6.1. WSGI Utilities

This module defines special classes and functions to make it easier to write applications based on the WSGI specification.

6.1.1. Overview

WSGI applications are simple callable objects that take two arguments:

result = application(environ, start_response)

In these utility classes, the arguments are encapsulated into a special context object based on WSGIContext. The context object allows you to get and set information specific to handling a single request, it also contains utility methods that are useful for extracting information from the URL, headers and request body and, likewise, methods that are useful for setting the response status and headers. Even in multi-threaded servers, each context instance is used by a single thread.

The application callable itself is modeled by an instance of the class WSGIApp. The instance may be called by multiple threads simultaneously so any state stored in the application is shared across all contexts and threads.

Many of the app class’ methods take an abbreviated form of the WSGI callable signature:

result = wsgi_app.page_method(context)

In this pattern, wsgi_app is a WSGIApp instance and page_method is the name of some response generating method defined in it.

In practice, you’ll derive a class from WSGIApp for your application and, possibly, derive a class from WSGIContext too. In the latter case, you must set the class attribute WSGIApp.ContextClass to your custom context class before creating your application instance.

The lifecycle of a script that runs your application can be summed up:

  1. Define your WSGIApp sub-class
  2. Set the values of any class attributes that are specific to a particular runtime environment. For example, you’ll probably want to set the path to the WSGIApp.settings_file where you can provide other runtime configuration options.
  3. Configure the class by calling the WSGIApp.setup() class method.
  4. Create an instance of the class
  5. Start handling requests!

Here’s an example:

#
#   Runtime configuration directives
#

#: path to settings file
SETTINGS_FILE = '/var/www/wsgi/data/settings.json'

#   Step 1: define the WSGIApp sub-class
class MyApp(WSGIApp):
    """Your class definitions here"""

    #   Step 2: set class attributes to configured values
    settings_file = SETTINGS_FILE

#   Step 3: call setup to configure the application
MyApp.setup()

#   Step 4: create an instance
application = MyApp()

#   Step 5: start handling requests, your framework may differ!
application.run_server()

In the last step we call a run_server method which uses Python’s builtin HTTP/WSGI server implementation. This is suitable for testing an application but in practice you’ll probably want to deploy your application with some other WSGI driver, such as Apache and modwsgi

6.1.1.1. Testing

The core WSGIApp class has a number of methods that make it easy to test your application from the command line, using Python’s built-in support for WSGI. In the example above you saw how the run_server method can be used.

There is also a facility to launch an application from the command line with options to override several settings. You can invoke this behaviour simply be calling the main class method:

from pyslet.wsgi import WSGIApp

class MyApp(WSGIApp):
    """Your class definitions here"""
    pass

if __name__ == "__main__":
    MyApp.main()

This simple example is available in the samples directory. You can invoke your script from the command line, –help can be used to look at what options are available:

$ python samples/wsgi_basic.py --help
Usage: wsgi_basic.py [options]

Options:
  -h, --help            show this help message and exit
  -v                    increase verbosity of output up to 3x
  -p PORT, --port=PORT  port on which to listen
  -i, --interactive     Enable interactive prompt after starting server
  --static=STATIC       Path to the directory of static files
  --private=PRIVATE     Path to the directory for data files
  --settings=SETTINGS   Path to the settings file

You could start a simple interactive server on port 8081 and hit it from your web browser with the following command:

$ python samples/wsgi_basic.py -ivvp8081
INFO:pyslet.wsgi:Starting MyApp server on port 8081
cmd: INFO:pyslet.wsgi:HTTP server on port 8081 running
1.0.0.127.in-addr.arpa - - [11/Dec/2014 23:49:54] "GET / HTTP/1.1" 200 78
cmd: stop

Typing ‘stop’ at the cmd prompt in interactive mode exits the server. Anything other than stop is evaluated as a python expression in the context of a method on your application object which allows you to interrogate you application while it is running:

cmd: self
<__main__.MyApp object at 0x1004c2b50>
cmd: self.settings
{'WSGIApp': {'interactive': True, 'static': None, 'port': 8081, 'private': None, 'level': 20}}

If you include -vvv on the launch you’ll get full debugging information including all WSGI environment information and all application output logged to the terminal.

6.1.2. Handling Pages

To handle a page you need to register your page with the request dispatcher. You typically do this during WSGIApp.init_dispatcher() by calling WSGIApp.set_method() and passing a pattern to match in the path and a bound method:

class MyApp(WSGIApp):

    def init_dispatcher(self):
        super(MyApp, self).init_dispatcher()
        self.set_method("/*", self.home)

    def home(self, context):
        data = "<html><head><title>Hello</title></head>" \
            "<body><p>Hello world!</p></body></html>"
        context.set_status(200)
        return self.html_response(context, data)

In this example we registered our simple ‘home’ method as the handler for all paths. The star is used instead of a complete path component and represents a wildcard that matches any value. When used at the end of a path it matches any (possibly empty) sequence of path components.

6.1.3. Data Storage

Most applications will need to read from or write data to some type of data store. Pyslet exposes its own data access layer to web applications, for details of the data access layer see the OData section.

To associate a data container with your application simply derive your application from WSGIDataApp instead of the more basic WSGIApp.

You’ll need to supply a metadata XML document describing your data schema and information about the data source in the settings file.

The minimum required to get an application working with a sqlite3 database would be to a directory with the following layout:

settings.json
metadata.xml
data/

The settings.json file would contain:

{
"WSGIApp": {
    "private": "data"
    },
"WSGIDataApp": {
    "metadata": "metadata.xml"
    }
}

If the settings file is in samples/wsgi_data your source might look this:

from pyslet.wsgi import WSGIDataApp

class MyApp(WSGIDataApp):

    settings_file = 'samples/wsgi_data/settings.json'

    # method definitions as before

if __name__ == "__main__":
    MyApp.main()

To create your database the first time you will either want to run a custom SQL script or get Pyslet to create the tables for you. With the script above both options can be achieved with the command line:

$ python samples/wsgi_data.py --create_tables -ivvp8081

This command starts the server as before but instructs it to create the tables in the database before running. Obviously you can only specify this option the first time!

Alternatively you might want to customise the table creation script, in which case you can create a pro-forma to edit using the –sqlout option instead:

$ python samples/wsgi_data.py --sqlout > wsgi_data.sql

6.1.4. Session Management

The WSGIDataApp is further extended by SessionApp to cover the common use case of needing to track information across multiple requests from the same user session.

The approach taken requires cookies to be enabled in the user’s browser. See Problems with Cookies below for details.

A decorator, session_decorator() is defined to make it easy to write (page) methods that depend on the existence of an active session. The session initiation logic is a little convoluted and is likely to involve at least one redirect when a protected page is first requested, but this all happens transparently to your application. You may want to look at overriding the ctest_page() and cfail_page() methods to provide more user-friendly messages in cases where cookies are blocked.

6.1.4.1. CSRF

Hand-in-hand with session management is defence against cross-site request forgery (CSRF) attacks. Relying purely on a session cookie to identify a user is problematic because a third party site could cause the user’s browser to submit requests to your application on their behalf. The browser will send the session cookie even if the request originated outside one of your application’s pages.

POST requests that affect the state of the server or carry out some other action requiring authorisation must be protected. Requests that simply return information (i.e., GET requests) are usually safe, even if the response contains confidential information, as the browser prevents the third party site from actually reading the HTML. Be careful when returning data other than HTML though, for example, data that could be parsed as valid JavaScript will need additional protection. The importance of using HTTP request methods appropriately cannot be understated!

The most common pattern for preventing this type of fraud is to use a special token in POST requests that can’t be guessed by the third party and isn’t exposed outside the page from which the POSTed form is supposed to originate. If you decorate a page that is the target of a POST request (the page that performs the action) with the session decorator then the request will fail if a CSRF token is not included in the request. The token can be read from the session object and will need to be inserted into any forms in your application. You shouldn’t expose your CRSF token in the URL as that makes it vulnerable to being discovered, so don’t add it to forms that use the GET action.

Here’s a simple example method that shows the use of the session decorator:

@session_decorator
def home(self, context):
    page = """<html><head><title>Session Page</title></head><body>
        <h1>Session Page</h1>
        %s
        </body></html>"""
    with self.container['Sessions'].open() as collection:
        try:
            entity = collection[context.session.sid]
            user_name = entity['UserName'].value
        except KeyError:
            user_name = None
    if user_name:
        noform = """<p>Welcome: %s</p>"""
        page = page % (noform % xml.EscapeCharData(user_name))
    else:
        form = """<form method="POST" action="setname">
            <p>Please enter your name: <input type="text" name="name"/>
                <input type="hidden" name=%s value=%s />
                <input type="submit" value="Set"/></p>
            </form>"""
        page = page % (
            form % (xml.EscapeCharData(self.csrf_token, True),
                    xml.EscapeCharData(context.session.sid, True)))
    context.set_status(200)
    return self.html_response(context, page)

We’ve added a simple database table to store the session data with the following entity:

<EntityType Name="Session">
    <Key>
        <PropertyRef Name="SessionID"/>
    </Key>
    <Property Name="SessionID" Type="Edm.String"
        MaxLength="64" Nullable="false"/>
    <Property Name="UserName" Type="Edm.String"
        Nullable="true" MaxLength="256" Unicode="true"/>
</EntityType>

Our database must also contain a small table used for key management, see below for information about encryption.

Our method reads the value of this property from the database and prints a welcome message if it is set. If not, it prints a form allowing you to enter your name. Notice that we must include a hidden field containing the CSRF token. The name of the token parameter is given in SessionApp.csrf_token and the value is read from the session object passed in the accompanying cookie - the browser should prevent third parties from reading the cookie’s value.

The action method that processes the form looks like this:

@session_decorator
def setname(self, context):
    user_name = context.get_form_string('name')
    if user_name:
        with self.container['Sessions'].open() as collection:
            try:
                entity = collection[context.session.sid]
                entity['UserName'].set_from_value(user_name)
                collection.update_entity(entity)
            except KeyError:
                entity = collection.new_entity()
                entity['SessionID'].set_from_value(context.session.sid)
                entity['UserName'].set_from_value(user_name)
                collection.insert_entity(entity)
    return self.redirect_page(context, context.get_app_root())

A sample application containing this code is provided and can again be run from the command line:

$ python samples/wsgi/wsgi_session.py --create_tables -ivvp8081

6.1.4.2. Problems with Cookies

There has been significant uncertainty over the use of cookies with some browsers blocking them in certain situations and some users blocking them entirely. In particular, the E-Privacy Directive in the European Union has led to a spate of scrolling consent banners and pop-ups on website landing pages.

It is worth bearing in mind that use of cookies, as opposed to URL-based solutions or cacheable basic basic auth credentials, is currently considered more secure for passing session identifiers. When designing your application you need to balance the privacy rights of your users with the need to keep their information safe and secure. Indeed, the main provisions of the directive are about providing security of services. As a result, it is generally accepted that the use of cookies for tracking sessions is essential and does not require any special consent from the user.

By extending WSGIDataApp this implementation always persists session data on the server. This gets around most of the perceived issues with the directive and cookies but does not absolve you and your application of the need to obtain consent from a more general data protection perspective!

Perhaps more onerous, but less discussed, is the obligation to remove ‘traffic data’, sometimes referred to as metadata, about the transmission of a communication. For this reason, we don’t store the originating IP address of the session even though doing so might actually increase security. As always, it’s a balance.

Finally, by relying on cookies we will sometimes fall foul of browser attempts to automate the privacy preferences of their users. The most common scenario is when our application is opened in a frame within another application. In this case, some browsers will apply a much stricter policy on blocking cookies. For example, Microsoft’s Internet Explorer (from version 6) requires the implementation of the P3P standard for communicating privacy information. Although some sites have chosen to fake a policy to trick the browser into accepting their cookies this has resulted in legal action so is not to be recommended.

See: http://msdn.microsoft.com/en-us/library/ms537343(v=VS.85).aspx

To maximise the chances of being able to create a session this class uses automatic redirection to test for cookie storage and a mechanism for transferring the session to a new window if it detects that cookies are blocked.

For a more detailed explanation of how this is achieved see my blog post Putting Cookies in the Frame

In many cases, once the application has been opened in a new window and the test cookie has been set successfully, future framed calls to the application will receive cookies and the user experience will be much smoother.

6.1.5. Encrypting Data

Sometimes you’ll want to encrypt sensitive data stored in a data store to prevent, say, a database administrator from being able to read it. This module provides a utility class called AppCipher which is designed to make this easier.

An AppCipher is initialised with a key. There are various strategies for storing keys for application use, in the simplest case you might read the key from a configuration file that is only available on the application server and not to the database administrator, say.

To assist with key management AppCipher will store old keys (suitably encrypted) in the data store using an entity with the following properties:

<EntityType Name="AppKey">
    <Key>
        <PropertyRef Name="KeyNum"/>
    </Key>
    <Property Name="KeyNum" Nullable="false" Type="Edm.Int32"/>
    <Property Name="KeyString" Nullable="false"
        Type="Edm.String" MaxLength="256" Unicode="false"/>
</EntityType>

SessionApp’s require an AppCipher to be specified in the settings and an AppKeys entity set in the data store to enable signing of the session cookie (to guard against cookie tampering).

The default implementation of AppCipher does not use any encryption (it merely obfuscates the input using base64 encoding) so to be useful you’ll need to use a class derived from AppCipher. If you have the Pycrypto module installed you can use the AESAppCipher class to use the AES algorithm to encrypt the data.

For details, see the reference section below.

6.1.6. Reference

class pyslet.wsgi.WSGIContext(environ, start_response, canonical_root=None)

Bases: object

A class used for managing WSGI calls

environ
The WSGI environment
start_response
The WSGI call-back
canonical_root
A URL that overrides the automatically derived canonical root, see WSGIApp for more details.

This class acts as a holding place for information specific to each request being handled by a WSGI-based application. In some frameworks this might be called the request object but we already have requests modelled in the http package and, anyway, this holds information about the WSGI environment and the response too.

MAX_CONTENT = 65536

The maximum amount of content we’ll read into memory (64K)

environ = None

the WSGI environ

start_response_method = None

the WSGI start_response callable

status = None

the response status code (an integer), see set_status()

status_message = None

the response status message (a string), see set_status()

headers = None

a list of (name, value) tuples containing the headers to return to the client. name and value must be strings

set_status(code)

Sets the status of the response

code
An HTTP integer response code.

This method sets the status_message automatically from the code. You must call this method before calling start_response.

add_header(name, value)

Adds a header to the response

name
The name of the header (a string)
value
The value of the header (a string)
start_response()

Calls the WSGI start_response method

If the status has not been set a 500 response is generated. The status string is created automatically from status and status_message and the headers are set from headers.

The return value is the return value of the WSGI start_response call, an obsolete callable that older applications use to write the body data of the response.

If you want to use the exc_info mechanism you must call start_response yourself directly using the value of start_response_method

get_app_root()

Returns the root of this application

The result is a pyslet.rfc2396.URI instance, It is calculated from the environment in the same way as get_url() but only examines the SCRIPT_NAME portion of the path.

It always ends in a trailing slash. So if you have a script bound to /script/myscript.py running over http on www.example.com then you will get:

http://www.example.com/script/myscript.py/

This allows you to generate absolute URLs by resolving them relative to the computed application root, e.g.:

URI.from_octets('images/counter.png').resolve(
    context.get_app_root())

would return:

http://www.example.com/script/myscript.py/images/counter.png

for the above example. This is preferable to using absolute paths which would strip away the SCRIPT_NAME prefix when used.

get_url()

Returns the URL used in the request

The result is a pyslet.rfc2396.URI instance, It is calculated from the environment using the algorithm described in URL Reconstruction section of the WSGI specification except that it ignores the Host header for security reasons.

Unlike the result of get_app_root() it doesn’t necessarily end with a trailing slash. So if you have a script bound to /script/myscript.py running over http on www.example.com then you may get:

http://www.example.com/script/myscript.py

A good pattern to adopt when faced with a missing trailing slash on a URL that is intended to behave as a ‘directory’ is to add the slash to the URL and use xml:base (for XML responses) or HTML’s <base> tag to set the root for relative links. The alternative is to issue an explicit redirect but this requires another request from the client.

This causes particular pain in OData services which frequently respond on the service script’s URL without a slash but generate incorrect relative links to the contained feeds as a result.

get_query()

Returns a dictionary of query parameters

The dictionary maps parameter names onto strings. In cases where multiple values have been supplied the values are comma separated, so a URL ending in ?option=Apple&option=Pear would result in the dictionary:

{'option': 'Apple,Pear'}

This method only computes the dictionary once, future calls return the same dictionary!

Note that the dictionary does not contain any cookie values or form parameters.

get_content()

Returns the content of the request as a string

The content is read from the input, up to CONTENT_LENGTH bytes, and is returned as a binary string. If the content exceeds MAX_CONTENT (default: 64K) then BadRequest is raised.

This method can be called multiple times, the content is only actually read from the input the first time. Subsequent calls return the same string.

This call cannot be called on the same context as get_form(), whichever is called first takes precedence. Calls to get_content after get_form return None.

get_form()

Returns a FieldStorage object parsed from the content.

The query string is excluded before the form is parsed as this only covers parameters submitted in the content of the request. To search the query string you will need to examine the dictionary returned by get_query() too.

This method can be called multiple times, the form is only actually read from the input the first time. Subsequent calls return the same FieldStorage object.

This call cannot be called on the same context as get_content(), whichever is called first takes precedence. Calls to get_form after get_content return None.

Warning: get_form will only parse the form from the content if the request method was POST!

get_form_string(name, max_length=65536)

Returns the value of a string parameter from the form.

name
The name of the parameter
max_length (optional, defaults to 64KB)

Due to an issue in the implementation of FieldStorage it isn’t actually possible to definitively tell the difference between a file upload and an ordinary input field. HTML5 clarifies the situation to say that ordinary fields don’t have a content type but FieldStorage assumes ‘text/plain’ in this case and sets the file and type attribute of the field anyway.

To prevent obtuse clients sending large files disguised as ordinary form fields, tricking your application into loading them into memory, this method checks the size of any file attribute (if present) against max_length before returning the field’s value.

If the parameter is missing from the form then an empty string is returned.

get_form_long(name)

Returns the value of a (long) integer parameter from the form.

name
The name of the parameter

If the parameter is missing from the form then None is returned, if the parameter is present but is not a valid integer then BadRequest is raised.

get_cookies()

Returns a dictionary of cookies from the request

If no cookies were passed an empty dictionary is returned.

For details of how multi-valued cookies are handled see: pyslet.http.cookie.CookieParser.request_cookie_string().

class pyslet.wsgi.WSGIApp

Bases: pyslet.wsgi.DispatchNode

An object to help support WSGI-based applications.

Instances are designed to be callable by the WSGI middle-ware, on creation each instance is assigned a random identifier which is used to provide comparison and hash implementations. We go to this trouble so that derived classes can use techniques like the functools lru_cache decorator in future versions.

ContextClass

the context class to use for this application, must be (derived from) WSGIContext

alias of WSGIContext

static_files = None

The path to the directory for static_files. Defaults to None. An pyslet.vfs.OSFilePath instance.

private_files = None

Private data diretory

An pyslet.vfs.OSFilePath instance.

The directory used for storing private data. The directory is partitioned into sub-directories based on the lower-cased class name of the object that owns the data. For example, if private_files is set to ‘/var/www/data’ and you derive a class called ‘MyApp’ from WSGIApp you can assume that it is safe to store and retrieve private data files from ‘/var/www/data/myapp’.

private_files defaults to None for safety. The current WSGIApp implementation does not depend on any private data.

settings_file = None

The path to the settings file. Defaults to None.

An pyslet.vfs.OSFilePath instance.

The format of the settings file is a json dictionary. The dictionary’s keys are class names that define a scope for class-specific settings. The key ‘WSGIApp’ is reserved for settings defined by this class. The defined settings are:

level (None)
If specified, used to set the root logging level, a value between 0 (NOTSET) and 50 (CRITICAL). For more information see python’s logging module.
port (8080)
The port number used by run_server()
canonical_root (“http://localhost” or “http://localhost:<port>”)

The canonical URL scheme, host (and port if required) for the application. This value is passed to the context and used by WSGIContext.get_url() and similar methods in preference to the SERVER_NAME and SEVER_PORT to construct absolute URLs returned or recorded by the application. Note that the Host header is always ignored to prevent related security attacks.

If no value is given then the default is calculated taking in to consideration the port setting.

interactive (False)
Sets the behaviour of run_server(), if specified the main thread prompts the user with a command line interface allowing you to interact with the running server. When False, run_server will run forever and can only be killed by an application request that sets stop to True or by an external signal that kills the process.
static (None)
A URL to the static files (not a local file path). This will normally be an absolute path or a relative path. Relative paths are relative to the settings file in which the setting is defined. As URL syntax is used you must use the ‘/’ as a path separator and add proper URL-escaping. On Windows, UNC paths can be specified by putting the host name in the authority section of the URL.
private (None)
A URL to the private files. Interpreted as per the ‘static’ setting above.
settings = None

the class settings loaded from settings_file by setup()

base = None

the base URI of this class, set from the path to the settings file itself and is used to locate data files on the server. This is a pyslet.rfc2396.FileURL instance. Not to be confused with the base URI of resources exposed by the application this class implements!

private_base = None

the base URI of this class’ private files. This is set from the private_files member and is a pyslet.rfc2396.FileURL instance

content_type = {'ico': MediaType('image', 'vnd.microsoft.icon', {})}

The mime type mapping table.

This table is used before falling back on Python’s built-in guess_type function from the mimetypes module. Add your own custom mappings here.

It maps file extension (without the dot) on to MediaType instances.

MAX_CHUNK = 65536

the maximum chunk size to read into memory when returning a (static) file. Defaults to 64K.

js_origin = 0

the integer millisecond time (since the epoch) corresponding to 01 January 1970 00:00:00 UTC the JavaScript time origin.

clslock = <_RLock owner=None count=0>

a threading.RLock instance that can be used to lock the class when dealing with data that might be shared amongst threads.

classmethod main()

Runs the application

Options are parsed from the command line and used to setup() the class before an instance is created and launched with run_server().

classmethod add_options(parser)

Defines command line options.

parser
An OptionParser instance, as defined by Python’s built-in optparse module.

The following options are added to parser by the base implementation:

-v Sets the logging level to WARNING, INFO or DEBUG depending on the number of times it is specified. Overrides the ‘level’ setting in the settings file.
-p, --port Overrides the value of the ‘port’ setting in the settings file.
-i, --interactive
 Overrides the value of the ‘interactive’ setting in the settings file.
--static Overrides the value of static_files.
--private Overrides the value of private_files.
--settings Sets the path to the settings_file.
classmethod setup(options=None, args=None, **kwargs)

Perform one-time class setup

options
An optional object containing the command line options, such as an optparse.Values instance created by calling parse_args on the OptionParser instance passed to add_options().
args
An optional list of positional command-line arguments such as would be returned from parse_args after the options have been removed.

All arguments are given as keyword arguments to enable use of super and diamond inheritance.

The purpose of this method is to perform any actions required to setup the class prior to the creation of any instances.

The default implementation loads the settings file and sets the value of settings. If no settings file can be found then an empty dictionary is created and populated with any overrides parsed from options.

Finally, the root logger is initialised based on the level setting.

Derived classes should always use super to call the base implementation before their own setup actions are performed.

classmethod resolve_setup_path(uri_path, private=False)

Resolves a settings-relative path

uri_path
The relative URI of a file or directory.
private (False)
Resolve relative to the private files directory

Returns uri_path as an OSFilePath instance after resolving relative to the settings file location or to the private files location as indicated by the private flag. If the required location is not set then uri_path must be an absolute file URL (starting with, e.g., file:///). On Windows systems the authority component of the URL may be used to specify the host name for a UNC path.

stop = None

flag: set to True to request run_server() to exit

id = None

a unique ID for this instance

init_dispatcher()

Used to initialise the dispatcher.

By default all requested paths generate a 404 error. You register pages during init_dispatcher() by calling set_method(). Derived classes should use super to pass the call to their parents.

set_method(path, method)

Registers a bound method in the dispatcher

path
A path or path pattern
method

A bound method or callable with the basic signature:

result = method(context)

A star in the path is treated as a wildcard and matches a complete path segment. A star at the end of the path (which must be after a ‘/’) matches any sequence of path segments. The matching sequence may be empty, in other words, “/images/” matches “/images/”. In keeping with common practice a missing trailing slash is ignored when dispatching so “/images” will also be routed to a method registered with “/images/” though if a separate registration is made for “/images” it will be matched in preference.

Named matches always take precedence over wildcards so you can register “/images/” and “/images/counter.png” and the latter path will be routed to its preferred handler. Similarly you can register “//background.png” and “/home/background.png” but remember the ‘*’ only matches a single path component! There is no way to match background.png in any directory.

call_wrapper(environ, start_response)

Alternative entry point for debugging

Although instances are callable you may use this method instead as your application’s entry point when debugging.

This method will log the environ variables, the headers output by the application and all the data (in quoted-printable form) returned at DEBUG level.

It also catches a common error, that of returning something other than a string for a header value or in the generated output. These are logged at ERROR level and converted to strings before being passed to the calling framework.

static_page(context)

Returns a static page

This method can be bound to any path using set_method() and it will look in the static_files directory for that file. For example, if static_files is “/var/www/html” and the PATH_INFO variable in the request is “/images/logo.png” then the path “/var/www/html/images/logo.png” will be returned.

There are significant restrictions on the names of the path components. Each component must match a basic label syntax (equivalent to the syntax of domain labels in host names) except the last component which must have a single ‘.’ separating two valid labels. This conservative syntax is designed to be safe for passing to file handling functions.

file_response(context, file_path)

Returns a file from the file system

file_path
The system file path of the file to be returned as an pyslet.vfs.OSFilePath instance.

The Content-Length header is set from the file size, the Last-Modified date is set from the file’s st_mtime and the file’s data is returned in chunks of MAX_CHUNK in the response.

The status is not set and must have been set before calling this method.

html_response(context, data)

Returns an HTML page

data
A string containing the HTML page data. This may be a unicode or binary string.

The Content-Type header is set to text/html (with an explicit charset if data is a unicode string). The status is not set and must have been set before calling this method.

json_response(context, data)

Returns a JSON response

data
A string containing the JSON data. This may be a unicode or binary string (encoded with utf-8).

The Content-Type is set to “application/json”. The status is not set and must have been set before calling this method.

text_response(context, data)

Returns a plain text response

data
A string containing the text data. This may be a unicode or binary string (encoded with US-ASCII).

The Content-Type is set to “text/plain” (with an explicit charset if a unicode string is passed). The status is not set and must have been set before calling this method.

Warning: do not encode unicode strings before passing them to this method as data, if you do you risk problems with non-ASCII characters as the default charset for text/plain is US-ASCII and not UTF-8 or ISO8859-1 (latin-1).

redirect_page(context, location, code=303)

Returns a redirect response

location
A URI instance or a string of octets.
code (303)
The redirect status code. As a reminder the typical codes are 301 for a permanent redirect, a 302 for a temporary redirect and a 303 for a temporary redirect following a POST request. This latter code is useful for implementing the widely adopted pattern of always redirecting the user after a successful POST request to prevent browsers prompting for re-submission and is therefore the default.

This method takes care of setting the status, the Location header and generating a simple HTML redirection page response containing a clickable link to location.

error_page(context, code=500, msg=None)

Generates an error response

code (500)
The status code to send.
msg (None)
An optional plain-text error message. If not given then the status line is echoed in the body of the response.
class pyslet.wsgi.WSGIDataApp(**kwargs)

Bases: pyslet.wsgi.WSGIApp

Extends WSGIApp to include a data store

The key ‘WSGIDataApp’ is reserved for settings defined by this class in the settings file. The defined settings are:

container (None)
The name of the container to use for the data store. By default, the default container is used. For future compatibility you should not depend on using this option.
metadata (None)
URI of the metadata file containing the data schema. The file is assumed to be relative to the settings_file.
source_type (‘sqlite’)
The type of data source to create. The default value is sqlite. A value of ‘mysql’ select’s Pyslet’s mysqldbds module instead.
sqlite_path (‘database.sqlite3’)
URI of the database file. The file is assumed to be relative to the private_files directory, though an absolute path may be given.
dbhost (‘localhost’)
For mysql databases, the hostname to connect to.
dname (None)
The name of the database to connect to.
dbuser (None)
The user name to connect to the database with.
dbpassword (None)
The password to use in conjunction with dbuser
keynum (‘0’)
The identification number of the key to use when storing encrypted data in the container.
secret (None)
The key corresponding to keynum. The key is read in plain text from the settings file and must be provided in order to use the app_cipher for managing encrypted data and secure hashing. Derived classes could use an alternative mechanism for reading the key, for example, using the keyring python module.
cipher (‘aes’)
The type of cipher to use. By default AESAppCipher is used which uses AES internally with a 256 bit key created by computing the SHA256 digest of the secret string. The only other supported value is ‘plaintext’ which does not provide any encryption but allows the app_cipher object to be used in cases where encryption may or may not be used depending on the deployment environment. For example, it is often useful to turn off encryption in a development environment!
when (None)

An optional value indicating when the specified secret comes into operation. The value should be a fully specified time point in ISO format with timezone offset, such as ‘2015-01-01T09:00:00-05:00’. This value is used when the application is being restarted after a key change, for details see AppCipher.change_key().

The use of AES requires the PyCrypto module to be installed.

classmethod add_options(parser)

Adds the following options:

-s, --sqlout print the suggested SQL database schema and then exit. The setting of –create is ignored.
--create_tables
 create tables in the database
-m. –memory Use an in-memory SQLite database. Overrides
any source_type and encryption setting values . Implies –create_tables
metadata = None

the metadata document for the underlying data service

data_source = None

the data source object for the underlying data service the type of this object will vary depending on the source type. For SQL-type containers this will be an instance of a class derived from SQLEntityContainer

container = None

the entity container (cf database)

classmethod setup(options=None, args=None, **kwargs)

Adds database initialisation

Loads the metadata document. Creates the data_source according to the configured settings (creating the tables only if requested in the command line options). Finally sets the container to the entity container for the application.

If the -s or –sqlout option is given in options then the data source’s create table script is output to standard output and sys.exit(0) is used to terminate the process.

classmethod new_app_cipher()

Creates an AppCipher instance

This method is called automatically on construction, you won’t normally need to call it yourself but you may do so, for example, when writing a script that requires access to data encrypted by the application.

If there is no ‘secret’ defined then None is returned.

Reads the values from the settings file and creates an instance of the appropriate class based on the cipher setting value. The cipher uses the ‘AppKeys’ entity set in container to store information about expired keys. The AppKey entities have the following three properties:

KeyNum (integer key)
The key identification number
KeyString (string)

The encrypted secret, for example:

'1:OBimcmOesYOt021NuPXTP01MoBOCSgviOpIL'

The number before the colon is the key identification number of the secret used to encrypt the string (and will always be different from the KeyNum field of course). The data after the colon is the base-64 encoded encrypted string. The same format is used for all data enrypted by AppCipher objects. In this case the secret was the word ‘secret’ and the algorithm used is AES.

Expires (DateTime)
The UTC time at which this secret will expire. After this time a newer key should be used for encrypting data though this key may of course still be used for decrypting data.
app_cipher = None

the application’s cipher, a AppCipher instance.

pyslet.wsgi.session_decorator(page_method)

Decorates a web method with session handling

page_method
An unbound method with signature: page_method(obj, context) which performs the WSGI protocol and returns the page generator.

Our decorator just calls SessionContext.session_wrapper().

class pyslet.wsgi.SessionContext(environ, start_response, canonical_root=None)

Bases: pyslet.wsgi.WSGIContext

Extends the base class with a session object.

session = None

a session object, or None if no session available

start_response()

Saves the session cookie.

class pyslet.wsgi.SessionApp(**kwargs)

Bases: pyslet.wsgi.WSGIDataApp

Extends WSGIDataApp to include session handling.

These sessions require support for cookies. The SessionApp class itself uses two cookies purely for session tracking.

The key ‘SessionApp’ is reserved for settings defined by this class in the settings file. The defined settings are:

timeout (600)
The number of seconds after which an inactive session will time out and no longer be accessible to the client.
cookie (‘sid’)
The name of the session cookie.
cookie_test (‘ctest’)
The name of the test cookie. This cookie is set with a longer lifetime and acts both as a test of whether cookies are supported or not and can double up as an indicator of whether user consent has been obtained for any extended use of cookies. It defaults to the value ‘0’, indicating that cookies can be stored but that no special consent has been obtained.
cookie_test_age (8640000)
The age of the test cookie (in seconds). The default value is equivalent to 100 days. If you use the test cookie to record consent to some cookie policy you should ensure that when you set the value you use a reasonable lifespan.
csrftoken (‘csrftoken’)
The name of the form field containing the CSRF token
classmethod setup(options=None, args=None, **kwargs)

Adds database initialisation

csrf_token = None

The name of our CSRF token

ContextClass

Extended context class

alias of SessionContext

SessionClass

The session class to use, must be (derived from) Session

alias of CookieSession

init_dispatcher()

Adds pre-defined pages for this application

These pages are mapped to /ctest and /wlaunch. These names are not currently configurable. See ctest() and wlaunch() for more information.

session_wrapper(context, page_method)

Called by the session_decorator

Uses set_session() to ensure the context has a session object. If this request is a POST then the form is parsed and the CSRF token checked for validity.

set_session(context)

Sets the session object in the context

The session is read from the session cookie, established and marked as being seen now. If no cookie is found a new session is created. In both cases a cookie header is set to update the cookie in the browser.

Adds the session cookie to the response headers

The cookie is bound to the path returned by WSGIContext.get_app_root() and is marked as being http_only and is marked secure if we have been accessed through an https URL.

You won’t normally have to call this method but you may want to override it if your application wishes to override the cookie settings.

Removes the session cookie

Adds the test cookie

establish_session(context)

Mark the session as established

This will update the session ID, override this method to update any data store accordingly if you are already associating protected information with the session to prevent it becoming orphaned.

merge_session(context, merge_session)

Merges a session into the session in the context

Override this method to update any data store. If you are already associating protected information with merge_session you need to transfer it to the context session.

The default implementation does nothing and merge_session is simply discarded.

session_page(context, page_method, return_path)

Returns a session protected page

context
The WSGIContext object
page_method

A function or bound method that will handle the page. Must have the signature:

page_method(context)

and return the generator for the page as per the WSGI specification.

return_path
A pyslet.rfc2396.URI instance pointing at the page that will be returned by page_method, used if the session is not established yet and a redirect to the test page needs to be implemented.

This method is only called after the session has been created, in other words, context.session must be a valid session.

This method either calls the page_method (after ensuring that the session is established) or initiates a redirection sequence which culminates in a request to return_path.

ctest(context)

The cookie test handler

This page takes three query parameters:

return
The return URL the user originally requested
s
The session that should be received in a cookie
sig
The session signature which includes the the User-Agent at the end of the message.
framed (optional)
An optional parameter, if present and equal to ‘1’ it means we’ve already attempted to load the page in a new window so if we still can’t read cookies we’ll return the cfail_page().

If cookies cannot be read back from the context this page will call the ctest_page() to provide an opportunity to open the application in a new window (or cfail_page() if this possibility has already been exhausted.

If cookies are successfully read, they are compared with the expected values (from the query) and the user is returned to the return URL with an automatic redirect. The return URL must be within the same application (to prevent ‘open redirect’ issues) and, to be extra safe, we change the user-visible session ID as we’ve exposed the previous value in the URL which makes it more liable to snooping.

ctest_page(context, target_url, return_url, s, sig)

Returns the cookie test page

Called when cookies are blocked (perhaps in a frame).

context
The request context
target_url
A string containing the base link to the wlaunch page. This page can opened in a new window (which may get around the cookie restrictions). You must pass the return_url and the sid values as the ‘return’ and ‘sid’ query parameters respectively.
return_url
A string containing the URL the user originally requested, and the location they should be returned to when the session is established.
s
The session
sig
The session signature

You may want to override this implementation to provide a more sophisticated page. The default simply presents the target_url with added “return”, “s” and “sig” parameters as a simple hypertext link that will open in a new window.

A more sophisticated application might render a button or a form but bear in mind that browsers that cause this page to load are likely to prevent automated ways of opening this link.

wlaunch(context)

Handles redirection to a new window

The query parameters must contain:

return
The return URL the user originally requested
s
The session that should also be received in a cookie
sig
The signature of the session, return URL and User-Agent

This page initiates the redirect sequence again, but this time setting the framed query parameter to prevent infinite redirection loops.

cfail_page(context)

Called when cookies are blocked completely.

The default simply returns a plain text message stating that cookies are blocked. You may want to include a page here with information about how to enable cookies, a link to the privacy policy for your application to help people make an informed decision to turn on cookies, etc.

check_redirect(context, target_path)

Checks a target path for an open redirect

target_path
A string or URI instance.

Returns True if the redirect is safe.

The test ensures that the canonical root of our application matches the canonical root of the target. In other words, it must have the same scheme and matching authority (host/port).

class pyslet.wsgi.AppCipher(key_num, key, key_set, when=None)

Bases: object

A cipher for encrypting application data

key_num
A key number
key
A binary string containing the application key.
key_set
An entity set used to store previous keys. The entity set must have an integer key property ‘KeyNum’ and a string field ‘KeyString’. The string field must be large enough to contain encrypted versions of previous keys.
when (None)
A fully specified pyslet.iso8601.TimePoint at which time the key will become active. If None, the key is active straight away. Otherwise, the key_set is searched for a key that is still active and that key is used when encrypting data until the when time, at which point the given key takes over.

The object wraps an underlying cipher. Strings are encrypted using the cipher and then encoded using base64. The output is then prefixed with an ASCII representation of the key number (key_num) followed by a ‘:’. For example, if key_num is 7 and the cipher is plain-text (the default) then encrypt(“Hello”) results in:

"7:SGVsbG8="

When decrypting a string, the key number is parsed and matched against the key_num of the key currently in force. If the string was encrypted with a different key then the key_set is used to look up that key (which is itself encrypted of course). The process continues until a key encrypted with key_num is found.

The upshot of this process is that you can change the key associated with an application. See change_key() for details.

MAX_AGE = 100

the maximum age of a key, which is the number of times the key can be changed before the original key is considered too old to be used for decryption.

new_cipher(key)

Returns a new cipher object with the given key

The default implementation creates a plain-text ‘cipher’ and is not suitable for secure use of encrypt/decrypt but, with a sufficiently good key, may still be used for hashing.

change_key(key_num, key, when)

Changes the key of this application.

key_num
The number given to the new key, must differ from the last MAX_AGE key numbers.
key
A binary string containing the new application key.
when
A fully specified pyslet.iso8601.TimePoint at which point the new key will come into effect.

Many organizations have a policy of changing keys on a routine basis, for example, to ensure that people who have had temporary access to the key only have temporary access to the data it protects. This method makes it easier to implement such a policy for applications that use the AppCipher class.

The existing key is encrypted with the new key and a record is written to the key_set to record the existing key number, the encrypted key string and the when time, which is treated as an expiry time in this context.

This procedure ensures that strings encrypted with an old key can always be decrypted because the value of the old key can be looked up. Although it is encrypted, it will be encrypted with a new(er) key and the procedure can be repeated as necessary until a key encrypted with the newest key is found.

The key change process then becomes:

  1. Start a utility process connected to the application’s entity container using the existing key and then call the change_key method. Pass a value for when that will give you time to reconfigure all AppCipher clients. Assuming the key change is planned, a time in hours or even days ahead can be used.
  2. Update or reconfigure all existing applications so that they will be initialised with the new key and the same value for when next time they are restarted.
  3. Restart/refresh all running applications before the change over time. As this does not need to be done simultaneously, a load balanced set of application servers can be cycled on a schedule to ensure continuous running).

Following a key change the entity container will still contain data encrypted with old keys and the architecture is such that compromise of a key is sufficient to read all encrypted data with that key and all previous keys. Therefore, changing the key only protects new data.

In situations where policy dictates a key change it might make sense to add a facility to the application for re-encrypting data in the data store by going through a read-decrypt/encrypt-write cycle with each protected data field. Of course, the old key could still be used to decrypt this information from archived backups of the data store. Alternatively, if the protected data is itself subject to change on a routine basis you may simply rely on the natural turnover of data in the application. The strategy you choose will depend on your application.

The MAX_AGE attribute determines the maximum number of keys that can be in use in the data set simultaneously. Eventually you will have to update encrypted data in the data store.

encrypt(data)

Encrypts data with the current key.

data
A binary input string.

Returns a character string of ASCII characters suitable for storage.

decrypt(data)

Decrypts data.

data
A character string containing the encrypted data

Returns a binary string containing the decrypted data.

sign(message)

Signs a message with the current key.

message
A binary message string.

Returns a character string of ASCII characters containing a signature of the message. It is recommended that character strings are encoded using UTF-8 before signing.

check_signature(signature, message=None)

Checks a signature returned by sign

signature
The ASCII signature to be checked for validity.
message
A binary message string. This is optional, if None then the message will be extracted from the signature string (reversing ascii_sign).

On success the method returns the validated message (a binary string) and on failure it raises ValueError.

ascii_sign(message)

Signs a message with the current key

message
A binary message string

The difference between ascii_sign and sign is that ascii_sign returns the entire message, including the signature, as a URI-encoded character string suitable for storage and/or transmission.

The message is %-encoded (as implemented by pyslet.rfc2396.escape_data()). You may apply the corresponding unescape data function to the entire string to get a binary string that contains an exact copy of the original data.

class pyslet.wsgi.AESAppCipher(key_num, key, key_set, when=None)

Bases: pyslet.wsgi.AppCipher

A cipher object that uses AES to encrypt the data

The Pycrypto module must be installed to use this class.

The key is hashed using the SHA256 algorithm to obtain a 32 byte value for the AES key. The encrypted strings contain random initialisation vectors so repeated calls won’t generate the same encrypted values. The CFB mode of operation is used.

6.1.6.1. Utility Functions

pyslet.wsgi.generate_key(key_length=128)

Generates a new key

key_length
The minimum key length in bits. Defaults to 128.

The key is returned as a sequence of 16 bit hexadecimal strings separated by ‘.’ to make them easier to read and transcribe into other systems.

pyslet.wsgi.key60(src)

Generates a non-negative 60-bit long from a source string.

src
A binary string.

The idea behind this function is to create an (almost) unique integer from a given string. The integer can then be used as the key field of an associated entity without having to create foreign keys that are long strings. There is of course a small chance that two source strings will result in the same integer.

The integer is calculated by truncating the SHA256 hexdigest to 15 characters (60-bits) and then converting to long. Future versions of Python promise improvements here, which would allow us to squeeze an extra 3 bits using int.from_bytes but alas, not in Python 2.x

6.1.6.2. Exceptions

If thrown while handling a WSGI request these errors will be caught by the underlying handlers and generate calls to to WSGIApp.error_page() with an appropriate 4xx response code.

class pyslet.wsgi.BadRequest

Bases: exceptions.Exception

An exception that will generate a 400 response code

class pyslet.wsgi.PageNotAuthorized

Bases: pyslet.wsgi.BadRequest

An exception that will generate a 403 response code

class pyslet.wsgi.PageNotFound

Bases: pyslet.wsgi.BadRequest

An exception that will generate a 404 response code

class pyslet.wsgi.MethodNotAllowed

Bases: pyslet.wsgi.BadRequest

An exception that will generate a 405 response code

Other sub-classes of Exception are caught and generate 500 errors:

class pyslet.wsgi.SessionError

Bases: exceptions.RuntimeError

Unexpected session handling error