5.4. HTTP Protocol Parameters

5.4.1. URLs

class pyslet.http.params.HTTPURL(octets='http://localhost/')

Bases: pyslet.rfc2396.ServerBasedURL

Represents http URLs

DEFAULT_PORT = 80

the default HTTP port

canonicalize()

Returns a canonical form of this URI

This method is almost identical to the implementation in ServerBasedURL except that a missing path is replaced by ‘/’ in keeping with rules for making HTTP requests.

class pyslet.http.params.HTTPSURL(octets='https://localhost/')

Bases: pyslet.http.params.HTTPURL

Represents https URLs

DEFAULT_PORT = 443

the default HTTPS port

5.4.2. Parameters

This module defines classes and functions for handling basic parameters used by HTTP. Refer to Section 3 of RFC2616 for details.

The approach taken by this module is to provide classes for each of the parameter types. Most classes have a class method ‘from_str’ which returns a new instance parsed from a string and performs the reverse transformation to the to_bytes function. In all cases, string arguments provided on construction should be binary strings, not character strings.

Instances are generally immutable objects which is consistent with them representing values of parameters in the protocol. They can be used as values in dictionaries (__hash__ is defined) and comparison methods are also provided, including inequalities where a logical ordering exists.

class pyslet.http.params.Parameter

Bases: object

Abstract base class for HTTP Parameters

Provides conversion to strings based on the to_bytes() method. In Python 2, also provides conversion to the unicode string type. In Python 3, implements __bytes__ to enable use of bytes(parameter) which becomes portable as in Python 2 __str__ is mapped to to_bytes too.

The HTTP grammar and the parsers and classes that implement it all use binary strings but usage of byte values outside those of the US ASCII codepoints is discouraged and unlikely to be portable between systems.

When required, Pyslet converts to character strings using the ISO-8859-1 codec. This ensures that the conversions never generate unicode decoding erros and is consistent with the text of RFC2616.

As the purpose of these modules is to provide a way to use HTTP constructs in other contexts too, parameters use character strings where possible. Therefore, if an attribute must represent a token then it is converted to a character string and must therefore be compared using character strings and not binary strings. For example, see the type and subtype attributes of MediaType. Similarly where tokens are passed as arguments to constructors these must also be character strings.

Where an attribute may be, or may contain, a value that would be represented as a quoted string in the protocol then it is stored as a binary string. You need to take particular care with parameter lists as the parameter names are tokens so are character strings but the parameter values are binary strings. The distinction is lost in Python 2 but the following code snippet will behave unexpectedly in Python 3 so for future compatibility it is better to make usage explicit now:

Python 2.7
>>> from pyslet.http.params import MediaType
>>> t = MediaType.from_str("text/plain; charset=utf-8")
>>> "Yes" if t["charset"] == 'utf-8' else "No"
'Yes'
>>> "Yes" if t["charset"] == b'utf-8' else "No"
'Yes'

Python 3.5
>>> from pyslet.http.params import MediaType
>>> t = MediaType.from_str("text/plain; charset=utf-8")
>>> "Yes" if t["charset"] == 'utf-8' else "No"
'No'
>>> "Yes" if t["charset"] == b'utf-8' else "No"
'Yes'

Such values may be set using character strings, in which case ISO-8859-1 is used to encode them.

classmethod bstr(arg)

Returns arg as a binary string

classmethod bparameters(parameters)

Ensures parameter values are binary strings

to_bytes()

Returns a binary string representation of the parameter

This method should be used in preference to str for compatibility with Python 3.

class pyslet.http.params.SortableParameter

Bases: pyslet.py2.SortableMixin, pyslet.http.params.Parameter

Base class for sortable parameters

Inherits from SortableMixin allowing sorting to be implemented using a class-specific sortkey method implementation.

A __hash__ implementation that calls sortkey is also provided to enable instances to be used as dictionary keys.

otherkey(other)

Overridden to provide comparison with strings.

If other is of either character or binary string types then it is passed to the classmethod from_str which is assumed to return a new instance of the same class as self which can then be compared by the return value of sortkey.

This enables comparisons such as the following:

>>> t = MediaType.from_str("text/plain")
>>> t == "text/plain"
True
>>> t > "image/png"
True
>>> t < "video/mp4"
True
class pyslet.http.params.HTTPVersion(major=1, minor=None)

Bases: pyslet.http.params.SortableParameter

Represents the HTTP Version.

major
The (optional) major version as an int
minor
The (optional) minor version as an int

The default instance, HTTPVersion(), represents HTTP/1.1

HTTPVersion objects are sortable (such that 1.1 > 1.0 and 1.2 < 1.25).

On conversion to a string the output is of the form:

HTTP/<major>.<minor>

For convenience, the constants HTTP_1p1 and HTTP_1p0 are provided for comparisons, e.g.:

if HTTPVersion.from_str(version_str) < HTTP_1p1:
    # do something to support a legacy system...
major = None

major protocol version (read only)

minor = None

minor protocol version (read only)

classmethod from_str(source)

Constructs an HTTPVersion object from a string.

class pyslet.http.params.FullDate(src=None, date=None, time=None)

Bases: pyslet.http.params.Parameter, pyslet.iso8601.TimePoint

A special sub-class for HTTP-formatted dates

We extend the basic ISO TimePoint, mixing in the Parameter base class and providing an implementation of to_bytes.

The effect is to change the way instances are formatted while retaining other timepoint features, including comparisons. Take care not to pass an instance as an argument where a plain TimePoint is expected as unexpected formatting errors could result. You can always wrap an instance to convert between the two types:

>>> from pyslet.iso8601 import TimePoint
>>> from pyslet.http.params import FullDate
>>> eagle = TimePoint.from_str('1969-07-20T15:17:40-05:00')
>>> print eagle
1969-07-20T15:17:40-05:00
>>> eagle = FullDate(eagle)
>>> print eagle
Sun, 20 Jul 1969 20:17:40 GMT
>>> eagle = TimePoint(eagle)
>>> print eagle
1969-07-20T15:17:40-05:00

Notice that when formatting the date is always expressed in GMT as per the recommendation in the HTTP specification.

classmethod from_http_str(source)

Returns an instance parsed from an HTTP formatted string

There are three supported formats as described in the specification:

"Sun, 06 Nov 1994 08:49:37 GMT"
"Sunday, 06-Nov-94 08:49:37 GMT"
"Sun Nov  6 08:49:37 1994"
to_bytes()

Formats the instance according to RFC 1123

The format is as follows:

Sun, 06 Nov 1994 08:49:37 GMT

This format is also described in in RFC2616 in the production rfc1123-date.

class pyslet.http.params.TransferEncoding(token='chunked', parameters={})

Bases: pyslet.http.params.SortableParameter

Represents an HTTP transfer-encoding.

token
The transfer encoding identifier, defaults to “chunked”
parameters
A parameter dictionary mapping parameter names to tuples of strings: (parameter name, parameter value)

When sorted, the order in which parameters were parsed is ignored. Instances are supported first by token and then by alphabetical parameter name/value pairs.

token = None

the lower-cased transfer-encoding token (defaults to “chunked”)

parameters = None

declared extension parameters

classmethod from_str(source)

Parses the transfer-encoding from a source string.

If the encoding is not parsed correctly BadSyntax is raised.

classmethod list_from_str(source)

Creates a list of transfer-encodings from a string

Transfer-encodings are comma-separated

class pyslet.http.params.Chunk(size=0, extensions=None)

Bases: pyslet.http.params.SortableParameter

Represents an HTTP chunk header

size
The size of this chunk (defaults to 0)
extensions
A parameter dictionary mapping parameter names to tuples of strings: (chunk-ext-name, chunk-ext-val)

For completeness, instances are sortable by size and then by alphabetical parameter name, value pairs.

size = None

the chunk-size

classmethod from_str(source)

Parses the chunk header from a source string of TEXT.

If the chunk header is not parsed correctly BadSyntax is raised. The header includes the chunk-size and any chunk-extension parameters but it does not include the trailing CRLF or the chunk-data

class pyslet.http.params.MediaType(type='application', subtype='octet-stream', parameters={})

Bases: pyslet.http.params.SortableParameter

Represents an HTTP media-type.

The built-in str function can be used to format instances according to the grammar defined in the specification.

type
The type code string, defaults to ‘application’
subtype
The sub-type code, defaults to ‘octet-stream’
parameters
A dictionary such as would be returned by grammar.WordParser.parse_parameters() containing the media type’s parameters.

Instances are immutable and support parameter value access by lower-case key (as a character string), returning the corresponding value or raising KeyError. E.g., mtype[‘charset’]

Instances also define comparison methods and a hash implementation. Media-types are compared by (lower case) type, subtype and ultimately parameters.

classmethod from_str(source)

Creates a media-type from a source string.

Enforces the following rule from the specification:

Linear white space (LWS) MUST NOT be used between the type and subtype, nor between an attribute and its value

The source may be either characters or bytes. Character strings must consist of iso-8859-1 characters only and should be plain ascii.

class pyslet.http.params.ProductToken(token=None, version=None)

Bases: pyslet.http.params.SortableParameter

Represents an HTTP product token.

The comparison operations use a more interesting sort than plain text on version in order to provide a more intuitive ordering. As it is common practice to use dotted decimal notation for versions (with some alphanumeric modifiers) the version string is exploded (see explode()) internally on construction and this exploded value is used in comparisons. The upshot is that version 1.0.3 sorts before 1.0.10 as you would expect and 1.0a < 1.0 < 1.0.3a3 < 1.0.3a20 < 1.0.3b1 < 1.0.3; there are limits to this algorithm. 1.0dev > 1.0b1 even though it looks like it should be the other way around. Similarly 1.0-live < 1.0-prod etc.

You shouldn’t use this comparison as a definitive way to determine that one release is more recent or up-to-date than another unless you know that the product in question uses a numbering scheme compatible with these rules. On the other hand, it can be useful when sorting lists for human consumption.

token = None

the product’s token

version = None

the product’s version

classmethod explode(version)

Returns an exploded version string.

Version strings are split by dot and then by runs of non-digit characters resulting in a list of tuples. Numbers that have modified are treated as if they had a ~ suffix. This ensures that when sorting, 1.0 > 1.0a (i.e., qualifiers indicate earlier releases, ~ being the ASCII character with the largest codepoint).

Examples will help:

explode("2.15")==((2, "~"),(15, "~"))
explode("2.17b3")==((2, "~"),(17, "b", 3, "~"))
explode("2.b3")==((2, "~"),(-1, "b", 3, "~"))

Note that a missing leading numeric component is treated as -1 to force “a3” to sort before “0a3”.

classmethod from_str(source)

Creates a product token from a source string.

classmethod list_from_str(source)

Creates a list of product tokens from a source string.

Individual tokens are separated by white space.

class pyslet.http.params.LanguageTag(primary, *subtags)

Bases: pyslet.http.params.SortableParameter

Represents an HTTP language-tag.

Language tags are compared by lower casing all components and then sorting by primary tag, then by each sub-tag. Note that en sorts before en-US.

partial_match(range)

True if this tag is a partial match against range

range
A tuple of lower-cased subtags. An empty tuple matches all instances.

For example:

lang=LanguageTag("en",("US","Texas"))
lang.partial_match(())==True
lang.partial_match(("en",)==True
lang.partial_match(("en","us")==True
lang.partial_match(("en","us","texas")==True
lang.partial_match(("en","gb")==False
lang.partial_match(("en","us","tex")==False
classmethod from_str(source)

Creates a language tag from a source string.

Enforces the following rules from the specification:

White space is not allowed within the tag
classmethod list_from_str(source)

Creates a list of language tags from a source string.

class pyslet.http.params.EntityTag(tag, weak=True)

Bases: pyslet.http.params.SortableParameter

Represents an HTTP entity-tag.

tag
The opaque tag
weak
A boolean indicating if the entity-tag is a weak or strong entity tag. Defaults to True.

Instances are compared by tag and then, if the tags match, by wheather the tag is weak or not.

weak = None

True if this is a weak tag

tag = None

the opaque tag

classmethod from_str(source)

Creates an entity-tag from a source string.

5.4.2.1. Parsing Parameter Values

In most cases parameter values will be parsed directly by the class methods provided in the parameter types themselves. For completeness a parameter parser is exposed to enable you to parse these values from more complex strings.

class pyslet.http.params.ParameterParser(source, ignore_sp=True)

Bases: pyslet.http.grammar.WordParser

An extended parser for parameter values

This parser defines attributes for dealing with English date names that are useful beyond the basic parsing functions to allow the formatting of date information in English regardless of the locale.

require_http_version()

Parses an HTTPVersion instance

Returns an HTTPVersion instance.

wkday = ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']

A list of English day-of-week abbreviations: wkday[0] == “Mon”, etc.

weekday = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']

A list of English day-of-week full names: weekday[0] == “Monday”

month = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']

A list of English month names: month[0] == “Jan”, etc.

require_fulldate()

Parses a FullDate instance.

Returns a FullDate instance or raises BadSyntax if none is found.

parse_delta_seconds()

Parses a delta-seconds value, see WordParser.parse_integer()

parse_charset()

Parses a charset, see WordParser.parse_tokenlower()

parse_content_coding()

Parses a content-coding, see WordParser.parse_tokenlower()

require_transfer_encoding()

Parses a TransferEncoding instance

require_chunk()

Parses a chunk header

Returns a Chunk instance.

require_media_type()

Parses a MediaType instance.

require_product_token()

Parses a ProductToken instance.

Raises BadSyntax if no product token was found.

parse_qvalue()

Parses a qvalue returning a float

Returns None if no qvalue was found.

require_language_tag()

Parses a language tag returning a LanguageTag instance. Raises BadSyntax if no language tag was found.

require_entity_tag()

Parses an entity-tag returning a EntityTag instance. Raises BadSyntax if no language tag was found.