6.11. File System Abstraction

The purpose of this module is to provide an abstraction over the top the native file system, potentially allowing alternative implementations to be provided in the future. This module was particularly developed with operating environments where access to the file system is limited or not-allowed. Pyslet modules that use these classes to access the file system can be easily repointed at some other implementation.

class pyslet.vfs.VirtualFilePath(*args)

Bases: pyslet.py2.SortableMixin

Abstract class representing a virtual file system

Instances represent paths within a file system. You can’t create an instance of VirtualFilePath directly, instead you must create instances using a class derived from it. (Do not call the __init__ method of VirtualFilePath from your derived classes.)

All instances are created from one or more strings, either byte strings or unicode strings, or existing instances. In the case of byte strings the encoding is assumed to be the default encoding of the file system. If multiple arguments are given then they are joined to make a single path using join().

Instances can be converted to either binary or character strings, use to_bytes() for the former. Note that the builtin str function returns a binary string in Python 2, not a character string.

Instances are immutable, and can be used as keys in dictionaries. Instances must be from the same file system to be comparable, the unicode representation is used.

An empty path is False, other paths are True. You can also compare a file path with a string (or unicode string) which is first converted to a file path instance.

fs_name = None

The name of the file system, must be overridden by derived classes.

The purpose of providing a name for a file system is to enable file systems to be mapped onto the authority (host) component of a file URL.

supports_unicode_filenames = False

Indicates whether this file system supports unicode file names natively. In general, you don’t need to worry about this as all methods that accept strings will accept either type of string and convert to the native representation.

When creating derived classes you must also override sep, curdir, pardir, ext, drive_sep (if applicable) and empty with the correct string types.

supports_unc = False

Indicates whether this file system supports UNC paths.

UNC paths are of the general form:

\\ComputerName\SharedFolder\Resource

This format is used in Microsoft Windows. See is_unc() for details.

supports_drives = False

Indicates whether this file system supports ‘drives’, i.e., is Windows-like in having drive letters that may prefix paths.

codec = 'utf-8'

The codec used by this file system

This codec is used to convert between byte strings and unicode strings. The default is utf-8.

sep = '/'

The path separator used by this file system

This is either a character or byte string, depending on the setting of supports_unicode_filenames.

curdir = '.'

The path component that represents the current directory

pardir = '..'

The path component that represents the parent directory

ext = '.'

The extension character

drive_sep = ':'

The drive separator

empty = ''

An empty path string (for use with join)

classmethod getcwd()

Returns an instance representing the working directory.

classmethod getcroot()

Returns an instance representing the current root.

UNIX users will find this odd but in other file systems there are multiple roots. Rather than invent an abstract concept of the root of roots we just accept that there can be more than one. (We might struggle to perform actions like listdir() on the root of roots.)

The current root is determined by stripping back the current working directory until it can no longer be split.

classmethod mkdtemp(suffix='', prefix='')

Creates a temporary directory in the file system

Returns an instance representing the path to the new directory.

Similar to Python’s tempfile.mkdtemp, like that function the caller is responsible for cleaning up the directory, which can be done with rmtree().

classmethod path_str(arg)

Converts a single argument to the correct string type

File systems can use either binary or character strings and we convert between them using codec. This method takes either type of string or an existing instance and returns a path string of the correct type.

path = None

the path, either character or binary string

to_bytes()

Returns the binary string representation of the path.

sortkey()

Instances are sortable using character strings.

join(*components)

Returns a new instance by joining path components

Starting with the current instance, this method appends each component, returning a new instance representing the joined path. If components contains an absolute path then previous components, including the instance’s path, are discarded.

For details see Python’s os.path.join function.

For the benefit of derived classes a default implementation is provided.

split()

Splits a path

Returns a tuple of two instances (head, tail) where tail is the last path component and head is everything leading up to it.

For details see Python’s os.path.split.

splitext()

Splits an extension from a path

Returns a tuple of (root, ext) where root is an instance containing just the root file path and ext is a string of characters representing the orignal path’s extension.

For details see Python’s os.path.splitext.

splitdrive()

Splits a drive designation

Returns a tuple of two instances (drive, tail) where drive is either a drive specification or is empty.

Default implementation uses the drive_sep to determine if the first path component is a drive.

splitunc()

Splits a UNC path

Returns a tuple of two instances (mount, path) where mount is an instance representing the UNC mount point or an instance representing the empty path if this isn’t a UNC path.

Default implementation checks for a double separator at the start of the path and at least one more separator.

abspath()

Returns an absolute path instance.

realpath()

Returns a real path, with any symbolic links removed.

The default implementation normalises the path using normpath() and normcase().

normpath()

Returns a normalised path instance.

normcase()

Returns a case-normalised path instance.

The default implementation returns the path unchanged.

is_unc()

Returns True if this path is a UNC path.

UNC paths contain a host designation, a path cannot contain a drive specification and also be a UNC path.

Default implementation calls splitunc() and returns True if the unc component is non-empty.

is_single_component()

Returns True if this path is a single, non-root, component.

E.g., tests that the path does not contain a slash (it may be empty)

is_empty()

Returns True if this path is empty

is_dirlike()

Returns True if this is a directory-like path.

E.g., test that the path ends in a slash (last component is empty).

is_root()

Returns True if this is a root path.

E.g., tests if it consists of just one or more slashes only (not counting any drive specification in file systems that support them).

isabs()

Returns True if the path is an absolute path.

stat()

Return information about the path.

exists()

Returns True if this is existing item in the file system.

isfile()

Returns True if this is a regular file in the file system.

isdir()

Returns True if this is a directory in the file system.

open(mode='r')

Returns an open file-like object from this path.

copy(dst)

Copies a file to dst path like Python’s shutil.copy.

Note that you can’t copy between file system implementations.

move(dst)

Moves a file to dst path like Python’s os.rename.

remove()

Removes a file.

listdir()

List directory contents

Returns a list containing path instances of the entries in the directory.

chdir()

Changes the current working directory to this path

mkdir()

Creates a new directory at this path.

If an item at this path already exists OSError is raised. This method ignores any trailing separator.

makedirs()

Recursive directory creation function.

Like mkdir(), but makes all intermediate-level directories needed to contain the leaf directory.

The default implementation repeatedly uses a combination of split and mkdir.

walk()

A generator function that walks the file system

Similar to os.walk. For each directory in the tree rooted at this path (including this path itself), it yields a 3-tuple of:

(dirpath, dirnames, filenames)

dirpath is an instance, dirnames and filename are lists of path instances.

rmtree(ignore_errors=False)

Removes the tree rooted at this directory

ignore_errors can be used to ignore any errors from the file system.

6.11.1. Accessing the Local File System

class pyslet.vfs.OSFilePath(*path)

Bases: pyslet.vfs.VirtualFilePath

A concrete implementation mapping to Python’s os modules

In most cases the methods map straightforwardly to functions in os and os.path.

fs_name = ''

An empty string.

The file system name affects the way URIs are interpreted, an empty string is consistent with the use of file:/// to reference the local file system.

supports_unicode_filenames = False

Copied from os.path

That means you won’t know ahead of time whether paths are expected as binary or unicode strings. In most cases it won’t matter as the methods will convert as appropriate but it does affect the type of the static path constants defined below.

supports_unc = False

Automatically determined from os.path

Tests if os.path has defined splitunc.

supports_drives = False

Automatically determined

The method chosen is straight out of the documentation for os.path. We join the segments “C:” and “foo” and check to see if the result contains the path separator or not.

codec = 'UTF-8'

as returned by sys.getfilesystemencoding()

sep = '/'

copied from os.sep

curdir = '.'

copied from os.curdir

pardir = '..'

copied from os.pardir

ext = '.'

copied from os.extsep

drive_sep = ':'

always set to ‘:’

Correctly set to either binary or character string depending on the setting of supports_unicode_filenames.

empty = ''

Set to the empty string

Uses either a binary or character string depending on the setting of supports_unicode_filenames.

6.11.2. Misc Definitions

class pyslet.vfs.ZipHooks

Bases: object

Context manager for compatibility with zipfile

The zipfile module allows you to write either a string or the contents of a named file to a zip archive. This class monkey-patches the builtin open function and os.stat with versions that support VirtualFilePath objects allowing us to copy the contents of a virtual represented file path directly to a zip archive without having to load it into memory first.

For more information on this approach see this blog post.

This implementation uses a lock on the class attributes to ensure thread safety.

As currently implemented, Pyslet does not contain a full implementation of VirtualFilePath so this class is provided in readiness for a more comprehensive implementation based on pyslet.blockstore.StreamStore.