6.11. File System Abstraction¶
The purpose of this module is to provide an abstraction over the top the native file system, potentially allowing alternative implementations to be provided in the future. This module was particularly developed with operating environments where access to the file system is limited or not-allowed. Pyslet modules that use these classes to access the file system can be easily repointed at some other implementation.
Abstract class representing a virtual file system
Instances represent paths within a file system. You can’t create an instance of VirtualFilePath directly, instead you must create instances using a class derived from it. (Do not call the __init__ method of VirtualFilePath from your derived classes.)
All instances are created from one or more strings, either byte strings or unicode strings, or existing instances. In the case of byte strings the encoding is assumed to be the default encoding of the file system. If multiple arguments are given then they are joined to make a single path using
Instances can be converted to either binary or character strings, use
to_bytes()for the former. Note that the builtin str function returns a binary string in Python 2, not a character string.
Instances are immutable, and can be used as keys in dictionaries. Instances must be from the same file system to be comparable, the unicode representation is used.
An empty path is False, other paths are True. You can also compare a file path with a string (or unicode string) which is first converted to a file path instance.
The name of the file system, must be overridden by derived classes.
The purpose of providing a name for a file system is to enable file systems to be mapped onto the authority (host) component of a file URL.
Indicates whether this file system supports unicode file names natively. In general, you don’t need to worry about this as all methods that accept strings will accept either type of string and convert to the native representation.
Indicates whether this file system supports UNC paths.
UNC paths are of the general form:
This format is used in Microsoft Windows. See
Indicates whether this file system supports ‘drives’, i.e., is Windows-like in having drive letters that may prefix paths.
The codec used by this file system
This codec is used to convert between byte strings and unicode strings. The default is utf-8.
The path separator used by this file system
This is either a character or byte string, depending on the setting of
The path component that represents the current directory
The path component that represents the parent directory
The extension character
The drive separator
An empty path string (for use with join)
Returns an instance representing the working directory.
Returns an instance representing the current root.
UNIX users will find this odd but in other file systems there are multiple roots. Rather than invent an abstract concept of the root of roots we just accept that there can be more than one. (We might struggle to perform actions like
listdir()on the root of roots.)
The current root is determined by stripping back the current working directory until it can no longer be split.
Creates a temporary directory in the file system
Returns an instance representing the path to the new directory.
Similar to Python’s tempfile.mkdtemp, like that function the caller is responsible for cleaning up the directory, which can be done with
Converts a single argument to the correct string type
File systems can use either binary or character strings and we convert between them using
codec. This method takes either type of string or an existing instance and returns a path string of the correct type.
the path, either character or binary string
Returns the binary string representation of the path.
Instances are sortable using character strings.
Returns a new instance by joining path components
Starting with the current instance, this method appends each component, returning a new instance representing the joined path. If components contains an absolute path then previous components, including the instance’s path, are discarded.
For details see Python’s os.path.join function.
For the benefit of derived classes a default implementation is provided.
Splits a path
Returns a tuple of two instances (head, tail) where tail is the last path component and head is everything leading up to it.
For details see Python’s os.path.split.
Splits an extension from a path
Returns a tuple of (root, ext) where root is an instance containing just the root file path and ext is a string of characters representing the orignal path’s extension.
For details see Python’s os.path.splitext.
Splits a drive designation
Returns a tuple of two instances (drive, tail) where drive is either a drive specification or is empty.
Default implementation uses the
drive_septo determine if the first path component is a drive.
Splits a UNC path
Returns a tuple of two instances (mount, path) where mount is an instance representing the UNC mount point or an instance representing the empty path if this isn’t a UNC path.
Default implementation checks for a double separator at the start of the path and at least one more separator.
Returns an absolute path instance.
Returns a real path, with any symbolic links removed.
Returns a normalised path instance.
Returns a case-normalised path instance.
The default implementation returns the path unchanged.
Returns True if this path is a UNC path.
UNC paths contain a host designation, a path cannot contain a drive specification and also be a UNC path.
Default implementation calls
splitunc()and returns True if the unc component is non-empty.
Returns True if this path is a single, non-root, component.
E.g., tests that the path does not contain a slash (it may be empty)
Returns True if this path is empty
Returns True if this is a directory-like path.
E.g., test that the path ends in a slash (last component is empty).
Returns True if this is a root path.
E.g., tests if it consists of just one or more slashes only (not counting any drive specification in file systems that support them).
Returns True if the path is an absolute path.
Return information about the path.
Returns True if this is existing item in the file system.
Returns True if this is a regular file in the file system.
Returns True if this is a directory in the file system.
Returns an open file-like object from this path.
Copies a file to dst path like Python’s shutil.copy.
Note that you can’t copy between file system implementations.
Moves a file to dst path like Python’s os.rename.
Removes a file.
List directory contents
Returns a list containing path instances of the entries in the directory.
Changes the current working directory to this path
Creates a new directory at this path.
If an item at this path already exists OSError is raised. This method ignores any trailing separator.
Recursive directory creation function.
Like mkdir(), but makes all intermediate-level directories needed to contain the leaf directory.
The default implementation repeatedly uses a combination of split and mkdir.
A generator function that walks the file system
Similar to os.walk. For each directory in the tree rooted at this path (including this path itself), it yields a 3-tuple of:
(dirpath, dirnames, filenames)
dirpath is an instance, dirnames and filename are lists of path instances.
Removes the tree rooted at this directory
ignore_errors can be used to ignore any errors from the file system.
6.11.1. Accessing the Local File System¶
A concrete implementation mapping to Python’s os modules
In most cases the methods map straightforwardly to functions in os and os.path.
An empty string.
The file system name affects the way URIs are interpreted, an empty string is consistent with the use of file:/// to reference the local file system.
Copied from os.path
That means you won’t know ahead of time whether paths are expected as binary or unicode strings. In most cases it won’t matter as the methods will convert as appropriate but it does affect the type of the static path constants defined below.
Automatically determined from os.path
Tests if os.path has defined splitunc.
The method chosen is straight out of the documentation for os.path. We join the segments “C:” and “foo” and check to see if the result contains the path separator or not.
as returned by sys.getfilesystemencoding()
copied from os.sep
copied from os.curdir
copied from os.pardir
copied from os.extsep
always set to ‘:’
Correctly set to either binary or character string depending on the setting of
6.11.2. Misc Definitions¶
Context manager for compatibility with zipfile
The zipfile module allows you to write either a string or the contents of a named file to a zip archive. This class monkey-patches the builtin open function and os.stat with versions that support
VirtualFilePathobjects allowing us to copy the contents of a virtual represented file path directly to a zip archive without having to load it into memory first.
For more information on this approach see this blog post.
This implementation uses a lock on the class attributes to ensure thread safety.
As currently implemented, Pyslet does not contain a full implementation of
VirtualFilePathso this class is provided in readiness for a more comprehensive implementation based on