File systems

High-level APIs

PyFilesystem

works with files and directories in archives, in storages, in the cloud, etc.

https://raster.shields.io/github/stars/pyfilesystem/pyfilesystem2 https://raster.shields.io/github/contributors/pyfilesystem/pyfilesystem2 https://raster.shields.io/github/commit-activity/y/pyfilesystem/pyfilesystem2 https://raster.shields.io/github/license/pyfilesystem/pyfilesystem2

Integrated file systems:

  • AppFS for predefined storage locations in operating systems where applications can store data

  • FTPFS for working with FTP servers

  • MemoryFS for caches, temporary data storage, unit tests, etc. that exist in the working memory

  • MountFS for a virtual file system that can mount other file systems

  • MultiFS for a virtual file system that combines other file systems

  • OSFS for the OS file system

  • TarFS reads and writes compressed tar archives

  • TempFS contains temporary data

  • ZipFS reads and writes Zip files

File systems of the PyFilesystem organization on GitHub:

File systems from third-party developers:

fsspec

Unified Python interface for many local, remote and embedded file systems and byte storages. If you already use pandas, Intake, Dask or DVC in your project, for example, fsspec is already available.

https://raster.shields.io/github/stars/fsspec/filesystem_spec https://raster.shields.io/github/contributors/fsspec/filesystem_spec https://raster.shields.io/github/commit-activity/y/fsspec/filesystem_spec https://raster.shields.io/github/license/fsspec/filesystem_spec

In addition to the integrated implementations, there are also many extensions, for example:

  • abfs for the Azure Blob Service

  • adl for the Azure DataLake storage

  • alluxiofs for the Alluxio distributed cache

  • boxfs for access to Box file storage

  • dropbox for access to Dropbox shares

  • dvc for accessing a DVC repository as a file system

  • gcsfs for Google Cloud Storage

  • gdrive for access to Google Drive and shares

  • huggingface_hub for access to the Hugging Face Hub file system

  • lakefs for lakeFS datalakes

  • ocifs for access to the Oracle Cloud Object Storage

  • ossfs for the Alibaba Cloud (Aliyun) object storage system (OSS)

  • p9fs for 9P servers

  • s3fs for Amazon S3 and other compatible storage

  • wandbfs for accessing Wandb data

  • webdav4 for WebDAV

See also

Rclone is a command line programme for managing files on cloud storage. It supports more than 70 cloud storages. You can find an example of its use with Python in rclone.py.

Specialised libraries

PyArrow

Apache Arrow Python bindings for the Hadoop Distributed File System (HDFS) and other fsspec-compatible file systems.

https://raster.shields.io/github/stars/apache/arrow https://raster.shields.io/github/contributors/apache/arrow https://raster.shields.io/github/commit-activity/y/apache/arrow https://raster.shields.io/github/license/apache/arrow
paramiko

Python implementation of the SSHv2 protocol, which offers both client and server functions. It forms the basis for the high-level SSH library Fabric.

https://raster.shields.io/github/stars/paramiko/paramiko https://raster.shields.io/github/contributors/paramiko/paramiko https://raster.shields.io/github/commit-activity/y/paramiko/paramiko https://raster.shields.io/github/license/paramiko/paramiko
boto3

AWS SDK for Python facilitates integration with Amazon S3, Amazon EC2, Amazon DynamoDB and others.

https://raster.shields.io/github/stars/boto/boto3 https://raster.shields.io/github/contributors/boto/boto3 https://raster.shields.io/github/commit-activity/y/boto/boto3 https://raster.shields.io/github/license/boto/boto3
azure-storage-blob

Azure Storage Blobs client library for Python.

https://raster.shields.io/github/stars/Azure/azure-sdk-for-python https://raster.shields.io/github/contributors/Azure/azure-sdk-for-python https://raster.shields.io/github/commit-activity/y/Azure/azure-sdk-for-python https://raster.shields.io/github/license/Azure/azure-sdk-for-python
oss2

Python SDK for the Alibaba Cloud Object Storage.

https://raster.shields.io/github/stars/aliyun/aliyun-oss-python-sdk https://raster.shields.io/github/contributors/aliyun/aliyun-oss-python-sdk https://raster.shields.io/github/commit-activity/y/aliyun/aliyun-oss-python-sdk https://raster.shields.io/github/license/aliyun/aliyun-oss-python-sdk
minio

MinIO Python Client SDK for Amazon S3 compatible cloud storage.

https://raster.shields.io/github/stars/minio/minio-py https://raster.shields.io/github/contributors/minio/minio-py https://raster.shields.io/github/commit-activity/y/minio/minio-py https://raster.shields.io/github/license/minio/minio-py
PyDrive2

Python wrapper library of the google-api-python-client, which simplifies many common Google Drive API tasks.

https://raster.shields.io/github/stars/iterative/PyDrive2 https://raster.shields.io/github/contributors/iterative/PyDrive2 https://raster.shields.io/github/commit-activity/y/iterative/PyDrive2 https://raster.shields.io/github/license/iterative/PyDrive2
Qcloud COSv5 SDK

Python SDK for the Tencent Cloud Object Storage (COS).

https://raster.shields.io/github/stars/tencentyun/cos-python-sdk-v5 https://raster.shields.io/github/contributors/tencentyun/cos-python-sdk-v5 https://raster.shields.io/github/commit-activity/y/tencentyun/cos-python-sdk-v5 https://raster.shields.io/github/license/tencentyun/cos-python-sdk-v5
linode_api4

Python bindings for the Linode API v4.

https://raster.shields.io/github/stars/linode/linode_api4-python https://raster.shields.io/github/contributors/linode/linode_api4-python https://raster.shields.io/github/commit-activity/y/linode/linode_api4-python https://raster.shields.io/github/license/linode/linode_api4-python
airfs

brings standard Python I/O to various storages (such as Alibaba Cloud OSS, Amazon Web Services S3, GitHub, Microsoft Azure Blobs Storage and Files Storage, OpenStack Swift/Object Store.

https://raster.shields.io/github/stars/JGoutin/airfs https://raster.shields.io/github/contributors/JGoutin/airfs https://raster.shields.io/github/commit-activity/y/JGoutin/airfs https://raster.shields.io/github/license/JGoutin/airfs
yandex-s3

Asyncio-compatible SDK for Yandex Object Storage.

https://raster.shields.io/github/stars/mrslow/yandex-s3 https://raster.shields.io/github/contributors/mrslow/yandex-s3 https://raster.shields.io/github/commit-activity/y/mrslow/yandex-s3 https://raster.shields.io/github/license/mrslow/yandex-s3

Dormant projects

PyDrive

Python wrapper library of the google-api-python-client, which simplifies many common Google Drive API tasks.

https://raster.shields.io/github/stars/googlearchive/PyDrive https://raster.shields.io/github/contributors/googlearchive/PyDrive https://raster.shields.io/github/commit-activity/y/googlearchive/PyDrive https://raster.shields.io/github/license/googlearchive/PyDrive
digital-ocean-spaces

Python client for Digital Ocean Spaces with an inbuilt shell.

https://raster.shields.io/github/stars/ChariotDev/digital-ocean-spaces https://raster.shields.io/github/contributors/ChariotDev/digital-ocean-spaces https://raster.shields.io/github/commit-activity/y/ChariotDev/digital-ocean-spaces https://raster.shields.io/github/license/ChariotDev/digital-ocean-spaces