In Enhancing performance, some possibilities are described for improving the performance of pandas. However, there are also special libraries that can parallelise the processing of data frames.
cuDF is a GPU DataFrame library that implements a Pandas-like API.
Modin parallelises almost the entire Pandas API. In most cases, the existing Pandas code only needs to be extended by the following import:
import modin.pandas as pd
The restrictions refer to
pd.read_json, which is only implemented for
Dask DataFrame is a
large parallel DataFrame made up of multiple Pandas DataFrames. Here, the
dask.dataframe API is a subset of the Pandas API, although there are minor