Converting Python data structures into pandas

Python data structures such as lists and arrays can be converted into pandas Series or DataFrames.

[1]:
import numpy as np
import pandas as pd

Series

Python lists can easily be converted into pandas Series:

[2]:
list1 = [-0.751442, 0.816935, -0.272546, -0.268295, -0.296728,  0.176255, -0.322612]

pd.Series(list1)
[2]:
0   -0.751442
1    0.816935
2   -0.272546
3   -0.268295
4   -0.296728
5    0.176255
6   -0.322612
dtype: float64

Multiple lists can also be easily converted into one pandas Series:

[3]:
list2 = [-0.029608, -0.277982, 2.693057, -0.850817, 0.783868, -1.137835, -0.617132]

pd.Series(list1 + list2)
[3]:
0    -0.751442
1     0.816935
2    -0.272546
3    -0.268295
4    -0.296728
5     0.176255
6    -0.322612
7    -0.029608
8    -0.277982
9     2.693057
10   -0.850817
11    0.783868
12   -1.137835
13   -0.617132
dtype: float64

A list can also be passed as an index:

[4]:
date = [
    "2022-01-31",
    "2022-02-01",
    "2022-02-02",
    "2022-02-03",
    "2022-02-04",
    "2022-02-05",
    "2022-02-06",
]

pd.Series(list1, index=date)
[4]:
2022-01-31   -0.751442
2022-02-01    0.816935
2022-02-02   -0.272546
2022-02-03   -0.268295
2022-02-04   -0.296728
2022-02-05    0.176255
2022-02-06   -0.322612
dtype: float64

With Python dictionaries you can pass not only values but also the corresponding keys to a pandas series:

[5]:
dict1 = {
    "2022-01-31": -0.751442,
    "2022-02-01": 0.816935,
    "2022-02-02": -0.272546,
    "2022-02-03": -0.268295,
    "2022-02-04": -0.296728,
    "2022-02-05": 0.176255,
    "2022-02-06": -0.322612,
}

pd.Series(dict1)
[5]:
2022-01-31   -0.751442
2022-02-01    0.816935
2022-02-02   -0.272546
2022-02-03   -0.268295
2022-02-04   -0.296728
2022-02-05    0.176255
2022-02-06   -0.322612
dtype: float64

When you pass a dict, the index in the resulting pandas series takes into account the order of the keys in the dict.

With collections.ChainMap you can also turn several dicts into one pandas.Series.

First we define a second dict:

[6]:
dict2 = {
    "2022-02-07": -0.029608,
    "2022-02-08": -0.277982,
    "2022-02-09": 2.693057,
    "2022-02-10": -0.850817,
    "2022-02-11": 0.783868,
    "2022-02-12": -1.137835,
    "2022-02-13": -0.617132,
}
[7]:
from collections import ChainMap


pd.Series(ChainMap(dict1, dict2))
[7]:
2022-02-07   -0.029608
2022-02-08   -0.277982
2022-02-09    2.693057
2022-02-10   -0.850817
2022-02-11    0.783868
2022-02-12   -1.137835
2022-02-13   -0.617132
2022-01-31   -0.751442
2022-02-01    0.816935
2022-02-02   -0.272546
2022-02-03   -0.268295
2022-02-04   -0.296728
2022-02-05    0.176255
2022-02-06   -0.322612
dtype: float64

DataFrame

Lists of lists can be loaded into a pandas DataFrame with:

[8]:
df = pd.DataFrame([list1, list2])
df
[8]:
0 1 2 3 4 5 6
0 -0.751442 0.816935 -0.272546 -0.268295 -0.296728 0.176255 -0.322612
1 -0.029608 -0.277982 2.693057 -0.850817 0.783868 -1.137835 -0.617132

You can also transfer a list into a DataFrame index:

[9]:
pd.DataFrame([list1, list2], index=["2022-01-31", "2022-02-01"])
[9]:
0 1 2 3 4 5 6
2022-01-31 -0.751442 0.816935 -0.272546 -0.268295 -0.296728 0.176255 -0.322612
2022-02-01 -0.029608 -0.277982 2.693057 -0.850817 0.783868 -1.137835 -0.617132

A pandas DataFrame can be created from a dict with values in lists:

[10]:
data = {
    "Code": ["U+0000", "U+0001", "U+0002", "U+0003", "U+0004", "U+0005"],
    "Decimal": [0, 1, 2, 3, 4, 5],
    "Octal": ["001", "002", "003", "004", "004", "005"],
    "Key": ["NUL", "Ctrl-A", "Ctrl-B", "Ctrl-C", "Ctrl-D", "Ctrl-E"],
}
[11]:
pd.DataFrame(data)
[11]:
Code Decimal Octal Key
0 U+0000 0 001 NUL
1 U+0001 1 002 Ctrl-A
2 U+0002 2 003 Ctrl-B
3 U+0003 3 004 Ctrl-C
4 U+0004 4 004 Ctrl-D
5 U+0005 5 005 Ctrl-E

Another common form of data is nested dict of dicts:

[12]:
data2 = {
    "U+0006": {"Decimal": "6", "Octal": "006", "Key": "Ctrl-F"},
    "U+0007": {"Decimal": "7", "Octal": "007", "Key": "Ctrl-G"},
}

df2 = pd.DataFrame(data2)

df2
[12]:
U+0006 U+0007
Decimal 6 7
Octal 006 007
Key Ctrl-F Ctrl-G

Dicts of Series are treated in a similar way:

[13]:
data3 = {"U+0006": df2["U+0006"][2:], "U+0007": df2["U+0007"][2:]}

pd.DataFrame(data3)
[13]:
U+0006 U+0007
Key Ctrl-F Ctrl-G