{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Intake for data scientists\n",
"\n",
"Intake makes it easy to load many different formats and types. For a complete overview, take a look at the [Plugin Directory](https://intake.readthedocs.io/en/latest/plugin-directory.html) and the [Intake Project Dashboard](https://intake.github.io/status/). Intake then transfers the data to common storage formats such as Pandas DataFrames, Numpy arrays or Python lists. They are then easily searchable and also accessible to distributed systems. If you are missing a plugin, you can also order one yourself, as described in [Making Drivers](https://intake.readthedocs.io/en/latest/making-plugins.html)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load a data source\n",
"\n",
"Hereinafter we will read two csv data records and transfer them to an intake catalog."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"sources:\n",
" csv:\n",
" args:\n",
" urlpath: states_*.csv\n",
" description: ''\n",
" driver: intake.source.csv.CSVSource\n",
" metadata: {}\n",
"\n"
]
}
],
"source": [
"import intake\n",
"\n",
"\n",
"ds = intake.open_csv(\"states_*.csv\")\n",
"\n",
"print(ds)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Mit der `open_*`-Funktion von Intake lassen sich verschiedenen Datenquellen einlesen. Je nach Datenformat oder Dienst lassen sich unterschiedliche Argmuente verwenden."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Configure the search path for data sources\n",
"\n",
"Intake checks the Intake configuration file for `catalog_path` and the environment variable `\"INTAKE_PATH\"` for a colon-separated list of paths or semicolons in Windows to look for catalog files. When importing `intake`, all entries from all catalogs that are referenced by `intake.cat` as part of a global catalog are displayed."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Read data\n",
"\n",
"Intake reads data in containers of various formats:\n",
"\n",
"* Tables in Pandas DataFrames\n",
"* Multi-dimensional arrays in numpy arrays\n",
"* Semi-structured data in Python lists of objects, usually dictionaries\n",
"\n",
"To find out in which container format Intake holds the data, you can use the `container` attribute:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'dataframe'"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ds.container"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In addition to `dataframe`, the result can also be `ndarray` or `python`."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" state | \n",
" slug | \n",
" code | \n",
" nickname | \n",
" website | \n",
" admission_date | \n",
" admission_number | \n",
" capital_city | \n",
" capital_url | \n",
" population | \n",
" population_rank | \n",
" constitution_url | \n",
" state_flag_url | \n",
" state_seal_url | \n",
" map_image_url | \n",
" landscape_background_url | \n",
" skyline_background_url | \n",
" twitter_url | \n",
" facebook_url | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" Alabama | \n",
" alabama | \n",
" AL | \n",
" Yellowhammer State | \n",
" http://www.alabama.gov | \n",
" 1819-12-14 | \n",
" 22 | \n",
" Montgomery | \n",
" http://www.montgomeryal.gov | \n",
" 4833722 | \n",
" 23 | \n",
" http://alisondb.legislature.state.al.us/alison... | \n",
" https://cdn.civil.services/us-states/flags/ala... | \n",
" https://cdn.civil.services/us-states/seals/ala... | \n",
" https://cdn.civil.services/us-states/maps/alab... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://twitter.com/alabamagov | \n",
" https://www.facebook.com/alabamagov | \n",
"
\n",
" \n",
" | 1 | \n",
" Alaska | \n",
" alaska | \n",
" AK | \n",
" The Last Frontier | \n",
" http://alaska.gov | \n",
" 1959-01-03 | \n",
" 49 | \n",
" Juneau | \n",
" http://www.juneau.org | \n",
" 735132 | \n",
" 47 | \n",
" http://www.legis.state.ak.us/basis/folioproxy.... | \n",
" https://cdn.civil.services/us-states/flags/ala... | \n",
" https://cdn.civil.services/us-states/seals/ala... | \n",
" https://cdn.civil.services/us-states/maps/alas... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://twitter.com/alaska | \n",
" https://www.facebook.com/AlaskaLocalGovernments | \n",
"
\n",
" \n",
" | 2 | \n",
" Arizona | \n",
" arizona | \n",
" AZ | \n",
" The Grand Canyon State | \n",
" https://az.gov | \n",
" 1912-02-14 | \n",
" 48 | \n",
" Phoenix | \n",
" https://www.phoenix.gov | \n",
" 6626624 | \n",
" 15 | \n",
" http://www.azleg.gov/Constitution.asp | \n",
" https://cdn.civil.services/us-states/flags/ari... | \n",
" https://cdn.civil.services/us-states/seals/ari... | \n",
" https://cdn.civil.services/us-states/maps/ariz... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" NaN | \n",
" NaN | \n",
"
\n",
" \n",
" | 3 | \n",
" Arkansas | \n",
" arkansas | \n",
" AR | \n",
" The Natural State | \n",
" http://arkansas.gov | \n",
" 1836-06-15 | \n",
" 25 | \n",
" Little Rock | \n",
" http://www.littlerock.org | \n",
" 2959373 | \n",
" 32 | \n",
" http://www.arkleg.state.ar.us/assembly/Summary... | \n",
" https://cdn.civil.services/us-states/flags/ark... | \n",
" https://cdn.civil.services/us-states/seals/ark... | \n",
" https://cdn.civil.services/us-states/maps/arka... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://twitter.com/arkansasgov | \n",
" https://www.facebook.com/Arkansas.gov | \n",
"
\n",
" \n",
" | 4 | \n",
" California | \n",
" california | \n",
" CA | \n",
" Golden State | \n",
" http://www.ca.gov | \n",
" 1850-09-09 | \n",
" 31 | \n",
" Sacramento | \n",
" http://www.cityofsacramento.org | \n",
" 38332521 | \n",
" 1 | \n",
" http://www.leginfo.ca.gov/const-toc.html | \n",
" https://cdn.civil.services/us-states/flags/cal... | \n",
" https://cdn.civil.services/us-states/seals/cal... | \n",
" https://cdn.civil.services/us-states/maps/cali... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://twitter.com/cagovernment | \n",
" NaN | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" state slug code nickname \\\n",
"0 Alabama alabama AL Yellowhammer State \n",
"1 Alaska alaska AK The Last Frontier \n",
"2 Arizona arizona AZ The Grand Canyon State \n",
"3 Arkansas arkansas AR The Natural State \n",
"4 California california CA Golden State \n",
"\n",
" website admission_date admission_number capital_city \\\n",
"0 http://www.alabama.gov 1819-12-14 22 Montgomery \n",
"1 http://alaska.gov 1959-01-03 49 Juneau \n",
"2 https://az.gov 1912-02-14 48 Phoenix \n",
"3 http://arkansas.gov 1836-06-15 25 Little Rock \n",
"4 http://www.ca.gov 1850-09-09 31 Sacramento \n",
"\n",
" capital_url population population_rank \\\n",
"0 http://www.montgomeryal.gov 4833722 23 \n",
"1 http://www.juneau.org 735132 47 \n",
"2 https://www.phoenix.gov 6626624 15 \n",
"3 http://www.littlerock.org 2959373 32 \n",
"4 http://www.cityofsacramento.org 38332521 1 \n",
"\n",
" constitution_url \\\n",
"0 http://alisondb.legislature.state.al.us/alison... \n",
"1 http://www.legis.state.ak.us/basis/folioproxy.... \n",
"2 http://www.azleg.gov/Constitution.asp \n",
"3 http://www.arkleg.state.ar.us/assembly/Summary... \n",
"4 http://www.leginfo.ca.gov/const-toc.html \n",
"\n",
" state_flag_url \\\n",
"0 https://cdn.civil.services/us-states/flags/ala... \n",
"1 https://cdn.civil.services/us-states/flags/ala... \n",
"2 https://cdn.civil.services/us-states/flags/ari... \n",
"3 https://cdn.civil.services/us-states/flags/ark... \n",
"4 https://cdn.civil.services/us-states/flags/cal... \n",
"\n",
" state_seal_url \\\n",
"0 https://cdn.civil.services/us-states/seals/ala... \n",
"1 https://cdn.civil.services/us-states/seals/ala... \n",
"2 https://cdn.civil.services/us-states/seals/ari... \n",
"3 https://cdn.civil.services/us-states/seals/ark... \n",
"4 https://cdn.civil.services/us-states/seals/cal... \n",
"\n",
" map_image_url \\\n",
"0 https://cdn.civil.services/us-states/maps/alab... \n",
"1 https://cdn.civil.services/us-states/maps/alas... \n",
"2 https://cdn.civil.services/us-states/maps/ariz... \n",
"3 https://cdn.civil.services/us-states/maps/arka... \n",
"4 https://cdn.civil.services/us-states/maps/cali... \n",
"\n",
" landscape_background_url \\\n",
"0 https://cdn.civil.services/us-states/backgroun... \n",
"1 https://cdn.civil.services/us-states/backgroun... \n",
"2 https://cdn.civil.services/us-states/backgroun... \n",
"3 https://cdn.civil.services/us-states/backgroun... \n",
"4 https://cdn.civil.services/us-states/backgroun... \n",
"\n",
" skyline_background_url \\\n",
"0 https://cdn.civil.services/us-states/backgroun... \n",
"1 https://cdn.civil.services/us-states/backgroun... \n",
"2 https://cdn.civil.services/us-states/backgroun... \n",
"3 https://cdn.civil.services/us-states/backgroun... \n",
"4 https://cdn.civil.services/us-states/backgroun... \n",
"\n",
" twitter_url \\\n",
"0 https://twitter.com/alabamagov \n",
"1 https://twitter.com/alaska \n",
"2 NaN \n",
"3 https://twitter.com/arkansasgov \n",
"4 https://twitter.com/cagovernment \n",
"\n",
" facebook_url \n",
"0 https://www.facebook.com/alabamagov \n",
"1 https://www.facebook.com/AlaskaLocalGovernments \n",
"2 NaN \n",
"3 https://www.facebook.com/Arkansas.gov \n",
"4 NaN "
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = ds.read()\n",
"\n",
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" state | \n",
" slug | \n",
" code | \n",
" nickname | \n",
" website | \n",
" admission_date | \n",
" admission_number | \n",
" capital_city | \n",
" capital_url | \n",
" population | \n",
" population_rank | \n",
" constitution_url | \n",
" state_flag_url | \n",
" state_seal_url | \n",
" map_image_url | \n",
" landscape_background_url | \n",
" skyline_background_url | \n",
" twitter_url | \n",
" facebook_url | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" Alabama | \n",
" alabama | \n",
" AL | \n",
" Yellowhammer State | \n",
" http://www.alabama.gov | \n",
" 1819-12-14 | \n",
" 22 | \n",
" Montgomery | \n",
" http://www.montgomeryal.gov | \n",
" 4833722 | \n",
" 23 | \n",
" http://alisondb.legislature.state.al.us/alison... | \n",
" https://cdn.civil.services/us-states/flags/ala... | \n",
" https://cdn.civil.services/us-states/seals/ala... | \n",
" https://cdn.civil.services/us-states/maps/alab... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://twitter.com/alabamagov | \n",
" https://www.facebook.com/alabamagov | \n",
"
\n",
" \n",
" | 1 | \n",
" Alaska | \n",
" alaska | \n",
" AK | \n",
" The Last Frontier | \n",
" http://alaska.gov | \n",
" 1959-01-03 | \n",
" 49 | \n",
" Juneau | \n",
" http://www.juneau.org | \n",
" 735132 | \n",
" 47 | \n",
" http://www.legis.state.ak.us/basis/folioproxy.... | \n",
" https://cdn.civil.services/us-states/flags/ala... | \n",
" https://cdn.civil.services/us-states/seals/ala... | \n",
" https://cdn.civil.services/us-states/maps/alas... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://twitter.com/alaska | \n",
" https://www.facebook.com/AlaskaLocalGovernments | \n",
"
\n",
" \n",
" | 2 | \n",
" Arizona | \n",
" arizona | \n",
" AZ | \n",
" The Grand Canyon State | \n",
" https://az.gov | \n",
" 1912-02-14 | \n",
" 48 | \n",
" Phoenix | \n",
" https://www.phoenix.gov | \n",
" 6626624 | \n",
" 15 | \n",
" http://www.azleg.gov/Constitution.asp | \n",
" https://cdn.civil.services/us-states/flags/ari... | \n",
" https://cdn.civil.services/us-states/seals/ari... | \n",
" https://cdn.civil.services/us-states/maps/ariz... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" NaN | \n",
" NaN | \n",
"
\n",
" \n",
" | 3 | \n",
" Arkansas | \n",
" arkansas | \n",
" AR | \n",
" The Natural State | \n",
" http://arkansas.gov | \n",
" 1836-06-15 | \n",
" 25 | \n",
" Little Rock | \n",
" http://www.littlerock.org | \n",
" 2959373 | \n",
" 32 | \n",
" http://www.arkleg.state.ar.us/assembly/Summary... | \n",
" https://cdn.civil.services/us-states/flags/ark... | \n",
" https://cdn.civil.services/us-states/seals/ark... | \n",
" https://cdn.civil.services/us-states/maps/arka... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://twitter.com/arkansasgov | \n",
" https://www.facebook.com/Arkansas.gov | \n",
"
\n",
" \n",
" | 4 | \n",
" California | \n",
" california | \n",
" CA | \n",
" Golden State | \n",
" http://www.ca.gov | \n",
" 1850-09-09 | \n",
" 31 | \n",
" Sacramento | \n",
" http://www.cityofsacramento.org | \n",
" 38332521 | \n",
" 1 | \n",
" http://www.leginfo.ca.gov/const-toc.html | \n",
" https://cdn.civil.services/us-states/flags/cal... | \n",
" https://cdn.civil.services/us-states/seals/cal... | \n",
" https://cdn.civil.services/us-states/maps/cali... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://twitter.com/cagovernment | \n",
" NaN | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" state slug code nickname \\\n",
"0 Alabama alabama AL Yellowhammer State \n",
"1 Alaska alaska AK The Last Frontier \n",
"2 Arizona arizona AZ The Grand Canyon State \n",
"3 Arkansas arkansas AR The Natural State \n",
"4 California california CA Golden State \n",
"\n",
" website admission_date admission_number capital_city \\\n",
"0 http://www.alabama.gov 1819-12-14 22 Montgomery \n",
"1 http://alaska.gov 1959-01-03 49 Juneau \n",
"2 https://az.gov 1912-02-14 48 Phoenix \n",
"3 http://arkansas.gov 1836-06-15 25 Little Rock \n",
"4 http://www.ca.gov 1850-09-09 31 Sacramento \n",
"\n",
" capital_url population population_rank \\\n",
"0 http://www.montgomeryal.gov 4833722 23 \n",
"1 http://www.juneau.org 735132 47 \n",
"2 https://www.phoenix.gov 6626624 15 \n",
"3 http://www.littlerock.org 2959373 32 \n",
"4 http://www.cityofsacramento.org 38332521 1 \n",
"\n",
" constitution_url \\\n",
"0 http://alisondb.legislature.state.al.us/alison... \n",
"1 http://www.legis.state.ak.us/basis/folioproxy.... \n",
"2 http://www.azleg.gov/Constitution.asp \n",
"3 http://www.arkleg.state.ar.us/assembly/Summary... \n",
"4 http://www.leginfo.ca.gov/const-toc.html \n",
"\n",
" state_flag_url \\\n",
"0 https://cdn.civil.services/us-states/flags/ala... \n",
"1 https://cdn.civil.services/us-states/flags/ala... \n",
"2 https://cdn.civil.services/us-states/flags/ari... \n",
"3 https://cdn.civil.services/us-states/flags/ark... \n",
"4 https://cdn.civil.services/us-states/flags/cal... \n",
"\n",
" state_seal_url \\\n",
"0 https://cdn.civil.services/us-states/seals/ala... \n",
"1 https://cdn.civil.services/us-states/seals/ala... \n",
"2 https://cdn.civil.services/us-states/seals/ari... \n",
"3 https://cdn.civil.services/us-states/seals/ark... \n",
"4 https://cdn.civil.services/us-states/seals/cal... \n",
"\n",
" map_image_url \\\n",
"0 https://cdn.civil.services/us-states/maps/alab... \n",
"1 https://cdn.civil.services/us-states/maps/alas... \n",
"2 https://cdn.civil.services/us-states/maps/ariz... \n",
"3 https://cdn.civil.services/us-states/maps/arka... \n",
"4 https://cdn.civil.services/us-states/maps/cali... \n",
"\n",
" landscape_background_url \\\n",
"0 https://cdn.civil.services/us-states/backgroun... \n",
"1 https://cdn.civil.services/us-states/backgroun... \n",
"2 https://cdn.civil.services/us-states/backgroun... \n",
"3 https://cdn.civil.services/us-states/backgroun... \n",
"4 https://cdn.civil.services/us-states/backgroun... \n",
"\n",
" skyline_background_url \\\n",
"0 https://cdn.civil.services/us-states/backgroun... \n",
"1 https://cdn.civil.services/us-states/backgroun... \n",
"2 https://cdn.civil.services/us-states/backgroun... \n",
"3 https://cdn.civil.services/us-states/backgroun... \n",
"4 https://cdn.civil.services/us-states/backgroun... \n",
"\n",
" twitter_url \\\n",
"0 https://twitter.com/alabamagov \n",
"1 https://twitter.com/alaska \n",
"2 NaN \n",
"3 https://twitter.com/arkansasgov \n",
"4 https://twitter.com/cagovernment \n",
"\n",
" facebook_url \n",
"0 https://www.facebook.com/alabamagov \n",
"1 https://www.facebook.com/AlaskaLocalGovernments \n",
"2 NaN \n",
"3 https://www.facebook.com/Arkansas.gov \n",
"4 NaN "
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ddf = ds.to_dask()\n",
"\n",
"ddf.head()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"cat = intake.open_catalog(\"us_states.yml\")"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['states']"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"list(cat)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"application/yaml": "states:\n args:\n urlpath: /Users/veit/cusy/trn/Python4DataScience/docs/data-processing/intake/states_*.csv\n description: US state information from [CivilServices](https://civil.services/)\n driver: intake.source.csv.CSVSource\n metadata:\n catalog_dir: /Users/veit/cusy/trn/Python4DataScience/docs/data-processing/intake/\n origin_url: https://github.com/CivilServiceUSA/us-states/blob/v1.0.0/data/states.csv\n",
"text/plain": [
"states:\n",
" args:\n",
" urlpath: /Users/veit/cusy/trn/Python4DataScience/docs/data-processing/intake/states_*.csv\n",
" description: US state information from [CivilServices](https://civil.services/)\n",
" driver: intake.source.csv.CSVSource\n",
" metadata:\n",
" catalog_dir: /Users/veit/cusy/trn/Python4DataScience/docs/data-processing/intake/\n",
" origin_url: https://github.com/CivilServiceUSA/us-states/blob/v1.0.0/data/states.csv\n"
]
},
"metadata": {
"application/json": {
"root": "states"
}
},
"output_type": "display_data"
}
],
"source": [
"cat.states"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Dask DataFrame Structure:
\n",
"\n",
" \n",
" \n",
" | \n",
" state | \n",
" slug | \n",
" code | \n",
" nickname | \n",
" website | \n",
" admission_date | \n",
" admission_number | \n",
" capital_city | \n",
" capital_url | \n",
" population | \n",
" population_rank | \n",
" constitution_url | \n",
" state_flag_url | \n",
" state_seal_url | \n",
" map_image_url | \n",
" landscape_background_url | \n",
" skyline_background_url | \n",
" twitter_url | \n",
" facebook_url | \n",
"
\n",
" \n",
" | npartitions=2 | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" | \n",
" string | \n",
" string | \n",
" string | \n",
" string | \n",
" string | \n",
" string | \n",
" int64 | \n",
" string | \n",
" string | \n",
" int64 | \n",
" int64 | \n",
" string | \n",
" string | \n",
" string | \n",
" string | \n",
" string | \n",
" string | \n",
" string | \n",
" string | \n",
"
\n",
" \n",
" | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
"
\n",
"Dask Name: read_csv, 1 expression
"
],
"text/plain": [
"Dask DataFrame Structure:\n",
" state slug code nickname website admission_date admission_number capital_city capital_url population population_rank constitution_url state_flag_url state_seal_url map_image_url landscape_background_url skyline_background_url twitter_url facebook_url\n",
"npartitions=2 \n",
" string string string string string string int64 string string int64 int64 string string string string string string string string\n",
" ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...\n",
" ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...\n",
"Dask Name: read_csv, 1 expression\n",
"Expr=ReadCSV(93f13fb)"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cat.states.to_dask()"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" state | \n",
" slug | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" Alabama | \n",
" alabama | \n",
"
\n",
" \n",
" | 1 | \n",
" Alaska | \n",
" alaska | \n",
"
\n",
" \n",
" | 2 | \n",
" Arizona | \n",
" arizona | \n",
"
\n",
" \n",
" | 3 | \n",
" Arkansas | \n",
" arkansas | \n",
"
\n",
" \n",
" | 4 | \n",
" California | \n",
" california | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" state slug\n",
"0 Alabama alabama\n",
"1 Alaska alaska\n",
"2 Arizona arizona\n",
"3 Arkansas arkansas\n",
"4 California california"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cat.states.to_dask()[[\"state\", \"slug\"]].head()"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" 0 | \n",
" 1 | \n",
" 2 | \n",
" 3 | \n",
" 4 | \n",
" 5 | \n",
" 6 | \n",
" 7 | \n",
" 8 | \n",
" 9 | \n",
" 10 | \n",
" 11 | \n",
" 12 | \n",
" 13 | \n",
" 14 | \n",
" 15 | \n",
" 16 | \n",
" 17 | \n",
" 18 | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" Alabama | \n",
" alabama | \n",
" AL | \n",
" Yellowhammer State | \n",
" http://www.alabama.gov | \n",
" 1819-12-14 | \n",
" 22 | \n",
" Montgomery | \n",
" http://www.montgomeryal.gov | \n",
" 4833722 | \n",
" 23 | \n",
" http://alisondb.legislature.state.al.us/alison... | \n",
" https://cdn.civil.services/us-states/flags/ala... | \n",
" https://cdn.civil.services/us-states/seals/ala... | \n",
" https://cdn.civil.services/us-states/maps/alab... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://twitter.com/alabamagov | \n",
" https://www.facebook.com/alabamagov | \n",
"
\n",
" \n",
" | 1 | \n",
" Alaska | \n",
" alaska | \n",
" AK | \n",
" The Last Frontier | \n",
" http://alaska.gov | \n",
" 1959-01-03 | \n",
" 49 | \n",
" Juneau | \n",
" http://www.juneau.org | \n",
" 735132 | \n",
" 47 | \n",
" http://www.legis.state.ak.us/basis/folioproxy.... | \n",
" https://cdn.civil.services/us-states/flags/ala... | \n",
" https://cdn.civil.services/us-states/seals/ala... | \n",
" https://cdn.civil.services/us-states/maps/alas... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://twitter.com/alaska | \n",
" https://www.facebook.com/AlaskaLocalGovernments | \n",
"
\n",
" \n",
" | 2 | \n",
" Arizona | \n",
" arizona | \n",
" AZ | \n",
" The Grand Canyon State | \n",
" https://az.gov | \n",
" 1912-02-14 | \n",
" 48 | \n",
" Phoenix | \n",
" https://www.phoenix.gov | \n",
" 6626624 | \n",
" 15 | \n",
" http://www.azleg.gov/Constitution.asp | \n",
" https://cdn.civil.services/us-states/flags/ari... | \n",
" https://cdn.civil.services/us-states/seals/ari... | \n",
" https://cdn.civil.services/us-states/maps/ariz... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" NaN | \n",
" NaN | \n",
"
\n",
" \n",
" | 3 | \n",
" Arkansas | \n",
" arkansas | \n",
" AR | \n",
" The Natural State | \n",
" http://arkansas.gov | \n",
" 1836-06-15 | \n",
" 25 | \n",
" Little Rock | \n",
" http://www.littlerock.org | \n",
" 2959373 | \n",
" 32 | \n",
" http://www.arkleg.state.ar.us/assembly/Summary... | \n",
" https://cdn.civil.services/us-states/flags/ark... | \n",
" https://cdn.civil.services/us-states/seals/ark... | \n",
" https://cdn.civil.services/us-states/maps/arka... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://twitter.com/arkansasgov | \n",
" https://www.facebook.com/Arkansas.gov | \n",
"
\n",
" \n",
" | 4 | \n",
" California | \n",
" california | \n",
" CA | \n",
" Golden State | \n",
" http://www.ca.gov | \n",
" 1850-09-09 | \n",
" 31 | \n",
" Sacramento | \n",
" http://www.cityofsacramento.org | \n",
" 38332521 | \n",
" 1 | \n",
" http://www.leginfo.ca.gov/const-toc.html | \n",
" https://cdn.civil.services/us-states/flags/cal... | \n",
" https://cdn.civil.services/us-states/seals/cal... | \n",
" https://cdn.civil.services/us-states/maps/cali... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://cdn.civil.services/us-states/backgroun... | \n",
" https://twitter.com/cagovernment | \n",
" NaN | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" 0 1 2 3 4 \\\n",
"0 Alabama alabama AL Yellowhammer State http://www.alabama.gov \n",
"1 Alaska alaska AK The Last Frontier http://alaska.gov \n",
"2 Arizona arizona AZ The Grand Canyon State https://az.gov \n",
"3 Arkansas arkansas AR The Natural State http://arkansas.gov \n",
"4 California california CA Golden State http://www.ca.gov \n",
"\n",
" 5 6 7 8 9 10 \\\n",
"0 1819-12-14 22 Montgomery http://www.montgomeryal.gov 4833722 23 \n",
"1 1959-01-03 49 Juneau http://www.juneau.org 735132 47 \n",
"2 1912-02-14 48 Phoenix https://www.phoenix.gov 6626624 15 \n",
"3 1836-06-15 25 Little Rock http://www.littlerock.org 2959373 32 \n",
"4 1850-09-09 31 Sacramento http://www.cityofsacramento.org 38332521 1 \n",
"\n",
" 11 \\\n",
"0 http://alisondb.legislature.state.al.us/alison... \n",
"1 http://www.legis.state.ak.us/basis/folioproxy.... \n",
"2 http://www.azleg.gov/Constitution.asp \n",
"3 http://www.arkleg.state.ar.us/assembly/Summary... \n",
"4 http://www.leginfo.ca.gov/const-toc.html \n",
"\n",
" 12 \\\n",
"0 https://cdn.civil.services/us-states/flags/ala... \n",
"1 https://cdn.civil.services/us-states/flags/ala... \n",
"2 https://cdn.civil.services/us-states/flags/ari... \n",
"3 https://cdn.civil.services/us-states/flags/ark... \n",
"4 https://cdn.civil.services/us-states/flags/cal... \n",
"\n",
" 13 \\\n",
"0 https://cdn.civil.services/us-states/seals/ala... \n",
"1 https://cdn.civil.services/us-states/seals/ala... \n",
"2 https://cdn.civil.services/us-states/seals/ari... \n",
"3 https://cdn.civil.services/us-states/seals/ark... \n",
"4 https://cdn.civil.services/us-states/seals/cal... \n",
"\n",
" 14 \\\n",
"0 https://cdn.civil.services/us-states/maps/alab... \n",
"1 https://cdn.civil.services/us-states/maps/alas... \n",
"2 https://cdn.civil.services/us-states/maps/ariz... \n",
"3 https://cdn.civil.services/us-states/maps/arka... \n",
"4 https://cdn.civil.services/us-states/maps/cali... \n",
"\n",
" 15 \\\n",
"0 https://cdn.civil.services/us-states/backgroun... \n",
"1 https://cdn.civil.services/us-states/backgroun... \n",
"2 https://cdn.civil.services/us-states/backgroun... \n",
"3 https://cdn.civil.services/us-states/backgroun... \n",
"4 https://cdn.civil.services/us-states/backgroun... \n",
"\n",
" 16 \\\n",
"0 https://cdn.civil.services/us-states/backgroun... \n",
"1 https://cdn.civil.services/us-states/backgroun... \n",
"2 https://cdn.civil.services/us-states/backgroun... \n",
"3 https://cdn.civil.services/us-states/backgroun... \n",
"4 https://cdn.civil.services/us-states/backgroun... \n",
"\n",
" 17 \\\n",
"0 https://twitter.com/alabamagov \n",
"1 https://twitter.com/alaska \n",
"2 NaN \n",
"3 https://twitter.com/arkansasgov \n",
"4 https://twitter.com/cagovernment \n",
"\n",
" 18 \n",
"0 https://www.facebook.com/alabamagov \n",
"1 https://www.facebook.com/AlaskaLocalGovernments \n",
"2 NaN \n",
"3 https://www.facebook.com/Arkansas.gov \n",
"4 NaN "
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cat.states(csv_kwargs={\"header\": None, \"skiprows\": 1}).read().head()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.13 Kernel",
"language": "python",
"name": "python313"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.0"
},
"latex_envs": {
"LaTeX_envs_menu_present": true,
"autoclose": false,
"autocomplete": true,
"bibliofile": "biblio.bib",
"cite_by": "apalike",
"current_citInitial": 1,
"eqLabelWithNumbers": true,
"eqNumInitial": 1,
"hotkeys": {
"equation": "Ctrl-E",
"itemize": "Ctrl-I"
},
"labels_anchors": false,
"latex_user_defs": false,
"report_style_numbering": false,
"user_envs_cfg": false
},
"widgets": {
"application/vnd.jupyter.widget-state+json": {
"state": {},
"version_major": 2,
"version_minor": 0
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}