Intake-GUI: Exploring data in a graphical user interface#
Intake GUI has been re-implemented so that it can be made available not only in Jupyter notebooks, but also in other web applications. It displays the contents of all installed catalogs and enables local and remote catalogs to be selected and to be searched and selected from.
Intake supports the division of labor between data engineers who curate, manage, and deploy data, and data scientists who analyse and visualise data without having to know how it’s stored.
The Intake GUI is based on Panel, with the control panel offering a composite dashboard solution for displaying plots, images, tables, texts and widgets. Panel works both in a Jupyter notebook and in a standalone Tornado application.
From a data engineer’s point of view, this means that you can deploy the recording GUI at an endpoint and use it as a data exploration tool for your data users. This also means that it’s easy to adapt and reorganise the GUI in order to insert your own logo, reuse parts of it in your own applications or add new functions.
In the future, Intake-GUI should also allow the input of user parameters as well as the editing and saving of catalogs.
[1]:
import intake
intake.gui
[1]:
The GUI contains three main areas:
a list of catalogs. The builtin catalog shown by defaul tcontains data records installed in the system, just like
intake.cat
.a list of the sources in the currently selected catalog.
a description of the currently selected source.
Ad 1: Catalogs#
No catalog is currently displayed in the list of catalogs. However, under the three main areas there are three buttons that can be used to add, remove, or search catalogs.
The buttons are also available through the API, e.g. for Add Catalog with:
[2]:
intake.gui.add("./us_crime/us_crime.yaml")
Remote catalogs are e.g. available under
Ad 2. Sources#
Selecting a source from the list updates the descriptive text on the left side of the user interface.
This is also available via the API:
[3]:
intake.gui.sources
[3]:
[name: us_crime
container: dataframe
plugin: ['csv']
driver: ['csv']
description: US Crime data [UCRDataTool](https://www.ucrdatatool.gov/Search/Crime/State/StatebyState.cfm)
direct_access: forbid
user_parameters: []
metadata:
plots:
line_example:
kind: line
y: ['Robbery', 'Burglary']
x: Year
violin_example:
kind: violin
y: ['Burglary rate', 'Larceny-theft rate', 'Robbery rate', 'Violent Crime rate']
group_label: Type of crime
value_label: Rate per 100k
invert: True
args:
urlpath: {{ CATALOG_DIR }}/data/crime.csv]
This consists of a list of regular Intake data source entries. To look at the first entries, we can enter the following:
[4]:
source = intake.gui.sources[0]
source.to_dask().head()
[4]:
[200~Year | Population | Violent crime total | Murder and nonnegligent Manslaughter | Legacy rape /1 | Revised rape /2 | Robbery | Aggravated assault | Property crime total | Burglary | ... | Violent Crime rate | Murder and nonnegligent manslaughter rate | Legacy rape rate /1 | Revised rape rate /2 | Robbery rate | Aggravated assault rate | Property crime rate | Burglary rate | Larceny-theft rate | Motor vehicle theft rate | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1960 | 179323175 | 288460 | 9110 | 17190 | NaN | 107840 | 154320 | 3095700 | 912100 | ... | 160.9 | 5.1 | 9.6 | NaN | 60.1 | 86.1 | 1726.3 | 508.6 | 1034.7 | 183.0 |
1 | 1961 | 182992000 | 289390 | 8740 | 17220 | NaN | 106670 | 156760 | 3198600 | 949600 | ... | 158.1 | 4.8 | 9.4 | NaN | 58.3 | 85.7 | 1747.9 | 518.9 | 1045.4 | 183.6 |
2 | 1962 | 185771000 | 301510 | 8530 | 17550 | NaN | 110860 | 164570 | 3450700 | 994300 | ... | 162.3 | 4.6 | 9.4 | NaN | 59.7 | 88.6 | 1857.5 | 535.2 | 1124.8 | 197.4 |
3 | 1963 | 188483000 | 316970 | 8640 | 17650 | NaN | 116470 | 174210 | 3792500 | 1086400 | ... | 168.2 | 4.6 | 9.4 | NaN | 61.8 | 92.4 | 2012.1 | 576.4 | 1219.1 | 216.6 |
4 | 1964 | 191141000 | 364220 | 9360 | 21420 | NaN | 130390 | 203050 | 4200400 | 1213200 | ... | 190.6 | 4.9 | 11.2 | NaN | 68.2 | 106.2 | 2197.5 | 634.7 | 1315.5 | 247.4 |
5 rows × 22 columns
[5]:
source.gui
[5]:
[6]:
intake.gui.source.description
[6]:
[7]:
cat = intake.open_catalog("./us_crime/us_crime.yaml")
cat.gui