{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# httpx installation and sample application" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Installation\n", "\n", "The httpx library is useful for communicating with REST APIs. With [Spack](../../productive/envs/spack/index.rst) you can provide httpx in your kernel:\n", "\n", "``` bash\n", "$ spack env activate python-311\n", "$ spack install py-httpx\n", "```\n", "\n", "Alternatively, you can install httpx with other package managers, for example\n", "\n", "``` bash\n", "$ uv add httpx\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example OSM Nominatim API\n", "\n", "In this example we get our data from the [OpenStreetMap Nominatim API](https://nominatim.org/release-docs/develop/api/Overview/). This can be reached via the URL `https://nominatim.openstreetmap.org/search?`. To e.g. receive information about the Berlin Congress Center in Berlin in JSON format, the URL `https://nominatim.openstreetmap.org/search.php?q=Alexanderplatz+Berlin&format=json` should be given, and if you want to display the corresponding map section you just have to leave out `&format=json`.\n", "\n", "Then we define the search URL and the parameters. Nominatim expects at least the following two parameters\n", "\n", "| Key | Value |\n", "| --------- | ------------------------------------ |\n", "| `q` | Address query that allows the following specifications: `street`, `city`, `county`, `state`, `country` and `postalcode`. |\n", "| `format` | Format in which the data is returned. Possible values are `html`, `xml`, `json`, `jsonv2`, `geojson` and `geocodejson`. |\n", "\n", "The query can then be made with:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import httpx\n", "\n", "\n", "search_url = \"https://nominatim.openstreetmap.org/search?\"\n", "params = {\n", " \"q\": \"Alexanderplatz, Berlin\",\n", " \"format\": \"json\",\n", "}\n", "r = httpx.get(search_url, params=params)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "200" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "r.status_code" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "[{'place_id': 128497332,\n", " 'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright',\n", " 'osm_type': 'way',\n", " 'osm_id': 783052052,\n", " 'lat': '52.5219814',\n", " 'lon': '13.413635717448294',\n", " 'class': 'place',\n", " 'type': 'square',\n", " 'place_rank': 25,\n", " 'importance': 0.5136915868107359,\n", " 'addresstype': 'square',\n", " 'name': 'Alexanderplatz',\n", " 'display_name': 'Alexanderplatz, Mitte, Berlin, 10178, Deutschland',\n", " 'boundingbox': ['52.5201457', '52.5238113', '13.4103097', '13.4160801']},\n", " {'place_id': 128243381,\n", " 'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright',\n", " 'osm_type': 'node',\n", " 'osm_id': 3908141014,\n", " 'lat': '52.5215661',\n", " 'lon': '13.4112804',\n", " 'class': 'railway',\n", " 'type': 'station',\n", " 'place_rank': 30,\n", " 'importance': 0.43609907778808027,\n", " 'addresstype': 'railway',\n", " 'name': 'Alexanderplatz',\n", " 'display_name': 'Alexanderplatz, Dircksenstraße, Mitte, Berlin, 10179, Deutschland',\n", " 'boundingbox': ['52.5165661', '52.5265661', '13.4062804', '13.4162804']},\n", " {'place_id': 128416772,\n", " 'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright',\n", " 'osm_type': 'way',\n", " 'osm_id': 346206374,\n", " 'lat': '52.5216214',\n", " 'lon': '13.4131913',\n", " 'class': 'highway',\n", " 'type': 'pedestrian',\n", " 'place_rank': 26,\n", " 'importance': 0.10000999999999993,\n", " 'addresstype': 'road',\n", " 'name': 'Alexanderplatz',\n", " 'display_name': 'Alexanderplatz, Mitte, Berlin, 10178, Deutschland',\n", " 'boundingbox': ['52.5216214', '52.5216661', '13.4131913', '13.4131914']}]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "r.json()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Three different locations are found, the square, a bus stop and a hotel. In order to be able to filter further, we can only display the most important location:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'place_id': 128497332,\n", " 'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright',\n", " 'osm_type': 'way',\n", " 'osm_id': 783052052,\n", " 'lat': '52.5219814',\n", " 'lon': '13.413635717448294',\n", " 'class': 'place',\n", " 'type': 'square',\n", " 'place_rank': 25,\n", " 'importance': 0.5136915868107359,\n", " 'addresstype': 'square',\n", " 'name': 'Alexanderplatz',\n", " 'display_name': 'Alexanderplatz, Mitte, Berlin, 10178, Deutschland',\n", " 'boundingbox': ['52.5201457', '52.5238113', '13.4103097', '13.4160801']}]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "params = {\"q\": \"Alexanderplatz, Berlin\", \"format\": \"json\", \"limit\": \"1\"}\n", "r = httpx.get(search_url, params=params)\n", "r.json()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Clean Code\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we know the code works, let’s turn everything into a clean and flexible function." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To ensure that the interaction was successful, we use the `raise_for_status` method of `httpx`, which throws an exception if the HTTP status code isn’t `200 OK`:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "r.raise_for_status()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since we don't want to exceed the load limits of the Nominatim API, we will delay our httpx with the `time.sleep` function:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'place_id': 128497332,\n", " 'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright',\n", " 'osm_type': 'way',\n", " 'osm_id': 783052052,\n", " 'lat': '52.5219814',\n", " 'lon': '13.413635717448294',\n", " 'class': 'place',\n", " 'type': 'square',\n", " 'place_rank': 25,\n", " 'importance': 0.5136915868107359,\n", " 'addresstype': 'square',\n", " 'name': 'Alexanderplatz',\n", " 'display_name': 'Alexanderplatz, Mitte, Berlin, 10178, Deutschland',\n", " 'boundingbox': ['52.5201457', '52.5238113', '13.4103097', '13.4160801']}]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from time import sleep\n", "\n", "\n", "sleep(1)\n", "r.json()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next we declare the function itself. As arguments we need the address, the format, the limit of the objects to be returned with the default value `1` and further `kwargs` (**k**ey**w**ord **arg**ument**s**) that are passed as parameters:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "def nominatim_search(address, format=\"json\", limit=1, **kwargs):\n", " \"\"\"Thin wrapper around the Nominatim search API.\n", " For the list of parameters see\n", " https://nominatim.org/release-docs/develop/api/Search/#parameters\n", " \"\"\"\n", " search_url = \"https://nominatim.openstreetmap.org/search?\"\n", " params = {\"q\": address, \"format\": format, \"limit\": limit, **kwargs}\n", " r = httpx.get(search_url, params=params)\n", " # Raise an exception if the status is unsuccessful\n", " r.raise_for_status()\n", "\n", " sleep(1)\n", " return r.json()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can try out the function, for example with" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'place_id': 128497332,\n", " 'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright',\n", " 'osm_type': 'way',\n", " 'osm_id': 783052052,\n", " 'lat': '52.5219814',\n", " 'lon': '13.413635717448294',\n", " 'class': 'place',\n", " 'type': 'square',\n", " 'place_rank': 25,\n", " 'importance': 0.5136915868107359,\n", " 'addresstype': 'square',\n", " 'name': 'Alexanderplatz',\n", " 'display_name': 'Alexanderplatz, Mitte, Berlin, 10178, Deutschland',\n", " 'boundingbox': ['52.5201457', '52.5238113', '13.4103097', '13.4160801']}]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nominatim_search(\"Alexanderplatz, Berlin\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Caching\n", "\n", "If the same queries are to be asked over and over again within a session, it makes sense to call up this data only once and use it again. In Python we can use `lru_cache` from Python’s standard `functools` library. `lru_cache` saves the last `N` requests (**L**east **R**ecent **U**sed) and as soon as the limit is exceeded, the oldest values are discarded. To use this for the `nominatim_search` method, all you have to do is define an import and a decorator:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "from functools import lru_cache\n", "\n", "\n", "@lru_cache(maxsize=1000)\n", "def nominatim_search(address, format=\"json\", limit=1, **kwargs):\n", " \"\"\"…\"\"\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, `lru_cache` only saves the results during a session. If a script terminates because of a timeout or an exception, the results are lost. If the data is to be saved more permanently, tools such as [joblib](https://joblib.readthedocs.io/en/stable/) or [python-diskcache](https://grantjenks.com/docs/diskcache/) can be used." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.13 Kernel", "language": "python", "name": "python313" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.0" }, "latex_envs": { "LaTeX_envs_menu_present": true, "autoclose": false, "autocomplete": true, "bibliofile": "biblio.bib", "cite_by": "apalike", "current_citInitial": 1, "eqLabelWithNumbers": true, "eqNumInitial": 1, "hotkeys": { "equation": "Ctrl-E", "itemize": "Ctrl-I" }, "labels_anchors": false, "latex_user_defs": false, "report_style_numbering": false, "user_envs_cfg": false }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 4 }