httpx installation and sample application¶
Installation¶
The httpx library is useful for communicating with REST APIs. With Spack you can provide httpx in your kernel:
$ spack env activate python-311
$ spack install py-httpx
Alternatively, you can install httpx with other package managers, for example
$ uv add httpx
Example OSM Nominatim API¶
In this example we get our data from the OpenStreetMap Nominatim API. This can be reached via the URL https://nominatim.openstreetmap.org/search?. To e.g. receive information about the Berlin Congress Center in Berlin in JSON format, the URL https://nominatim.openstreetmap.org/search.php?q=Alexanderplatz+Berlin&format=json should be given, and if you want to display the corresponding map section you just have to leave out
&format=json.
Then we define the search URL and the parameters. Nominatim expects at least the following two parameters
Key |
Value |
|---|---|
|
Address query that allows the following specifications: |
|
Format in which the data is returned. Possible values are |
The query can then be made with:
[1]:
import httpx
search_url = "https://nominatim.openstreetmap.org/search?"
params = {
"q": "Alexanderplatz, Berlin",
"format": "json",
}
r = httpx.get(search_url, params=params)
[2]:
r.status_code
[2]:
200
[3]:
r.json()
[3]:
[{'place_id': 128497332,
'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright',
'osm_type': 'way',
'osm_id': 783052052,
'lat': '52.5219814',
'lon': '13.413635717448294',
'class': 'place',
'type': 'square',
'place_rank': 25,
'importance': 0.5136915868107359,
'addresstype': 'square',
'name': 'Alexanderplatz',
'display_name': 'Alexanderplatz, Mitte, Berlin, 10178, Deutschland',
'boundingbox': ['52.5201457', '52.5238113', '13.4103097', '13.4160801']},
{'place_id': 128243381,
'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright',
'osm_type': 'node',
'osm_id': 3908141014,
'lat': '52.5215661',
'lon': '13.4112804',
'class': 'railway',
'type': 'station',
'place_rank': 30,
'importance': 0.43609907778808027,
'addresstype': 'railway',
'name': 'Alexanderplatz',
'display_name': 'Alexanderplatz, Dircksenstraße, Mitte, Berlin, 10179, Deutschland',
'boundingbox': ['52.5165661', '52.5265661', '13.4062804', '13.4162804']},
{'place_id': 128416772,
'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright',
'osm_type': 'way',
'osm_id': 346206374,
'lat': '52.5216214',
'lon': '13.4131913',
'class': 'highway',
'type': 'pedestrian',
'place_rank': 26,
'importance': 0.10000999999999993,
'addresstype': 'road',
'name': 'Alexanderplatz',
'display_name': 'Alexanderplatz, Mitte, Berlin, 10178, Deutschland',
'boundingbox': ['52.5216214', '52.5216661', '13.4131913', '13.4131914']}]
Three different locations are found, the square, a bus stop and a hotel. In order to be able to filter further, we can only display the most important location:
[4]:
params = {"q": "Alexanderplatz, Berlin", "format": "json", "limit": "1"}
r = httpx.get(search_url, params=params)
r.json()
[4]:
[{'place_id': 128497332,
'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright',
'osm_type': 'way',
'osm_id': 783052052,
'lat': '52.5219814',
'lon': '13.413635717448294',
'class': 'place',
'type': 'square',
'place_rank': 25,
'importance': 0.5136915868107359,
'addresstype': 'square',
'name': 'Alexanderplatz',
'display_name': 'Alexanderplatz, Mitte, Berlin, 10178, Deutschland',
'boundingbox': ['52.5201457', '52.5238113', '13.4103097', '13.4160801']}]
Clean Code¶
Now that we know the code works, let’s turn everything into a clean and flexible function.
To ensure that the interaction was successful, we use the raise_for_status method of httpx, which throws an exception if the HTTP status code isn’t 200 OK:
[5]:
r.raise_for_status()
[5]:
<Response [200 OK]>
Since we don’t want to exceed the load limits of the Nominatim API, we will delay our httpx with the time.sleep function:
[6]:
from time import sleep
sleep(1)
r.json()
[6]:
[{'place_id': 128497332,
'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright',
'osm_type': 'way',
'osm_id': 783052052,
'lat': '52.5219814',
'lon': '13.413635717448294',
'class': 'place',
'type': 'square',
'place_rank': 25,
'importance': 0.5136915868107359,
'addresstype': 'square',
'name': 'Alexanderplatz',
'display_name': 'Alexanderplatz, Mitte, Berlin, 10178, Deutschland',
'boundingbox': ['52.5201457', '52.5238113', '13.4103097', '13.4160801']}]
Next we declare the function itself. As arguments we need the address, the format, the limit of the objects to be returned with the default value 1 and further kwargs (keyword arguments) that are passed as parameters:
[7]:
def nominatim_search(address, format="json", limit=1, **kwargs):
"""Thin wrapper around the Nominatim search API.
For the list of parameters see
https://nominatim.org/release-docs/develop/api/Search/#parameters
"""
search_url = "https://nominatim.openstreetmap.org/search?"
params = {"q": address, "format": format, "limit": limit, **kwargs}
r = httpx.get(search_url, params=params)
# Raise an exception if the status is unsuccessful
r.raise_for_status()
sleep(1)
return r.json()
Now we can try out the function, for example with
[8]:
nominatim_search("Alexanderplatz, Berlin")
[8]:
[{'place_id': 128497332,
'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright',
'osm_type': 'way',
'osm_id': 783052052,
'lat': '52.5219814',
'lon': '13.413635717448294',
'class': 'place',
'type': 'square',
'place_rank': 25,
'importance': 0.5136915868107359,
'addresstype': 'square',
'name': 'Alexanderplatz',
'display_name': 'Alexanderplatz, Mitte, Berlin, 10178, Deutschland',
'boundingbox': ['52.5201457', '52.5238113', '13.4103097', '13.4160801']}]
Caching¶
If the same queries are to be asked over and over again within a session, it makes sense to call up this data only once and use it again. In Python we can use lru_cache from Python’s standard functools library. lru_cache saves the last N requests (Least Recent Used) and as soon as the limit is exceeded, the oldest values are discarded. To use this for the nominatim_search method, all you have to do is define an import and a decorator:
[9]:
from functools import lru_cache
@lru_cache(maxsize=1000)
def nominatim_search(address, format="json", limit=1, **kwargs):
"""…"""
However, lru_cache only saves the results during a session. If a script terminates because of a timeout or an exception, the results are lost. If the data is to be saved more permanently, tools such as joblib or python-diskcache can be used.