I’d like to announce the release of the new major version 4 of pyosmium.
pyosmium was originally created as a thin Python wrapper around the osmium library, a fast and flexible C++ library for reading, writing and processing OSM data. With the new version 4, pyosmium adds a convenience layer on top which gives the library a more pythonic feel and speeds up processing considerably.
The most important new features are:
- Iterative processing. OSM files can now be iterated through with an simple for loop. (Processing a file using handler functions is still possible.)
- Filter functions allow to quickly skip over uninteresting parts of the file, making it now possible to use pyosmium scripts with large OSM files without pre-filtering.
- Writers with automatic reference completion. One of the challenges of writing OSM files is that for every way and relation written, you usually also need to write the nodes and members. pyosmium now implements writers which help with that.
Here is an example how to quickly find the most frequent tags used together with amenity=school using the new iterative syntax:
1 2 3 4 5 6 7 8 9 10 11 12 13
import osmium from collections import Counter tag_counter = Counter() total = 0 for o in osmium.FileProcessor('planet.osm.pbf')\ .with_filter(osmium.filter.TagFilter(('amenity', 'school'))): tag_counter.update([tag.k for tag in o.tags if tag.k != 'amenity']) total += 1 for tag, cnt in tag_counter.most_common(10): print(f"{cnt:6d} ({cnt*100/total:5.2f}%) {tag}")
Running this on a full OSM planet file takes less than 5 minutes on a 12-core machine with 128GB RAM.
Or do you want to create an thematic extract of schools:
1 2 3 4 5 6 7
import osmium with osmium.BackReferenceWriter('schools.pbf', 'planet.osm.pbf', overwrite=True) as writer: for o in osmium.FileProcessor('planet.osm.pbf')\ .with_filter(osmium.filter.TagFilter(('amenity', 'school'))): writer.add(o)
This is done in about 13 minutes. For comparison, osmium-tool’s tags-filter needs about 10 minutes for the same task on the same machine.
There are many more smaller improvements and additions. For a complete list of changes, have a look at the release notes. The improved documentation now comes with a cookbook section with documented examples to get you started.