Using the JOSM Conflation plugin to add 1500 addresses in 10 minutes

Posted by watmildon on 4/24/2023

The setup

This diary is a follow on to my previous entry detailing Adding addresses with JOSM and MapWithAI. You’ll need all of the setup there plus the Conflation Plugin.

Finding a good area

The key to doing this quickly is to find an area with:

  • High quality address data with good spatial positioning
  • High density of building outlines to act as targets for the address data
  • Extremely low current address density (resolving conflicts is important but does slow you down!)

The area I’ve been spending most of my time recently is Phoenix, AZ which has great address data from the [National Address Database] and a high level of building coverage. The suburb are also quite sprawling which means you can cover a lot of very regularized ground very quickly. Here’s a good candidate for rapid addition: Aerial image of a suburb of Phoenix. The houses are laid out very regularly

Using the same setup from the other diary entry, pull in the address data. Looks great so far: Same aerial image of a suburb of Phoenix but not with dots representing address nodes. The houses and nodes match very well.

Let’s get conflating

The conflation plugin is extremely powerful. It can be used to ask “which things in set A are closely matched with things in set B”. I have described how to use it to find umapped hospitals but this time we’ll be using it to match address nodes with building outlines.

The first set of things we need to select are the address nodes we want to add. These will be in the conflation tool as the “Reference”.

  • Activate the MapWithAI layer that has the address nodes using the Layers pane right click > Activate
  • Select the relevant set of nodes (this may be all the nodes if your download area is precise)
  • Click the “Configure” button on the conflation tool pane
  • Click the upper “Freeze” button on the “Configure conflation settings” pane

Next we need to get the building outlines into the conflation tool as the “Subject”.

  • Activate the layer with the OSM data using the Layers pane right click > Activate
  • Open the find element dialog Ctrl+F
  • Search for all building elements that are ways building=* type:way
  • Click the lower “Freeze” button

Screen shot of the configuration pane for the conflation plugin showing that the reference and subject have been set

Hit “Generate match…” to set the tool to work. It will relatively quickly generate a list of matched and unmatched items. This set has 23 unmatched items listed in the “Reference only” tab. Let’s go take a look.

conflation tool dialog. a list of matched and unmatched items

Unmatched items

Each unmatched item will need to be reviewed to determine what needs to be done. There’s lots of ways things can go unmatched but I’ll outline the most common here.

Example 1: An ambiguous area.

In this case the aerial show an incomplete construction site with no real matching to the address data. It’s not clear at all that any of this is useful so we delete the nodes and move on. Remember, we don’t have to add everything but we really really want to avoid adding incorrect data.

Aerial view of a large constructionsite mostly still brownfield with address nodes scattered across it

Example 2: Features without buildings.

Here’s three cases where the address node should be merged into the OSM layer. A house without a building object. A bit of utilities. A park. Aerial view of a house but no OSM building outline Aerial view of a fenced in utility area Aerial view of a park

Example 3: An extra address node.

No data set is perfect. Sometimes you’ll find address nodes that don’t obviously belong to anything on the ground. Delete them and move on.

Aerial image of a suburban street with an address node in the middle of the street

Example 4: Bad conflation due to misaligned data

Again, we have a data set issue that needs fixing. It’s clear from context that the address data for some houses is shifted a bit to the west. It is best to move the nodes where they belong and merge them by hand.

Aerial image of a suburban street with misaligned address nodes such that one house has no address node associated with it

Example 5: Large building or building that is a relation

Sometimes an address clearly goes with a building but the building shape makes the conflation difficult. Merge it by hand.

Aerial image of a large building with a lone address node nearby but not in the outline

Reviewing matched items

Now we need to review the matched items. Matches with a low “distance” mean that the address node was sitting very near the centroid of a building. Let’s start by reviewing the large distance matches.

Screenshot of the conflation matches sorted by decreasing "distance". The top item has a distance very near our cutoff of 30

Click through them one at a time and see if it looks like a reasonable match. You only need to intervene on any matches that are incorrect. Click through them until the distance is small enough that most items are inside the building outline. For this set that’s something like 15.

The most common mismatch looks like this:

A match where the address node is matched to the neighbors shed

Bulk conflation and final review

You’re now ready to hit the big “Conflate” button. This will merge all of the address data into the “matching” building outlines. The area will look very nice.

Zoomed out aerial view of the osm data now with address information

Pan though the area and see if anything looks amiss. When I did this I ended up needing to correct a set of houses where the address data had matched onto some smaller out buildings and not the main structure. Moving the tags is easy with the copy/paste in JOSM.

Aerial view of a cul-de-sac with sheds having address data that should be on the larger building outlines

Upload

Run the validator and fix up anything it notices before upload. Most commonly adding a directional or suffix to a roadway. It’ll probably take a bit for your first few runs but this process takes me about 10 minutes for this chunk of the map.

Let me know if you have any questions or if you found this useful!

Happy mapping.