AI Generated Changeset Comments

Posted by FargoColdYa on 4/26/2024

Summary: What if AI creates the Changeset Comments? We could send locations, tag types, and quantities to get an output. AI would have to be run locally with small models for cost and be validated by the user.

Problem 1: Time I assume that 1,000 users create 2 changes in 1 day. We assume that each change set takes 3.5 seconds. 1000 users *2 changes * 3.5 seconds per change = 7000 seconds. OSM Users spend about 1.9 hours per day.

Problem 2: Skill Outsourcing Users should spend time on the things AI can’t do.

Problem 3: Server Side Peer Review We have human generated changeset comments. We could create AI generated changeset comments. We could ask the AI, “are these 2 changeset comments so different that it looks malicious”?

General AI Inputs: 1. Location: Where did the user map? 2. Feature Types: What tags did the user use?

AI Prompt: “You are an AI system. A user made edits in OpenStreetMap, a collaborative mapping project. They mapped locations[Mappleville, MN, USA; Bobville, MN, USA] with tags[50xSidewalks, 20xMarkedCrossings, & 10xReligous Areas]. You will create a changeset comment that concisely tells human reviewers what this changeset was about in 3 sentences or less. Exact numbers are not important. Changesets describe changes, so don’t request anything. Don’t mention anything that is common across all changesets.”

AI Response (https://www.meta.ai/): “Added sidewalks, marked crossings, and religious areas in Mappleville and Bobville, MN. Improved pedestrian and accessibility mapping. Enhanced local community information.”

Specific AI Inputs for Locations: 1. Cities[1 to 5], States[1 to 5], Countries[1 to 5]. 2. Is this a place with unclear boundaries? (What if somebody maps the ocean) 3. What is the size of the bounding box for this edit in KM?

Specific AI Inputs for Feature Types: Tags[1 to 6] & corresponding Quantities

Algorithms: 1. Sort the following tags by how frequently each was used in descending order and a limit of 5. 2. For each city, how often was each tag used? Create a table unless the table is huge.

Complexities of the process: 1. Disputed Boundaries: This was the changeset that changed the boarder. 2. Large Edits: Do not run this edit over changesets larger than 500 edits. 3. Malicious Inputs: Somebody named a building tag after a war crime. The AI received that as an input. What does the AI say? 4. Resource Allocation: Developer Time could be better spent doing something else. 5. Irregular Edits: I will use every tag in OSM only once. I will map an area the size of a continent.

Complexities of AI in general: 1. Uncommon Languages: Are these things only good at the 5 biggest languages? 2. Edit Safety: The user mapped religious areas in 2 different nations that share a disputed boarder and are in a war. 3. Money: Laptops with TPU’s are not common in 2024 (but will be in 2030). Mobile Editors with TPU’s are not common in 2024 (but will be on high end phones in 2030). Running AI costs money. Who will pay for it?

Solutions: 1. AI runs locally on a TPU. 2. If you use the outputs of an AI for changeset comments, you are responsible for safety.

Disclaimers: 1. I don’t work in AI. 2. I describe what I don’t have the resources to build. 3. I assume that developer resources should focus on high priority tasks.

Expected Development Difficulty: 1. Web to TPU is hard: Graphics have standard libraries (OpenGL). AI TPU’s are not common and don’t have standard libraries. 2. This can create giant tables if you are not careful.

The benefits of manual changesets: 1. Spam is harder to create in bulk. 2. Self reflection is encouraged. 3. Individuality is good to see. 4. Changesets are the alternative to the Change Approval Board (CAB meetings). It is supposed to take effort.

TLDR: OpenStreetMap (OSM) edits could be aided with AI-generated changeset comments, potentially saving users 1.9 hours daily. AI could analyze edit locations and feature types to generate concise comments, freeing users to focus on tasks that require human expertise. However, implementing AI-generated comments requires addressing complexities like disputed boundaries, TPU libraries, and malicious inputs.