Growth and Maintenance in OpenStreetMap Alan McConchie Design Technologist, Stamen Design PhD Candidate, University of British Columbia [email protected] // @mappingmashups State of the Map US June 7, 2015 1
a decade of OpenStreetMap: 1. Do OSM's first “explorers” stick around? 2. Does activity shift to “gardening”? 3. Do imports hinder growth of community? 2
I discovered that the edits I made in Aylmer and Hull — suburbs of Gatineau, Quebec that I know very well, because I pass through them every time I go to Ottawa — had been replaced by a batch import from CanVec. It removed a lot of human intelligence that I had added to the existing streets, such as pedestrian crossings, traffic lights, and turning circles. It also obliterated service roads, and turned all streets into highway=unclassified instead of residential, tertiary, or secondary, and divided highways into two- way streets. And any areas that had shared a node with a street way were now bollixed up. In other words, a big mess to revert, or to fix: the CanVec data had some useful information to add to the map — but not at the cost of erasing hours upon hours of existing work. My work. I was pissed[…] My basic point was, and is, the following: that you’re not going to get local people to contribute to OSM if they believe that their edits are going to be wiped out by the next person to import a pile of data. Jonathan Crowe, 2011 (emphasis mine) http://www.maproomblog.com/2011/02/the_state_of_openstreetmap_in_canada.php 11
of a greater movement of collaborative productivity, where people all over the world can and do join forces to create something great, something of value. I believe that in 40 years, probably even in 15, hardly anything of the data we have collected will retain much value - but we will have been part of a great development, and mankind will be the better for it. Will OSM, instead of being the social endeavor of “a great map that people made themselves”, then be the technical challenge of “the geo database where a few clever guys managed to combine lots of existing data”? Frederik Ramm, 2012 (emphasis mine) https://lists.openstreetmap.org/pipermail/talk-us/2012-December/009966.html 12
not universally accepted) theory is that massive imports are to blame for slowing down the development of the US community. …instead of the fun task of going out, finding new streets, and filling in a blank map, new contributors in the US were now faced with the relative tedium of correcting repetitive errors in an existing dataset. Meanwhile, in countries like Germany, almost all data has been collected by volunteer mappers, with only a few small-scale imports conducted in places where there already was an active community. This was not by design - there simply wasn't a comparable dataset that could have been imported - but it was ultimately beneficial. At least if you accept the theory, of course. Tobias Knerr, 2015 (emphasis mine) http://forum.openstreetmap.org/viewtopic.php?id=30121 13
Divide data into chunks • Community members manually upload each chunk NOTE: The term "Import" is highly loaded in the OSM community. "A distributed and curated merge," is a more accurate description of what Seattle OSM planning to do. — wiki.osm.org/wiki/Seattle_Import 15
imports Four training events and editathons Five events out walking around 20+ active participants slideshare.net/gwhathistory/osm-sotm-us-2013-imports4community-002 16
effect how local OSM communities grow and evolve? Do the recent imports have different data signatures from TIGER? Broadly, how has OSM evolved differently in different cities?
dump of OSM up to December 31, 2014 • Extract a small number of study areas • Only look at nodes, disregard all other data types • Overlay grid (1 km) • Find earliest node in each grid cell • For each username, count following values: Number of total node edits Number of version 1 edits (node creation) Number of version 2+ edits (move node or change attributes) Number of edits in “blank spots on the map” • Only count a user's activity within each study area. 21 The analysis
OSM contributors plotted by number of “blank spot” edits and number of total edits. Circles are individuals editors sized according to number of days active London
Bay Area: OSM contributors plotted by number of “blank spot” edits and number of total edits. Circles are individuals editors sized according to number of days active Bay Area
boxes around approximate extent of urban areas Bounding box population estimated using Gridded Population of the World dataset for year 2000 (GPWv3) London Population 8,500,000 Land area 3300 sq km Berlin Population 1,900,000 Land area 2100 sq km Seattle Population 3,200,000 Land area 6300 sq km
nodes by month (new nodes + modified nodes) TIGER imports Boston building import Haiti earthquake Paris building import second Boston import NYC building imports seattle building import
nodes by month Modification history lost before OSM API v0.5 Berlin Haiti Berlin Bay Area Montevideo London New York Bay Area Los Angeles Paris Los Angeles
nodes by month: Four U.S. cities solid lines: aggregated blankspot contributors (Bay Area n = 119, Seattle = 54, New York = 61, Boston = 27) dotted lines: aggregated non-blankspot contributors (Bay Area n = 3485, Seattle = 2067, New York = 3116, Boston = 1406)
nodes by month: Four U.S. cities solid lines: aggregated blankspot contributors (Bay Area n = 119, Seattle = 54, New York = 61, Boston = 27) dotted lines: aggregated non-blankspot contributors (Bay Area n = 3485, Seattle = 2067, New York = 3116, Boston = 1406)
law effects trump everything else. • Gradual shift to new contributors (less so in Canada!) • Subtle shift to maintenance tasks… • …but there are always new things to add. • Maintenance is “bursty”, like additions are. • Do imports hinder the growth of community? 53
law effects trump everything else. • Gradual shift to new contributors (less so in Canada!) • Subtle shift to maintenance tasks… • …but there are always new things to add. • Maintenance is “bursty”, like additions are. • Do imports hinder the growth of community? 54 …not very clear cut at all