City Labels in OpenStreetMap
In the second part of his critique of OpenStreetMap Justin O’Beirne discusses various issues surrounding labelling of cities in OpenStreetMap’s cartography, specifically in our default mapnik rendering of the US.
The issues he highlights can be broadly divided into two categories: problems with our stylesheets and rendering technology; and problems with our data, and in particular with our US data.
The issue which I intend to address here is the one he tackles first – that of label density which is something that stems largely from data quality and, more importantly, consistency issues. Specifically, although the post talks about cities, the real question is about what is tagged as a city and what is tagged as some lesser type of place.
By way of explanation I should probably start by explaining that in OpenStreetMap tagging there are four commonly used used values for the place tag which designate a populated place. In order, from largest to smallest, those are: city, town, village and hamlet. The question which then arises is, how do we decide which of those values to use for a given settlement?
Like so many tags the specific names used come, because of OpenStreetMap’s origins, from typical British usage. It is therefore generally not a good idea to interpret the names too literally in other jurisdictions – indeed some tag values like highway=trunk aren’t even interpreted literally in England!
To the British the question of which places should be cities is fairly clear – there are a few alternative definitions (places with royal charters vs places with cathedrals) but those only relate to a few edge cases and in general there is little debate and only a relatively small number of large and/or important towns will qualify.
At the other end of the spectrum a hamlet would normally only be used for very small places that amount to little more than a handful of houses.
In between lies the distinction between villages and towns which is much less well defined but in my opinion would generally lie around the few thousand mark in population terms – once you reach 2-3 thousand residents you are probably a town rather than a village.
Interestingly the OpenStreetMap wiki disagrees a little here and suggests hamlet for populations up to one thousand and village up to ten thousand. I would argue that both of those values are too high for normal British usage and certainly larger than I would use when tagging places.
All of which brings us back to the variations in density in the US map…
The first thing to understand about the US is that most populated places there appear have been initially imported from the USGS GNIS data set. I haven’t found any documentation as to how places were categorised but I suspect it was done based on population and most likely using the values in the OpenStreetMap wiki or something close to them.
Justin’s first example starts with the apparent high density of places in Florida so I took a look at a randomly selected place in his example which appeared to be fairly small – the town(?) of Frostproof. The OpenStreetMap history for Frostproof reveals that it was originally imported from GNIS as a village (probably because of it’s population of 2922) but has recently been retagged as a city.
My suspicion is that this is the result of an overly literal interpretation of the place=city tag – as I understand things many relatively small places in the US officially style themselves as cities – certainly Wikipedia describes Frostproof in this way. Nobody in Britain, or indeed probably in Europe as a whole, would consider somewhere that small to be a city however and tagging it as such certainly goes against normal OpenStreetMap tagging practice.
In most of the rest of the US no such retagging of small towns as cities appears to have taken place, making place names there appear much less dense at low zoom levels. The sort of places which Justin’s article suggests should be appearing in those areas mostly appear to be in the 25-100 thousand population range and hence have been tagged as towns during the GNIS import. The solution here, if more place names are considered cartographically desirable, would either be to adjust the threshold at which places are tagged as cities instead of town, or to alter the stylesheets to render towns at lower zoom levels.
The relatively high density around Los Angeles which the article mentions appears to be the result of a fairly large number of places with populations just over the 100 thousand mark. Despite their large populations, and the fact they are likely independent cities legally, I suspect that many of them would be tagged as suburbs in Britain rather than as cities or towns and hence would be given lower priority when rendering.
The real lesson to be drawn from all this however is that the US OpenStreetMap community probably needs to reach a consensus on how to map populated places to tag values so that a better level of consistency can be achieved with less variation from area to area across the map.