Open-source geo is really something right now

tunesmith · on June 12, 2016

One thing that mystifies me is a common pattern of attitudes I see regarding capturing knowledge.

Clicking around you'll find repeated references to Leibniz and Descartes dreaming of a way to capture all of human knowledge via axiomatic statements, lemmas, and conclusions; then easily dismissed as impractical.

This article talks about the Semantic Web, which is then immediately made fun of with derisive references to Rudolf Carnap and the urge to "Pokemonize" all of human knowledge.

What I want to know is, even if it is impossible to capture all of human wisdom/conclusions, why does that by definition mean the effort is worthless? After all, when looking at information rather than conclusions - wikipedia isn't complete, will never be complete, but it is still useful for what is there.

cwp · on June 12, 2016

It's not just that it's impossible to capture everything in one database, it's that it's a waste of time to try. There is no format that can capture all nuances and be useful for all purposes.

Data is messy. The only thing you can do with it is preserve the original format that was used to record it, transform a copy to a format that's useful for the analysis you want to do, and join it to other, similarly messy data.

Wikipedia is a perfect example. It's messy as hell. It's blobs of data with connections between them. Most of the blobs are unstructured text (or at best semi-structured) but there are a bunch of images in various formats there, and links to outside datasources in myriad formats and locations. There are some standards and attempts to maintain order, but the whole thing is damn difficult to process by machine. It's a very, very long way from the Semantic Web.

adrianN · on June 13, 2016

It's certainly possible to capture everything a human can know in one database. The human brain demonstrates that it's possible. It might be a bit tricky to capture artifacts that come from messy wetware with something simpler than a simulation of messy wetware, but there is no theoretical barrier to having all human knowledge in one query-able location.

akira2501 · on June 13, 2016

> It's certainly possible to capture everything a human can know in one database.

Storing the information isn't the real challenge, structuring the connections between different pieces of knowledge is what seems to be the very difficult part.

Without the latter, you can have all the data in the world, but no way to make any use or sense of it.

pjc50 · on June 13, 2016

The brain is not a computer and the mind is not a database. It does not store discrete data items.

gosub · on June 13, 2016

If it can compute, it's a computer. If it can store and retrieve data, it's a database.

mcherm · on June 13, 2016

The human brain stores information. I would not call it a database.

Let's start with your statement: "If it can store and retrieve data, it's a database." One of the notable characteristics of human memory is that it is not reliable. It does NOT reliably retrieve the data that it stored -- and I don't just mean that we forget some things, I mean that many of our memories are factually incorrect.

Human memory is extraordinarily USEFUL, but in order for the term "database" to have a useful meaning, I have to categorize human memory as a form of information storage that is NOT a "database".

scrupulusalbion · on June 13, 2016

This would mean that a floppy disc (hardware) and the filesystem used to arrange the bits thereon (software) are each a database. How is breaking down the distinction between database, filesystem, and storage medium helpful?

Retra · on June 13, 2016

I think the idea is that it could get closer to the Semantic Web if there were more respect for that kind of effort, and even though it doesn't solve all of the problems with data, it does solve some of them.

maxerickson · on June 12, 2016

I think this article is making fun of the idea that you need to start with a tidy structure.

That matches what happened, the OpenAddresses project mostly put all the messy data in one place, which made it possible to extract the parcels from the mess. Tracking down the data provided a lot more value than pontificating about how parcel data should best be stored.

jboynyc · on June 13, 2016

> why does that by definition mean the effort is worthless?

Paul Ford himself doesn't claim it's a "worthless" exercise in this piece. In fact, he says "It’s pretty exciting to imagine that one day we’ll stumble into the one true universal database of all human knowledge."

As he portrays it, the vision of the semantic web is a kind of receding horizon that drives forward a lot of worthwhile efforts. That's why he brings it up in the first place. Sure, he pokes fun at it a bit, but that's just Paul Ford's (highly effective and enjoyable, IMO) writing style.

pella · on June 12, 2016

-> https://openaddresses.io/

hackney · on June 13, 2016

Interesting. The post references sources such as Mapzen which allows you to extract portions of data for peculiar needs. Mapzen also provides an api for turn by turn navigation and there is a plugin to use it on a web based map. I use DeLorme's street atlas on my tablet. It works great and is inexpensive. Just don't bother with any map pack additions as when you try to download more than a few square miles, delorme then promptly throttles you to 1990 dialup speeds. Viking (sourceforge) is also an option but there is no real time gps with windows.

caniszczyk · on June 13, 2016

Also check out the work done by LocationTech: https://www.locationtech.org/