Actual title: Google and other tech giants are happy to have control over the We...

Vinnl · on Aug 7, 2020

> But there is absolutely no business incentive there - rather the opposite. Easy portability of data is not something most companies would want.

That depends on what kind of data it is. For example, your home address is not part of your bank's primary business model, but keeping it up-to-date is important for it. If data portability in and out of the bank makes it more likely that you'll keep it up-to-date, that's useful for your bank as well.

Legislation and customer demand is also making it more and more palatable. If some data is not critical to your business model, but being the sole guardian of it is a legal/reputational liability is, then actually handing control over that data over to someone else and re-using that is very useful.

riffraff · on Aug 8, 2020

That's interest on the side of the data consumer not the data provider, for lack of better words.

If the bank was the one owning the information they would not want it to be shared with others as that would allow their client to easily migrate to another bank which they definitely do not want.

But as the one receiving the data,sure, it would be nice to have others share it with me, they'd say.

I'm afraid without legislation data sharing is never going to be a thing.

Vinnl · on Aug 8, 2020

At this moment that bank is the entity that keeps this data. Their challenge is, however, that the data gets outdated. But if they give other parties the ability to access that data, then the consumer will have more motivation to keep it up-to-date, and the bank will now have access to more accurate address data.

(Note that the bank is an example - it could be another party.)

mschuster91 · on Aug 8, 2020

The solution for this would be for banks to use the government as single source of truth - in Germany we have the Melderegister anyway, it's mandatory to register your primary address.

Unfortunately it's not allowed by law that a consumer gives "push access" to e.g. banks, health insurance or employers.

bryanrasmussen · on Aug 8, 2020

somehow all this makes me think that well maintained metadata would be a real boon to the more untrustworthy elements of the web.

Vinnl · on Aug 8, 2020

A fair concern - this really depends on legislation like GDPR that only allows for the data sharing with the consumer's explicit consent.

jacques_chester · on Aug 7, 2020

It's a textbook collective action problem. Everyone would gain from having a high-quality shared ontology, but nobody gains enough individually (it's a public good).

The typical solutions to collective action problems are (1) benefactors who subsidise production (either privately or through taxation), or (2) direct command and control. Google was apparently filling the role of benefactor.

PeterisP · on Aug 7, 2020

I'm not even sure if everyone would gain from having a high-quality shared ontology, because as soon as you go beyond trivial examples, the details of the data model inevitably have competing incompatible needs which require some compromise.

I could certainly imagine that for many companies the disadvantages of using a model that's not simply a copy of their specific view of that problem domain are larger than the hypothetical benefits of interoperability, so even if such a shared ontology would exist, many would intentionally choose to use their own ontology instead of adapting to that standard.

mandevil · on Aug 7, 2020

At one point I worked at a company founded in 2005ish, so one of their core things was an ontology. We found that while some very generic things were reusable (person and address, say) almost everything that drove business logic was different, use case to use case.

majormajor · on Aug 8, 2020

Yeah, trying to standardize this seems like it would quickly turn into one of those "things people believe about time" rabbit-holes of edge cases and differences.

Even inside a single company I get scared now when someone says "if only we had just one standard way to handle this sort of thing"... if it's rarely simple for just one company, how would it work globally?

breck · on Aug 8, 2020

This is a really good point.

Without substantial benefits of a large universal ontology and without the ability to painlessly diverge from an ontology to not compromise on accurately modeling a particular domain, it’s hard to see the net benefits. Everyone will want to customize things for their domain, or point of view. An ontology should be easy to fork, like a repo.

smoe · on Aug 8, 2020

I think even in those cases there are benefits.

I would certainly want to extend, modify and replace the data models for my core business as I see fit. But beyond those there are still going to be a whole lot of models in need in order to run the company but want to keep low maintenance. E.g. for hiring I might not have strong opinions what a job post, a candidate or an application should look like, so I'd be happy sticking to the standard in those cases and benefit of it being easier to mix and match tooling and pass around the data.

Also I reckon to me that a partially customized ontology, which is inevitable, is still easier to map between orgs than if they build it from scratch completely

Or maybe see it less as a standardized ontology but as a standardized way to create ontologies

fnord77 · on Aug 8, 2020

this is where NIST or a similar agency could step in.

acdha · on Aug 8, 2020

> I recently looked into using schema.org types as the basis for a information capturing system, but many of the types are somewhat outdated, of questionable quality or just missing.

It grew out of the semantic web community so this was roughly what I expected. That space just seems cursed to have these lofty ideals which are never realized because it’s hard to justify spending time on something which has no known consumer. Schema.org seemed poised to change that but they only use a couple of types and then only for a few types of searches.

dbish · on Aug 7, 2020

Spent time with schema.org years ago. It's just not needed/useful for most scenarios and the amount of work and convincing most groups to use it isn't worthwhile so continuing to extend it isn't worth the effort.

jshen · on Aug 8, 2020

Not needed nor useful for who? It’s obviously useful for a more robust and open web, for our collective society, so I’m not sure who the subject of your statement is.

whbrown · on Aug 8, 2020

Not to be curt, but it clearly isn't obviously useful—otherwise the project wouldn't be languishing such as it is. The notion of creating a single overarching conceptual map to regulate the representation of the varied manifold of human experience on the web is almost certainly a deeply misguided idea, and even if it's philosophically sound (a big if), it's not clear that schema is anything like the correct approach. I'm open to be convinced of it's value if you'd like to elaborate, but I'll just say its far from obvious.

jshen · on Aug 8, 2020

Ah, I think we might be talking about different things. I think the larger promise of the semantic web is a categorically different thing than adding a bit of meta data to pages to know basic things like author, content type, description, etc.

It’s the latter I think is clearly valuable, in order for us to have competition for the likes of google and Facebook. It lowers the barrier for creating competing search engines, modern rss readers, and even things like distributed social networks.

PaulHoule · on Aug 7, 2020

Schema.org is designed to be useful to Bing and Google but not other entities. It is enough to help them compile better training sets to extract that kind of metadata without schema.org, but not enough to build a simple extractor that would be useful to a smaller software company.

zerocrates · on Aug 8, 2020

Yes, despite its agnostic branding and name indicating basically totally maximal scope, schema.org has basically the features Google's interested in supporting for pulling out things from pages and emails.

To the extent that other uses can basically piggyback on data that sites added to target Google, it does provide some value, but I don't see it as really even attempting to be a generally useful "semantic web" or linked data vocabulary in the sense of interoperating with other things.

melvinroest · on Aug 8, 2020

The Dutch startup I work at [1] is active in the semantic web technology space. It's not pretty much inactive. The industry is simply not in the foreground of things.

[1] triply.cc

dang · on Aug 7, 2020

We've edited the title to a different subset of the tweet. It's not always obvious how to condense those into 80 chars.

Submitted title was 'Google is happy to control core Web schemas, but they neglect project'

onion2k · on Aug 7, 2020

Development indeed seems slow, while changes that are needed by one of the larger involved companies get pushed through quickly.

Whichever company did that would be accused of trying to "take over" the web.

Ideally large companies should be sponsoring open efforts to define things that affect how the web works rather than doing the work themselves. Smaller open teams that move fast to define structures that work for as many people as possible, even if they're not perfect for Google, Microsoft, etc, would be more useful to the internet industry as a whole.

clairity · on Aug 7, 2020

i recently went though schema.org a bit while putting together a blog, and it was a long list (for a human to digest), but relative to all the objects in the world, tiny. google's vested interest and stamp on it was pretty evident.

i also went through microformats, which seems to be much smaller, and more tightly-focused around blogs and structuring data shared among federated sites.

dajohnson89 · on Aug 8, 2020

deleted