> It also makes it possible to cache things locally indexed on their IDs, and to form URLs that make requests for just the precise documents that aren't available locally. In order to achieve this, it's necessary to have both (1) IDs, and (2) a way to convert a list of IDs into a single request for all of the documents at once.
Couldn't the id column contain a canonical url then? E.g.:
I think biggest benefit from this kind of specification would be for data (partly) distributed and (partly) shared between different hosts, as there would be at least some common ground for both clients and server how to communicate. IDs are very abstract and does not necessarily tie data under particular host. Of course for some data domains ID could be URL, but that is decision made by data provider. Spec decision would be crippling for overall use.
I'd say it's the other way around. Url's are opaque for the client - id's imply more knowledge of the implementation. E.g. the client would have to know which host to communicate with and how to structure url's from id's. With hyperlinked documents, all the client needs to know is http.
URLs are opaque, and can often serve as very useful IDs, but, alone, they imply a one-at-a-time model of fetching documents, and this spec is trying to provide a way to easily request only the documents a client needs in a compound document.
Keep in mind that this spec actually requires that every ID be able to be readily converted into a URL based on information found in the same payload, so URLs are still front-and-center in the design. It just separates out the notion of a unique identifier, so that it can be used in other kinds of requests.
How would you combine multiple authors into a single request? With IDs + a template, you can form a single request for all of the "people" you don't have yet. If you use URLs as IDs, which is one of the primary goals here.
If I understand your question correctly, I would say that http pipelining solves that issue. It can only be used for GET requests, so there are limitations.
SPDY begins to offer some very appealing alternatives where when sending a document the transport can push all of the individual dependent documents. It really does fix things, begins pushing all the data at once, in a glorious resource-oriented fashion. That said, I would also enjoy a spec that does resource-description of subresources so we can send linked data around without having to have every piece of data be an endpoint.
That said, the immediate follow on question arises- now that we're sending sub-resources, can we get the most important agent to understand and grok our sub-resources- can the browser follow our subresourcing and those subresources to their canonical URLs, and serve those subresources if it's seen them inside another document? There are two questions- first, is your spec good enough to enable that facility where addressing can be well known- here, in this Json Resource Description spec presented yes, via URI templates, very good- and second, does the browser bother to inspect the JSON it sees? No? Well, I'm not super bothered by this academic interest not being materialized, knowing at least in principle the specs make it possible.
Thanks for the link - I weren't aware there were so many issues with pipelining. I have mainly used it server to server, where it seems to present less problems (not surprisingly really - I'm in much better control of the chain of components).
It can only be used for GET requests, has problems in web browsers, and still requires the overhead of individual requests on the server side to construct and return many responses.
In theory, things like pipelining allow you to never have to worry about compound documents. In practice, I don't know anyone who has gotten this to work well for browser clients and general-purpose frameworks when dealing with non-trivial numbers of documents.
The server side support seems like the wrong thing to base the protocol design on, but I admit that is probably just me showing my limited experience with very high traffic api's. I would think though that much of this could be alleviated by proper caching. As requests would naturally be finely granulated into individual resources, presumably that could be done efficiently.
The point of browser support is probably more pressing. I'm curious as to how big an issue that still is? Which browsers support it properly these days and which don't?
I wonder if it would be worth to build an api around the assumption of support for pipelining and then provide a fallback hack for those that lack support. E.g. something similar to the good old _method hidden-field hack for lack of http method support. I'm thinking something like an optional "batch request endpoint", that would tunnel through multiple requests, similar to what a pipeline would do. I believe Facebook is offering something similar in their api's.
N.B. Pipelining can be used for any idempotent request (so PUTs and DELETEs work too). That said, lack of broad implementation support for pipelining is still an issue. Since HTTP/2.0 is being based on SPDY, hopefully we'll see a day where this is less of an issue.
Couldn't the id column contain a canonical url then? E.g.: