What data format do libraries prefer for Web APIs?

Recently libraries have begun to focus quite a bit on linked data and the importance of APIs for building applications. There are a variety of options for the format of data returned by APIs: XML, JSON, RDF/XML, Turtle, CSV, MARC, to name a few. I was wondering if there is emerging consensus around what web service data format is preferable. Characteristics such as reusability, eas of use, extensibility, tool support are among the possible considerations. Opinions about the type of XML, JSON, RDF etc are welcome as well.

Ed Summers

api
linked-data
web

Comments

Joe: 'XML' is too broad a category. You've got REST, SOAP, WDDX ... and within SOAP, you've got doc/lit, rpc/encoded, doc/lit wrapped. And outside of SOAP, there's XML using attributes vs. all elements, etc.

Answer by Erin White

Our library has been developing with XML for our APIs but we've also been dipping our toes into JSON. For us the deciding factor for XML has been (1) our web technology of choice, PHP, and its ability to work easily with XML; (2) staff's expertise with XML over other formats; and (3) our vendor's API decisions, which until recently have been predominantly XML. I can see JSON becoming more popular for us in the future because it is so lightweight and flexible.

As far as emerging consensus, I have no answers but would be interested to hear others'.

Comments

phette23: I don't have an answer so I'll just comment: I personally feel JSON APIs have a clear advantage. They're lightweight, easy to process in any language, & work well with cross-domain requests (no server side scripting!). As someone who tries to work in JavaScript as much as possible, that's a huge win.
Joe: @phette23: the disadvantage is that you can't validate and there aren't the stub builders available like there are if you supplied a WSDL file for a SOAP service. (I've amazed people that they could consume one of our more complex services in under 30 min of work, when it takes days to work out the kinks on groups with similar interfaces done in JSON or REST)

Answer by Sam K

I think it depends on the desired end result. I mainly use APIs to obtain data for analysis not for web presentation so XML (with it's relatively high overhead) is fine. If I were doing more web work I think I'd like JSON better.

That said my language of choice is python and it's easy enough to parse and process either.

Comments

Answer by Chad Ferguson

Like the others have mentioned the two most accepted and widely used choices are XML and JSON. XML would be optimal in situations where your data is primarily marked-up, you need validation (schema), or require attributes. JSON on the other hand is more light-weight and is great for data using the basic types (primarily string), is native to javascript, and is considerably smaller than xml.

Personally I prefer to work with JSON.

Comments

Answer by The Beard

This would seem like a tough question to answer because much depends on the community the library itself serves. For example, a library that is part of a higher education institution may be able to take advantage of more current technologies (JSON, as an example) with an ample supply of students, graduate assistants and other technologists who can act more in concert with the times.

A typical library in a K-12 rural district may have one librarian who can choose technologies for acquisition, but their ability to scale or adopt new technologies could be difficult simply because there's little infrastructure to sustain any change they'd like to make.

In working with (and designing) a number of APIs related to search, discovery and retrieval of educational resources, one common thread that has emerged is just how powerful JSON is. XML is still very widely used.

Comments

Answer by Ed Chamberlain

JSON and XML seem to be used extensively. As API's tend to be layered over or originating from commercial systems, we often have to work around those limitations.

The public facing API's we've developed at Cambridge attempt to provide XML and JSON where possible. However, this has its drawbacks, in that the we've created data feeds that work ok in each model, rather than play up to the strengths of one.

I've personally found that JSON's proximity to the data models in programming languages makes it more preferable for developers. XMLs' ability to represent complex data models and concepts using nested structures and attributes makes it a nice fit for metadata folk.

We've also experimented with offering RDF and SPARQL. Feedback so far indicates that developers (librarycentric or otherwise) prefer easier lighter formats to work with although RDF has its uses. Even with RDF, its hard to find a universal standard that works across several sources of data.