Here's a contribution from Jeff Young, who manages the RDF aspects of VIAF:
Since Wikidata’s introduction to the Linked Data Web in 2014 and subsequent integration of Freebase, it has become a premier example of how to publish and manage Linked Data. Like VIAF, Wikidata uses Schema.org as its core RDF vocabulary and both datasets publish using Linked Data best practices. This consistency should allow applications to treat both datasets as complementary. The main difference will be in the coverage of entities/information, based on their respective sources.
The VIAF RDF changes outlined on the Developer Network blog are intended to further enrich and align the common purpose. Some of the VIAF changes provide additional information to help disambiguate entities, such as schema:location and schema:description. Where possible, schema:names are now language tagged, which should make it easier for applications to select a language-appropriate label for display.
The biggest change, though, is in the “shape of the data” that gets returned via Linked Data requests. Previously, this was a record-oriented view rather than a concise description of the entity. Like Wikidata, the new response will focus on the entity itself and depend on the related entities to describe themselves.
Alignment with Wikidata is a major step in the evolution of VIAF, which started with RDF/XML representations of name authority clusters in 2009 and transitioned to “primary entities” in 2011. The introduction of VIAF as Schema.org in 2014 extends the audience and integration with Wikidata further strengthens industry standard practices. These steps should help ensure that VIAF remains an authoritative source of entity identifiers and information in the linked web of data.
Jeff
Note: We expect these RDF changes to be visible on viaf.org April 16, 2015. The bulk distribution will follow shortly after that.
--Th
There can be a bit of tradeoff here, depending on the granularities of the dependent entities, and the design of the backend storage system.
One problem is latency; the dependent entities cannot be requested for at least one RTT ;if there are chains of depencies then this can quickly add up (HTTP/2.0 can reduce this effect somewhat).
If most of the dependent entities are used most of the time then it can be more efficient to send a dataset containing multiple entities.
This is especially true if the data is compressed (either ahead of time or on-the-fly). Most compression algorithms require a bit of a run-up to get started.
This is very much the case for RDF - especially n-triples / n-quads, where there aren't any prefixes, and for RDF/XML, which is XML...
This does not lessen the importance of making the other entities available by name (blank nodes must die).
[If the client requests an RDF format that supports named graphs then each entity can go in a separate graph. This can help with caching]
Posted by: Simon Spero | April 15, 2015 at 11:57