Identities timeline

Timeline2I've spent a couple of days trying out the Simile project's Timeline JavaScript widget.  This was partly inspired by looking at the Zoomii interface and being impressed with what can be done with JavaScript.  The Timeline widget ingests a data file representing events and displays them on a time line which can be scrolled through horizontally.  My first timeline is still a bit rough, but since I'm not sure how much further we will take it, I thought it was worth sharing.

The timeline shows the top 1,000 people in WorldCat Identities (by number of library holdings) plotted according to the first date that Identities has associated with each name.  Sometimes that is a birth date, other times just the first publication date associated with them.  It always seems to be the case that whenever you display information in a new way, anomalies in the data that were hidden now become apparent.  I've noticed a few in the timeline, but will resist the urge to point them out here.  Incorporating more of the information from the Virtual International Authority File would help, since it has more complete date information.

Simile Timeline is quite flexible.  I was able to add a condensed scrollable view at the bottom of the page, as well as expand 1770-1980 to display by year while the rest of the timeline is by decade.  I started out trying to use the code in the subversion repository, but since all the documentation is for older code, ended up using the older JavaScript code hosted at MIT.

Scaling this up to handling more than a thousand Identities looks difficult.  Even at this size the scrolling is uneven on many workstations, and more names make the problem worse.  My original hope was to have tens of thousands of names, but that doesn't seem practical right now.

--Th

Linking to WorldCat Identities

Smollett

Since previous posts about linking to WorldCat Identities are getting out of date, here's a summary of the current API.  There are several ways of linking to Identities:

  1. Directly to the pages themselves
  2. OpenURL
  3. NameFinder searches
  4. SRU searches

DIRECT LINKING

The first is by far the simplest.  If you have an LCCN for a person (or corporation, or horse, etc.) you can link using that:

People that do not have an LCCN can also be referenced directly:

We only do this for names without LCCN's, so you can't just stick an arbitrary name in and expect to get an Identity page back.  For that matter, we only have pages for names found in WorldCat, so at least right now we do not cover all possible LCCN's.  The best way to get an LCCN for a name is probably from the results of a NameFinder search (discussed below).  There is a form to do these searches at http://worldcat.org/identities/.

We are attempting to make these links as permanent as possible, so when Ralph finally gets an LCCN associated with him, we will make the above link redirect to that new URI.

OpenURL LINKING

OpenURL links are what we use in WorldCat.org to link to pages about people:

Here’s the short version, with just last name and OCLC#.

http://worldcat.org/identities/find?url_ver=Z39.88-2004 &rft_val_fmt=info:ofi/fmt:kev:mtx:identity &rft.namelast=Shaw &rft_id=info:oclcnum/30702926


Here’s an example with everything turned on.

http://worldcat.org/identities/find?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:identity &rft.namelast=Shaw &rft.namefirst=Bernard &rft.nameinit=G &rft.nameinit1=G &rft.nameinitm= &rft.namesuffix= &rft.nametitle= &rft.date=1856-1950 &rft.name= &rft.birthdate=1856 &rft.deathdate=1950 &rft.arn=240672 &rft.title=Pygmalion &rft_id=info:oclcnum/30702926

The 'arn' in that last URI refers to the Authority Record Number, a number assigned by OCLC to each authority record.  Currently we only use a few pieces of the possible fields in the OpenURL.  First we look to see if there is an ARN and use that to find a record.  If there is no ARN, then we use a combination of the last name and OCLC number.  If those are absent or do not result in a unique record, then we assemble a full name from the component pieces and send that to the NameFinder.


NameFinder SEARCHES

The name finder service gives back a list of candidate names with URI's, ranking information, a sample title and other information about the name.  The REST-ful version of this is:

http://worldcat.org/identities/find?fullName=George+Bernard+Shaw, but there is also a SOAP version.  NameFinder looks at lots of possible variations in names, so almost always results in a list rather than a unique Identity record.


SRU SEARCHES

There is also full SRU searching against the component databases that make up Identities. 

There are 5 SRU databases associated with Identities: CorporateIdentities, PersonalIdentities, SubjectIdentities, AbandonedIdentities, and Identities.  The last database does a federated search across the other 4.  They can be found at:


 

http://worldcat.org/identities/search/CorporateIdentities

http://worldcat.org/identities/search/PersonalIdentities

http://worldcat.org/identities/search/SubjectIdentities

http://worldcat.org/identities/search/AbandonedIdentities

http://worldcat.org/identities/search/Identities

 

The Explain record for each of those services lists the indexes that can be searched.  The only other trick is that a sortKey of “holdingscount” can be used to order the result sets by library holdings counts.


The URI for Ralph LeVan (http://worldcat.org/identities/np-levan,+ralph+r) turns into an SRU search: http://worldcat.org/identities/search/Identities? query=local.pnkey+exact+"np-levan,+ralph+r" .

All the URI's return SRU searchRetrieveResponses (except for NameFinder which returns pages originally designed for the ePrints-UK project).


If this explanation isn't clear, or doesn't go far enough, please let me know.  Since the whole thing is built on top of SRU and all the SRU is exposed, it should be possible to do lots of things with the results and pages.


Thanks to Ralph LeVan who both wrote most of the code involved and drafted the explanation of much of the above.


--Th

Update on LCCN Permalinks Post

BannerroseOne of the comments on a recent post about LCCN permalinks asked about links to the LC/NACO authority file, to which Ann Della Porta replied about plans.

I should have mentioned in the original post that here at OCLC Research we have long offered a Web service that will show individual authority records (e.g. http://errol.oclc.org/laf/n79-32879.html).  Unfortunately we've been using a pattern for LCCNs different from LC's in their permalinks.  We should probably update our service to support LC's style.

The service is actually layered on top of OAI-PMH, which in turn in layered on top of SRU, so adding additional patterns is mostly a matter of adding new terms in an index.

--Th

ETD 2008 in Aberdeen

Etd_logo I recently attended ETD 2008.  The ETD (Electronic Theses and Dissertations) conferences are sponsored by NDLTD (Networked Digital Library of Theses and Dissertations).  I'm on the board of NDLTD and try to get to the conferences when possible.  This year's conference was at the Robert Gordon University (The Professional University) in Aberdeen Scotland.  There continues to be a lot of activity around ETDs, probably because this is one of the first types of digital material generated locally that libraries have had to cope with.  New services, such as the UK's  EThOS system have been in development for some time, but are just now becoming available.

A lot of the people that come to these conferences are first-timers trying to understand how to cope with digital theses on their campuses.  In fact the conferences often run a special pre-session just to get new-comers up to speed (I gave a short talk at ETD 2008's about ETD metadata).  We get a fair number of repeat participants, however.  A Canadian librarian I was talking to as we walked between buildings told me she had come to a conference a few years ago.  When she went back to her campus she told herself 'I can do this' and started up their ETD program.  She was finding this year's conference very helpful in keeping up with what has been happening since then.

As part of my board duties in co-chair of the Services and Standards committee, and thanks in large part to my co-chair (Ana Pavani from Brazil) we made some progress in updating ETD-MS, NDLTD's suggestions on how to extend Dublin Core metadata for electronic theses.  In particular we decided on ways to code degree level (0, 1 & 2 for undergraduate, masters and doctoral), access rights (no public access, limited public access, full public access), and plan to add a dc.publisher.country element.

One of the treats of going to the conference was the chance to catch up with Herbert Van de Sompel.  My group here at OCLC has been involved in the OAI-ORE work, and Herbert gave an excellent talk about it at the conference.  He also talked about the MESUR project he consults on.  He had some nice pictures show relationships between fields based on people's movement journal to journal while looking for articles.  He claims that the traditional impact factor measurement for the importance of journals gives very different results than most measures based on the usage data they have been analyzing.

Another interesting talk was about a program supported by NDLTD to use LOCKSS to preserve electronic theses, as part of the MetaArchive CooperativeMartin Halbert from Emory attended the conference to talk about this, along with Gail McMillan of Virginia Tech who has long been involved with ETDs and NDLTD.

I enjoyed Aberdeen.  I've only been there before for a few hours twenty-some years ago.  This time I went in a couple of days early and was able to rent a car and get out to see two great castles (Drum and Crathes).  Both are well preserved with beautiful grounds.  The number of gray granite buildings in the city is remarkable, but they are not as attractive as they might be.  Old Aberdeen, however, was charming, and it was nice of the city to put on a reception for us at the Beach Ballroom.  I can also highly recommend Marks & Spencer's scones with Jersey double cream.

As part of OCLC's support to NDLTD we harvest metadata about ETDs, and make a union catalog (which includes metadata from WorldCat) harvestable.  This is starting to be a sizeable file, with nearly 750,000 records in it right now.

--Th

My Photo

July 2008

Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31