« November 2007 | Main | January 2008 »

Six Degrees

Wernerschmidtraw I'm sure most people have encountered the idea of 'six-degrees of separation' which suggests that everyone is connected to everyone else by at most six jumps.  WorldCat Identities has connections between names, and I thought it would be interesting to see how many steps it takes to get from one person to another.

Since Identities is built as a Web service this wasn't very difficult to do.  I found it worked better if to ignore links to corporate identities, since some large publishers have their names in records and they get picked up in Identities.

Here's the shortest path between Mark Twain and Jane Austen:

Mark Twain ->Charles Warner (Author) -> Hamilton Mabie (Editor) -> George Edwards (Illustrator) -> Austin Dobson (Editor) -> Jane Austen

But Identities links are not symmetrical, so the path the other way might be different (and usually is).  Here's the path from Austen to Twain:

Jane Austen -> R. Brimley Johnson (Editor) -> Edgar Allan Poe -> Nathaniel Hawthorne -> Jane Austen

By clicking on the Icon_plus_blue after each of the related names in Identities you can see the results in WorldCat, for the combination of the Identity name and the related name.  For instance, searching for Jane Austen and R. Brimly Johnson shows that he wrote a book entitled Jane Austen, Her Life, Her Work, Her Family, and Her Critics.  Combining him with Edgar Allan Poe finds The Complete Poetical Works of Edgar Allan Poe, with Three Essays on Poetry.

Unfortunately, it isn't possible to get from everyone to everyone else.  There does seem to be a path from myself to Kevin Bacon, though:

Thomas Hickey -> Diane Vizine-Goetz -> Joan Mitchell -> Gregory New -> Pat Thomas -> Willi Baer -> Michael Cerenzie -> Frankie Muniz -> Kevin Bacon

I'm not sure about the link in that chain from Gregory New (a former Dewey editor) to Pat Thomas.  There might be two Pat Thomas's involved.

If you want to try your hand at mining Identities in similar ways, please use the version of it running on our Research servers:  http://orlabs.oclc.org/Identities/.  The links above were found using the Research version on December 17, 2007.

--Th

Update:  At Ed Summers' request I've put my code up.  My natural modesty always makes me reluctant to show how raw the code is, but it did take a little thought, so it's probably worth sharing.  Here it is:
http://outgoing.typepad.com/code/relations2.py.

Java Guidon

JavaguidonGuidon was our name for the client OCLC wrote for displaying electronic journals.  Last night I had someone in Japan ask me permission to load this old screen shot of an experimental version of Guidon into Wikimedia for use in a Japanese article about electronic publishing.  I'm not sure how they found it; I haven't seen that image for years.  Looking around, I found an article in the 1995 Annual Review of OCLC Research, so the screen is probably from 1995 or so.  The date on the .gif file is July 1996.

This was towards the end of OCLC's experimentation with SGML encoded documents and in production we were still distributing a C++ version of the Guidon client.  The screen shot is of a port of Guidon running in Netscape, all done in Java applets.  It had a lot of nice features, including zoom (using our own (patented!) way of loading TeX/Metafont fonts) and a two-column scroll that really worked (to everyone's surprise that saw it).

Unfortunately for the project, OCLC abandoned doing our own formatting of electronic journals soon after this, so it never got close to production.  Page layout is expensive (several dollars/page) even when fairly well automated, which makes using the PDF files that publishers are already producing impossible to compete with.  But the articles we were putting up under Guidon in the mid '90s were easier to read than PDFs are now, so there are still improvements ahead for electronic media.

--Th

Note: Yes, I gave permision for use of the image.  If anyone notices it appearing in the Japanese Wikipedia let me know and I'll link to it.

My Photo

June 2009

Sun Mon Tue Wed Thu Fri Sat
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30