« The day I almost met James Gosling | Main | Scanned books and the DDC »

Name searching

Naco For some time we've had a simple Web service up for searching the NACO name authority file.  This grew out of our (somewhat limited) participation in the eprintsUK project.  One of our plans was to have a service that could be used to verify names for institutional repositories.  We haven't given up on that, and Ralph LeVan in particular is interested in linking local names to global names.  A couple of weeks ago Ralph redid the matching algorithm used in our name lookup.

This is something we've known was needed for some time.  As a matter of fact, I worked out a new ranking algorithm which improved the retrievals, but it never made our public service.  Ralph's, though, is substantially better since it can tolerate many spelling errors and is smart about ranking based on usage counts and preferred form vs. cross references in the authority file.

I'm pretty far down the list if you just search for Hickey, (my 'name rank' in WorldCat was at something like 650,000th the last time I looked) but a search for T B Hickey retrieves only my record.  Because of the toleration for spelling variants, the searches T B Hicky and T B Hikcey both work too.  Here's another search that failed in the old service: Jim GrayThe Jim Gray I was after comes up fourth in the list (we need to get the $q displaying to differentiate the second and third entries).  This one is a bit tricky, since Jim's real name is James Nicholas Gray, but he writes and is established as Jim Gray.  Our old system lost him in a sea of James Grays.  Even the new system won't find him as Jim Grey (with an e), since that form is quite common.

We think the system works pretty well.  Give it a try, and if you find something that doesn't work the way you expected let us know and we'll see if we can fix it.

--Th

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d83459bf2269e200e550717e6a8833

Listed below are links to weblogs that reference Name searching:

» Searching for names: revisited from -=( In Between )=-
In my previous post I talked about author identification. Afterwards, I found that this issue was also discussed on the weblogs of two OCLC researchers, Thom Hickey and Lorcan Dempsey. It sure is worth your time to read what they have to say about it. To [Read More]

Comments

Maxresults didn't seem to have any effect at all when I used it. Not a showstopper, but still a bug?

Neat look-up service! My question: why wouldn't "Eaton, John" also find the folks I found with "Eaton, John H"? If I hadn't already known the middle initial, I wouldn't have found the established name for my target individual.

Maxresults is a bit of a hold over from the earlier service. We'll either drop it or make it work.

Good point about John Eaton. Evidently the code thought there were enough good matches not to go further. A 'next page' link would help.

--Th

I searched on a Korean name (Myung Mi Kim, who apparently doesn't have a name record) and the results I got (Firefox on OSX) weren't correctly encoded for display.

I know character encoding is a rats' nest, though.

Argh. Yes, she does have a record -- being conditioned by OPACs, I did a last-name-first search and got bogus results. Perhaps a note on the page?

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Comments are moderated, and will not appear until the author has approved them.

My Photo

June 2009

Sun Mon Tue Wed Thu Fri Sat
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30