For some time we've had a simple Web service up for searching the NACO name authority file. This grew out of our (somewhat limited) participation in the eprintsUK project. One of our plans was to have a service that could be used to verify names for institutional repositories. We haven't given up on that, and Ralph LeVan in particular is interested in linking local names to global names. A couple of weeks ago Ralph redid the matching algorithm used in our name lookup.
This is something we've known was needed for some time. As a matter of fact, I worked out a new ranking algorithm which improved the retrievals, but it never made our public service. Ralph's, though, is substantially better since it can tolerate many spelling errors and is smart about ranking based on usage counts and preferred form vs. cross references in the authority file.
I'm pretty far down the list if you just search for Hickey, (my 'name rank' in WorldCat was at something like 650,000th the last time I looked) but a search for T B Hickey retrieves only my record. Because of the toleration for spelling variants, the searches T B Hicky and T B Hikcey both work too. Here's another search that failed in the old service: Jim Gray. The Jim Gray I was after comes up fourth in the list (we need to get the $q displaying to differentiate the second and third entries). This one is a bit tricky, since Jim's real name is James Nicholas Gray, but he writes and is established as Jim Gray. Our old system lost him in a sea of James Grays. Even the new system won't find him as Jim Grey (with an e), since that form is quite common.
We think the system works pretty well. Give it a try, and if you find something that doesn't work the way you expected let us know and we'll see if we can fix it.
--Th
Maxresults didn't seem to have any effect at all when I used it. Not a showstopper, but still a bug?
Posted by: Dorothea | May 16, 2006 at 13:19
Neat look-up service! My question: why wouldn't "Eaton, John" also find the folks I found with "Eaton, John H"? If I hadn't already known the middle initial, I wouldn't have found the established name for my target individual.
Posted by: Bodling | May 16, 2006 at 16:11
Maxresults is a bit of a hold over from the earlier service. We'll either drop it or make it work.
Good point about John Eaton. Evidently the code thought there were enough good matches not to go further. A 'next page' link would help.
--Th
Posted by: Thom | May 17, 2006 at 10:09
I searched on a Korean name (Myung Mi Kim, who apparently doesn't have a name record) and the results I got (Firefox on OSX) weren't correctly encoded for display.
I know character encoding is a rats' nest, though.
Posted by: Dorothea | May 19, 2006 at 09:14
Argh. Yes, she does have a record -- being conditioned by OPACs, I did a last-name-first search and got bogus results. Perhaps a note on the page?
Posted by: Dorothea | May 19, 2006 at 09:16