« April 2005 | Main | June 2005 »

Black napkins and NetFlix

Refectory_1 I'm not sure these are related topics, but when I think of one the other comes to mind.

First the black napkins.  While dining out at one of the fancier restaurants in Columbus, I noticed that they supply black napkins for customers with dark outfits on.  One occasionally hears about 'white glove' service, although I suppose the phrase is getting dated since no one wears white gloves anymore.  I'm assuming that the black napkins are to avoid getting visible lint on the customer's clothes.

NetFlixs.  I'm sure most people are familiar with how NetFlix operates.  For a subscription ($10 to $18) you are entitled to have 1 to 3 DVDs.  You keep them as long as you want, but if you want to get a new movie, you need to mail in one of them in a prepaid mailer.  Since they are DVDs, standard first-class postage is all that is needed.

Certainly there are library services that work in this manner -- the NLS (National Library Service for the Blind and Physically Handicapped) in the U.S. does.  And certainly books are bulkier and harder to mail than DVDs.  And yes, OCLC's corporate library sends books to me through inter-office mail, as do many colleges.  But, do any public libraries offer this sort of service?  And if so, do they have anything like the NetFlix web site that everyone seems to love?

I suppose what brings these topics together in my mind is the level of service they provide.  A level of service that libraries have a hard time matching.

--Th

WorldCat Wiki

Banner_image_council_1 The OCLC Members Council has a Digital Library Research Interest Group that we in OCLC Research meet with three times a year.  I gave a short talk about our plans for a WorldCat Wiki, a new project involving several groups here at OCLC.

The idea is to have a Wiki that complements WorldCat.  People could add reviews, cover art, comments, etc. and relate these to bibliographic records (maybe at the FRBR work-level too).  We hope the system is flexible enough so that people do (good) things we're not expecting.  We'd like the Wiki to be available anywhere WorldCat records are.

The DLR interest group had some good suggestions about how this sort of capability might fit into their local situations.  Some of the ideas might take a little more structure in terms of group membership than we had been planning.  There certainly is a lot of effort being expended within and with libraries that isn't being shared very well and this might help with that.

I've mentioned the MetaWiki software we're using to build the Wiki part of this.

--Th

Continue reading "WorldCat Wiki" »

Variety

CocacolalogoI was in a grocery store the other day and was struck by the variety of Coca-Cola available.  These are all identifiable as 'Coke', not other products that the Coca-Cola company makes.  I can remember when Coke came in one flavor and, if I remember correctly, only one or two sizes.  (Potato chips only came in one flavor too, as strange as that now sounds.) Now, if you count diet Coke, and caffeine-free Coke, and caffeine-free Diet Coke in refrigerator 12-packs, and Vanilla Coke in cute little 8-ounce cans, this store had more than two dozen varieties of Coke!  And this was a small store in Ohio; I'm sure Coke is sold in literally hundreds, if not thousands, of different ways around the world.

What I'm getting at, is that one-size-fits-all isn't what people expect, and that we need variety in our interfaces.  Libraries have known about variety all along.  While I would maintain that resource needs across different academic campuses is very similar, the physical collections of libraries have quite a bit of divergence.  This is because the libraries accommodate the people that make the highest demands and the most use of the library, and those people are always different at different campuses.  The people our interfaces reach have differences too, and we need to accommodate them.

There are some subtleties and complex relationships in our data.  We should make these as understandable and usable as possible.  But just because some people have a hard time differentiating between Shakespeare as an author and Shakespeare as a subject doesn't mean that at least some of our interfaces shouldn't expose the difference to users that are interested, and maybe even to those that didn't know they were interested, but don't have trouble with the concept.  This doesn't mean that having every library's interface be different is a virtue, or that we need to impress people with how clever we are, or that I find the beverage industry a particularly good role model, but there is a place for variety and useful complexity.

Prompted by a comment by Paul Miller

--Th

xISBN extensions

HtmldefguideI think it's fair to say that xISBN (here's an application) has been a fairly popular service.  We've done quite a bit of work to make the results better, but little to explore what other types of services could extend or compliment it.  I recently posted a similar list to  Code4Lib, but thought an entry here might reach a slightly different group.

Here are a few ideas that either we've thought of, or have been proposed to us:

  • ISBN to author/title or more extensive work-level metadata
  • Adding 'distinctive' information to the response to help selection
  • xOCLC: take in an OCLC number and return the OCLC numbers in its FRBR work-set
  • xLCCN: do the same thing for LCCNs
  • xISSN:  do the same thing for ISSNs
  • SOAP wrappers around everything
  • Expose services via OpenURL 1.0

Some of these would take more work than others.  the OCLC number and LCCN sound easy.  The author/title information probably isn't too hard, as we are doing similar things in FictionFinder.
Deciding what is distinctive for one ISBN versus the others in a work-set sounds a bit harder.  ISSNs sound hard, too.  Our FRBR algorithm does group some serials together, but FRBRizing serials is at least slightly contentious and we don't claim to have even a proposed solution, although we do see the need.

We're open to suggestions.  It would be nice to have some sort of premium service that would help sustain the free services.

--Th

114

FerrariI can reach something like 114 controls from the driver's seat in my car.  Of course, like bibliographic records, there are lots of ways to count things, and this is a generous counting (e.g. the 4-way mirror control counted as 4), but even collapsing those, there are at least 50, and that doesn't count the on-screen controls of the navigation system, which would at least double the number.

What brought this to mind are some things I've read lately about  how simple OPACS should be  I'm all for simplicity and love being able to type almost anything into Google's search box and have it (usually) do something appropriate, but average people manage to cope with some fairly complex systems in their daily life, if they are important to them.  If complexity adds real functionality that many users need and they don't have to cope with it just to do common tasks, it probably deserves a place in our interfaces.

That said, 114 controls do stretch my ability to cope.

--Th

Counting records in WorldCat

ApplecakeHow many records are in WorldCat?  The simple answer is just over 60 million, since the 60 millionth OCLC number was just added May 2nd (you can watch records go in here).  But of course, things aren't quite that simple.  According to Gary Smith here at OCLC, in the first week of May 2005, there were slightly more than two million cross reference records in WorldCat.  These are created when records get merged.  When you ask for a record that got merged into another record the cross reference record automatically refers you to the merged record.

There are also records that are missing completely.  The famous Apple Cake recipe, which has probably been deleted more than once from WorldCat, would be an example of that.  This month there are nearly 450,000 of those.  Taking both cross reference and missing records into account reduces the 60 million by some 4.4% to around 57.4 million bibliographic records.

At least as important as record count are the holdings associated with the records.  There are about 16 holdings/OCLC#, a statistic that has been relatively stable for decades.  When things are busy, nearly thirty holdings get added each second, although three per second is more typical.  Libraries often modify records while setting holdings, and we have copies of all those changed records, although we don't keep them online (yet).

This year (2005) WorldCat will pass 1,000,000,000 holdings.

--Th

xISBN, mod_Python, and optimization

XisbndiagWe've had some problems with our xISBN service over the last couple of weeks.  One lesson we learned is that optimization is very context-sensitive.  The other is that component tests are a good thing.  Sometimes we get ahead of ourselves here in Research, but for code that goes into semi-production we end up needing the tests, and as a background activity are extending them to exercise more of the FRBR code functionality.

xISBN is a simple web service.  You give it a single ISBN and it returns all the ISBNs we think belong to that FRBR work.  To make this fast (and cheap, since we're not charging for the service) xISBN loads the whole table into main memory using the FlatFile code mentioned previously.

Now, we've spent some time optimizing the FlatFile code so that lookups are fast.  We also added code that checks the input file for correctness (e.g. making sure the keys are in sorted order), since we've gotten burned by that a couple of times.  Combined with the growing popularity of xISBN, those improvements turned into real problems.

Continue reading "xISBN, mod_Python, and optimization" »

More FRBR Workshop, identifiers, etc.

Whatwork_1 Although the content of a conference is its justification, meeting people for the first time is always one of a conference’s pleasures.  The FRBR Workshop had a very nice dinner in the OCLC Atrium and, among the other fascinating people at our table, was the Inquiring Librarian (Jenn  Riley of Indiana U.) who writes a blog worth looking at.

Diane Vizine-Goetz gave an excellent presentation about the use of subjects in fiction, although I suppose I may be prejudiced as we are working on some projects together.

Ketil Albertsen gave a very thoughtful talk about identifiers.  For the Paradigma project, they needed to develop identifiers for harvested objects.  Anyone who has worked with identifiers will recognize that identifiers aren’t as simple as they seem.  Ketil’s presentation lists more than 30 issues they considered while designing their identifiers, along with pros and cons, and an estimate of how sure they are of their decision.  Here are the first few rules (slightly paraphrased):

  • ID value carries no information about identified object
  • ID values should use a restricted character/symbol set
  • Check digits are include in ID display
  • Check digits are not stored internally
  • IDs have a fixed length

Continue reading "More FRBR Workshop, identifiers, etc." »

FRBR statistics

Frbrcover_1 Attending the FRBR Workshop at OCLC today, one of the most common questions/assertions is that the minority of bibliographic records need FRBR to group them.  That's an easy impression to get from some of the statistics we've released (e.g 78% of the works in WorldCat have a single manifestation), but at least slightly misleading.  Here's a statistic that is closer to what a typical user sees:  63% of the holdings in WorldCat are associated with works with two or more manifestations.

Update: Sorry, I got that wrong.  Here are the latest figures:

Out of 54,830,689 manifestations we found 43,794,883 work sets
88% of the works in WorldCat have a single mainifestation
55% of the holdings are in works with more than one manifestation

--Th

MetaWiki

Here's the promised follow-up to my ERRoLs post.

Openurl Jeff Young has been gradually generalizing his ideas on what can be done with web protocols like OAI-PMH and SRU.  One thing that has been missing in all of this is an editing capability.  Jeff realized that by adding an edit capability he had all the pieces needed to make a wiki.  Actually, more than that.  All the pieces to make a wiki with fielded data.

Continue reading "MetaWiki" »

Exchanging FRBR information

OrtreescroppedI'm giving a presentation at the FRBR Workshop being held this week at OCLC.  Please pardon the PowerPoint.  I'll put an updated HTML version up soon.

--Th

My Photo

July 2008

Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31