AACR2 lists four uses for uniform titles, but the most common is to group items that appear with multiple titles under a single heading. Works such as Don Quixote that are published in multiple languages and under hundreds of different titles benefit from this. Unfortunately, when trying to group manifestations into works, uniform titles do not always correspond to what anyone would consider a work.
We have been aware of this since we started trying to group bibliographic records into works (something we've dabbled in for nearly 20 years here at OCLC, and worked on seriously for half that time). My last post about controlling names was an unpleasant reminder us of this, since the most popular 'work' presented under our newly controlled J.S. Bach records is actually there because of a MARC21 240 (uniform title) field Selections. Our current work clustering always uses the 240 in preference to the title proper reflected in the 245 (title statement) field. Music has its own highly developed approach to uniform titles, but similar groupings occur in other areas.
In Bach's case we found 1,429 different titles collocated under Selections. For some of these, Selections might be the best place to put them, but others, such as Switched-On Bach (by Carlos) have multiple manifestation records, and a life of their own beyond simply Selections. Another case we've long known about is Treaties, etc. which groups treaties (e.g. Great Britain, Treaties, etc.). Although different treaties are obviously different works, that clustering somehow seems less surprising than hiding Switched-on Bach under Selections.
Some would probably argue that manifestations collected under Selections are really just themselves collections of works by Bach and some other mechanism is needed to get access to those works. I don't think there are any easy answers to this problem, but we are going to try out (here in OCLC Research first) a fairly simple approach. There are uniform titles that occur so many times that we consider them 'noise' titles for doing things like matching names. For FRBR processing we are going to try ignoring the top 25 uniform titles. Here they are, along with a count of how many times we see them in WorldCat:
3,125 SPEECHES
3,404 CANTATAS
3,873 QUARTETS\STRINGS
4,377 CHORAL MUSIC
4,662 CONSTITUTION
4,761 CHAMBER MUSIC
5,263 ESSAYS
5,428 OPERAS
5,535 SONATAS\PIANO
5,585 SYMPHONIES
7,016 ANNUAL REPORT
7,361 ORGAN MUSIC
8,333 VOCAL MUSIC
8,929 PLAYS
11,483 ORCHESTRA MUSIC
12,899 CORRESPONDENCE
13,191 INSTRUMENTAL MUSIC
14,811 SHORT STORIES
23,098 PIANO MUSIC
24,406 TREATIES ETC
26234 SONGS
46,877 POEMS
58,303 LAWS ETC
59,210 WORKS
91,940 SELECTIONS
There are a number of other generic uniform titles beyond the top 25, but at that point we start to see uniform titles for works (e.g. The Book of Common Prayer is #26).
This isn't our first abandonment of the 240 field. WorldCat Identities originally preferred the 240 to the 245 for the work display. Unfortunately relatively few people benefited from seeing Prestuplenie i nakazanie instead of Crime and Punishment, so we switched to using the most common form of the 245 for display.
Note: The list of common uniform titles is in upper case because of normalization. In the past we normalized to lower case for ease of reading, but the latest version of PCC/NACO normalization uses Unicode mappings to normalize case, and since some of these mappings are only available into uppercase, we are following their guidelines and switching to it.
--Th
Update (3 December 2008): We couldn't stand the uppercase, so after we've done the normalization we now 'lower' the characters that have a lower case character associated with them.
The list of uniform titles to ignore hasn't changed much, except that 'quartets\strings', 'sonatas\piano' and 'symphonies' have been removed. For non 240 titles we have a longer list generated algorithmically. (For VIAF name matching we have similar lists of titles we don't trust to bring names together, one for each authority file we are processing.)