There has been an interesting discussion on the Code4Lib mailing list the last couple of days about how to rank results in a FRBR environment. I weighed in with the common opinion around here (at least in OR) that the major factor in ranking should be some sort of popularity score. We typically use the total number of WorldCat holdings for the work, but it would seem as though circulation data could be used as well. Other ranking criteria, such as the number of times a term occurs, I claimed are secondary at best.
Shortly after posting that, we had a visitor that pointed out a weakness in ranking only by library counts. Diane Vizine-Goetz was demonstrating a soon-to-be-released version of FictionFinder by searching for 'Don Quixote' and the second most highly ranked item was Henry Fielding's History of the Adventures of Joseph Andrews, "A Henry Fielding novel written to imitate the action of Cervantes' romantic-heroic character, Don Quixote.'"
Now obviously the Fielding novel is related to Don Quixote, but it doesn't seem as though it should be second in the list, especially because there were several other 'works' listed that look as though they should have been included in the main Don Quixote group, but were missed because of title variants (e.g. The Ingenious Gentleman Don Quixote de la Mancha). It's even conceivable that Joseph Andrews could have come ahead of Don Quixote in the list if it had more library holdings. (Actually, it isn't even close at 4,866 holdings versus Don Quixote's 40,257).
So, I think it is clear that a simple library count isn't the best possible way to rank FRBR work-sets. What should be done to fix it is less clear. In the above example the string 'Cervantes, author of Don Quixote' actually appears in the subtitle of many manifestations of Joseph Andrews. Right now ranking by library holdings is fairly understandable, and in our experience works very well.
--Th