This posting is partly a shameless experiment in manipulating results in search engines. WorldCat Identities normally returns XML pages with an associated style sheet that transforms the page to HTML within your browser. This works fairly well with modern browsers, less well with mobile devices, and very poorly for harvesters such as Google and Yahoo, who really don't know quite what to do with XML and are reluctant to run our style sheet on their machines to see what it would produce.
So, we (by we I mean that others did most of the work, primarily Ralph LeVan) recently added a feature to Identities that detects the major harvesters and converts the XML pages to HTML on our end. That way the harvesters see plain HTML, something they are very familiar with. The change just went in last weekend, but already the server-side rendered HTML is showing up as the cached version of some Identities pages in Google. More importantly, this seems to have the desired effect of making the pages easier to find.
Another thing we have done is to create sitemaps for the top 50,000 Identities pages (based on library holdings). We haven't added pointers to those sitemaps in our robots.txt file yet, but that is scheduled to happen soon. When that happens it will be easier for harvesters to find Identities pages about popular authors, and they should do a good job indexing them because they will see HTML.
Here's the experiment. Rosemary Sutcliff's Identities page doesn't seem to be in Google today. I'm hoping that the link to it from this blog entry prompts Google to harvest the page, and that it achieves a reasonable ranking in a Google search (e.g. on the first page). Some Identities pages already do this. Lorcan Dempsey noticed that a search in Google for Raymond Williams has the Identities page highly ranked (probably because he has linked to it from his blog). Interestingly a search for Williams, Raymond does even better. Identities also does well if you enter very specific searches, even for famous people (e.g. Armstrong, Louis 1901-1971).
--Th
Update: This was originally posted August 1, 2008. Google's cache shows Rosemary Sutcliff's WorldCat Identities page was harvested on the 4th, and I got a Google alert that showed it was available on August 8th.