One often hears statements like 'transmitting the Library of Congress in 15 minutes' or 'equivalent of the Library of Congress on a key fob.' I suppose I've contributed to that by putting all of WorldCat on an iPod (not hard to do, but the iPod isn't up to interacting with it yet), but how realistic is LC on a key fob?
The Library of Congress is larger than I thought. The site claims 29 million books, 2.7 million recordings, 12 million photographs, 4.8 million maps and 58 million manuscripts! Sometimes people equate a volume to a megabyte (the typical novel is around that), but more realistically, you'll need a scanned image of those pages, around 100Kbytes/page. At 500 pages/volume that gives us about 50 megabytes per volume. At 500 megabytes/recording, 2 megabytes/photo, 5 megabytes/map and 50 megabytes/manuscript I get: 30m x 50mb + 3m x 500mb + 12m x 2mb + 5m x 5mb +60m x 50m = 6 petabytes. This doesn't include video. At 5 gigabytes/video, it only takes 300,000 videos to match the scanned size of all the books, so lets call the collection an even 10 petabytes. This is quite a bit larger than the size people often use, but more realistic.
My son recently put a terabyte together for around $1,000. I just bought a one-gigabyte Memory Stick for $100 (about 100x the cost of disk). 10 petabytes is 10,000 terabytes and 10,000,000 gigabytes, so we've got something between 4 and 7 orders of magnitude in cost reduction to wait for. Assuming the cost of storage continues to decline at 50% per year, it takes about three years to get a 10x reduction in price. So it will be 12 years before you could stuff the contents of the Library of Congress on your personal storage server, 21 years before you could get it on your key fob.
That's not too bad. 15-20 years is a long way to extrapolate, but the cost decline has held up for the last 40. I got asked the same question in 1970 though, and didn't have any easy answer then as to what it would mean. We were talking microfilm, but it was still the same question. Of course I suppose the contents of the Library of Congress fall into the "limited canon of book-centered knowledge", so the interest in this has probably already declined. Still, it does give you an idea of the capabilities we might have and how hard it's going to be for publishers to keep everyone from having a copy of everything they come in contact with.
Update 2011 June 30: It's been six years since this was originally posted. Using the rough 3 years/10-fold decrease we should have seen a 100-fold decline in storage prices. Looks like you can get about two terabytes in a USB external drive for about $100, or a factor of 20 rather than 100. SD cards are now in the $1/gigabyte, a 100-fold reduction in 6 years, pretty much on target.
So, $1,000 now gets you 20 terabytes of disk, but we need 10 petabytes, or 500x20TB. Call that 2.5 orders of magnitude at 3/years/magnitude gives 7-8 years left at the original rate, and more like 12 years at the current rate of decline. That is the same wait predicted 6 years ago, especially if, as one of the comments suggests, LC continues to grow their collection. The $100 key fob at 100 gig, still needs needs a 100,000-fold drop in price, but only 15 years might get us there!