One often hears statements like 'transmitting the Library of Congress in 15 minutes' or 'equivalent of the Library of Congress on a key fob.' I suppose I've contributed to that by putting all of WorldCat on an iPod (not hard to do, but the iPod isn't up to interacting with it yet), but how realistic is LC on a key fob?
The Library of Congress is larger than I thought. The site claims 29 million books, 2.7 million recordings, 12 million photographs, 4.8 million maps and 58 million manuscripts! Sometimes people equate a volume to a megabyte (the typical novel is around that), but more realistically, you'll need a scanned image of those pages, around 100Kbytes/page. At 500 pages/volume that gives us about 50 megabytes per volume. At 500 megabytes/recording, 2 megabytes/photo, 5 megabytes/map and 50 megabytes/manuscript I get: 30m x 50mb + 3m x 500mb + 12m x 2mb + 5m x 5mb +60m x 50m = 6 petabytes. This doesn't include video. At 5 gigabytes/video, it only takes 300,000 videos to match the scanned size of all the books, so lets call the collection an even 10 petabytes. This is quite a bit larger than the size people often use, but more realistic.
My son recently put a terabyte together for around $1,000. I just bought a one-gigabyte Memory Stick for $100 (about 100x the cost of disk). 10 petabytes is 10,000 terabytes and 10,000,000 gigabytes, so we've got something between 4 and 7 orders of magnitude in cost reduction to wait for. Assuming the cost of storage continues to decline at 50% per year, it takes about three years to get a 10x reduction in price. So it will be 12 years before you could stuff the contents of the Library of Congress on your personal storage server, 21 years before you could get it on your key fob.
That's not too bad. 15-20 years is a long way to extrapolate, but the cost decline has held up for the last 40. I got asked the same question in 1970 though, and didn't have any easy answer then as to what it would mean. We were talking microfilm, but it was still the same question. Of course I suppose the contents of the Library of Congress fall into the "limited canon of book-centered knowledge", so the interest in this has probably already declined. Still, it does give you an idea of the capabilities we might have and how hard it's going to be for publishers to keep everyone from having a copy of everything they come in contact with.
Update 2011 June 30: It's been six years since this was originally posted. Using the rough 3 years/10-fold decrease we should have seen a 100-fold decline in storage prices. Looks like you can get about two terabytes in a USB external drive for about $100, or a factor of 20 rather than 100. SD cards are now in the $1/gigabyte, a 100-fold reduction in 6 years, pretty much on target.
So, $1,000 now gets you 20 terabytes of disk, but we need 10 petabytes, or 500x20TB. Call that 2.5 orders of magnitude at 3/years/magnitude gives 7-8 years left at the original rate, and more like 12 years at the current rate of decline. That is the same wait predicted 6 years ago, especially if, as one of the comments suggests, LC continues to grow their collection. The $100 key fob at 100 gig, still needs needs a 100,000-fold drop in price, but only 15 years might get us there!
Update 2015 May: Ten years since this was first posted, and the exponential drop in prices seems to have slowed. USB sticks in general seem to have stopped growing, although you can get a 1 terabyte stick fro about $800. Amazon is offering at least one 128 GByte Micro SD card for $20. Those little cards are 165 cubic millimeters (15x11x1). A cubic foot could hold about 154 thousand of them, or 20 million gigabytes, probably enough to hold LC, even with some growth, especially since image compression algorithms have improved substantially over the last ten years. That would be heavy to pickup, but it would about as portable as the first portable PCs!
But 50 thousand 128 GByte Micro SD cards would cost $1 million. To get to a $100 we're still looking for 4 orders of magnitude, or 12 more years at best. 5 terabyte disk drives are now about $130 so $1,000 gets you nearly 40 TBytes on disk, half of what it cost four years ago, but not nearly as fast a drop as predicted. Disk is still cheaper than flash memory, but only by a factor of 10. If we wanted to give up on the $100 stick and targeted a $1000 hand held device (which would be disk today), we are only three orders of magnitude away from have LC in our hand! Maybe another 10 years?
As several of the comments have mentioned, we're going to be depending on telecommunications for access to collections this size for some time to come. One thing that has changed is that there are now lots of places with storage systems big enough to hold 20 petabytes without any problem, so hosting such a collection is now technically feasible.