Sunday, June 24, 2007

Massive info processing needed

Step function improvements in information processing and communications capabilities are needed now and in the future.

Massive amounts of data are becoming increasingly available and currently humanity only has narrowband access to these biomedical, genetic, astronomical, particle accelerator and other data sets.

Large and diverse data sets requiring storage, processing and upload/download will only continue to proliferate. For example, as the web goes 3-D, rapid video, audio and data download and upload are needed, as objects become IP tagged, sensor and RFID networks require instantaneous data collection and summarization.

Scientific philosophers such as Anders Sandberg correctly point to information processing capability as the main limit on any society's growth and development. To attain a new level and not just asymptote out on the current plateau, a re-envisioning of information processing is necessary.

As with energy, some alternatives are being explored at the fringes but there is no real replacement methodology on the horizon. Also as with energy, there may be a dearth of real progress until the present means is more substantially threatened, however this point is coming very soon in information processing as compared with energy.


sedicious said...

Forgive me for playing devil's advocate here, but what greater access to all this data is really needed? I mean, sure, it's rather a hassle to access it and use the large datasets, but as long as the data is only of interest to experts anyway, who is it that's really missing out? Any specific examples?

I ask because I think the real challenge is not nearly so much with the "biomedical, genetic, astronomical, particle accelerator and other data sets" as it is with calendars, emails, voicemail, blogs, news, press releases, podcasts, legal filings, notices and rulings, etc. For compiling scientific and statistical data our current methods are bit clumbsy and expensive, but we actually do see both technological solutions and the social/economic apparatuses to implement them. This is much less the case for textual, audio, and video data, where the current state of the art can do some metadata-based retrieval, but very little in the way of summarization, correlation, compilation, etc.

LaBlogga said...

Hi Sedicious,

Thank you for your thoughtful comments. The benefit of better processing, tools, communication and administration for data sets is to open them up to 1) all industry experts (right now it is hard for worldwide experts to access the same large data sets easily and 2) to open up the data sets to amateurs (per Wikinomics) to try a much greater diversity of methods and uses.

Specific data set examples include the petabytes coming down from the Hubble Space telescope and the expected data load from the CERN particle accelerator.

I agree that there is a huge amount of other data to be processed and much better tools to be developed for visualization, etc. but the biggest advances will come from the next level of scientific data administration, 3-D modeling of the brain, cancer tissues, memory and aging processes for example.

Deepak said...

Life scientists have no idea what kind of data deluge they are going to face in coming years (well some of them do). We've been very happy in our little silos, but get to the next generation of discovery, data will have to be opened up and put up in the cloud.

That said, I agree that it's not just scientific data that needs to be processed.

LaBlogga said...

Hi Deepak, thanks for the comment. Yes, there are and will be many kinds of data that will need to processed, stored, transmitted and manipulated, in step functions more prolific than today. It is still the sciences from which the key advances for humanity are most likely to emanate.