Sunday, April 12, 2009

Gene Encyclopedia of all Life

A tremendous resource would be an open gene database of all of the genes present in eukaryotic, archaeal and bacterial life. There are several open genomic databases now but the information is organized around genomes and organisms rather than specific genes and gene function.

A gene database of all life is in the same vein as E.O. Wilson’s Encyclopedia of Life, but at the next level of detail. The Encyclopedia of Life hopes to provide a webpage with scientific information for every known species on Earth. The gene database would provide a webpage and scientific details for each gene present in life and include other information such as a cross reference to all of the different species in which the gene is expressed.

Merge the Entrez Genome Project and the PartsRegistry
The foundations and perhaps the vision and obviousness of a gene database of all life exist but not its targeted pursuit as a funded research priority. Existing genomics databases such as the U.S. NCBI’s Entrez Genome Project database could be extended and merged into one database that is more explicitly searchable by gene function, possibly joining forces with the PartsRegistry from synthetic biology which provides a homepage, datasheet and genomic sequence by gene or biological function.

NCBI’s Entrez Genome Project database genomic catalog of all life


Unifying the work of E.O. Wilson, Craig Venter, Penny Boston and Drew Endy
An interesting project would be the unification of the Encyclopedia of Life, genomics-by-organism databases and parts registry-by-gene databases together with the aggressive pursuit of cataloguing and sequencing newly discovered organisms and genes. A gene encyclopedia could rapidly extend human knowledge and facilitate the era of personalized medicine as these novel genes could have extensive application in human therapies and pharmaceuticals, energy, climate management, agriculture and other areas.
Tremendous novelty and diversity remains unstudied with species (E.O. Wilson), with organisms in the sea (Craig Venter), and with extremophile life in caves (Penny Boston); 70-90% novel organisms, most of which have not had any gene identification and sequencing, functional assessment and cataloging.

A data resource like a gene encyclopedia could also uplevel the research focus to analytics. It will be interesting to see if an era of fully fungible genes across life arises, how easy it is to transplant function and how function expresses differently in different life forms.

blog comments powered by Disqus