Post-Map Workplan

From biology to physiology

At first glance, the relatively small number of human genes reported by Celera Genomics Group and the International Human Genome Sequencing Consortium implies a level of simplicity that would allow for easier prediction of a gene's protein structure and function.

For companies involved in structure prediction, fewer genes suggest less variety in protein structures, making it easier to apply homology modeling techniques to determine the 3-dimensional structure of an encoded protein.

And for functional genomics companies such as Lexicon Genetics Inc. (LEXG, The Woodlands, Texas), "getting knockouts for every mammalian gene is now realistic" given the smaller number of genes, according to CEO Arthur Sands.

Initially, the results point to the increased utility of model organisms, for example instantly mapping a gene of interest in a given model organism to a corresponding human gene, and a more rapid influx of data into computational and in silico modeling systems, as fewer (than expected) genes reduces the number of protein families to be elucidated.

But although the findings by CRA (Rockville, Md.) and the Consortium (IHGSC) reveal common sets of genes across a variety of organisms, they also expose a more intricate picture, one that involves genes leading to multiple proteins through alternative splicing and complex domain architectures in protein structures.

Going forward, the challenge will be to use the findings to refine the various approaches to predicting gene structure and gene function, and ultimately to develop model systems that accurately forecast physiological outcomes in humans.

Conservation across organisms

Last week's reports highlighted the fact that the human genetic code is broadly conserved across invertebrate and vertebrate species. The Consortium's Nature paper noted that 77 percent of human proteins and 93 percent of human protein domains are conserved between humans and invertebrates.