2012 September 24 Monday
Implications Of Lots More Genetic Variants

Been meaning to comment on this report of a couple of weeks ago: We don't carry around anywhere near as much junk DNA as previously claimed. Not too many slacker letters in the genome going along for the ride.

"Our genome is simply alive with switches: millions of places that determine whether a gene is switched on or off," says Ewan Birney of EMBL-EBI, lead analysis coordinator for ENCODE. "The Human Genome Project showed that only 2% of the genome contains genes, the instructions to make proteins. With ENCODE, we can see that around 80% of the genome is actively doing something. We found that a much bigger part of the genome a surprising amount, in fact is involved in controlling when and where proteins are produced, than in simply manufacturing the building blocks."

This report has a number of important implications:

  • A larger portion of the genome being functionally significant brings with it a larger number of genetic variants. Many variants that were thought to be functionally insignificant really cause changes in how our bodies function.
  • Therefore we have far more genetic variation between humans at all levels of aggregation. We are more different in more ways due to genes than previously asserted.
  • It will be harder to figure out what each genetic variant does because it will be harder to control for most variants to look at the effects of a small number of variants.
  • It will also be harder to figure out what each genetic variant does because so many more variants will need to be studied. Some may cancel out the effects of others.
  • We will have more functionally significant existing genetic variants to choose among when creating genetically engineered super offspring.
  • Our genetic load of harmful mutants (we all carry harmful mutants) is much larger. We are more flawed than we previously thought. So that also opens up greater possibility for making better humans. More flaws to get rid of. This even can work for those of us already born since we can use cell therapies built from cells with fewer harmful mutants.

If we had cheaper genetic sequencing equipment 30 years ago we would have made very good use of the data because of the cost of computer memory, disk, and CPU. The amount of data we need to process thru to tease out what genes do is huge. We each have about 3 billion genetic letters. But when sequencing a genome it has to be sequenced many times to identify errors and to fit together overlapping pieces. So each genome's sequencing requires processing of tens of billions of genetic letters with lots comparisons and building up of data structures to gradually connect all the pieces together.

Once each genome's sequence is known then using it to compare against other genomes requires even more computer powers. The differences in those letters need to be compared with many attributes of each of us for tens or hundreds of millions of people in order to discover all their effects.

By Randall Parker    2012 September 24 09:17 PM   Entry Permalink | Comments (3)
Site Traffic Info