August 26, 2005
Public DNA Sequence Databases Reach 100 Gigabases Of Data

The big public DNA sequencing databases have hit the 100 gigabase mark.

The world’s three leading public repositories for DNA and RNA sequence information have reached 100 gigabases [100,000,000,000 bases; the ’letters’ of the genetic code] of sequence. Thanks to their data exchange policy, which has paved the way for the global exchange of many types of biological information, the three members of the International Nucleotide Sequence Database Collaboration [INSDC,] – EMBL Bank [Hinxton, UK], GenBank [Bethesda, USA] and the DNA Data Bank of Japan [Mishima, Japan] all reached this milestone together.

Graham Cameron, Associate Director of EMBL’s European Bioinformatics Institute, says "This is an important milestone in the history of the nucleotide sequence databases. From the first EMBL Data Library entry made available in 1982 to today’s provision of over 55 million sequence entries from at least 200,000 different organisms, these resources have anticipated the needs of molecular biologists and addressed them – often in the face of a serious lack of resources."

100,000,000,000 DNA letters sequenced sound like a lot? I find this disappointing. The human genome is estimated to be in the neighborhood of about 2.9 gigabses. So 100 gigabases is only enough to represent the DNA sequences for about 34 people. Suddenly sounds a lot less staggering.

All this information from many organisms helps scientists in many ways. However, in 5 or 10 years DNA sequencing costs will drop down to $1000 or perhaps even $100 per person. Then literally hundreds of millions of people will get their DNA sequenced and orders of magnitude more sequencing of other species will get done.

Cheap DNA sequencing would lead very quickly to the identification of DNA sequences that contribute to many disease risks, longevity, personality, intelligence, and assorted abilities and aspects of appearance. Identification of genetic variations that contribute to differences in disease risks and longevity will help guide the genetic engineering of stem cells to create stem cells which maximize longevity and health improvements.

With cheap DNA sequencing drug side effects due to genetic variations would become much more avoidable and drug development efforts would hit fewer failures in late stage testing due to harmful side effects in small portions of populations. Hence the rate of drug development would accelerate.

Money spent by government DNA sequencing projects doing sequencing of organisms with today's technology would be better spent on more research to develop cheaper sequencing methods. Though efforts already underway promise cheaper DNA sequencing methods in the not too distant future. Check out previous posts in my Biotech Advance Rates for reports on efforts to cut DNA sequencing costs by orders of magnitude.

Share |      Randall Parker, 2005 August 26 03:45 PM  Biotech Advance Rates

TJ Green said at August 31, 2005 4:53 PM:

Our genes are our past,the future is unknown. To ensure the survival of our species,we are all made slightly different. Sometimes this stategy can go dramatically wrong,like the rhesus negative mother with the rhesus positive child. In the past rhesus negative women would have died with their children(with a figure now of 85% Rh+ and 15% Rh-it gives you an idea of how many must have died). This is another reason why we need a global DNA database,so we can "police" the genes.

Post a comment
Name (not anon or anonymous):
Email Address:
Remember info?

Go Read More Posts On FuturePundit
Site Traffic Info
The contents of this site are copyright ©