Skip to main content

Scientists have for the first time decoded the complete DNA sequence of a single human being, a mammoth feat that shatters old beliefs about the "book of life" and marks a historic step toward the era when medical care can be tailored to an individual's genes.

With the boggling array of genetic quirks, burps and hiccups found in the full DNA sequence of one healthy middle-aged man, the human genome has now shrugged off its reputation for being perhaps the world's most boring and predictable molecule.

Coiled inside the body's cells, DNA is the chemical chain that encodes the instructions to build and operate a human in two sets of 23 chromosomes - one set passed down from each parent.

The first two maps of the human genome, published by an international government-funded consortium and a private company in 2001, were based on a patchwork of DNA from several donors. Both versions were also half maps, decoding only one set of the 23 chromosomes on the assumption the two sets would hardly differ.

Those maps suggested that humans were 99.9 per cent genetically identical, with only one one-thousandth of DNA information accounting for all the vibrant variety of humanity.

Now researchers from Canada, the United States and Spain have decoded all 46 of the chromosomes belonging to J. Craig Venter, the 60-year-old upstart American biologist whose company, Celera Genomics, compiled the private version of the human genome seven years ago. And the results indicate that those first celebrated DNA maps seriously underestimated the genetic diversity of humans - by a factor of at least five.

The new work suggests that the genetic code in the chromosomes we carry can vary widely, not only between any two strangers waiting at a bus stop, but between brothers and sisters.

"The biggest single surprise is how much we missed the boat with the human genome seven years ago, and how different we really are," Dr. Venter said in an interview. "The overwhelming message back then was that we are all like identical clones of each other. ... It's comforting to know we are more unique than that."

The findings, released today in PLoS Biology, a free, online scientific journal, give researchers a trove of new targets when hunting for genetic traits that contribute to disease. They also fuel hopes that people could one day learn from their codes which drugs best suit them, or what ills might befall them and take steps to prevent them.

At the same time, the study serves up a sobering dose of reality for genetic medicine. Diagnosing conditions through genetic tests may be trickier than expected, since the differences between maternal and paternal chromosomes means there could be two very different sides to every story. As well, the work shows that relying on DNA to predict anyone's medical future at the moment might be a lot like reading tea leaves: The picture could be fuzzy and fleeting for a long time to come.

"It is clear," Dr. Venter said, "that we are still at the earliest stages of discovery about ourselves and only with continued sequencing of more individual genomes will we be able to garner a full understanding of how our genes influence our lives."

The more genomes researchers can read, he said, the better they can understand how the genetic script relates to a person's actual performance and tease apart the effects of environmental forces.

Steve Scherer, the senior scientist in Genetics and Genome Biology at Toronto's Hospital for Sick Children who led the analysis of the Venter genome, noted for example that nearly half of Dr. Venter's 23,224 genes contained variants, or mutations - "a number geneticists have wondered about for 50 years." At this point, Dr. Scherer said, no one can interpret most of the new information. In fact, the researchers note that decoding Dr. Venter's DNA has so far revealed not much more about his potential health problems than knowing his family history.

Still, Dr. Scherer remains optimistic that the learning curve is likely to be surmounted in the not-too-distant future.

"With this type of knowledge now in hand, the stage is set for an era of personalized medicine, where genome sequence information becomes a critical reference to assist with health-related decisions," said Dr. Scherer, who is also a professor of medical and molecular genetics at the University of Toronto.

Most experts predict that routinely reading individual genomes will become a reality within five years as the technology to unravel the six billion chemical units that make up DNA gets faster and cheaper.

Kathy Siminovitch, director of genomic medicine at Toronto's Mount Sinai Hospital and the Samuel Lunenfeld Research Institute, noted that the first Human Genome Project rang in at roughly $1-billion (U.S). But with the new generation of "ultra-fast" DNA sequencing machines that have hit the market within the past two years, she said the bill is expected to drop to less than $100,000 by year's end.

Connecticut biotech firm 454 Life Sciences, for instance, has been using the technology to decode the full genome of James Watson, the Nobel laureate who co-discovered the structure of DNA in 1953. That publication is expected later this year.

"It seems like it is possible to think that a $1,000 genome could be within reach," said Dr. Siminovitch, who is buying an ultra-fast sequencer for the University Health Network. "When we see how much variation there is in [Dr. Venter's]DNA, then chances are there is this much variation in all DNA. ... This publication [of the Venter genome]will drive the momentum to get the price down and to be able to do this on lots of people."

Work on Dr. Venter's DNA began in 2003, growing out of the original Celera map, which was a compilation of the DNA from five people. But 60 per cent of it had belonged to Dr. Venter, which, at the time, cost roughly $60-million to decode.

After leaving Celera over a business dispute in 2002, Dr. Venter set up the J. Craig Venter Institute in Rockville, Md., where biologists, geneticists and computer analysts, along with collaborators at the University of California in San Diego and Spain's University of Barcelona, spent four years and at least another $10-million continuing to unravel his DNA.

With most of it in hand last summer, after 32 million "reads" through sequencing machines, the Venter team turned to Dr. Scherer and scientist Lars Feuk at Sick Kids to analyze it for variations and mutations.

The Sick Kids team has made several new discoveries about the unexpected quirks in DNA over the past three years. Where scientists once assumed that genetic typos, or single chemical changes in the code, were the dominant form of mutation, the Toronto researchers have shown that DNA can also vary widely in structure and size.

Dr. Scherer and colleagues have found that people can carry several extra copies of genes, or be missing them completely, and still be healthy. The phenomenon, dubbed "copy number variation," could act like a dosing effect to explain the towering height of a basketball player, for instance, or why one child might look so much like his father. Now, by studying Dr. Venter's DNA, they have discovered another form of variation that, much like a genetic hiccup, can add or delete just a couple of extra chemical units to a stretch of code, which may well affect the function of a gene.

"We're recognizing this form of variation, of these small insertions and deletions, for the very first time," Dr. Scherer said. He explained that researchers once estimated there were about 100 such variants in a human genome, "but now we see about one million of them."

"It's different from everything we've learned ... the chromosomes don't line up at all."

The ongoing study suggests the chromosomes Dr. Venter inherited from his parents are different in at least 15 million places.

"This raises all sorts of questions," Dr. Scherer said. "You can have no gene on one chromosome and have two copies of the gene on the other ... there's really a more dynamic interplay than we thought." It is not yet clear, he added, how or when one parent's chromosomes might kick in to have the dominant effect.

Dr. Venter noted that the genetic variation between unrelated people might be much higher, considering that both of his parents hailed from Western Europe. People with parents from more diverse populations, he said, might have even greater differences in their chromosomes.

But he stressed the new findings do not suggest there are racial differences in DNA. "Race is a social construct, not a scientific one," Dr. Venter said. We are all originally related, and all of us genetically mixed, he said, so that no "bright lines" can be drawn to cleanly divide populations at the level of DNA.

A new glossary of genetics

Human genome All of the genetic information carried inside a human cell.

Chromosomes The rod-shaped structures inside our cells made up of DNA. They house genes along their length like boxcars on a train. People inherit 46 chromosomes from their parents, 23 from each parent.

DNA Deoxyribonucleic acid is the chemical code that provides the genetic instructions to build and operate a human being. It is wound like a spiralling ladder into the 23 pairs of chromosomes found in the nucleus of our cells. There are about three billion rungs on the ladder.

Genes The essential units of heredity that make up only about 3 per cent of the genome. Each gene encodes a recipe to make a protein and proteins make the stuff that help to make us human: the shape of our lips, the sound of a laugh, the frontal lobes of our brains.

Nucleotides Chemical units that form the building blocks of DNA and are represented by the letters A, C, G and T (A for adenine, C cytosine, G guanine, T thymine). One 'letter' is found at the end of each rung on the ladder that makes up DNA with As joining to Ts, and Cs to Gs. There are three billion of these so-called base pairs or rungs across the ladder.

Junk DNA The 97 per cent of genetic code in DNA that does not encode the recipe for a gene. Now thought to be linked to regulating genes.

SNiPs The mutation type best known in human DNA. It stands for "single nucleotide polymorphism" and refers to a single-letter change in the DNA code, like a typo, a T where others carry a C, for example.

CNVs More recent type of variation co-discovered by researchers at Toronto's Hospital for Sick Children. It refers to large stretches of nucleotides that can be missing or added to DNA regions both inside and outside of genes. These can also result in people carrying several more copies of a gene or none at all, but still be apparently healthy.

INDELS Blips in the code where at least two nucleotides are inserted or deleted. Resembling small CNVs, these are far more prevalent than expected and may affect the function of genes.

Carolyn Abraham

Decoding the human condition

2001 First human genome maps

Two versions: a public one, compiled with DNA from more than 700 anonymous donors, and the private Celera version based on five donors, 60 per cent of it from the DNA of J. Craig Venter.

Decoded only one of the two sets of the 23 chromosomes people inherit (assuming that the two sets would differ little).

Excluded segments of mismatched code to compile "a consensus sequence" that represent no one person, or the true unpredictable nature of DNA.

2007 First full genome map

of a single human Assembled from scratch by

decoding the complete DNA

sequence of one person, 60-year-old Dr. Venter, the former head of Celera Genomics.

Sequences the DNA in all of Dr. Venter's 46 chromosomes, or the two sets of 23 passed down from mother and father.

All sequences, even those that seem highly variable, included.

Carolyn Abraham

Interact with The Globe