Washington Post December 04, 2002

'Junk DNA' Contains Essential Information

by Justin Gillis

The huge stretches of genetic material dismissed in biology classrooms for generations as "junk DNA" actually contain instructions essential for the growth and survival of people and other organisms, and may hold keys to understanding complex diseases like cancer, strokes and heart attacks, researchers reported today.

That is the most striking finding of the first comprehensive comparison between the genetic instruction set, or genome, of human beings and that of laboratory mice, due for publication tomorrow in the journal Nature. The new results suggest that the genomes of both organisms contain at least twice as much critically important genetic material as previously believed, a finding that promises to upend decades of scientific dogma and rewrite the rule book for how nature builds complex creatures.

The newly discovered mother lode of genetic instructions does not, by and large, contain genes, which are templates for building the proteins that do most of the work in human or other bodies. Instead, the new material appears to consist mostly of instructions for how the body should use its genes--when and where to turn them on and off, for example, and for how long.

Scientists have long known that genomes contain such instructions and that these are likely to be important in understanding disease and development. But the new analyses shocked them by revealing that the instruction set is at least as big as the gene set, and probably bigger. It's the scientific equivalent, perhaps, of a consumer buying a trim new gadget and opening the box to find a 300-page instruction manual.

"My goodness, there's a lot more that matters in the human genome than we had realized," said Eric Lander, director of genome research at the Whitehead Institute for Biomedical Research in Cambridge, Mass. and a primary author of the new work. "I feel we're dramatically closer now to knowing what all the players are. That's got to make a huge difference in being able to understand the basis of disease."

In news conferences today in Washington, London and Rome, top researchers unveiled a relatively complete draft of the genome of the laboratory mouse, by far the most important experimental organism in biology, one that has contributed immensely during the past century to understanding human ailments. The availability of that draft genome permitted the first detailed comparisons between the genomes of mouse and human.

Scientists have been eagerly awaiting such a comparison for two years. Even though they compiled a good draft of the human genome in 2000, complete with fanfare at the White House, they knew they didn't have the tools to make much sense of it, in part because it contains enormous stretches of repetitive genetic information of no known function. Merely finding the genes, the protein-encoding regions, buried in the more than 3 billion bits of human genetic information was a vexing problem. Genes are marked off by no clear boundaries, and the best available computer programs were inadequate to the task.

But nature has provided a sort of key, in the form of the humble house mouse, Mus musculus. Several scientists this morning compared the availability of the mouse genome to the discovery of the Rosetta stone, the famous archeological find that contained the same passage written in three languages, enabling linguists for the first time to decipher Egyptian hieroglyphics.

Dexoxyribonucleic acid, or DNA, the long carrier molecule for genetic information, mutates slowly over time. Favorable mutations are the basis of evolution, but most mutations are actually unfavorable. For that reason, stretches of DNA in an organism's genome that don't matter much are free to change a lot over time, whereas stretches that matter a great deal are under tight evolutionary constraints, changing less (if they change too much, the organism dies in the womb, never growing up to leave offspring). Biologists say these important regions are "highly conserved" in evolution.

Humans and mice are both mammals that last shared a common ancestor about 75 million years ago, a very short stretch in the history of life on the Earth. That means the most important parts of their genomes should share striking similarities, whereas the less important parts will have changed considerably.

Now, able to line the two genomes up side by side for the first time, scientists are finding that is indeed the case. The similar regions are jumping out at them like Christmas lights in a darkened room. In fact, 80 percent of known human genes have closely matched counterparts in the mouse, the new papers reveal, and the other 20 percent show a lesser but still recognizable match. It looks as though mice and people share essentially all the same genes.

This finding alone is striking, and it should lead to a detailed understanding of many new genes linked to illness. In recent years scientists have created more than a thousand genetically altered mice as "models" of human disease, a critical first step in devising drugs or other treatments. The new data should accelerate that task by revealing in more detail which mouse genes the scientists need to alter to mimic human ailments.

The big surprise in the research, however, was that about 5 percent of the genetic material of mice and people is highly conserved, and matching genes alone can account for only about 2 percent of it. That means as much as 3 percent of the genetic material is playing a critical but mysterious role--one so important nature has kept that genetic information largely intact for 75 million years.

It's only speculation now, but most scientists think those stretches of DNA will prove to be regulatory regions--instructional segments that somehow govern the behavior of genes. More and more, to cite one example, it looks as though mice and people will turn out to have very different brains not because the genes encoding their brain cells are so different, but because the instructions that regulate how many times those cells reproduce during development are different--producing a far bigger brain in a human than in a mouse.

Likewise, scientists said today, it's likely to turn out that many complex diseases develop not because the genes encoding important proteins are broken, but because the instructions for how to use those genes are scrambled.

Scientists have always known the instruction book would be important, but few of them imagined it would be so large a proportion of the genome--the implication being that the instructions, and the machinery for interpreting them, may matter as much or more than the genes themselves. Key scientists said the new discoveries were likely to force them to abandon the term "junk DNA" and send them back to the drawing board to come up with sweeping new models for how nature builds and maintains organisms.

"We will have to develop a much more dynamic view of what a gene is, how it's controlled, how it's encoded," said Aravinda Chakravarti, head of the Institute for Genetic Medicine at Johns Hopkins University. "It's fun to find a whole new set of questions you could spend the rest of your life answering."

© 2002 The Washington Post Company

File Date: 12.08.02