How AI Is Helping Scientists Decode the ¡®Dark Genome¡¯
Imagine DNA as a book written with about 3 billion letters. It uses only four characters ? A, C, G, and T ? yet those letters contain the instructions that shape the human body. Scientists can clearly read only about 2% of this book. That portion includes genes that dictate how proteins are produced and function. The remaining 98% is known as the ¡°dark genome,¡± a vast and largely unexplored region.
The dark genome does not directly produce proteins. However, it is a master regulator of gene expression, controlling cell identity, environmental responses, and disease development. Many genetic mutations linked to disease are in this non-coding DNA; as such, researchers view it as a major target for research.
Now, a new artificial intelligence model developed by Google¡¯s DeepMind, called AlphaGenome, may help scientists better understand this hidden layer. AlphaGenome can analyze up to 1 million DNA letters at a time ? identifying where genes are located and predicting how non-coding regions influence gene activity. That includes effects on gene expression and gene splicing, the process that allows a single gene to produce different proteins.
One of AlphaGenome¡¯s most powerful features is its ability to predict the impact of changing even a single letter in the genetic code. Unlike large language models that focus on predicting the next word in a sequence, AlphaGenome connects DNA sequences to biological outcomes, examining how changes in genetic text alter function. The model was trained using publicly available data from human and mouse cell experiments.
Still, researchers caution that the system has limits. It is less accurate in its predictions of gene regulation over long stretches of DNA and requires improvement across different tissues since the same genetic code can behave differently in neurons than in cardiac muscle cells. Even so, scientists say continued progress could speed efforts to link DNA variations to disease.
May For The Teen Times teen/1772193972/1613367687
1. What percentage of the DNA book can scientists clearly read today?
2. Who developed the new AI model AlphaGenome to decode non-coding DNA?
3. How does the new model predict the impact of changing genetic letters?
4. Why are genetic mutations in the hidden layer a target for research?
1. Should humans use AI to change their own genetic instructions?
2. How can understanding the dark genome help us treat future diseases?
3. Why is it important for scientists to share their data publicly?
4. Is it exciting to solve the mysteries of the human body?