Researchers are using artificial intelligence (AI) to dig deep into the mechanisms of gene activation, a crucial process in growth, development, and disease. Utilizing machine learning, the team identified “synthetic extreme” DNA sequences that play specific roles in gene activation. These sequences were discovered by testing millions of DNA sequences and comparing gene activation elements in humans and fruit flies.
Artificial intelligence (AI) has gained significant attention in recent times, particularly with the emergence of ChatGPT and related technologies. Beyond chatbots, biologists are harnessing AI to explore the fundamental workings of our genes.
Researchers from the University of California San Diego utilized machine learning, a form of AI, to uncover a mysterious piece of the puzzle related to gene activation—a critical process in growth, development, and disease. Professor James T. Kadonaga and his team identified the downstream core promoter region (DPR), an essential DNA activation code involved in up to a third of our genes.
Building upon this breakthrough, Kadonaga and colleagues Long Vo ngoc and Torrey E. Rhyne employed machine learning to identify “synthetic extreme” DNA sequences with specific functions in gene activation. By comparing the DPR gene activation element in humans and fruit flies (Drosophila), they tested millions of DNA sequences and successfully discovered custom-tailored DPR sequences that are active in humans but not in fruit flies, and vice versa.
This approach holds promise beyond gene activation research. It can be employed to identify synthetic DNA sequences with potential applications in biotechnology and medicine. For instance, the method could aid in comparing the effects of different drugs or identifying DNA sequences that activate genes in specific tissues. The possibilities of this AI-based approach are vast.
Machine learning, a subset of AI, enables computer systems to learn and improve based on data and experience. The researchers utilized support vector regression to train their machine learning models with established DNA sequences from real-world laboratory experiments. These models successfully identified human-specific and fruit fly-specific DNA sequences, and the predicted functions of the extreme sequences were verified in the laboratory.
The exceptional capabilities of AI models in predicting rare sequences highlight the potential for wider use of machine learning and AI technologies in biology. This work demonstrates the application of AI in designing customized DNA elements for gene activation, offering practical implications for biotechnology and biomedical research. Biologists are just scratching the surface of AI’s potential