When biologists finished mapping the human genome in 2003, many expected they'd cracked the code of life. Four billion years of evolution had other plans. It turns out only about 2% of our DNA actually codes for proteins—roughly 20,000 genes. The rest? Regulatory chaos that even the most sophisticated AI can't fully parse.
The Regulatory Layer Nobody Warned Us About
The real action isn't in those protein-coding sequences but in how they're controlled. Transcription factors—molecular operations managers that bind to DNA and trigger gene expression—don't work like simple on/off switches here. Unlike bacteria, where regulation follows straightforward "OR" logic, human cells operate on combinatorial "AND" logic, integrating multiple signals before making regulatory decisions. "In the human genome the logic is more like what computer scientists designate 'AND,'" said Karen Adelman, a gene regulation researcher at Harvard Medical School. "Many signals are integrated to reach a regulatory decision: this and that and also that other thing."
Enhancers: The Puzzle Inside the Puzzle
Then there's the enhancer problem. Our genome contains hundreds of thousands—possibly millions—of these DNA segments that serve as gathering points for transcription factors. Each gene might be influenced by dozens of enhancers, and each enhancer might regulate multiple genes. Some sit right next to their target genes; others lurk millions of nucleotides away. "It's embarrassing that 25 years after the Human Genome Project, we don't know where all the enhancers are in the genome, let alone what they do," said Wendy Bickmore, a genome biologist at the University of Edinburgh.
Loops, Hubs, and Condensates
Those distant enhancers get brought to their target genes through loops extruded by a protein motor called cohesin—imagine chromatin being pushed through a ring like rope through your fingers. When these elements finally meet, they don't assemble into neat molecular machines. Instead, they form what's called a condensate: a loose, fluid blob where components interact weakly, fleetingly, and rather indiscriminately. "There'll be a bit of loop extrusion going on over here, in the next cell it might be over here, and the whole thing is turning over incredibly fast," Bickmore explained. Even identical cells—two skin cells, say—never have quite the same regulatory configuration at any given moment.
Why AI Might Be Fundamentally Blind
This is where genomic foundation models like Evo 2, Genos, and Google DeepMind's AlphaGenome face their reckoning. These systems train on vast quantities of genetic sequences, learning correlations between DNA variations and organismal traits. The hope: all that regulatory complexity—transcription factors, splicing, epigenetic marks, chromatin folding—will be implicitly captured in the patterns the algorithms detect. It's a reasonable bet, but probably insufficient. "I wouldn't have designed it this way if I was God," Bickmore said. "But here we are!" The genome isn't a static program; it's an open informational system that responds dynamically to its environment and internal state.
The Informiome Problem
Adrian Woolfson, co-founder of biotech company Genyro, argues the challenge runs even deeper. Beyond our DNA lie what he calls the "informiome"—layers of extra-genetic information including diet, environment, microbiome, and culture that profoundly affect how genomes function. "While the human genome forms the foundation of the human informiome, other layers of extra-genetic information are equally important," Woolfson wrote in his April 2026 book On the Future of Species.
Key Takeaways
- Only ~2% of human DNA codes for proteins; gene regulation dominates the genome's function
- Human cells use combinatorial "AND" logic rather than simple on/off switches, making prediction difficult
- Enhancers number in the hundreds of thousands to millions, with complex many-to-many relationships to genes
- Chromatin loops and condensates create regulatory configurations that differ between otherwise identical cells
- AI foundation models may capture useful correlations but miss fundamental mechanisms
The Bottom Line
Don't get me wrong—AlphaGenome and its cousins will absolutely ship useful predictions. Drug discovery, disease risk modeling, evolutionary analysis: these applications don't need philosophical understanding of how life works. But if you're hoping computational genomics will eventually unlock the "secret of life"? That's a different ballgame entirely. The genome is a strange loop, not an algorithm, and some puzzles resist even the most elegant brute force.