Extreme long-range regulation of developmental genes
Having spent six years grappling with the mysteries of gene regulation at the University of Bergen, Norway, Boris Lenhard brings a new research programme to the CSC. Heading up the Computational Regulatory Genomics Group, he is one of an increasing number of new faces in the recently formed Integrative Biology section.
Lenhard’s group uses computational methods to mine large DNA datasets for information about how genes are regulated. While exons encode units that make up proteins, vast regions of the genome (introns and intergenic regions) contain regions involved in influencing gene expression. Non-coding regions bind proteins such as transcription factors, which can enhance or quell gene expression.
It is clear that non-coding DNA immediately adjacent to a coding region might affect how that gene is expressed, but Lenhard’s group is investigating a more unusual phenomenon: “Our most exciting research interest is long-range regulation,” he says. Focussing on genes involved in early development, his team is looking at how regulator elements can act as remote puppet-masters, influencing gene expression across large regions of the genome. “We’ve found that, particularly for developmental genes, enhancers are often found in large clusters along the length of DNA. Clusters contain tens to hundreds of regulatory elements, which can cover megabases around the coding-sequence itself. Some genes have a regulatory span that exceeds the total size of an average bacterial genome.” Single genes, it seems, can have vast genomic regions directing their expression. Making sense of these large datasets is the challenge.
“What originally attracted us to investigate this model of regulation was that many of these elements are extremely well conserved across species,” says Lenhard. “If you look across all vertebrate species, they are bizarrely similar. There are almost 500 such elements that are perfectly conserved between humans and mice over 200 base pairs or more. Why would these regions be perfectly preserved over 60 million years of evolution?”
The genes involved govern multicellular processes involved in very early embryonic development – a time when the basic body plan and general shape of an organism is being established. Perhaps the fundamental importance and great complexity of this process has meant that these sections of DNA are so intricately composed that any changes would completely disrupt the formation of an organism.
Lenhard’s team is establishing collaborations with other groups within the CSC and around the world to obtain primary data that might help answer their questions about how these promoter and enhancer elements interact to orchestrate the early development of multicellular life. On the other side of the coin, these development genes are implicated in a variety of diseases, from congenital malformations to cancer. Combining the power of genome-wide computational analyses with investigative experimental work will yield vital information about how extreme long-range regulation occurs.