Population genetics - Emergence of New Genes
Understand de novo gene origination, microRNA gene evolution, and the role of positive selection in shaping emerging genes.
Summary
Read Summary
Flashcards
Save Flashcards
Quiz
Take Quiz
Quick Practice
What are the two types of genomic regions from which new genes can arise according to Wang (2003)?
1 of 13
Summary
Origin and Evolution of New Genes
Introduction
One of the most fascinating questions in evolutionary biology is: how do entirely new genes arise? Rather than genes simply being inherited and modified versions of ancestral genes, we now understand that genomes regularly generate novel genes from scratch. This process, along with the rapid evolution of these young genes, reveals how genetic innovation drives biological complexity and species diversity. The mechanisms by which new genes originate and then acquire function through positive selection represent fundamental processes in genome evolution.
De Novo Gene Origination: Birth from Non-Coding DNA
What Are De Novo Genes?
De novo genes are genes that have arisen recently in evolutionary time from sequences that had no previous protein-coding function. Rather than emerging through duplication and divergence of existing genes (a well-known pathway), de novo genes originate directly from non-coding DNA regions.
The key insight from recent research is that de novo genes typically emerge through a surprising pathway: non-coding DNA sequences occasionally acquire the ability to be transcribed into RNA and then translated into protein. This means that vast stretches of "junk DNA" are not entirely non-functional—they contain hidden potential. When the right combination of regulatory signals and open reading frames aligns, a functional gene can suddenly emerge.
Initial Characteristics of Young Genes
When de novo genes first arise, they typically have several characteristic features that distinguish them from established genes:
Low expression levels: Newly formed genes are often expressed at low levels in specific tissues or under particular conditions. This makes evolutionary sense—a new gene that is barely expressed poses little risk to the organism's physiology. If it has no immediate benefit, it also causes minimal harm.
Relaxed functional constraints: Young de novo genes are generally not subject to strong negative selection early in their existence. Because they weren't critical to the ancestral organism (the organism existed fine without them), losing or modifying them doesn't immediately compromise fitness. This creates an evolutionary "window" where the gene can diverge and experiment with different sequences and functions.
Species-specific distribution: De novo genes are often found in only one species or a small group of closely related species. This makes them useful molecular markers for recent speciation events. Their species-specific nature reflects rapid turnover—they originate and may persist, diverge, or be lost over relatively short evolutionary timescales (thousands to millions of years, rather than hundreds of millions).
Rapid Acquisition of Essential Functions
Despite their humble origins, experimental and comparative genomic evidence reveals that some de novo genes quickly acquire important biological functions. This is perhaps the most remarkable finding: genes that emerged from scratch can become essential surprisingly fast. Some examples include genes involved in spermatogenesis, developmental processes, and stress responses. This rapid functional acquisition suggests that young genes are indeed subject to strong positive selection when they provide beneficial traits—a topic we'll explore in detail below.
MicroRNA Gene Evolution: Expanding the Regulatory Genome
The Special Case of microRNAs
MicroRNAs (miRNAs) represent a particularly interesting class of newly evolved genes. These are small non-coding RNAs, typically 20–25 nucleotides long, that regulate other genes post-transcriptionally by binding to messenger RNAs and preventing their translation or promoting their degradation.
Research has shown that animal microRNA genes have undergone extensive diversification through multiple mechanisms: duplication of existing miRNA genes, divergence of duplicated copies to recognize new target sequences, and the acquisition of entirely novel targeting specificity. This evolutionary flexibility has allowed microRNAs to become central players in fine-tuning developmental and physiological processes.
Origin from Hairpin-Forming Sequences
A key mechanistic discovery is that new microRNA genes frequently originate from genomic sequences that naturally form hairpin (stem-loop) structures. These hairpin structures are not accidents—they arise when a DNA sequence is transcribed, and the resulting RNA folds back on itself through complementary base pairing.
The critical step for a hairpin sequence to become a functional microRNA gene is processing: the RNA hairpin must be recognized and cleaved by two cellular enzymes:
Drosha: A nuclear enzyme that recognizes the hairpin structure and cuts it, producing a smaller hairpin called a pre-miRNA
Dicer: A cytoplasmic enzyme that further processes the pre-miRNA into a mature miRNA duplex
When a newly arisen hairpin sequence becomes a substrate for Drosha and Dicer processing, it graduates from being a random genomic feature to being a functional regulatory molecule. Once mature, the microRNA can bind to target mRNAs and regulate them.
Functional Diversification and Developmental Significance
The evolution of new microRNAs contributes substantially to the evolution of developmental programs. Different species often have lineage-specific microRNAs—microRNAs found in one species or group but not in others. These novel miRNAs regulate different combinations of target genes, allowing species to evolve distinct patterns of gene expression during development.
For example, microRNAs have been implicated in controlling the timing of developmental transitions, cell fate decisions, and tissue differentiation. When a new microRNA arises with a novel target specificity, it can rewire regulatory networks and produce evolutionary changes in body plan or physiology.
Conservation and Selective Constraints
A particularly important feature for understanding microRNA evolution is the seed sequence—typically the first 6–8 nucleotides of the mature microRNA. The seed sequence is what determines which mRNA targets the microRNA can bind to.
Comparative analysis across species reveals that seed sequences are highly conserved, even when other parts of the microRNA sequence diverge substantially. This pattern indicates strong selective pressure: mutations that change the seed sequence risk disrupting target specificity and altering critical regulatory relationships. Thus, while microRNA sequences can drift in regions that don't affect targeting, the seed remains locked in by natural selection.
Birth-Death Dynamics and Lineage-Specific Networks
MicroRNA families exhibit what researchers call birth-death dynamics: new microRNA genes continuously arise (birth), while others are lost or pseudogenized (death) over evolutionary time. At any point, a species possesses a particular complement of microRNAs that has accumulated through this ongoing process.
Importantly, these microRNAs rarely arise as isolated regulators. Instead, they become integrated into regulatory networks where multiple microRNAs collectively control developmental pathways. Different lineages accumulate different sets of microRNAs, leading to lineage-specific regulatory networks. This is part of how evolution generates phenotypic diversity: by rewiring gene regulatory networks through the emergence and loss of microRNAs.
Positive Selection on Emerging Genes: Adaptation in Action
Why Young Genes Experience Strong Selection
Once de novo genes arise, they don't necessarily remain unchanged. In fact, comparative and population genetic studies reveal that newly formed genes frequently experience positive selection—a process where advantageous mutations spread rapidly through populations because they increase fitness.
This might initially seem surprising: shouldn't new genes with unknown or weak functions experience mostly negative selection (removing deleterious mutations)? The answer reveals something important about evolution. While young genes may start with low expression and weak constraints, if they provide even a slight benefit, natural selection will act to refine and improve them. Young genes are often evolving in response to new evolutionary challenges—such as combating pathogens, adapting to new environments, or acquiring specialized functions—that create strong selective pressure.
Gene Age and Selection Strength
Research by Sawyer and colleagues on Drosophila (fruit flies) revealed a striking pattern: the strength of positive selection on genes is inversely correlated with gene age. Younger genes show stronger signatures of positive selection, while older, more established genes show weaker signatures.
What this means: when you compare the rates of synonymous substitutions (changes to DNA that don't alter the amino acid sequence) to non-synonymous substitutions (changes that do alter amino acids), you can estimate the proportion of amino-acid changes that are adaptive rather than neutral. In young Drosophila genes, this proportion is notably high—indicating that many amino-acid changes were beneficial and selected for. In older genes, the proportion is lower.
This pattern makes evolutionary sense: young genes are still "finding their way," acquiring and refining functions. Older genes have already settled into established roles, so most mutations are neutral or slightly deleterious rather than beneficial.
Targeting Protein-Protein Interactions
A particularly important insight is where these advantageous mutations tend to occur within newly evolved genes: they frequently map to protein-protein interaction interfaces. These are the physical surfaces where a protein contacts other proteins.
Why does this make sense? A new gene produces a new protein with a novel sequence. For this protein to be useful, it often must interact with existing cellular machinery—other proteins, signaling complexes, or regulatory networks. The new protein must "fit" into these existing interaction networks. Natural selection will favor mutations that improve the fit, enhance binding affinity, or allow recognition by appropriate regulatory partners. Thus, the interfaces between the new protein and its partners become hotspots for adaptive evolution.
Detecting Selection: Signatures and Methods
Population geneticists use several approaches to identify genes under positive selection:
Comparing sequence divergence: By analyzing DNA sequences from multiple species, researchers calculate ratios of non-synonymous to synonymous changes. A high ratio suggests positive selection.
Analyzing nucleotide polymorphisms: Within populations, the pattern of genetic variation can reveal whether rare mutations are spreading rapidly (a signature of positive selection) or being removed (negative selection).
Identifying selective sweeps: A selective sweep occurs when a beneficial allele increases in frequency so rapidly that genetic variation around it is eliminated. Population-genetic scans can identify these "sweeps"—regions of unusually low genetic diversity—which often cluster near de novo genes.
These methods have consistently found that de novo genes and other newly evolved genes show heightened signatures of positive selection compared to the genome-wide background.
Summary: The Rapid Evolution of Genetic Innovation
The origin and early evolution of new genes illustrate a general principle: evolution is not a process of slow, gradual change to established structures, but also one of rapid innovation. New genes can arise de novo from non-coding DNA, acquire basic functions while expressed at low levels, and then be refined through positive selection as they become integrated into cellular networks. MicroRNA genes exemplify this process, emerging from hairpin-forming sequences and diversifying to create lineage-specific regulatory networks. The strong positive selection observed on young genes demonstrates that natural selection efficiently optimizes new genetic functions, particularly at protein-interaction interfaces. Together, these mechanisms explain how genomes generate the genetic novelty that underlies the origin of new traits and the diversification of life.
Flashcards
What are the two types of genomic regions from which new genes can arise according to Wang (2003)?
Recently evolved sequences and ancient genomic regions.
What is the typical starting point for de novo genes according to McLysaght and Hurst (2016)?
Non-coding DNA that acquires transcription and translation.
What does the species-specific nature of many de novo genes reflect about the genome?
Rapid turnover in the genome.
What has experimental evidence shown regarding the functional acquisition of some de novo genes?
They can quickly acquire essential functions.
Through what three processes have animal microRNA genes diversified according to Liu et al. (2008)?
Duplication
Divergence
Acquisition of novel targets
From what specific genomic structures do new microRNA genes frequently arise?
Hairpin-forming genomic fragments.
What biological role is served by the functional diversification of microRNAs?
Fine-tuned regulation of developmental pathways.
What does the conservation of seed sequences in microRNAs indicate?
Selective pressure to maintain target specificity.
What phenomenon generates lineage-specific regulatory networks in microRNA families?
Rapid birth-death cycles.
What is the primary effect of positive selection on advantageous mutations in newly formed genes?
Rapid fixation of the mutations.
How does the strength of selection correlate with the age of a gene?
Younger genes experience stronger selection.
To what structural changes are adaptive fixations in new genes often linked?
Changes in protein-protein interaction interfaces.
What signature is frequently identified near de novo genes during population-genetic scans?
Selective sweeps.
Quiz
Population genetics - Emergence of New Genes Quiz Question 1: What is the observed relationship between gene age and the proportion of adaptive amino‑acid changes?
- Younger genes show a higher proportion of adaptive changes (correct)
- Older genes accumulate more adaptive changes
- Gene age has no effect on adaptive change frequency
- Only ancient genes undergo adaptive amino‑acid replacements
What is the observed relationship between gene age and the proportion of adaptive amino‑acid changes?
1 of 1
Key Concepts
Gene Evolution Mechanisms
De novo gene origination
MicroRNA gene evolution
Gene duplication and divergence
Species‑specific genes
Selection and Adaptation
Positive selection
Gene age effect on adaptation
Selective sweep
Protein Interaction Evolution
Protein‑protein interaction interface evolution
Definitions
De novo gene origination
The process by which new protein‑coding genes arise from previously non‑coding DNA sequences.
MicroRNA gene evolution
The diversification of microRNA genes through duplication, divergence, and the acquisition of novel hairpin precursors.
Positive selection
Evolutionary pressure that favors advantageous mutations, leading to rapid fixation in a population.
Gene age effect on adaptation
The observation that younger genes tend to experience stronger positive selection than older, more conserved genes.
Species‑specific genes
Genes that are unique to a particular species, often arising rapidly and contributing to lineage‑specific traits.
Selective sweep
A reduction in genetic variation caused by the rapid fixation of a beneficial allele in a population.
Protein‑protein interaction interface evolution
Adaptive changes in the regions of proteins that mediate physical interactions with other proteins.
Gene duplication and divergence
The creation of gene copies followed by functional divergence, a major mechanism for generating new genetic material.