Subjects/Science/Biology/Biology/Phylogenetics

Phylogenetics Study Guide

Study Guide

📖 Core Concepts Phylogenetics – study of evolutionary history using heritable traits (DNA, proteins, morphology). Phylogenetic tree – diagram of hypothesized relationships; tips = observed taxa, internal nodes = inferred ancestors. Rooted tree – has a designated ancestor; shows direction of change. Unrooted tree – only shows relationships, no direction or root. Clade – an ancestor plus all its descendants (monophyletic group). Cladistics – reconstructs relationships using shared derived characters (synapomorphies). Parsimony – prefers the tree requiring the fewest evolutionary changes. Maximum Likelihood (ML) – evaluates a tree by the probability of the observed data given a model of character change. Bayesian inference – combines prior probabilities with a likelihood to produce a posterior distribution of trees; explored via Markov chain Monte Carlo (MCMC). Phenetic distance methods – build trees from overall similarity matrices (e.g., neighbor‑joining). Long‑branch attraction (LBA) – artifact where rapidly evolving, unrelated lineages appear together because of convergent/homoplastic changes. Bootstrap – resampling technique that provides a statistical support value for each node. 📌 Must Remember Rooted vs. Unrooted – rooted → lineage ancestry; unrooted → similarity pattern only. Parsimony principle – “simplest explanation wins” → fewest character changes. ML & Bayesian both require an explicit substitution model (e.g., Jukes‑Cantor, GTR). Neighbor‑joining builds a tree by repeatedly joining the pair with the smallest distance. Bootstrap ≥ 70 % is commonly taken as moderate support; ≥ 95 % as strong support. Dollo’s Law – once a complex character is lost, it is unlikely to be re‑evolved. Phenetics ignores phylogeny; cladistics requires synapomorphies; evolutionary taxonomy blends both. Robinson–Foulds metric quantifies topological distance between two trees. 🔄 Key Processes Data collection – obtain DNA/protein sequences or morphological characters. Alignment – line up homologous positions; trim ambiguous regions. Model selection – use information‑theoretic criteria (AIC, BIC) to pick the best substitution model. Tree inference Parsimony: enumerate (or heuristically search) trees → choose with minimum steps. ML: compute likelihood $L = P(\text{data} \mid \text{tree}, \text{model})$ for each candidate → pick highest $L$. Bayesian: define priors, run MCMC → obtain posterior sample; summarize with majority‑rule consensus. Neighbor‑joining: calculate pairwise distances → iteratively join closest taxa → produce unrooted tree. Support assessment – bootstrap (or jackknife) replicates → assign % support to nodes. Tree visualisation & interpretation – root (if needed), annotate branch lengths (time vs. change), identify clades. 🔍 Key Comparisons Parsimony vs. ML – Parsimony counts changes; ML uses explicit probabilistic model of change. Rooted vs. Unrooted – Rooted = direction + ancestor; Unrooted = only relationship pattern. Phenetics vs. Cladistics – Phenetics = overall similarity; Cladistics = shared derived traits only. Bootstrap vs. Posterior Probability – Bootstrap = resampling frequency; Posterior = probability given data & priors. ⚠️ Common Misunderstandings “Long branch = ancient” – LBA is a methodological artifact, not a true age indicator. “Higher bootstrap = correct tree” – high support can still be misleading if model is wrong or taxon sampling is poor. “Bayesian = always better” – Bayesian results depend heavily on priors; poor priors can bias the posterior. “Unrooted tree shows evolution” – without a root, direction of change cannot be inferred. 🧠 Mental Models / Intuition Tree as a family diagram – think of the root as the “great‑grandparent” and each node as a “child” that gives rise to all descendants. Parsimony as “Occam’s razor” – the simplest story (fewest mutations) is preferred unless data demand complexity. Likelihood as “fit score” – higher likelihood = the model predicts the observed data more accurately, like a better‑fitting curve. MCMC as “random walk” through tree space – the chain spends more time in high‑probability regions, giving a natural weighting to likely trees. 🚩 Exceptions & Edge Cases Homoplasy (convergent evolution) can make parsimony misleading; ML/Bayesian can accommodate it via substitution models. Dollo’s Law is a rule of thumb; rare cases of re‑evolution of complex traits have been documented. Very short internal branches → low resolution; may produce a “star” phylogeny regardless of method. Sparse taxon sampling → increases risk of LBA and can inflate branch lengths. 📍 When to Use Which Parsimony – small morphological datasets, when computational speed matters, and homoplasy is expected to be low. Maximum Likelihood – medium‑size DNA datasets, need for model‑based inference, and when branch‑length estimates are required. Bayesian – large molecular datasets, desire for posterior probabilities, or when incorporating prior knowledge (e.g., fossil calibrations). Neighbor‑joining – quick exploratory analysis, large numbers of taxa, or when only distance matrix is available. Bootstrap – to assess node stability for any method; use >1000 replicates for reliable estimates. 👀 Patterns to Recognize Star‑like topology → very rapid radiation or insufficient data. Long branch attached to short internal branch → red flag for possible LBA. High bootstrap on shallow nodes but low on deep nodes → recent divergences well resolved, ancient splits ambiguous. Consistent clade across methods → robust evolutionary signal. 🗂️ Exam Traps Choosing “rooted” vs. “unrooted” – a question may ask which tree type is appropriate for visualizing similarity; answer: unrooted. Confusing phenetic distance with phylogenetic distance – similarity ≠ ancestry; phenetic methods ignore character polarity. Assuming bootstrap = probability of correctness – it is a frequency under resampling, not a direct probability. Mixing up “clade” and “grade” – a grade is a paraphyletic group (missing some descendants); a clade is monophyletic. Misinterpreting posterior probabilities as p‑values – they are conditional on the model and priors, not frequentist significance. --- Keep this guide handy – it condenses the high‑yield concepts you’ll need to ace any phylogenetics exam.

Or, immediately create your own study flashcards:

Upload a PDF.
Master Study Materials.

Start learning in seconds

Drop your PDFs here or