RemNote Community
Community

Protein - Experimental Techniques Proteomics and Resources

Understand experimental methods, purification strategies, and computational tools for protein analysis, structure determination, and proteomics.
Summary
Read Summary
Flashcards
Save Flashcards
Quiz
Take Quiz

Quick Practice

What does immunohistochemistry use to visualize the location of proteins in tissue sections?
1 of 32

Summary

Methods for Studying Protein Structure and Function Introduction Understanding how proteins work requires studying both their structure and their function. Scientists have developed numerous experimental and computational approaches to tackle this challenge. These methods generally fall into three categories: experimental techniques that directly observe proteins, approaches that examine proteins in different contexts (test tube, living cells, or computers), and large-scale analyses that study many proteins simultaneously. This section explores the most important methods you'll encounter in your study of protein biology. Experimental Techniques for Protein Structure and Function X-ray Crystallography X-ray crystallography is one of the most powerful techniques for determining protein structure at atomic resolution. The method works by first crystallizing a purified protein, then shooting X-rays through the crystal. The atoms in the protein diffract these X-rays in specific patterns, creating a diffraction pattern that can be analyzed mathematically to determine the three-dimensional positions of every atom in the protein. Why this matters: X-ray crystallography has been responsible for solving the vast majority of protein structures in the Protein Data Bank. It provides atomic-level detail that's essential for understanding how proteins work and for designing drugs that target them. Important limitation: Proteins must be crystallizable, which is actually quite difficult. This bias means that globular proteins (compact, roughly spherical proteins) are over-represented in our knowledge, while membrane proteins and very large complexes are under-represented. Nuclear Magnetic Resonance (NMR) Spectroscopy NMR spectroscopy determines protein structures without requiring crystals. Instead, it measures how atomic nuclei behave in a magnetic field. This allows researchers to study proteins in solution—their natural environment—rather than in artificial crystal forms. Key advantage: NMR can reveal protein dynamics and motion, not just static structures. It's particularly useful for studying smaller proteins and can detect movement between different conformations. Mass Spectrometry Mass spectrometry is a technique that ionizes proteins or peptides and measures their mass-to-charge ratio. This allows researchers to: Determine the exact molecular weight of a protein Identify post-translational modifications (chemical changes made to proteins after synthesis) Sequence peptides by breaking them into fragments and analyzing the pattern Perform large-scale identification of many proteins simultaneously This technique has become indispensable for proteomics (studying all proteins in a sample). Site-Directed Mutagenesis Site-directed mutagenesis is a molecular technique that deliberately introduces specific changes into a protein's amino acid sequence. Scientists use this method to test hypotheses about which residues are important for function. For example, if you suspect that a particular amino acid is critical for binding a substrate, you can mutate it and see how the protein's function changes. Practical value: This technique directly reveals structure-function relationships by showing how specific structural changes affect protein behavior. Immunohistochemistry Immunohistochemistry uses antibodies (proteins that specifically bind to target proteins) to visualize where a particular protein is located within tissue sections. Antibodies are labeled with dyes or enzymes that produce visible signals, allowing you to see protein location under a microscope. Key application: This technique is particularly valuable for studying protein location in intact tissues and is commonly used in medical diagnostics. In-Vitro, In-Vivo, and In-Silico Approaches These three complementary approaches study proteins in fundamentally different ways: In-Vitro Studies (Test Tube) In-vitro (Latin for "in glass") refers to experiments performed outside living systems—typically with purified proteins in test tubes or other controlled containers. Examples include: Measuring enzyme kinetics (how fast an enzyme works) Testing protein-protein binding Studying protein folding in isolation Advantage: Complete experimental control. You can measure exactly what you want without the complexity of a living cell interfering. Limitation: Purified proteins are removed from their natural cellular environment, which may be necessary for their normal function. In-Vivo Studies (Living Systems) In-vivo (Latin for "in life") means studying proteins within living cells or whole organisms. This reveals what proteins actually do in their natural context. Advantage: Shows physiological relevance—how the protein actually functions to support life processes. Limitation: Much more complex; many variables are harder to control. In-Silico Studies (Computer-Based) In-silico (Latin for "in silicon") refers to computational modeling and simulations. Computers predict: Protein structures based on sequence How proteins move and change shape How proteins bind to other molecules Effects of mutations Advantage: Fast and can handle large-scale analyses. Limitation: Predictions must be validated experimentally; computational models are only as good as the underlying assumptions. Protein Purification and Cellular Localization Overview of Purification Before you can study a protein, you need to obtain it in pure form. Protein purification is a multi-step process that starts with cells and progressively isolates the protein of interest. Step-by-Step Purification Process Cell Lysis: The first step is breaking open cells to release their contents. This produces a crude lysate—a complex mixture containing thousands of different proteins along with nucleic acids, lipids, and other cellular components. Ultracentrifugation: High-speed spinning of the lysate separates it into layers based on density. This removes insoluble material like membranes, organelles, and nucleic acids, leaving soluble proteins in the supernatant. Salting-Out: Adding high concentrations of salt causes proteins to precipitate (form solid particles) out of solution, concentrating them. This is useful for concentrating dilute protein solutions. Chromatography Techniques: These separate proteins based on different properties: Size exclusion chromatography separates proteins by molecular weight Ion exchange chromatography separates proteins by charge Affinity chromatography separates proteins by their ability to bind specific molecules Monitoring Purification Progress Scientists track purification using several analytical methods: Gel electrophoresis separates proteins by size, showing you how many different proteins remain Spectroscopy measures protein concentration Enzyme assays (for enzymatic proteins) confirm the target protein is still active Isoelectric focusing separates proteins by their isoelectric point (the pH where they have no net charge) Affinity Tags for Recombinant Proteins When scientists use genetic engineering to produce proteins, they often attach affinity tags—short sequences that have special binding properties. The most common example is a poly-histidine tag (or "His-tag"), which is a string of 6-10 histidine amino acids. How it works: His-tags bind very tightly to nickel ions immobilized on a chromatography column. When the crude lysate passes through a column containing nickel, only proteins with His-tags bind and stick to the column. Washing removes all other proteins, then the His-tagged protein is eluted (removed) by adding a solution of free histidine or imidazole, which competes with the His-tag for nickel binding. Advantage: This provides a fast, highly specific purification step. The tag is typically removed after purification if needed. Determining Cellular Localization Once you have a purified protein, you'll want to understand where it functions within the cell. Several techniques address this: Fluorescent Fusion Proteins: Scientists attach a gene encoding a fluorescent protein (most commonly green fluorescent protein or GFP) to the gene of their target protein. When this engineered protein is expressed in cells, it glows green, allowing direct visualization of where the protein is located using fluorescence microscopy. Indirect Immunofluorescence: If you can't or don't want to add a fluorescent tag, you can use antibodies instead: Add antibodies that specifically recognize your protein of interest These antibodies are pre-labeled with fluorescent dyes The antibodies bind to your protein, revealing its location via fluorescence To identify specific compartments, you can co-label known compartment markers (like mitochondrial proteins) with different colored dyes, allowing you to determine if your protein colocalizes with that compartment Fluorescent Dyes: Specific dyes preferentially accumulate in particular organelles (for example, certain dyes accumulate in mitochondria). By combining these with immunofluorescence, you can determine which compartment contains your protein. Immunoelectron Microscopy: For ultra-high resolution localization, antibodies can be conjugated to electron-dense gold particles. When viewed under an electron microscope, these gold particles appear as dark spots, revealing protein location at the ultrastructural level. Protein Digestion The Process: Breaking Down Dietary Protein When you eat protein, your digestive system breaks it down into smaller pieces through a process called proteolysis. This breaks peptide bonds (the bonds connecting amino acids), converting large dietary proteins into small peptides and amino acids that can be absorbed through the intestinal wall. Proteases and Peptidases: Enzyme Classification Proteases (also called peptidases) are enzymes that hydrolyze peptide bonds. They're classified based on which bonds they break: Exopeptidases cleave peptide bonds at the terminals (ends) of proteins, one amino acid at a time Endopeptidases cleave peptide bonds in the interior of protein chains, breaking them into large fragments Pepsin: The Stomach's Protease Pepsin is an endopeptidase secreted in the stomach that initiates protein digestion. It works best in the acidic environment of the stomach (pH 2) and makes the first cuts in dietary proteins, producing smaller peptides. Pancreatic Proteases: Trypsin and Chymotrypsin After partially digested proteins move to the small intestine, two pancreatic endopeptidases take over: Trypsin cleaves peptide bonds specifically after arginine and lysine residues (basic amino acids). Chymotrypsin cleaves peptide bonds after large hydrophobic amino acids like phenylalanine, tryptophan, and tyrosine. Why multiple enzymes? Together, these proteases efficiently cleave peptide bonds throughout the protein sequence, ensuring complete digestion. Their complementary specificities ensure that bonds throughout the protein—not just in specific locations—get broken. Together, these enzymes complete the breakdown of dietary proteins into amino acids and small peptides ready for absorption. Proteomics: Large-Scale Protein Analysis What is Proteomics? The proteome is the complete set of proteins present in a cell, tissue, or organism at a specific point in time. This is distinct from the genome (the DNA sequence), because: Multiple proteins can be made from a single gene Proteins are modified after synthesis Protein abundance changes with time and cellular conditions Some proteins are present in many copies; others in just a few Proteomics is the systematic study of proteomes—essentially the protein equivalent of genomics. Rather than asking "what proteins could be made?" (which is what genomics tells you), proteomics asks "what proteins actually are present right now, in what amounts, and how are they modified?" Large-Scale Proteomic Approaches Modern proteomics relies on high-throughput technologies that can analyze thousands of proteins simultaneously: High-throughput Mass Spectrometry: Mass spectrometry can identify and quantify hundreds or thousands of proteins from complex mixtures. Proteins are typically digested into peptides first, and the mass spectrometer identifies peptides by their mass and fragmentation pattern. Software then assigns these peptides back to their original proteins. Protein Microarrays: Thousands of different proteins (or antibodies) are attached to a glass slide in a grid pattern. When a protein-containing sample is applied to the array, proteins of interest bind to their matching spots. Fluorescent detection reveals which proteins are present and their relative abundance. Bioinformatic Databases: Large databases store and organize proteomic data, making it searchable and comparable across different studies. These databases integrate information about protein sequences, structures, functions, and interactions. The Interactome The interactome is the complete set of biologically possible protein-protein interactions within a cell. Understanding the interactome is crucial because: Proteins typically function as parts of larger complexes or networks Knowing which proteins interact reveals how cellular processes are organized Incorrect protein interactions are implicated in diseases Two-Hybrid Screening is a technique for systematically exploring the interactome. It tests pairs of proteins to determine whether they physically interact, allowing researchers to map interaction networks on a cell-wide scale. Techniques in Proteomics Two-Dimensional Electrophoresis Two-dimensional (2D) electrophoresis separates proteins in two perpendicular directions: First dimension: Isoelectric focusing separates proteins by their isoelectric point (charge) Second dimension: Size exclusion electrophoresis separates proteins by molecular weight This creates a 2D "map" where each protein appears at a unique position, theoretically allowing visualization of thousands of proteins simultaneously. However, this technique is less commonly used today compared to mass spectrometry approaches. Mass Spectrometry in Proteomics Mass spectrometry has become the gold standard for large-scale protein identification. A typical workflow includes: Sample preparation: Proteins are often digested into peptides Separation: Peptides are separated by liquid chromatography Ionization and mass analysis: The mass spectrometer measures the mass of peptides Fragmentation: Peptides are further fragmented, and the fragment pattern is analyzed Database searching: Software matches the fragment patterns to known protein sequences This approach is remarkably sensitive and can detect rare proteins even in complex mixtures. Protein Microarrays In protein microarray experiments, thousands of different antibodies (or sometimes proteins) are attached to distinct spots on a glass slide. When a cellular extract or serum sample is applied, proteins bind to their cognate antibodies. Fluorescent detection reveals which proteins are present and compares their relative abundances between samples (for example, normal versus diseased tissue). <extrainfo> Structural Genomics Structural genomics is an ambitious effort to determine the three-dimensional structures of proteins representing every possible structural fold (the way proteins fold into their basic shapes). The idea is that if we know one example of each possible fold, we can use those structures as templates to model the structure of any other protein. This approach complements homology modeling (see below) by aiming to create a comprehensive library of template structures covering all possible protein architectures. </extrainfo> Protein Structure Determination Why Structure Matters A protein's three-dimensional structure directly determines its function. Knowing the structure reveals: How the protein binds to other molecules What regions are important for activity How mutations might affect function Where drugs might bind Understanding both tertiary structure (how a single protein folds) and quaternary structure (how multiple protein subunits assemble together) is essential for drug design and understanding protein function. Major Structure Determination Methods X-ray Crystallography (Revisited in Detail) X-ray crystallography remains the workhorse for protein structure determination. The method yields atomic-resolution information but requires producing high-quality protein crystals—a major bottleneck. What's challenging: Crystallization is often unpredictable. Many proteins are difficult or impossible to crystallize, particularly membrane proteins and large, flexible complexes. This creates a significant bias in the structures we know—we're much better informed about globular proteins than about membrane proteins. Nuclear Magnetic Resonance Spectroscopy (Revisited) NMR provides atomic-resolution structures of proteins in solution. The technique measures how nuclear spins interact with a magnetic field and with each other. Distance constraints between atoms are extracted from these measurements and used to calculate the three-dimensional structure. Unique advantages: NMR can reveal multiple conformations and protein dynamics, showing how proteins move and change shape. This information is often lost in crystal structures. Size limitation: NMR is most practical for proteins under 30 kDa. Larger proteins produce spectra that are too complex to interpret. Circular Dichroism Circular dichroism (CD) is a spectroscopic technique that doesn't determine complete structures but rather measures secondary structure content. CD measures how differently proteins absorb left-handed versus right-handed polarized light. Different secondary structures (α-helix, β-sheet, random coil) produce characteristic CD spectra. What it tells you: The percentage of your protein that is α-helix versus β-sheet. This is much quicker and easier than X-ray crystallography or NMR but provides less detailed information. Cryoelectron Microscopy and Electron Crystallography Cryoelectron microscopy (cryo-EM) visualizes proteins directly by freezing them rapidly and imaging them with an electron microscope. Recent technological advances have made this an increasingly powerful method. While it typically provides lower resolution than X-ray crystallography, cryo-EM is particularly valuable for: Very large protein complexes (which are difficult to crystallize) Viruses Proteins in multiple conformational states Electron crystallography can produce high-resolution structures from two-dimensional crystals of membrane proteins—cases where traditional three-dimensional crystallography fails. The Protein Data Bank Solved protein structures are deposited in the Protein Data Bank (PDB), a public repository containing the three-dimensional coordinates for every atom in thousands of solved structures. Researchers worldwide can freely download these structures for further analysis and visualization. Structural Bias The set of solved structures in the PDB is not unbiased. There's a strong bias toward globular proteins because they crystallize more readily. Meanwhile, membrane proteins and large protein complexes are under-represented because they're technically more challenging to crystallize. This bias has important implications: we have more detailed structural knowledge of some protein classes than others. This is particularly problematic for membrane proteins, which are targets for many drugs, yet we know their structures far less well than globular proteins. Protein Structure Prediction Homology Modeling: Using Structure Similarity If a protein of interest is too difficult to study experimentally, scientists can often predict its structure using homology modeling. This approach is based on a key observation: proteins with similar amino acid sequences typically have similar three-dimensional structures. How it works: Search sequence databases for known protein structures that are similar to your target protein Identify a suitable template—a protein with solved structure that is homologous (evolutionarily related) to your target Align the sequences of your target and the template Use the template's structure as a framework, modifying it based on differences in your target's sequence Critical bottleneck: The accuracy of homology modeling depends almost entirely on the quality of the sequence alignment. A perfect alignment yields highly accurate predictions; poor alignment yields poor predictions. The Role of Structural Genomics Structural genomics attempts to solve enough diverse protein structures that most unsolved proteins will have a homologous structure in the database to use as a template. By systematically solving representatives of different protein folds, structural genomics aims to make homology modeling applicable to nearly all proteins. Limitations and Special Cases Intrinsically Disordered Proteins: Approximately 33% of eukaryotic proteins contain large regions that lack a fixed three-dimensional structure. These intrinsically disordered proteins are biologically functional despite lacking stable tertiary structure. Traditional structure prediction approaches don't apply to these regions because they have no stable structure to predict. Prediction of disorder itself is important—identifying which regions of a protein are disordered helps characterize its structure and function. Applications to Protein Engineering Structure prediction has practical applications in protein engineering—rationally designing novel proteins with desired properties. By understanding protein structure, engineers can: Modify enzymes to work at different pH or temperature Design new protein binding sites Create entirely novel protein folds Structure prediction computational tools make these designs more informed and successful. In-Silico Simulation of Molecular Processes Molecular Docking Molecular docking is a computational technique that predicts how two molecules will bind to each other. This is particularly valuable for predicting: Protein-ligand interactions: How a small molecule (drug candidate) might bind to a protein target Protein-protein interactions: How two proteins might fit together The docking software positions the two molecules in space and calculates their interaction energy, finding the binding pose (orientation and position) with the most favorable energy. Practical application: Molecular docking is used extensively in drug design to predict whether drug candidates will bind well to their target proteins before synthesizing and testing them experimentally. Classical Molecular Dynamics Molecular dynamics simulations compute how proteins move and change shape over time. The approach uses molecular mechanics force fields (mathematical descriptions of how atoms interact) and Newton's laws of motion to simulate protein motion at the atomic level. What it reveals: Which motions and conformational changes are energetically favorable How flexible different regions of the protein are How proteins respond to binding events Effects of mutations on protein dynamics Computational demand: Simulating even microseconds of protein motion requires substantial computing power, which is why these simulations are typically limited to relatively short timescales (nanoseconds to microseconds). <extrainfo> Online Databases and Resources for Protein Information Scientists have created numerous searchable databases to organize and distribute protein information. These are valuable resources for finding sequences, structures, functional information, and interactions: NCBI Protein Database: The National Center for Biotechnology Information maintains the Entrez Protein database, containing curated protein sequences from many organisms. NCBI Protein Structure Database: Offers three-dimensional structures of proteins. Human Protein Reference Database: Provides curated information specifically about human proteins. PDB Europe: The Protein Data Bank in Europe hosts protein structural data along with educational materials and tutorials. RCSB PDB: The Research Collaboratory for Structural Bioinformatics maintains the main Protein Data Bank with detailed structures and educational "Molecule of the Month" features. UniProt: The Universal Protein Resource provides comprehensive information combining protein sequences with functional annotations, interaction data, and literature references. Educational Resources: The Virtual Library of Biochemistry and Cell Biology provides comprehensive guides like "Proteins: Biogenesis to Degradation," detailing the complete life cycle of proteins from synthesis through folding, trafficking, and eventual degradation. </extrainfo>
Flashcards
What does immunohistochemistry use to visualize the location of proteins in tissue sections?
Antibodies
What is the primary purpose of creating specific amino-acid changes through site-directed mutagenesis?
To study structure-function relationships
Why is the set of solved structures in the Protein Data Bank biased toward globular proteins?
They crystallize more readily for X-ray crystallography
In what state does nuclear magnetic resonance (NMR) spectroscopy elucidate protein structures?
In solution
What three characteristics of a protein can be identified using mass spectrometry?
Protein mass Composition Post-translational modifications
What is the defining characteristic of in-vitro protein studies?
Analyzing purified proteins under controlled conditions
What is the primary goal of conducting in-vivo experiments on proteins?
To examine protein function within living cells or organisms
Which process releases cellular contents into a crude lysate to begin the purification process?
Cell lysis
What is the purpose of ultracentrifugation during protein purification?
To separate soluble proteins from membranes, organelles, and nucleic acids
Chromatography separates proteins based on which three physical properties?
Molecular weight Net charge Binding affinity
How does a poly-histidine tag (His-tag) enable the selective purification of recombinant proteins?
It binds to nickel ions on a chromatography column
What technique uses electron-dense gold particles to map protein position at ultrastructural resolution?
Immunoelectron microscopy
What is the biological role of proteolysis in the digestive system?
Breaking down dietary proteins into small peptides and amino acids
What is the functional difference between exopeptidases and endopeptidases?
Exopeptidases cleave terminal peptide bonds, while endopeptidases cleave internal bonds
Which endopeptidase initiates the process of protein hydrolysis in the stomach?
Pepsin
Which two endopeptidases are secreted by the pancreas to complete protein hydrolysis?
Trypsin and chymotrypsin
What term refers to the complete set of proteins present in a cell or tissue at a given time?
The proteome
What method is used to separate many proteins simultaneously by their isoelectric point and molecular weight?
Two-dimensional electrophoresis
What is the primary function of two-hybrid screening in proteomics?
To systematically explore protein–protein interactions
What is the definition of the "interactome"?
The complete set of biologically possible protein–protein interactions in a cell
What is the ultimate goal of structural genomics?
To determine the structures of proteins representing every possible fold
What information does circular dichroism provide about a protein's structure?
The proportion of $\beta$-sheet and $\alpha$-helical secondary structure
For what kind of structures is cryoelectron microscopy typically used?
Lower-resolution structures of very large protein complexes (e.g., viruses)
What type of proteins are specifically suited for high-resolution study via electron crystallography?
Two-dimensional crystals of membrane proteins
What specific data does the Protein Data Bank (PDB) provide for each atom in a solved structure?
Cartesian coordinates
Which two categories of protein structures are under-represented in the Protein Data Bank due to crystallization difficulties?
Membrane proteins Large protein complexes
On what basis does homology modeling predict a protein's structure?
Using a template protein with sequence similarity
What is considered the main bottleneck in achieving accurate homology modeling?
Accurate sequence alignment
Approximately what percentage of eukaryotic proteins contain large, functional, intrinsically disordered segments?
33%
What is the primary use of molecular docking in in-silico simulations?
Predicting intermolecular interactions (e.g., protein–ligand or protein–protein binding)
What does classical molecular dynamics use to simulate the motion of proteins over time?
Molecular mechanics force fields
Which resource is known for providing comprehensive protein sequence and functional information?
UniProt (Universal Protein Resource)

Quiz

What experimental approach determines atomic‑level structures from crystal diffraction patterns?
1 of 31
Key Concepts
Protein Structure Determination
X‑ray crystallography
Nuclear magnetic resonance spectroscopy
Homology modeling
Intrinsically disordered proteins
Molecular dynamics simulation
Protein Analysis Techniques
Mass spectrometry
Two‑dimensional electrophoresis
Protein microarray
Proteomics Resources
Proteomics
Protein Data Bank