3. • Structure Determination
Various functions of biological system depend upon the
structure and function of proteins.
Determination of structure and functions of proteins assist
in scrutinizing the dynamics of proteins.
To understand the functions of proteins at a molecular
level, it is often necessary to determine their three-
dimensional structure.
Introduction
4. Introduction
Why Structure Determination ?
helps us in Understanding:
• How proteins interact with other molecules ?
• How they perform catalysis in the case of enzymes ?
• Interaction of protein with other molecules including
protein itself.
• Miscoding and/or misfolding of proteins associated with
diseases.
6. X-Ray Crystallography
• What is X-Ray Crystallography?
– A form of very high resolution microscopy.
– Enables us to visualize protein structures at the atomic level
– Enhances our understanding of protein function.
• What is the principle behind X-Ray Crystallography?
– It is based on the fact that X-rays are diffracted by crystals.
http://pruffle.mit.edu/atomiccontrol/education/xray/xray_diff_files/image002.gif
7. Why X-Rays? Not Others?
300 nm
10 nm
0.1 nm or 1 Å
Wavelength
Individual cells
and sub-cellular
organelles
Cellular
architecture
Shapes of large
protein molecules
Atomic detail of
protein
1.Light
1.Electron
1.X-Rays
VisualizationMicroscopy
8. Why use X-rays and crystals?
Optical microscopy vs. X-ray diffraction
• X-rays is in the order of atom diameter and bond lengths, allowing these to be
individually resolved.
• No lenses available to focus X-rays. Crystal acts as a magnifier of the
scattering of X-rays.
http://classes.soe.ucsc.edu/bme220/Spring07/NOTES/Xraycryst.IMcNae_MWalkinshaw.pdf
9. X-Ray Crystallography
• 1. Protein purification.
• 2. Protein crystallization.
• 3. Data collection.
• 4. Structure Solution (Phasing)
• 5. Structure determination (Model building and refinement)
Steps in Structure Determination
http://www2.uah.es/farmamol/New_Science_Press/nsp-protein-5.pdf
10. X-Ray Crystallography
• What is Protein Purification?
– is a series of processes intended to isolate one or a few proteins from a
complex mixture, usually cells, tissues or whole organisms.
• Why Protein Purification?
– Characterization of the function.
– Structure
– Interactions of the protein.
• Requirements
– minimum of 5 to 10 milligrams pure soluble
– protein are required with better than 95% purity
Step1:Protein Purification
http://classes.soe.ucsc.edu/bme220/Spring07/NOTES/Xraycryst.IMcNae_MWalkinshaw.pdf
11. X-Ray Crystallography
• Why Crystallization:
– X-ray scattering from a single unit would be unimaginably weak.
– A crystal arranges a huge number of molecules in the same orientation.
– Scattered waves add up in phase and increase Signal to a level which
can be measured.
– This is often the rate-limiting step in straightforward structure
determinations, especially for membrane proteins
Step2:Protein crystallization
http://xray.bmc.uu.se/~kaspars/xray.ppt
12. Step2:Protein crystallization
Crystals MUST be:
Small in size:
•Less than 1 millimeter
PERFECT:
•No cracks
•No Inclusions, such as air
bubbles
Improving Crystal Quality
Hanging Drop Method
Hanging Drop Method:
1 to 5μl protein solution is suspended over
a 1 ml reservoir containing precipitant
solution
e.g. ammonium sulfate solution or
polyethylene glycol
http://classes.soe.ucsc.edu/bme220/Spring07/NOTES/Xraycryst.IMcNae_MWalkinshaw.pdf
14. Mounting Crystals:
• Crystals are mounted in a way so that the sample
can be rotated and an X‐Ray beam can be passed
through the sample.
• Methods of mounting include using either a capillary
or a tube.
• Both capillary and tubes are mounted on a
goniometer.
X-Ray Crystallography
Step3:Data collection:
Exposing X‐Rays:
Once the crystals are correctly mounted, they are
exposed to X‐Ray Beams. X‐Ray Sources include:
• Synchrotron: gives high resolution and luminosity
• X‐Ray generators: for smaller, laboratory use
http://serc.carleton.edu/research_educati
on/geochemsheets/techniques/SXD.html
15. X-Ray Crystallography
• The source of the X-rays is often a synchrotron.
• The typical size for a crystal for data collection may be 0.3 x 0.3 x
0.1 mm.
• The crystals are bombarded with X-rays which are scattered from
the planes of the crystal lattice.
• The scattered X-rays are captured as a diffraction pattern on a
detector such as film or an electronic device.
Step3:Data collection:
http://pruffle.mit.edu/atomiccontrol/education/xray/xray_diff_files/image006.gif
16. X-Ray Crystallography
• Rotate crystal through 1 degree and Record XRD pattern
• If XRD pattern is very crowded, reduce the degree of rotation
• Repeat until 30 degrees were obtained
• Sometimes 180 degrees depending on crystal symmetry
• Lower the symmetry= More data are required
• For high resolution, use Synchrotron
Step3:Data collection
http://upload.wikimedia.org/wikipedia/commons/d/de/Kappa_goniometer_animation.ogg
17. X-Ray Crystallography
Step4:Structure Solution (Phasing)
A typical diffraction pattern from a
protein crystal
GOAL= From Diffraction Data to Electron Density
The 3D structure obtained above is
the electron density map of the
crystal.
http://www.chem.ucla.edu/harding/IGOC/E/electron_density_map01http://www.chem.ucla.edu/harding/IGOC/D/diffraction_pattern01.jpg
19. • What is the Phase problem?
– In the measurement of data from an X-ray crystallographic
experiment only the amplitude of the wave is determined.
– To compute a structure, the phase must also be known.
– Since it cannot be determined directly, it must be determined
indirectly or by some other experiment.
X-Ray Crystallography
Step4:Structure Solution (Phasing)
20. • Methods for solving the phase problem
– Molecular Replacement (MR)
– Multiple/Single Isomorphous replacement (MIR/SIR)
– Multiple/Single wavelength Anomalous Diffraction(MAD/SAD)
• Principle using Fourier Transform (FT) :
– FT of the diffraction data gives us a representation of the contents
of the crystal.
X-Ray Crystallography
Step4:Structure Solution (Phasing)
http://xray.bmc.uu.se/~kaspars/xray.ppt
21. X-Ray Crystallography
Step5: Structure determination (Fitting):
• Fitting of protein sequence in the electron density.
• Electron density – Not self explanatory
• Can be automated, if resolution is close to 2Å or better.
• What can be interpreted is largely defined by resolution.
http://xray.bmc.uu.se/~kaspars/xray.ppt
22. X-Ray Crystallography
Step5: Structure determination (Refinement):
Automated improvement of the model, so it explains the observed data
better.
The phases get improved as well, so the electron density maps get better.
24. Nuclear magnetic resonance (NMR)
Introduction:
• The aim:
Measure set of distances between atomic nuclei.
• Why?
– For proteins that are hard to crystallize.
– For proteins that can be dissolved at high concentrations.
– To study dynamics of the protein: conformational equilibria,
folding and intra-, intermolecular interactions.
25. Nuclear magnetic resonance (NMR)
The concept
• The base is the nucleus Spin.
• Spin is characterized by angular momentum vector.
• Can be parallel or anti-parallel external magnetic field.
• Forms energy states , low and high
• Applying radio frequency can change the states.
http://www.umkcradres.org/Spec/RADPAGE/Magnet2.jpg
26. Nuclear magnetic resonance (NMR)
The concept
• Perturbation of the spins causes a NMR signal to be observed.
• The signal consists of RF waves with frequencies that match the energy
difference between the spin states of the individual nuclei involved.
• The resonance frequencies of different types of nuclei are widely
different.
http://en.wikipedia.org/wiki/File:NMR_EPR.gif
27. Nuclear magnetic resonance (NMR)
The concept
• Chemical shift is the resonant frequency of a nucleus relative to a
standard.
• Nuclear Overhauser effect (NOE) permits distance measurements
between nuclei.
http://www.cs.duke.edu/brd/Teaching/Bio/asmb/current/2papers/Intro-reviews/flemming.pdf
28. Nuclear magnetic resonance (NMR)
• 1. Protein solution.
• 2. NMR spectroscopy (data collection)
• 3. Sequential resonance assignment
• 4. Collection of conformational constraints
• 5. Structure calculation
Steps in Structure Determination
http://uah.es/farmamol/New_Science_Press/nsp-protein-5.pdf
29. • Highly purified protein preparation.
• Unlike crystallography, structure determination by NMR is carried out on
aqueous sample.
• Usually, the sample consists of between 300 and 600 microlitres with a
protein concentration in the range 0.1 – 3 millimolar.
• The purified protein is usually dissolved in a buffer solution
Nuclear magnetic resonance (NMR)
Step1: Protein solution
30. • Each distinct nucleus produces a chemical shift by which it can be recognized
.
• Overlapping chemical shifts , So!
• Two main experiments categories
- One where magnetization is transferred through the chemical bonds.
- One where the transfer is through space.
Nuclear magnetic resonance (NMR)
Step2: NMR spectroscopy (data collection)
31. • Map chemical shift to atom by
sequential walking .
• Application of multidimensional
NMR spectroscopy allowed the
development of general
strategies for the assignment .
• Take advantage of the known
protein sequence.
Nuclear magnetic resonance (NMR)
Step3: Sequential resonance assignment
http://en.wikipedia.org/wiki/File:1H_NMR_Ethanol_Coupling_shown.GIF
32. • Can be obtained within one week.
• The assignment of inter-atomic distances based on proton/proton NOEs
observed in is quite time consuming.
• Structure calculation and NOE assignment is an iterative process.
Nuclear magnetic resonance (NMR)
Step3: Sequential resonance assignment
33. • Geometric conformational information to be derived from the NMR
data.
• Distance restraints.
• Restraints angle .
• Orientation restraints.
• Chemical shift data, provides information on the type of secondary
structure
Nuclear magnetic resonance (NMR)
•Step4: Collection of conformational constraints
34. • Determined restraints is the input.
• Using computer programs The process
results in an ensemble of structures .
Nuclear magnetic resonance (NMR)
•Step5: Structure calculation
http://en.wikipedia.org/wiki/File:Ensemble_of_NMR_structures.jpg
36. • Every experiment has associated errors
• Random errors will affect the reproducibility and precision of the
resulting structures
• Systematic errors affect the accuracy of the model
• Precision indicates the degree of reproducibility of the
measurement and is often expressed as the variance of the
measured data set under the same conditions
• Accuracy, however, indicates the degree to which a measurement
approaches its correct value
• Ideally, a model of a protein will be more accurate the more fit the
actual molecule it represents and will be more precise as there is
less uncertainty about the positions of its atoms
Structure Quality Measures
Definitions
37. • R-Factor
– A measure of agreement between the crystallographic model and the
original X-ray diffraction data.
– The R-factor is used to assess the progress of structure refinement, and
the final R-factor is one measure of model quality.
– The R-factor is calculated as follows:
• |Fobs| is derived from the measured intensity of a reflection in the
diffraction pattern
• |Fcalc| is the intensity of the same reflection calculated from the
current model
– The absolute range of values is 0 to 1, the lower the better structure
– Usually ranges between 0.6 and 0.2
Structure Quality Measures
X-Ray Crystallography Quality Assessment
38. • Free R-Factor
– The free R-factor, Rfree, is computed in the same manner as R-Factor,
but using only a small set of randomly chosen intensities (the "test set")
which are set aside from the beginning and not used during refinement
– They are used only in the cross-validation or quality control process of
assessing the agreement between calculated (from the model) and
observed data
• The quantities RSR, Rmerge and Rsymm are similarly used to describe
the internal agreement of measurements in a crystallographic data
set.
– These quantities are generally less used, and they are explained on our
Wiki
Structure Quality Measures
X-Ray Crystallography Quality Assessment
39. • Knowledge-based quality measures
– Knowledge-based (KB) metrics describe how well the structure model
conforms to expectations
– They use selected features, such as:
• Bond length and bond angle distributions, dihedral angle distributions,
atomic packing, hydrogen bond geometries, and other geometric features.
– Ideal values are derived high-resolution X-ray structures
• Model versus data measures
– The most general form of MvD validation involves comparison of
distances and dihedral angles in models with the corresponding
experimental restraints.
– MvD measures are used widely with NMR
Structure Quality Measures
NMR Quality Assessment
40. • Common MvD Measures
– Root-Mean Square Deviation (RMSD)
• A common approach to asses the quality of NMR structures and to
determine the relative difference between structures
• An rmsd is a measure of the distance separation between
equivalent atoms:
• Two identical structures will have an rmsd of 0Å
– RPF Quality Scores
• Recent efforts in NMR structure validation have included increased
use of RPF Scores to calculate the ‘‘goodness-of-fit’’ between the
3D protein NMR structures and experimental NOESY peak list
Structure Quality Measures
NMR Quality Assessment
http://biomaps.rutgers.edu/JACS_127_1665_2005.pdf
41. • RPF Quality Scores
– Recall
TP / (TP + FN)
– Precision
TP / (TP + FP)
– F-measure
• Overall performance score calculated from the recall and precision
• It provides measure of the overall fit between the query model
structure and the experimental data
(2 x Recall x Precision) / (Recall + Precision)
Structure Quality Measures
NMR Quality Assessment
43. X-Ray Pros X-Ray Cons NMR Pros NMR Cons
Get whole 3D structure
by analysis of good
crystallized material
Protein has to form
stable crystals that
diffract well
Can provide information
on dynamics and
identify individual side-
chain motion
Requires concentrated
solution - therefore
danger of aggregation
Produces a single
model that is easy to
visualize and interpret
Crystal production can
be difficult and time
consuming
Secondary structure can
be derived from limited
experimental data
Currently limited to
determination of
relatively small proteins
More mathematically
direct image
construction
Inability to examine
solutions and the
behavior of the
molecules in solution
Free from artifacts
resulting from
crystallization
A weaker interpretation
of the experimental
data
Quality indicators
available (resolution, R-
factor)
There is no chance for
direct determination of
secondary structures
Useful for protein-
folding studies
Produces an ensemble
of possible structures
rather than one model
Large molecules can be
determined
Unnatural, non-
physiological
environment
Closer to biological
conditions in some
respects
Advantages & Disadvantages
X-Ray vs. NMR
45. Cryo-Electron microscopy
Another method for structure determination
• Definition:
– is a new technology for studying the architecture of cells, viruses and
protein assemblies at molecular resolution.
• Biological specimens:
1. Thin film
2. Vitreous sections
46. Cryo-Electron microscopy
Another method for structure determination
• Advantages :
1. Allows the observation of specimens that have not been stained or
fixed in any way
2. Showing them in their native environment
3. Less in functionally irrelevant conformational changes
• Disadvantages:
1. Expensive
2. The resolution of cryo-electron microscopy maps is not high enough
50. Unit Cell vs. Biological Cell
• Unit Cell: Asymmetric unit is the smallest portion of a crystal
structure to which symmetry operations can be applied in order to
generate the complete unit cell (the crystal repeating unit)
• Biological Cell: macromolecular assembly that has either been
shown to be or is believed to be the functional form of the molecule.
hemoglobin
(αβ)2
51. Unit Cell vs. Biological Cell
• Thus, a biological assembly may be built from:
• one copy of the asymmetric unit
• a portion of the asymmetric unit
• Asymmetric unit with multiple biological assemblies
52. X-Ray Crystallography
Step1:Protein Purification(Backup)
A figure summarizing the steps involved in a metal binding strategy for protein
purification
http://upload.wikimedia.org/wikipedia/commons/thumb/e/e9/Protein_Purification_MetalBinding.tif/lossy-page1-320px-Protein_Purification_MetalBinding.tif.jpg
53. X-Ray Crystallography
Braggs law
Step2:Protein crystallization(Backup)
http://www.eserc.stonybrook.edu/ProjectJava/Bragg/
Scattered beams in phase,
they add up
Scattered beams not in
phase, they cancel each other
nl = 2d sinq
54. • The biological material is spread on an electron microscopy grid and is preserved in a frozen-
hydrated state by rapid freezing, usually in liquid ethane near liquid nitrogen temperature. By
maintaining specimens at liquid nitrogen temperature or colder, they can be introduced into the
high-vacuum of the electron microscope column. Most biological specimens are
extremely radiation sensitive, so they must be imaged with low-dose techniques (usefully, the low
temperature of cryo-electron microscopy provides an additional protective factor
against radiation damage).
• Consequently, the images are extremely noisy. For some biological systems it is possible to
average images to increase the signal-to-noise ratio and retrieve high-resolution information about
the specimen using the technique known as single particle analysis. This approach in general
requires that the things being averaged are identical, although some limited conformational
heterogeneity can now be studied (e.g. ribosome). Three-dimensional reconstructions from cryo-
EM images of protein complexes and viruses have been solved to sub-nanometer or near-atomic
resolution, allowing new insights into the structure and biology of these large assemblies.
• Analysis of ordered arrays of protein, such as 2-D crystals of transmembrane
proteins or helical arrays of proteins, also allows a kind of averaging which can provide high-
resolution information about the specimen. This technique is called electron crystallography.
Thin film
55. • The thin film method is limited to thin specimens (typically < 500 nm) because the electrons
cannot cross thicker samples without multiple scattering events. Thicker specimens can be
vitrified by plunge freezing (cryofixation) in ethane (up to tens of μm in thickness) or more
commonly by high pressure freezing (up to hundreds of μm). They can then be cut in thin sections
(40 to 200 nm thick) with a diamond knife in a cryo ultramicrotome at temperatures lower than -
135 °C (devitrification temperature). The sections are collected on an electron microscope grid
and are imaged in the same manner as specimen vitrified in thin film. This technique is called
cryo-electron microscopy of vitreous sections (CEMOVIS) or cryo-electron microscopy of frozen-
hydrated sections.
Vitreous sections
Editor's Notes
We all are familiar with crystals from rock collections or small molecules, such as salt or
sugar. We usually associate them with properties like hard, durable, and pretty.
Unfortunately, only the latter is true for protein crystals.
If the crystals are not perfect then the end image that is formed will have random patterns or have other problems.
To obtain any useful structural information some form of intelligence(Machine/Human) has to interpret the electron density in the for m of for a model that best fits the data.
Phase errors and unidentifiable sections of density also play a role in restricting accurate model building. These can be overcome or decreased.
To obtain any useful structural information some form of intelligence(Machine/Human) has to interpret the electron density in the for m of for a model that best fits the data.
Phase errors and unidentifiable sections of density also play a role in restricting accurate model building. These can be overcome or decreased.
To obtain any useful structural information some form of intelligence(Machine/Human) has to interpret the electron density in the for m of for a model that best fits the data.
Phase errors and unidentifiable sections of density also play a role in restricting accurate model building. These can be overcome or decreased.
For example, protons (1H) resonate at a ten times higher frequency than nitrogen nuclei (15N)
To obtain any useful structural information some form of intelligence(Machine/Human) has to interpret the electron density in the for m of for a model that best fits the data.
Phase errors and unidentifiable sections of density also play a role in restricting accurate model building. These can be overcome or decreased.
Find out which chemical shift corresponds to which atom. This is typically achieved by sequential walking using information derived from several different types of NMR experiment .
Only the application of multidimensional NMR spectroscopy allowed the development of general strategies for the assignment of signals in proteins
use the known protein sequence to connect nuclei of amino acid residues which are neighbours in the sequence.
Chemical shift assignments typically can be obtained within one week and can also be automated. The chemical shifts already allow to define the secondary structure elements of a protein.
The assignment of interatomic distances based on proton/proton NOEs observed in is quite time consuming.
Structure calculation and NOE assignment is an iterative process.
geometric conformational information in the form of distances and/or torsion angles has to be derived from the NMR data, Distance restraints
the torsion angles of the chemical bonds, typically the psi and phi angles, can be generated. One approach uses the chemical shifts to generate angle restraints angle restraints
The analyte molecules in a sample can be partially ordered with respect to the external magnetic field of the spectrometer by manipulating the sample conditions orientation restraints
chemical shift data, provides information on the type of secondary structure
determined restraints can be used as input for the structure calculation process
using computer programs such as GeNMR, CYANA or XPLOR-NIH The process results in an ensemble of structures that, if the data were sufficient to dictate a certain fold, will converge
Typically one hundred structures are calculated, and those structures, which comply best to the NMR input data and are energetically most favorable, are selected as group of structures often referred to as an NMR bundle.