2. Concepts: Linkage and Linkage Mapping
Linkage (of genes):
“the association of genes that results from their
being on the same chromosome (i.e., physically
associated)”. For example, genes A and B in
chromosomes Chr1 and Chr2 (Fig. 1a).
Linkage group:
“all genes in one chromosome form one linkage
group”. For example: Chr1 and Chr2 are two different
linkage groups (Fig. 1a).
Linked (genes):
“a pair of linked genes (specifically, their alleles) tend
to be transmitted together during meiotic cycle and
progenies deviate from Mendelian ratios depending
upon recombination fraction (r) between the two
genes”. For example, genes A and B in Fig. 1b.
A
B
Fig. 1a. A and B linked;
C unlinked to A and B
C
Chr1 Chr2
AA aa
BB bb
Aa aa
Bb bb
X
X
a
b Unlinked Linked
A Aa
B Bb
A Aa
b bb
a aa
B Bb
a aa
b bb
Frequency
1/4
1/4
1/4
1/4
(1-r)/2
r/2
r/2
(1-r)/2
Fig. 1b. Test cross frequenciesSource: R.H.J. Schlegel, Encyclopedic Dictionary of Plant Breeding
3. Concepts: Linkage and Linkage Mapping
Linkage map:
- “is a map of the frequencies of
recombination that occur between markers
on homologous chromosomes during
meiosis.”
- distance is measured in cM.
Physical map:
- “shows the physical locations of genes and
other DNA sequences of interest.
- distance measure in base pairs
Comparative map:
- a map that compares linkage maps or
physical maps of related species based on
shared markers or sequences, respectively
(Fig. 2)
Fig. 2. Test cross frequencies
Source: Fig. 2 - www.pnas.org/content/102/37/13206/F3.expansion.html
4. 1. Monogenic or oligogenic
2. Discreet phenotypic classes
(nominal scale).
3. Typically, environmental effect on
trait expression is absent or low
4. Discontinuous variation (Fig. 3)
5. Genes have large effect
6. Mapped as visible marker
(i.e., linkage mapping)
Concepts: QTL Analysis
Qualitative traits Quantitative traits
1. Polygenic (quantitative trait loci)
2. Continuum of measures (interval
scale).
3. Trait expression may show
profound environmental effect
4. Continuous variation (Fig. 4)
5. Genes have smaller effects
6. Mapping requires QTL analysis
cubocube.com
Fig.3.Discreettrait
Fig. 4. Fruit shape: a quantitative trait
www.nature.com
5. Lecture Outline: Linkage Mapping
1. A peek into the history of linkage mapping
1.1. Mendel’s work: rediscovery, validation and exceptions
1.2. Early genetic linkage maps
- natural mutants as genetic markers
- two-point and three-point linkage analysis
1.3. Mapping functions
2. Molecular era and revolution in genetic linkage mapping
2.1. Molecular markers
- isozymes, RFLPs, SSRs and SNPs
2.2. Mapping populations in plants
- F2, RILs, BC
2.3. Methods and tools for linkage mapping in plants
- maximum likelihood, LOD support, multipoint linkage mapping
2.4. Mapping polyploid genomes and outcrossing species
6. 1. A peek into the history of linkage mapping
1.1. Mendel’s work: rediscovery,
validation and exceptions
- Experiments in Plant Hybridization
(1865). Crosses between natural
mutants (Fig. 5)
- Rediscovered in 1900
- Laws of segregation (Fig. 6) and
independent assortment (Fig. 7)
- Wide validity in diverse organisms
for unlinked qualitative traits
Source: monohybrid cross - www.desktopclass.com
Fig. 6. Monohybrid
Cross
Fig. 5. Mendel’s traits
Source: Mendel’s traits -www.nature.com
Fig. 7
Source: Punnett square - sites.saschina.org
7. 1. A peek into the history of linkage mapping
1.1. Mendel’s work: rediscovery,
validation and exceptions
- Bateson and Punnett (1904)
- Deviation from Mendelian inheritance
(Fig. 8)
www.cas.miamioh.edu
1900
1865
Gregor Mendel:
- Proposed basic laws of inheritance
H. de Vries, E. von Tschermak, C. Correns
- Rediscovered Mendel’s work
Boveri and Sutton:
- Chromosome theory of inheritance
1902
Bateson and Punnett:
- Linkage
1904
Fig. 8
8. 1. A peek into the history of linkage mapping
1.2. Early genetic linkage maps
- 1900 – 1910: concepts of gene, allele,
genotype, phenotype, homozygote,
heterozygote
Thomas Hunt Morgan:
i. studied Drosophila genetics
ii. genes responsible for discreet
phenotypic differences are located on
chromosomes
iii. likelihood of co-transmission and
reshuffling (due to recombination)
were dependent on linkage between
genes (Fig. 9)
iv. linkages can be quantified
(i.e., linkage mapping is a possibility) Fig. 9. An illustration of Morgan’s
study in Drosophila
Source: Fig. 9. - http://bio.vtn2.com/bio-home/harvey/lect/images/morgan15.4.gif
9. 1. A peek into the history of linkage mapping
1.2. Early genetic linkage maps
Quantifying genetic linkages:
- mostly dihybrid test crosses and F2
populations (Fig. 10)
- segregating for wild-type (+) and mutant
(-) alleles
- sex-linked genes (X-linked)
First genetic linkage map of Sturtevant
(Morgan’s student):
- Series of dihybrid crosses. Example,
Fig. 10
- Map distance between body color and
eye color genes
= Recombination frequency, RF (%)
= [(0+2)/373)]*100 = 0.5
Fig. 10. An illustration of a dihybrid
cross, based on Sturtevant
(1913)
Source: Fig 10 - http://www.esp.org/foundations/genetics/classical/holdings/s/ahs-13.pdf
RF (%) = (recombinant type)*100/total
(+)
(-) (+)
(-)
Parental type
10. 1. A peek into the history of linkage mapping
1.2. Early genetic linkage maps
First genetic linkage map of Sturtevant
(Morgan’s student) (Fig. 11):
- a series of two-point recombination
frequencies (%) between 6 genes (Fig.
12). Here, 19 different populations
- started marker order from closest
linkages and manually added other loci
Fig. 11. First genetic linkage map. Sturtevant (1913)
Factors
concered
Proportion of
crossovers
% of
crossovers
BCO 193 / 16278 1.2
BO 2 / 373 0.5
BP 1464 / 4551 32.2
BR 115 / 324 35.5
BM 260 / 693 37.5
COP 224 / 748 29.9
COR 1643 / 4749 34.6
COM 76 / 161 47.2
OP 247 / 836 29.5
OR 183 / 538 34.0
OM 218 / 404 54.0
CR 236 / 829 28.5
CM 112 / 333 33.6
B(C,O) 214 / 21736 1.0
(C,O)P 471 / 1584 29.7
(C,O)R 2062 / 6116 33.7
(C,O)M 406 / 898 45.2
PR 17 / 573 3.0
PM 109 / 405 26.9
Source: Fig.11, Fig. 12 - www.nature.com/scitable/content/The-linear-arrangement-of-six-sex-linked-16655
Fig. 12. Sturtevant table of RF (%)
11. 1. A peek into the history of linkage mapping
1.2. Early genetic linkage maps
Limitations of two-point linkage
analysis
- Consider that 2 genes are far enough
apart that 2 crossovers (XOs) occur
between them (occasionally) and
involves:
i. same two nonsister chromatids for
both (Fig. 13)
ii. different nonsister chromatids for
both (Fig. 14)
- Result: either underestimation or
overestimation of RF
Fig. 13. Double crossover (same)
A
A
B
B
AB
AB
Gametes
a
a
b
b
ab
ab
Fig. 14. Double crossover (different )
A
A
B
B
Ab
Ab
Gametes
a
a
b
b
aB
aB
12. 1. A peek into the history of linkage mapping
1.2. Early genetic linkage maps
The three point test cross
- Using trihybrid crosses
- more efficient; includes 2 XOs
- allows calculation of XO interference
Example (Fig. 15):
i.First, test linkage. Here, they are
linked
ii.Most frequent are parental types
ii. Four single crossovers (SCOs)
iii. Two double crossovers (DCOs)
X- Z+Y+
offspring No. of Parental/
phenotypes individual
s
Recombinant
X+
Y-
Z+
1 Recombinant DCO
X-
Y+
Z+
440 Parental
X
-
Y
-
Z
+
26 Recombinant SCO #1
X-
Y-
Z-
61 Recombinant SCO #2
X+
Y+
Z-
32 Recombinant SCO #1
X+
Y-
Z-
442 Parental
X+
Y+
Z+
58 Recombinant SCO #2
X-
Y+
Z-
2 Recombinant DCO
total 1062
XO type
Fig. 15. Three point test cross freq.
X+ Z-Y-
X- Z-Y-
X- Z-Y-
Triple
Heterozygote
Triple
HomozygousX
13. 1. A peek into the history of linkage mapping
1.2. Early genetic linkage maps
Example (Fig. 16) continued..
iv. Compare either parental type to
double XO types
v. Conclusion: gene Z is in center
vi. Map distance (X-Z)
= [SCO (X-Z) + DCOs]*100/total
vii. Coefficient of coincidence (C)
= observed DCO freq./expected DCO
freq.
where, expected DCO freq
= (X-Z SCO freq. * Z-Y SCO freq)
viii. Interference = 1 - C
X- Z+Y+
offspring No. of Parental/
phenotypes individual
s
Recombinant
X+
Y-
Z+
1 Recombinant DCO
X-
Y+
Z+
440 Parental
X
-
Y
-
Z
+
26 Recombinant SCO #1
X-
Y-
Z-
61 Recombinant SCO #2
X+
Y+
Z-
32 Recombinant SCO #1
X+
Y-
Z-
442 Parental
X+
Y+
Z+
58 Recombinant SCO #2
X-
Y+
Z-
2 Recombinant DCO
total 1062
XO type
Fig. 16. Three point test cross freq.
X+ Z-Y-
X- Z-Y-
X- Z-Y-
Triple
Heterozygote
Triple
HomozygousX
P X
-
Y
+
Z
+
X
+
Y
-
Z
-
DCO X+
Y-
Z+
X+
Y-
Z+
D D S S S D
14. 1. A peek into the history of linkage mapping
1.3. Mapping functions
- “for more than three loci,
relationship among possible
recombination fractions is complex”
- “RFs between loci flanking a region
are not simple sum of recombination
fractions for adjacent loci within the
region”
- “conversion of recombination
fractions to additive map distances
requires mapping functions (Fig. 17):
i. Haldane
ii. Kosambi
Fig. 17. Table: Haldane and Kosambi
mapping functions. Chart:
comparison of mapping functions.
“r” is recombination fraction and
“d’ is map distance.
Source: Ben Hui Liu, Statistical Genomics; Roling Wu et al. , Statistical Genetics of Quantitative Traits
15. 1. A peek into the history of linkage mapping
Summary:
-Paucity of visible natural markers
(phenotypic mutants)
-Radiation mutants offered additional traits,
but lethality, sterility was a problem
-Nevertheless, two point and three point
linkage maps persisted for several decades
(~70 years)
-Example:
i. tomato: 258 morphological and
physiological markers (Rick 1975)
Fig. 18. An illustration of A tomato
linkage map made in 1952
Source: Fig. 18 – An introduction to Genetic Analysis, 5th edition.
16. 2. Molecular era and revolution in genetic linkage mapping
2.1. Molecular markers
- gel electrophoresis brought isozyme markers in
picture
-restriction endonuclease and southern blot
techniques brought RFLPs
-DNA sequencing and PCR brought SSRs and
SNPs
- virtually unlimited number of “visible markers”
-gaps in genetic linkage maps could be filled
- comparative mapping, gene cloning, QTL
analysis and MAS could be done
Fig. 19. Classes of molecular
markers
Source: Fig.19 -nature.berkeley.edu/brunslab/tour/tour2.html
RFLP SSR
17. 2. Molecular era and revolution in genetic linkage mapping
2.2. Mapping populations in
plants - considerations:
1st: marker polymorphism
- adequate polymorphic markers
between parents
- contrasting traits of interest
2nd: reproductive mode
- If inbreeding is a possibility:
F2, recombinant inbred lines
(RIL), backcross (BC)
- Mostly outcrossing (or self-
incompatible), long generation
time:
pseudo-testcross, backcross
Fig. 20a. F2 population
Source: Fig.20 –K. Meksem and G. Kahl, The Handbook of Plant Genome Mapping
Fig. 20b. RIL population
Fig. 20c. BC population
Fig. 20d. pseudo-
testcross population
18. 2. Molecular era and revolution in genetic linkage mapping
2.3. Methods and tools for linkage mapping in plants
Steps:
i. Data generation: genotype mapping population and prepare input format
for mapping
ii. Calculating recombination fractions (RFs): maximum likelihood estimates
of pair-wise RFs
iii. Locus grouping: grouping of markers into prospective linkage groups based
on linkage (maximum recombination fraction) and LOD (minimum limit of
support) thresholds
iv. Locus ordering: finding the best possible order based on highest multi
point likelihood (LOD) among different probable orders
v. Multilocus distance estimation
19. 2. Molecular era and revolution in genetic linkage mapping
2.3. Methods and tools for linkage mapping in plants
Detailed procedural discourse on MapMaker
i. Data generation:
mapmaker input file format (Fig. 21)
Type of cross: F2 intercross
F2 backcross
F3 self
RI self
RI sib
Defaults
Genotype Score:
Default symbols are
A : homozygous for parent A
H : heterozygous
B : homozygous for parent B
C : not homozygous for parent A
D : not homozygous for parent B
- : for missing
ScoresMarker Names
Population Size
Number of Markers
Fig. 21. MapMaker input format
20. 2. Molecular era and revolution in genetic linkage mapping
2.3. Methods and tools for linkage mapping in plants
Detailed procedural discourse on MapMaker
ii. Calculating recombination fractions (RFs): in backcross mating design (BC1)
- progenies can be distinctly
categorized into parental
or recombinant (Fig. 22a)
- recombination fraction is
simply the frequency of
recombinant type
(Fig 22b)
Fig. 22a. Freq. of gametes in BC mating
Fig. 22b. RF estimation is plain
and simple for a
backcross mating design
21. 2. Molecular era and revolution in genetic linkage mapping
2.3. Methods and tools for linkage mapping in plants
Detailed procedural discourse on MapMaker
ii. Calculating recombination fractions (RFs): in F2 mating design (Fig. 23a)
- progenies cannot be distinctly
categorized. For illustration, four
possible genotypes shown in Fig. 23b
belong to same genotype class
A1A2B1B2, but may come from parental
gametes without XO or recombinant
gametes (with XO) in both parents
Fig. 23a. F2 mating design and F2 genotypes
Fig. 23b. The counts (in parenthesis)
and frequencies of the 16
possible genotypes in an F2
family
22. 2. Molecular era and revolution in genetic linkage mapping
2.3. Methods and tools for linkage mapping in plants
Detailed procedural discourse on MapMaker
ii. Calculating recombination fractions (RFs): in F2 mating design
- 16 possible genotypes coalesce into 9 observable genotypic classes
Fig. 24. Frequencies of the nine observed genotypes in
an F2 population
23. 2. Molecular era and revolution in genetic linkage mapping
2.3. Methods and tools for linkage mapping in plants
Detailed procedural discourse on MapMaker
ii. Calculating recombination fractions (RFs): in F2 mating design
- likelihood function for estimating RF ( )
- “Maximum likelihood for r is
obtained by setting S(r) = 0 and
solving for r”
- “however, there is no explicit
solution for r”
- different ways to invoke iterative
algorithm to solve for r:
a. Grid search
b. Newton-Raphson MethodFig. 25. Likelihood function of r
24. 2. Molecular era and revolution in genetic linkage mapping
2.3. Methods and tools for linkage mapping in plants
Detailed procedural discourse on MapMaker
iii. Locus grouping :
- MapMaker’s “GROUP” command builds preliminary linkage groups based on
maximum-likelihood estimates of RF and corresponding LOD score between
marker pairs
- maximum allowable RF and minimum LOD score thresholds can be manually
updated to track changes in grouping structure with corresponding changes in
thresholds
- finally, linkage groups are formed by marker associations. For example, if A is
linked to B, and B is linked to C, all three belong to a group (remember, RF and
LOD thresholds are there for minimizing spurious linkages)
25. 2. Molecular era and revolution in genetic linkage mapping
2.3. Methods and tools for linkage mapping in plants
Detailed procedural discourse on MapMaker
iv. Locus ordering:
- “ ordering is the central problem in linkage mapping, and also the most
interesting in the sense that for groups of even modest size there is no sure
way to find the best (N! / 2) possible order”
- MapMaker’s “COMPARE” command is exhaustive - computes maximum
likelihood score for all possible orders and reports a subset of most likely ones
- however, ordering more than 5-7 markers with “COMPARE” is not practical
(time issue!)
Source: Meksem and Kahl, The Handbook of Plant Genome Mapping
26. 2. Molecular era and revolution in genetic linkage mapping
2.3. Methods and tools for linkage mapping in plants
Detailed procedural discourse on MapMaker
iv. Locus ordering:
- therefore, have to resort to faster algorithms. For example, MapMaker’s
“ORDER” command:
a. identifies the most informative subset of markers (default 5 markers)
b. performs exhaustive order search (akin to COMPARE) and finds one
c. tries to add remaining markers individually (at default RF = 0.5 and LOD =
3.0)
d. drops LOD threshold to 2.0 and tries remaining ones
e. in case markers still cannot be assigned a particular position, reports as such
f. such markers can be manually tried with “TRY” command and dropped if fails
Source: Meksem and Kahl, The Handbook of Plant Genome Mapping
27. 2. Molecular era and revolution in genetic linkage mapping
2.3. Methods and tools for linkage mapping in plants
Detailed procedural discourse on MapMaker
v. Multipoint distance estimation:
- MapMaker uses MAP command for multipoint estimates (not two-point
estimates)
- it employs EM algorithm (expectation-maximization algorithm), where
mutually dependent unknown parameters are alternately updated to converge
to a maximum.
- for example, an initial estimate (two-point) of r (θold = θ1, θ2, … θl-1, where l is
the number of loci) is used to compute expected number of recombinant type
for each interval (E step)
- (M step): using the new expected value MLE of θnew is computed
- E and M is iterated until θnew θold (the likelihood converges to a maximum)
- map distances are calculated using different mapping functions (default
Haldane)
Source: Ben Hui Liu, Statistical Genomics
28. Revisiting tomato genetic linkage maps:
-Example:
Tomato: (Sim et al. 2012)
Fig. 26a and 26b
- 7,666 SNPs
2. Molecular era and revolution in genetic linkage mapping
Fig. 26a. SNP
distribution
Fig. 26b. Two tomato linkage maps
compared to draft genome
assembly
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0040563
29. 2. Molecular era and revolution in genetic linkage mapping
2.4. Mapping polyploid genomes
- Allopolyploids show disomic segregation.
Hence, linkage mapping in allopolyploids are
similar to diploid linkage mapping
- Autopolyploids (e.g., potato, sugarcane etc)
show polysomic segregation (Fig. 27a).
Hence, linkage mapping in autopolyploids
employ different mapping techniques
- For example, single dose markers (SDMs)
segregating in 1:1 ratio (Fig. 27b) used in
pseudo-testcross mapping strategy
- Also, biparental and double-dose markers
can be integrated using TetraploidMap
software
Fig. 27a. Single locus Segregation
Aaaa X aaaa
1/2 Aaaa 1/2 aaaa
Autotetraploid
Fig. 27b. Segregation
of a SDM
30. 2. Molecular era and revolution in genetic linkage mapping
2.4. Mapping polyploid genomes
- Example TetraploidMap: four homologous chromosomes and a consensus
map
Source: TetraploidMap manual
31. Linkage Mapping
Summary
i. Genetic linkage maps were originally built to map phenotypic
mutants
ii. Modern linkage maps use molecular markers (predominantly, DNA
markers)
iii. Different types of mapping populations are used
iv. Mapping studies in diploid and allopolyploids use similar tools and
techniques
v. Linkage maps in autopolyploids neccessitates different mapping
strategies
vi. Linkage maps are useful for
- tagging markers along chromosomes
- identifying markers linked to genes and cloning genes
- identifying quantitative trait loci for traits of interest
- marker assisted selection
- comparative mapping and evolutionary studies
32. Lecture Outline: QTL Analysis
3. QTL mapping: models and methods
3.1. Single QTL model
3.1.1. Single marker analysis (SMA)
- t-tests, ANOVA, linear regression
3.1.2. Simple interval mapping (SIM)
3.2. Multiple QTL model
3.2.1. Multiple regression
3.2.2. Composite interval mapping (CIM)
3.3. QTL mapping in polyploid genomes
33. 3. QTL Mapping: Models and Methods
3.1. Single QTL model
- Assessing marker-trait associations at individual marker locus
- gene effects for single QTL model:
Backcross: g = 0.5 (µ1 - µ2), where
µ1 = mean for homozygous
µ2 = mean for heterozygous
F2: additive (α) = 0.5 (µ1 - µ3) and
dominance (d) = 0.5 (2µ2- µ1 - µ3), where
µ3 = mean for homozygous for parent B alleles
- Employs single marker analysis (SMA) techniques
Source: Ben Hui Liu, Statistical Genomics
34. 3.1.1. Single marker analysis (SMA)
- based on linear model:
yj = µ + f (markerj) + ɛj, where
yj is trait value of the jth individual in the population
µ is population mean
f (markerj) is a function of marker genotype
ɛj is the residual associated with the jth individual
Different methods:
a. marker genotypes treated as classification variable
- for a backcross (2 genotypes): use t-test
- for F2 population (up to 3 genotypes): use ANOVA
b. marker genotypes treated as dummy variables
- use marker-trait regression
c. likelihood ratio test and maximum likelihood estimation
Source: Ben Hui Liu, Statistical Genomics
35. 3.1.1. SMA
Source: Ben Hui Liu, Statistical Genomics
yj = β0 + β1xj + ɛj ,where
yj is the trait value for the jth individual in
the population, xj is the dummy variable
taking 1 if the individual is AA and -1 for
Aa. β0 is the intercept for the regression
which is the overall mean for the trait. β1
is the slope for the regression line and ɛj
is the random error.
yj = β0 + β1x1j + β2x2j + ɛj ,where
yj is the trait value for the jth individual in the population,
x1j is the dummy variable for the marker additive effect
taking 1, 0, and -1 for marker genotypes AA, Aa and aa,
respectively. x2j is the dummy variable for the marker
dominant effect taking 1, 0, and -1 for marker genotypes
AA, Aa and aa. β0 is the intercept for the regression
which is the overall mean for the trait. β1 and β2 are the
slopes for the additive and dominant regression lines,
respectively. ɛj is the random error.
BC
F2
- t-test and ANOVA
Steps (given alleles A and a at a marker locus):
a. sort marker genotype classes into groups
- “AA” and “Aa” in backcross; “AA”, “Aa”, and
“aa” in (F2)
b. test significant difference in means
- t statistic (in backcross), F statistic (in F2)
- Linear regression approach
Fig. 27. One way analysis
36. 1. Conceptually and
computationally simple
2. Genetic linkage map
information not needed
3. Easily incorporates
covariates
4. Informative when
markers sufficiently
cover the genome
5. Can be extended to
multiple regression for
multiple QTL model
3.1.1. SMA
Advantages Limitations
1. Location and effects of detected QTLs are
confounded
larger QTL effect could be because the
marker is close to a QTL or
farther from the QTL, but the QTL
contributes much significantly to the trait
2. QTL position cannot be precisely detected
3. Power to detect QTL is low when marker
density is low
4. Multiple comparison increases false
positives
5. Missing genotypes are totally excluded from
analysis
6. Limited ability to separate linked QTLs and
no ability to assess interacting QTLs
37. Basic statistical analysis
platforms:
Excel
JMP
SAS
R etc
QTL mapping platforms:
WinQTLCartographer
R/QTL
JoinMap
MapMarker/QTL etc.
3.1.1. SMA
Software tools Windows QTL Cartographer
SMA analysis fits the data to the simple linear
regression model
y = b0 + b1 x + e
Results reported includes b0, b1 and the F statistic
for each marker
F statistic compares the hypothesis
H0: b1 = 0; H1: b1
The pr(F) is a measure of how much support there
is for H0
A smaller pr(F) indicates less support for H0 and
thus more support for H1
Likelihood ratio test statistic compares two nested
hypothesis H0 and H1 with L0 and L1
likelihoods. Then, the “Likelihood Ratio Test
Statistic: is: -2ln(L0/L1)
38. 3.1.2. Simple interval mapping (IM)
- “Mapping Mendelian factors underlying Quantitative Traits
using RFLP linkage maps” (Lander and Bolstein 1989)
- Concept:
Based on joint segregation of a pair of adjacent markers and a
putative QTL within an interval flanked by the marker pair (Fig.
28)
Methods:
a. Likelihood approach (preferred over regression)
b. Regression approach (faster computation than ML)
Source: Ben Hui Liu, Statistical Genomics
Fig. 28. Linkage relationship of a QTL and two
flanking markers
39. 3.1.2. SIM
Likelihood approach (employed in WinQTLCart):
Source: Course notes, QTL mapping and Discovery
The density function for the
normal distribution with
mean μQk, and variance σ2.
There are K=1 to N
genotypes.
probability of the QTL
genotype, given the jth
genotypes of the flanking
markers
likelihood of phenotypic
value z, given the jth
genotypes of the flanking
markers.
MLE estimate under the reduced model of no QTL: μQQ=μQq=μqq
MLE estimate under the full model including a QTL.
LOD scores (log10 of the odds ratio), where
OR LR= 4.6LOD
40. 1. Conceptually and computationally
simple
2. Genetic linkage map information
not needed
3. Easily incorporates covariates
4. Informative when markers
sufficiently cover the genome
5. Can be extended to multiple
regression for multiple QTL model
3.1.1. SIM
Advantages Limitations
1. Location and effects of detected
QTLs are confounded
larger QTL effect could be because
the marker is close to a QTL or
farther from the QTL, but the QTL
contributes much significantly to
the trait
2. QTL positions cannot be precisely
detected
3. Power to detect QTL is low when
marker density is low
4. Multiple comparison increases
false positives
5. Missing genotypes are totally
excluded from analysis
41. 3.2. Composite interval mapping (CIM)
Source: Course notes, QTL mapping and Discovery
Test Interval
Left Marker Right Marker
Blocked Region (Cofactors)
CIM is a combination of IM and multiple regression (multiple QTL model)
- Fits both the effects of a QTL as well as the effects of covariates (subset of
selected genetic markers)
- CIM adds background loci to simple interval mapping (IM).
- It fits parameters for a target QTL in one interval while simultaneously fitting
partial regression coefficients for "background markers" to account for
variance caused by non-target QTL.
- Background markers are usually 20-40 cM apart
42. 3.2. CIM
General CIM statistical model can be written as:
Phenotypic
trait value of
subject i
Overall
mean
Row vector of predictor variables
corresponding to the effects of the
putative QTL
Row vector of predictor
variables corresponding
to the rth cofactor marker
Column vector with
the coefficient of the
rth cofactor marker
N(0,δ2)
Zi1α: additive effect
Zi1d: dominance effect
43. 3.2. CIM
Set of statistical models evaluated in the CIM analysis
(WinQTLCartographer):
- For backcross, recombinant inbred lines, and double haploids,
only Model 0 and Model 1 are generated and tested
- For F2 design, all four models are generated and tested
44. Comparison of SMA, SIM and CIM
Much precise location
http://solcap.msu.edu/pdf%20files/5PAA_Douches_2_Mapping_Populations.pdf
45. 3.3. QTL mapping in polyploid genomes
- Generally, QTL mapping in allopolyploid genomes is same as
diploids
- However, QTL mapping in autopolyploid genomes require
different strategies
- Example:
QTL mapping in
autotetraploids using
TetraploidMap
46. 3.3. QTL mapping in polyploid genomes
Summary
- Single marker analysis (SMA) involves t-test, ANOVA, or linear
regression approach
- Interval mapping is based on joint segregation of a pair of
adjacent markers
- CIM is a combination of IM and multiple regression and is
desirable among the three
- QTL mapping in autopolyploids require different analytical
strategies