Lecture 6: QTL Mapping - PowerPoint PPT Presentation

Lecture 6: QTL Mapping. P 1 x P 2. B 2. B 1. F 1. F 1. F 1 x F 1. F 1. F 1. Backcross design.

35K downloads 61K Views 187KB Size

Recommend Presentation

Introduction to QTL mapping - PowerPoint PPT Presentation
Introduction to QTL mapping. Manuel Ferreira. Boulder Introductory Course 2006. Outline. 1. Aim.

Statistical issues in QTL mapping in mice - PowerPoint PPT Presentation
Statistical issues in QTL mapping in mice. Karl W Broman Department of Biostatistics Johns Hopkins

QTL Mapping in Natural Populations - PowerPoint PPT Presentation
QTL Mapping in Natural Populations. Basic theory for QTL mapping is derived from linkage analysis in

GridQTL : A Grid Portal for QTL Mapping of Compute Intensive Datasets - PowerPoint PPT Presentation
GridQTL : A Grid Portal for QTL Mapping of Compute Intensive Datasets. John Allen 1 , Jean-Alain Grunchec

Lecture 6 Dynamic Programming: Slicing Floorplans and Technology Mapping - PowerPoint PPT Presentation
Lecture 6 Dynamic Programming: Slicing Floorplans and Technology Mapping. 048918 VLSI Backend CAD.

Module 17: Advanced QTL Mapping Zhao-Bang Zeng , Brian S. Yandell Presentation Schedule - PowerPoint PPT Presentation
Module 17: Advanced QTL Mapping Zhao-Bang Zeng , Brian S. Yandell Presentation Schedule. Monday

PowerPoint PPTX Transcript

Lecture 6:QTL Mapping

P1 x P2





F1 x F1



Backcross design

Backcross design

F2 design



Advanced intercross

Design (AIC, AICk)


Experimental Design: CrossesExperimental Designs: Marker Analysis

Single marker analysis

Flanking marker analysis (interval mapping)

Composite interval mapping

Interval mapping plus additional markers

Multipoint mapping

Uses all markers on a chromosome simultaneously

Conditional Probabilities of QTL Genotypes

The basic building block for all QTL methods is

Pr(Qk | Mj) --- the probability of QTL genotype

Qk given the marker genotype is Mj.

Consider a QTL linked to a marker (recombination

Fraction = c). Cross MMQQ x mmqq. In the F1, all

gametes are MQ and mq

In the F2, freq(MQ) = freq(mq) = (1-c)/2,

freq(mQ) = freq(Mq) = c/2

Hence, Pr(MMQQ) = Pr(MQ)Pr(MQ) = (1-c)2/4

Pr(MMQq) = 2Pr(MQ)Pr(Mq) = 2c(1-c)/4

Pr(MMqq) = Pr(Mq)Pr(Mq) = c2 /4

Why the 2? MQ from father, Mq from mother, OR

MQ from mother, Mq from father

Since Pr(MM) = 1/4, the conditional probabilities become

Pr(QQ | MM) = Pr(MMQQ)/Pr(MM) = (1-c)2

Pr(Qq | MM) = Pr(MMQq)/Pr(MM) = 2c(1-c)

Pr(qq | MM) = Pr(MMqq)/Pr(MM) = c2




Genetic map




No interference: c12 = c1 + c2 - 2c1c2

Complete interference: c12 = c1 + c2

2 Marker loci

Suppose the cross is M1M1QQM2M2 x m1m1qqm2m2

In F2, Pr(M1QM2) = (1-c1)(1-c2)

Pr(M1Qm2) = (1-c1) c2 Pr(m1QM2) = (1-c1) c2

Likewise, Pr(M1M2) = 1-c12 = 1- c1 + c2

A little bookkeeping gives










Expected Marker Means

The expected trait mean for marker genotype Mj

is just

For example, if QQ = 2a, Qa = a(1+k), qq = 0, then in

the F2 of an MMQQ/mmqq cross,

• If the trait mean is significantly different for the

genotypes at a marker locus, it is linked to a QTL

• A small MM-mm difference could be (i) a tightly-linked

QTL of small effect or (ii) loose linkage to a large QTL










This is essentially a for

even modest linkage








Hence, the use of single markers provides for

detection of a QTL. However, single marker means does

not allow separate estimation of a and c.

Now consider using interval mapping (flanking markers)

Hence, a and c can be estimated from the mean values of

flanking marker genotypes

Value of trait in kth individual of marker genotype

type i

Effect of marker genotype i on trait value

Linear Models for QTL Detection

The use of differences in the mean trait value

for different marker genotypes to detect a QTL

and estimate its effects is a use of linear models.

One-way ANOVA.

Detection: a QTL is linked to the marker if at least

one of the bi is significantly different from zero

Estimation (QTL effect and position): This requires

relating the bi to the QTL effects and map position

Effect from marker genotype at first

marker set (can be > 1 loci)

Effect from marker genotype at second

marker set

Interaction between marker genotypes i in 1st

marker set and k in 2nd marker set

Detecting epistasis

One major advantage of linear models is their

flexibility. To test for epistasis between two QTLs,

used an ANOVA with an interaction term

• At least one of the ai significantly different from 0

---- QTL linked to first marker set

• At least one of the bk significantly different from 0

---- QTL linked to second marker set

• At least one of the dik significantly different from 0

---- interactions between QTL in sets 1 and two

Trait value given marker genotype is type j

Distribution of trait value given QTL genotype is k

is normal with mean mQk. (QTL effects enter here)

Probability of QTL genotype k given marker genotype

j --- genetic map and linkage phase entire here

Sum over the N possible linked QTL genotypes

Maximum Likelihood Methods

ML methods use the entire distribution of the data, not

just the marker genotype means.

More powerful that linear models, but not as flexible

in extending solutions (new analysis required for each model)

Basic likelihood function:

This is a mixture model

Maximum of the likelihood under a no-linked QTL



Maximum of the full likelihood




ML methods combine both detection and estimation

Of QTL effects/position.

Test for a linked QTL given from the LR test

The LR score is often plotted by trying different locations

for the QTL (I.e., values of c) and computing a LOD score

for each

A typical QTL map from a likelihood analysis





CIM works by adding an additional term to the

linear model ,

Interval Mapping with Marker Cofactors

Consider interval mapping using the markers i and i+1.

QTLs linked to these markers, but outside this

interval, can contribute (falsely) to estimation of

QTL position and effect

Now suppose we also add the two markers flanking the

interval (i-1 and i+2)

Inclusion of markers i-1 and i+2 fully account

for any linked QTLs to the left of i-1 and the

right of i+2

Interval being mapped

However, still do not account for QTLs in the areas

Interval mapping + marker cofactors is called

Composite Interval Mapping (CIM)

CIM also (potentially) includes unlinked markers to

account for QTL on other chromosomes.

Power and Repeatability: The Beavis Effect

QTLs with low power of detection tend to have their

effects overestimated, often very dramatically

As power of detection increases, the overestimation

of detected QTLs becomes far less serious

This is often called the Beavis Effect, after Bill

Beavis who first noticed this in simulation studies

Mapping in Outbred Populations

QTL mapping in outbred populations has far lower

power compared to line crosses.

Not every individual is informative for linkage.

an individual must be a double heterozygote

to provide linkage information

Parents can differ in linkage phase, e.g., MQ/mq

vs. Mq/mQ. Hence, cannot pool families, rather must

Analyze each parent separately.

Marker vs. QTL Informative

Can easily check to see if a parent/family is

Marker informative (at least one parent is

A marker heterozygote).

No easy way to check if they are also QTL informative

(at least one family is a QTL heterozygote)

A fully-informative parent is both Marker and QTL

informative, i.e., a double heterozygote.

Types of families (considering marker information)

Fully Marker informative Family:

MiMj x MkMl

Both parents different heterozygotes

All offspring are informative in distinguishing

alternative alleles from both parents.

Backcross family

One parent a marker homozygote

MiMj x MkMk

All offspring informative in distinguishing

heterozygous parent's alternative alleles

Intercross family

MiMj x MiMj

Both the same marker heterozygote

Only homozygous offspring informative in

distinguishing alternative parental alleles

Trait value for the kth offspring of sire i with marker

genotype j

The effect of sire i

The effect of marker j in sire i

Sib Families

QTL mapping can occur in sib families. Here, one

looks separately within each family for differences

in trait means for individuals carrying alternative

marker alleles. Hence, a separate analysis can be

done for each parent in each family.

Information across families is combined using

a standard nested ANOVA.

Half-sibs (common sire)



A significant marker effect indicates linkage to a


This is tested using the standard F-ratio,

What can we say about QTL effect and position?

Thus, the marker variance confounds both position and

QTL effect, here measured by the additive variance of

the QTL

Since sA2 = 2a2p(1-p), we can get a small variance

For a QTL of large effect (a >>1) if one allele is rare

If 2p(1-p) is small, heterozygotes are rare (most sires

are QTL homozygotes). However, if a is large, in these

rare families, there is a large effect.

Effect of sire i

Effect of marker allele j from sire i

Effect of the kth son of sire i with

sire marker allele j

Hence, there is a tradeoff in getting a sufficient number

of families to have a few with the QTL segregating, but

also to have family sizes large enough to detect differences

between parental marker alleles

The granddaughter design. Widely used in dairy cattle

to improve power.

Each sire (i) produces a number of sons that are genotyped for the sire allele. Each son then produces a number of offspring in which the trait is measured

Advantage: large sample size for each mij value.

Trait value for individual i

Genetic value of other (background) QTLs

Genetic effect of chromosomal region of interest


Fraction of chromosomal region shared IBD

between individuals i and j.

Resemblance between relatives correction

General Pedigree Methods

Random effects (hence, variance component) method

for detecting QTLs in general pedigrees

The covariance between individuals i and j is thus







The resulting likelihood function is




Assume z is MVN, giving the covariance matrix as


Estimated from marker


Estimated from

the pedigree

A significant sA2 indicates a linked QTL.

Haseman-Elston Regressions

One simple test for linkage of a QTL to a marker

locus is the Haseman-Elston regression, used in

human genetics

The idea is simple: If a marker is linked to a QTL,

then relatives that share a IBD marker alleles

likely share IBD QTL alleles and hence are more

similar to each other than expected by chance.

The approach: regress the (squared) difference

in trait value in the same sets of relatives on the

fraction of IBD marker alleles they share.

Fraction of marker alleles IBD in this pair

Of relatives. p = 0, 0.5, or 1









For the ith pair of relatives,

The expected slope is a function of the additive variance

Of the linked QTL, the distance c between marker and

QTL and the type of relative

This is a one-sided test, as the null hypothesis (no linkage)

is b =0 versus the alternative b < 0

Note that parent-offspring are NOT an appropriate

pair of relatives for this test. WHY?

Affected Sib Pair Methods

As with the HE regression, the idea is that if the

marker is linked to a QTL, individuals with more

IBD marker alleles with have closer phenotypes.

Example of an allele-sharing method.

Consider a discrete phenotype (disease presence/


A sib can either be affected or unaffected.

A pair of sibs can either be concordant (either

doubly affected or both unaffected), or

discordant (a singly-affected pair)


The IDB probabilities for a random pair of full sibs are

Pr(0 IBD) = Pr(2 IBD) = 1/4, Pr(1 IBD) = 1/2

The idea of affected sib pair (ASP) methods is to compare

this expected distribution across one (or more) classes of

phenotypes. A departure from this expectation implies

the marker is linked to a QTL

There are a huge number of versions of this simple test.

pij = frequency of a pair with i affected sibs sharing

j marker alleles IBD

• Compare the frequency of doubly-affected sib pairs

That have both marker alleles IBD with the null value 1/4

One-sided test as p22 > 1/4 under



Number of doubly-affected


Freq of pairs sharing 1 alleles


Freq of pairs with 2

alleles IBD

• Compare the mean number of IBD alleles in

doubly-affected pairs with the null value of 1

One-sided test as

p21+ 2p22 > 1 under linkage

Finally, maximum likelihood approaches have been

suggested. In particular, the goodness-of-fit

of the full distribution of IDB values in doubly-

affected sibs

Comparing p2i = n2i/n2 with 1/4 (i=0, 2) or 1/2 (i=1)





The resulting test statistic is called MLS for

Maximum LOD score, and is given by

Where p20 = p22 = 1/4, p21 = 1/2. Linkage indicated

by MLS > 3

An alternative formulation of MLS is to consider

just the contribution from one parent, where

(under no linkage), Pr(sibs share same parental

allele IBD) = Pr(sibs don’t share same parental

allele) = 1/2

Fraction of sib pairs sharing parental

allele IBD















Example: Genomic scan for type I diabetes

Marker D6S415. For sibs with diabetes, 74 pairs

Shared the same parental allele IBD, 60 did not

Marker D6S273. For sibs with diabetes, 92 pairs

Shared the same parental allele IBD, 31 did not

Smile Life

When life gives you a hundred reasons to cry, show life that you have a thousand reasons to smile

Get in touch

© Copyright 2015 - 2019 SLIDESILO.COM - All rights reserved.