Introduction

So far we have looked at selection on a single trait, but breeders are generally interested in multiple traits at the same time. An important observation here is that selection for one trait only (for example, yield) can lead to unexpected, even undesired, changes in other traits (for example, health or welfare of individuals). So, we need to pay attention to multiple traits to ensure our breeding programme will be sustainable in long-term. This correlated response to selection is ultimately affected by genetic correlations between the traits, but also correlations between the phenotypes (phenotype correlations), because we select based on phenotype information (Lande, 1979; Falconer and MacKay, 1996; Kelly, 2011). If two traits are positively correlated, selection on one-trait will at the same time change both traits in the same direction. However, when two traits are negatively correlated, selection on one trait will change the other trait in the opposite direction. Sometimes such a change (in the other direction for the correlated trait) might be desirable, other times not. Therefore, when we want to improve multiple traits with selection, we need to take into account at least three aspects: 1) the breeding objective, 2) the economic importance of a change in different traits, 3) trait genetic and phenotype variation and correlations.

In this vignette, we will highlight some of these aspects, by learning how selection on one trait can affect another trait. We will achieve this by:

Base population with two correlated traits

We will start by simulating founder maize genomes with 10 chromosomes for 100 founders and capture 100 segregating sites per chromosome.

# Clean the working environment
rm(list = ls())

# Set the default plot layout
par(mfrow = c(1, 1))

# Load AlphaSimR, simulate founder genomes, and set the SP object
library(AlphaSimR)
## Loading required package: R6
# ... this runMacs() call will take quite a bit of time to run ...
founderGenomes = runMacs(nInd = 100,
                         nChr = 10,
                         segSites = 100,
                         species = "MAIZE")
SP = SimParam$new(founderGenomes)

Now we will specify two traits that are genetically correlated. To this end, we have to provide a number of genetic parameters: two genetic means (we will use 10 for both traits), two genetic variances (we will use 1 and 2), a matrix of genetic correlations (we will use genetic correlation between traits equal to 0.6), and two heritabilities (we will use 0.5 for both traits). Also, we will define that the traits have a simple additive genetic architecture and each trait is controlled by the same 100 loci on each chromosome.

# Genetic parameters for two traits
means = c(10, 10)
vars = c(1, 2)
cors = matrix(data = c(1.0, 0.6,
                       0.6, 1.0),
                byrow = TRUE, nrow = 2, ncol = 2)
h2s = c(0.5, 0.5)

# Define the traits
SP$addTraitA(nQtlPerChr = 100, mean = means, var = vars, corA = cors)
str(SP$traits)
## List of 2
##  $ :Formal class 'TraitA' [package "AlphaSimR"] with 6 slots
##   .. ..@ addEff    : num [1:1000] -0.0393 0.1068 -0.0489 -0.0327 0.0519 ...
##   .. ..@ intercept : num 9.49
##   .. ..@ nLoci     : int 1000
##   .. ..@ lociPerChr: int [1:10] 100 100 100 100 100 100 100 100 100 100
##   .. ..@ lociLoc   : int [1:1000] 1 2 3 4 5 6 7 8 9 10 ...
##   .. ..@ name      : chr "Trait1"
##  $ :Formal class 'TraitA' [package "AlphaSimR"] with 6 slots
##   .. ..@ addEff    : num [1:1000] -0.0907 0.1363 -0.0308 -0.1061 0.0266 ...
##   .. ..@ intercept : num 10.9
##   .. ..@ nLoci     : int 1000
##   .. ..@ lociPerChr: int [1:10] 100 100 100 100 100 100 100 100 100 100
##   .. ..@ lociLoc   : int [1:1000] 1 2 3 4 5 6 7 8 9 10 ...
##   .. ..@ name      : chr "Trait2"

We can now create a base population and inspect that we have indeed generated two traits.

# Base population
basePop = newPop(founderGenomes)
basePop
## An object of class "Pop" 
## Ploidy: 2 
## Individuals: 100 
## Chromosomes: 10 
## Loci: 1000 
## Traits: 2
head(gv(basePop))
##         Trait1    Trait2
## [1,]  8.465496  9.100605
## [2,]  8.355808  8.709362
## [3,] 10.135319 10.301975
## [4,]  8.823296  9.045402
## [5,]  8.242311  8.744313
## [6,]  8.873968 10.560081

Let’s explore genetic relationships between the two traits.

# Explore genetic relationship between traits
plot(x = gv(basePop),
     xlab = "Genetic value - trait 1", ylab = "Genetic value - trait 2")

# Genetic covariance matrix (variances on diagonal and covariance on off-diagonal elements)
(VarG = varG(basePop))
##           Trait1    Trait2
## Trait1 1.0000000 0.9166939
## Trait2 0.9166939 2.0000000
# Genetic correlation matrix (correlation off-diagonal on off-diagonal elements)
cov2cor(VarG)
##           Trait1    Trait2
## Trait1 1.0000000 0.6482005
## Trait2 0.6482005 1.0000000
# Equivalent to the above
cor(gv(basePop))
##           Trait1    Trait2
## Trait1 1.0000000 0.6482005
## Trait2 0.6482005 1.0000000

As expected, the traits show a genetic correlation close to the defined simulation parameters. The correlation is not exactly as defined in the input due to the stochastic nature of AlphaSimR simulations and our population is also quite small, just 100 individuals.

Now, we move to exploring relationship between phenotypes. But, first we need to generate phenotypes for the two traits. For this, we use the setPheno() function.

# Phenotype the population
basePop = setPheno(basePop, h2 = h2s)
head(pheno(basePop))
##         Trait1    Trait2
## [1,]  8.610797 10.682791
## [2,]  7.443327  7.958452
## [3,]  9.797044 11.660588
## [4,]  9.160284  8.681686
## [5,]  9.388897  7.960772
## [6,] 10.061457 10.272663
# Explore phenotype relationship between traits
plot(x = pheno(basePop),
     xlab = "Phenotype value - trait 1", ylab = "Phenotype value - trait 2")

# Phenotype covariance matrix (variances on diagonal and covariance on off-diagonal elements)
(VarP = varP(basePop))
##          Trait1   Trait2
## Trait1 1.957380 1.651203
## Trait2 1.651203 4.871044
# Phenotype correlation matrix (correlation off-diagonal on off-diagonal elements)
cov2cor(VarP)
##           Trait1    Trait2
## Trait1 1.0000000 0.5347512
## Trait2 0.5347512 1.0000000

We could see above that correlation between phenotype values is smaller than between genetic values, which is due to the fact that we have implicitly assumed that the environmental components of the two traits are uncorrelated. See the argument varE in setPheno() or functions SimParam$setVarE() and SimParam$setCorE() to set the desired level of environmental correlation. These environmental correlations can have different magnitude or even different sign than genetic correlations.

help(setPheno)

Selection on a single trait and multi-trait response to selection

We will now select the best 20 individuals based on their phenotype values for the first trait and evaluate response to this selection for both traits. Of note, we are selecting individuals that have a high phenotype value for the first trait. This direction of selection is obviously always important, but particularly with multiple traits, where we could be interested in selecting different traits in different directions (you will address this in exercises) - see help about the selectInd() function (use help(selectInd)) and check its argument selectTop.

# Select individuals based on highest phenotypes for the first trait
nSelected = 20
basePopSelected = selectInd(pop = basePop,
                            nInd = nSelected,
                            use = "pheno",
                            trait = 1)
basePopSelected
## An object of class "Pop" 
## Ploidy: 2 
## Individuals: 20 
## Chromosomes: 10 
## Loci: 1000 
## Traits: 2

Let’s calculate selection differential for the two traits.

# Average phenotype value for all individuals
(meanPAll = meanP(basePop))
##    Trait1    Trait2 
## 10.086267  9.954896
# Average phenotype value for selected individuals
(meanPSel = meanP(basePopSelected))
##   Trait1   Trait2 
## 12.02408 11.66014
# Selection differential for both traits
(selDiff = meanPSel - meanPAll)
##   Trait1   Trait2 
## 1.937812 1.705246

As expected, selection on the first trait generated a positive selection differential for the first trait. It also generated a positive selection differential for the second trait, which was driven by a positive genetic correlation between the traits. The magnitude of the indirect response to selection for correlated traits will depend on genetic and phenotype covariance matrices, that is, genetic and phenotype variances for the traits and genetic and phenotype correlations between traits (Lande, 1979; Falconer and MacKay, 1996; Kelly, 2011).

To better understand this correlated response we can plot phenotype and genetic values for both traits against each other and inspect how selection on the first trait brings about correlated response in the second trait. In the plot below the purple full lines will show the mean of all individuals in the base population for both traits, while the purple dashed lines will show the mean of the selected individuals, again for both traits.

pheno = pheno(basePop)
gv = gv(basePop)
gvSel = gv(basePopSelected)
phenoSel = pheno(basePopSelected)

phenoRange = range(pheno)
gvRange = range(gv)
ranges = range(phenoRange, gvRange)

par(mfrow = c(1, 1))
plot(x = pheno[, 1], y = pheno[, 2],
     xlab = "Phenotype value - trait 1", ylab = "Phenotype value - trait 2",
     xlim = ranges, ylim = ranges, col = "black")
points(x = phenoSel[, 1], y = phenoSel[, 2], pch = 19, col = "purple")
abline(v = meanP(basePop)[1], col = "purple", lwd = 3)
abline(h = meanP(basePop)[2], col = "purple", lwd = 3) 
abline(v = meanP(basePopSelected)[1], col = "purple", lwd = 3, lty =2)
abline(h = meanP(basePopSelected)[2], col = "purple", lwd = 3, lty =2)

plot(x = gv[, 1], y = gv[, 2],
     xlab = "Genetic value - trait 1", ylab = "Genetic value - trait 2",
     xlim = ranges, ylim = ranges, col = "black")
points(x = gvSel[, 1], y = gvSel[, 2], pch = 19, col = "purple")
abline(v = meanG(basePop)[1], col = "purple", lwd = 3)
abline(h = meanG(basePop)[2], col = "purple", lwd = 3) 
abline(v = meanG(basePopSelected)[1], col = "purple", lwd = 3, lty =2)
abline(h = meanG(basePopSelected)[2], col = "purple", lwd = 3, lty =2)

Furthermore, selection does not only change the mean for the first and second trait, it also changes their variances and correlation between the traits.

# Genetic covariances and correlation before selection (all individuals)
print("VarGBasePop")
## [1] "VarGBasePop"
(varGBasePop = varG(basePop))
##           Trait1    Trait2
## Trait1 1.0000000 0.9166939
## Trait2 0.9166939 2.0000000
print("CorGBasePop")
## [1] "CorGBasePop"
cov2cor(varGBasePop)
##           Trait1    Trait2
## Trait1 1.0000000 0.6482005
## Trait2 0.6482005 1.0000000
# Genetic covariances and correlation after selection (selected individuals)
print("VarGBasePopSelected")
## [1] "VarGBasePopSelected"
(varGbasePopSelected = varG(basePopSelected))
##           Trait1    Trait2
## Trait1 0.5797141 0.7797646
## Trait2 0.7797646 2.0290627
print("CorGBasePopSelected")
## [1] "CorGBasePopSelected"
cov2cor(varGbasePopSelected)
##           Trait1    Trait2
## Trait1 1.0000000 0.7189668
## Trait2 0.7189668 1.0000000
# Phenotype covariances and correlation before selection (all individuals)
print("VarPBasePop")
## [1] "VarPBasePop"
(varPBasePop = varP(basePop))
##          Trait1   Trait2
## Trait1 1.957380 1.651203
## Trait2 1.651203 4.871044
print("CorPBasePop")
## [1] "CorPBasePop"
cov2cor(varPBasePop)
##           Trait1    Trait2
## Trait1 1.0000000 0.5347512
## Trait2 0.5347512 1.0000000
# Phenotype covariances and correlation after selection (selected individuals)
print("VarPBasePopSelected")
## [1] "VarPBasePopSelected"
(varPbasePopSelected = varP(basePopSelected))
##           Trait1    Trait2
## Trait1 0.3273143 0.4759985
## Trait2 0.4759985 3.4905234
print("CorPBasePopSelected")
## [1] "CorPBasePopSelected"
cov2cor(varPbasePopSelected)
##           Trait1    Trait2
## Trait1 1.0000000 0.4453257
## Trait2 0.4453257 1.0000000

EXTRA: Genetic change within and between generations

As mentioned above, when traits are correlated, response to selection is a function of genetic and phenotype covariance matrices, selection direction, and selection pressure we put on the traits. Expected within-generation change in genetic values for \(n\) traits (\(\boldsymbol{\Delta G}\)) can be predicted with a multi-trait version of the breeder’s equation (Lande, 1979; Kelly, 2011) by multiplying the \(n \times n\) genetic covariance matrix (\(\boldsymbol{V}_G\)) with the \(n \times n\) inverse phenotype covariance matrix (\(\boldsymbol{V}^{-1}_P\)) and with the \(n \times 1\) selection differential vector (\(\boldsymbol{S}\)):

\[\boldsymbol{\Delta G} = \boldsymbol{V}_G \boldsymbol{V}^{-1}_P \boldsymbol{S}.\]

# Genetic covariance matrix
(Vg = varG(basePop))
##           Trait1    Trait2
## Trait1 1.0000000 0.9166939
## Trait2 0.9166939 2.0000000
# Phenotype covariance matrix
(Vp = varP(basePop))
##          Trait1   Trait2
## Trait1 1.957380 1.651203
## Trait2 1.651203 4.871044
# Selection differential vector
(selDiff = meanPSel - meanPAll)
##   Trait1   Trait2 
## 1.937812 1.705246
# Expected genetic change (predicted from the multi-trait breeder's equation)
(deltaGExpected = c(Vg %*% solve(Vp) %*% selDiff))
## [1] 0.9914861 0.9324119
# Observed genetic change
(deltaGObserved = meanG(basePopSelected) - meanG(basePop))
##    Trait1    Trait2 
## 0.9427463 1.2215312

How accurately the breeder’s equation predicts the response to selection will depend on how well the assumptions of the equation are met. Importantly, this equation only predicts the within-generation change in genetic means, that is, the change between the genetic mean of all individuals and the genetic mean of selected individuals.

The between-generation change in genetic means will depend furthermore on recombination and segregation of the genomes of selected individuals and how these two processes impact the correlation between traits. To evaluate this, we will create a new generation by crossing the selected individuals at random and evaluate trait means and covariances in the new generation.

# Cross the parents and phenotype progeny
newPop = randCross(basePopSelected, nCrosses = nInd(basePop))
newPop = setPheno(newPop, h2 = h2s)
# Observed genetic change WITHIN a generation
(deltaGWithinGeneration = meanG(basePopSelected) - meanG(basePop))
##    Trait1    Trait2 
## 0.9427463 1.2215312
# Observed genetic change BETWEEN generations
(deltaGBetweenGeneration = meanG(newPop) - meanG(basePop))
##   Trait1   Trait2 
## 0.912371 1.234301
# Evaluate trait covariances and correlatons
# ... in the base population
print("VarGBase")
## [1] "VarGBase"
(VgBase = varG(basePop))
##           Trait1    Trait2
## Trait1 1.0000000 0.9166939
## Trait2 0.9166939 2.0000000
print("CorGBase")
## [1] "CorGBase"
cov2cor(VgBase)
##           Trait1    Trait2
## Trait1 1.0000000 0.6482005
## Trait2 0.6482005 1.0000000
# ... in the base population - selected
print("VarGBaseSel")
## [1] "VarGBaseSel"
(VgBaseSel = varG(basePopSelected))
##           Trait1    Trait2
## Trait1 0.5797141 0.7797646
## Trait2 0.7797646 2.0290627
print("CorGBaseSel")
## [1] "CorGBaseSel"
cov2cor(VgBaseSel)
##           Trait1    Trait2
## Trait1 1.0000000 0.7189668
## Trait2 0.7189668 1.0000000
# ... in the new population
print("VarGNew")
## [1] "VarGNew"
(VgNew = varG(newPop))
##           Trait1    Trait2
## Trait1 0.6370037 0.6201768
## Trait2 0.6201768 1.4578467
print("CorGNew")
## [1] "CorGNew"
cov2cor(VgNew)
##           Trait1    Trait2
## Trait1 1.0000000 0.6435594
## Trait2 0.6435594 1.0000000

Now, let’s plot the phenotype and genetic values for both the base and new generation to inspect changes due to selection. Below the purple solid lines denote the average in the current population (all individuals), purple dashed lines denote the average of selected individuals, and black lines denote the average in the base population.

par(mfrow = c(2, 2),
    mar = c(4, 4, 2, 1))

pheno = pheno(basePop)
phenoSel = pheno(basePopSelected)
phenoNew = pheno(newPop)

gv = gv(basePop)
gvSel = gv(basePopSelected)
gvNew = gv(newPop)

phenoRange = range(c(pheno, phenoNew))
gvRange = range(c(gv, gvNew))
ranges = range(phenoRange, gvRange)

plot(x = pheno[, 1], y = pheno[, 2],
     xlab = "Phenotype value - trait 1", ylab = "Phenotype value - trait 2",
     xlim = ranges, ylim = ranges, col = "black",
     main = "Base population")
points(x = phenoSel[, 1], y = phenoSel[, 2], pch = 19, col = "purple")
abline(v = meanP(basePop)[1], col = "purple", lwd = 3)
abline(h = meanP(basePop)[2], col = "purple", lwd = 3) 
abline(v = meanP(basePopSelected)[1], col = "purple", lwd = 3, lty =2)
abline(h = meanP(basePopSelected)[2], col = "purple", lwd = 3, lty =2)

plot(x = gv[, 1], y = gv[, 2],
     xlab = "Genetic value - trait 1", ylab = "Genetic value - trait 2",
     xlim = ranges, ylim = ranges, col = "black",
     main = "Base population")
points(x = gvSel[, 1], y = gvSel[, 2], pch = 19, col = "purple")
abline(v = meanG(basePop)[1], col = "purple", lwd = 3)
abline(h = meanG(basePop)[2], col = "purple", lwd = 3) 
abline(v = meanG(basePopSelected)[1], col = "purple", lwd = 3, lty =2)
abline(h = meanG(basePopSelected)[2], col = "purple", lwd = 3, lty =2)

plot(x = phenoNew[, 1], y = phenoNew[, 2],
     xlab = "Phenotype value - trait 1", ylab = "Phenotype value - trait 2",
     xlim = ranges, ylim = ranges, col = "black",
     main = "New population")
abline(v = meanP(basePop)[1], col = "black", lwd = 3)
abline(h = meanP(basePop)[2], col = "black", lwd = 3)
abline(v = meanP(newPop)[1], col = "purple", lwd = 3)
abline(h = meanP(newPop)[2], col = "purple", lwd = 3)

plot(x = gvNew[, 1], y = gvNew[, 2],
     xlab = "Genetic value - trait 1", ylab = "Genetic value - trait 2",
     xlim = ranges, ylim = ranges, col = "black",
     main = "New population")
abline(v = meanG(basePop)[1], col = "black", lwd = 3)
abline(h = meanG(basePop)[2], col = "black", lwd = 3)
abline(v = meanG(newPop)[1], col = "purple", lwd = 3)
abline(h = meanG(newPop)[2], col = "purple", lwd = 3)

The above plots clearly show that selection on the first trait increases that trait, but also the second trait, due to positive genetic correlation between the two traits.

EXTRA: Selection over many generations

Finally, we can perform selection on the first trait over many generations and track long-term changes in genetic means, variances, and covariances (or correlations) between the two traits. For example, let’s evaluate response to selection over 10 generations. First, we need to allocate containers to store all these summaries. Since we will be storing summaries of different dimensions (vector for means, matrix for variances and covariances, and a scalar for correlation), we will use R’s list data structure.

# Allocate containers
nGenerations = 10 + 1 # +1 to store starting generation
meanGAll = vector("list", length = nGenerations)
varGAll = vector("list", length = nGenerations)
corGAll = numeric(nGenerations)

# Save the starting values
meanGAll[[1]] = meanG(basePop)
varGAll[[1]] = varG(basePop)
corGAll[1] = cov2cor(varGAll[[1]])[2, 1]

Now, we will repeat selection and crossing as we have done before between the basePop and newPop, but this time we will use R’s for loop to simulate across many generations.

# To make the for loop below simpler we will make a copy of the object basePopSelected
newPopSelected = basePopSelected

# Selection over many generations
for (generation in 1:(nGenerations - 1)) {
  # Cross parents, phenotype progeny, and select new parents
  newPop = randCross(newPopSelected, nCrosses = nInd(basePop))
  newPop = setPheno(newPop, h2 = h2s)
  newPopSelected = selectInd(pop = newPop,
                             nInd = nSelected,
                             use = "pheno",
                             trait = 1)
  # Save summaries
  meanGAll[[1 + generation]] = meanG(newPop)
  varGAll[[1 + generation]] = varG(newPop)
  corGAll[1 + generation] = cov2cor(varGAll[[1 + generation]])[2, 1]
}

Let’s now plot how the means, variances, and correlations changed over these generations.

meanGTrait1 = sapply(meanGAll, function(x) x[1])
meanGTrait2 = sapply(meanGAll, function(x) x[2])
meanRanges = range(c(meanGTrait1, meanGTrait2))

varGTrait1 = sapply(varGAll, function(x) x[1, 1])
varGTrait2 = sapply(varGAll, function(x) x[2, 2])
varRanges = range(c(varGTrait1, varGTrait2))

par(mfrow = c(1, 1))

# Plot mean of genetic values over generations
plot(x = 1:nGenerations, y = meanGTrait1, type = "l", col = "black", lwd = 3,
     xlab = "Generation", ylab = "Mean of genetic values", ylim = meanRanges)
lines(x = 1:nGenerations, y = meanGTrait2, type = "l", col = "purple", lty = 2, lwd = 3)
legend(x = "topleft", legend = c("1", "2"), title = "Trait",
       lwd = 3, lty = c(1, 2), col = c("black", "purple"))

# Plot variance of genetic values over generations
plot(x = 1:nGenerations, y = varGTrait1, type = "l", col = "black", lwd = 3,
     xlab = "Generation", ylab = "Variance of genetic values", ylim = varRanges)
lines(x = 1:nGenerations, y = varGTrait2, type = "l", col = "purple", lty = 2, lwd = 3)
legend(x = "topleft", legend = c("1", "2"), title = "Trait",
       lwd = 3, lty = c(1, 2), col = c("black", "purple"))

# Plot correlation between genetic values over generations
plot(x = 1:nGenerations, y = corGAll, type = "l", col = "black", lwd = 3,
     xlab = "Generation", ylab = "Genetic correlation", ylim = c(-1, 1))

As expected, we have seen a continued positive response to selection on both traits, even though we have selected on just the first trait. We have seen a decrease in genetic variance for both traits as well as change in correlation between the traits. The magnitude of these differences will depend on the type of selection we are performing, which we will evaluate in exercises.

References

Falconer D.S., Mackay T.F.C. (1996) Introduction to quantitative genetics. 4th edition, Longman, https://www.pearson.com/uk/educators/higher-education-educators/program/Falconer-Introduction-to-Quantitative-Genetics-4th-Edition/PGM432132.html

Kelly J.K. (2011) The breeder’s equation. Nature Education Knowledge 4(5):5, https://www.nature.com/scitable/knowledge/library/the-breeder-s-equation-24204828

Lande R. (1979) Quantitative genetic analysis of multivariate evolution applied to brain:body allometry. Evolution, 33, 402–416, https://doi.org/10.2307/2407630