Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Genomic selection on rice

4,443 views

Published on

Early generation selection in a recurrent selection breeding program within a synthetic population

Published in: Science
  • Login to see the comments

Genomic selection on rice

  1. 1. Genomic selection on rice Early generation selection in a recurrent selection breeding program within a synthetic population Since 1967 / Science to cultivate change Cécile Grenier Tuong-Vi Cao
  2. 2. Genomic selection Since 1967 / Science to cultivate change • Decreased genotyping costs and new statistical methods enable simultaneous estimation of all marker effects! • GS – a new form of MAS that estimates all marker effects across the whole genome to calculate genome estimated breeding values (GEBVs ) • Markers are not tested for significance – all markers are used in selection
  3. 3. Genomic selection on Rice Can genomic selection can be applied on rice synthetic population (SP) managed through recurrent selection (RS)? Can GS be adapted to Recurrent Genomic Selection (RGS)? Since 1967 / Science to cultivate change Theme 2 – Varietal Development
  4. 4. The breeding scheme Since 1967 / Science to cultivate change Recombination Candidate units Evaluation Phenotype in target environments Synthetic Population 3000 S0 plants Varieties Selected units Evaluations
  5. 5. The SP derived training and breeding population Fixation through SSD for ~350 lines Since 1967 / Science to cultivate change Synthetic Population 3000 S0 plants 343 S2:4 and S3:5 families Extraction of 400 S0 plants Recombination (35 plants) Training Population Breeding Population
  6. 6. Testing the Feasibility of Genomic Selection through Cross-Validations Phenotypes (Y) Genotypes (X) Since 1967 / Science to cultivate change 343 families (from a SP with 10 cycles of recombination) Whole Genome Regression Model
  7. 7. Since 1967 / Science to cultivate change GBS technology 6,874 SNP with MAF ≥ 2.5% (1 marker every ~ 57 kb) 4,098 SNP with MAF ≥ 10.0% (1 marker every ~ 95 kb) LD decay curve for chromosome 1 and MAF ≥ 10% For ½ initial r², the average extent of LD is ~ 0.639 Mb, i.e. at least 610 markers are required to cover the whole genome
  8. 8. Heatmap (G matrix of 343 individuals with Un-rooted Neighbor Joining (dissimilarity matrix among 343 individuals with 6874 SNP) 6874 SNP) Since 1967 / Science to cultivate change The genetic material
  9. 9. Since 1967 / Science to cultivate change The genetic material Evaluation of the 343 families (301 S2:4 and 42 S3:5) under a Lattice Design with 2 repetitions Panicle weight (h2=0.19) Grain yield (h2=0.30) Flowering date (h2=0.86) Plant height (h2=0.61)
  10. 10. Testing the Feasibility of Genomic Selection through Cross-Validations Phenotypes (Y) Genotypes (X) Since 1967 / Science to cultivate change 343 families (from a SP with 10 cycles of recombination) Whole Genome Regression Model k-folds cross-validation:  100 samplings of Training Population (TP) and Validation Population (VP)  100 cor(y, X)  Mean of correlations: ‘Predictive ability of genomic selection’
  11. 11. GS in Rice synthetic populations Regression models G-BLUP Ridged Regression Bayesian LASSO Bayesian RR Since 1967 / Science to cultivate change Limit for r² MAF (%) No. SNP r² <= 0.75 2.5 1758 5 1158 10 678 r² <= 0.90 2.5 4314 5 3268 10 2152 r² <= 1.00 2.5 6874 5 5605 10 4098 k No. ind. [tst] 3 114 6 57 9 38 Incidence matrix choice of SNP markers based on LD and MAF FD (Flowering date) {h2 = 0.86} PH (Plant height) {h2 = 0.61} PW (Panicle weight) {h2 = 0.19} GY (Grain yield) {h2 = 0.30} k-folds cross-validation fraction k of the population (n=343) used for validation Traits
  12. 12. Statistical models for GS Criteria rrBLUP B-RR B-LASSO Variable selection No No Yes Marker effects All markers with same 2 σ2, λ2 Since 1967 / Science to cultivate change Penalized regressions – Parametric linear regression models (frequentist and Bayesians) • Ridge Regression (RR), Best Linear Unbiased Predictors (BLUP), Least Absolute Shrinkage and Selection Operator (LASSO), G-BLUP, RR-BLUP, LASSO, Bayesian RR, Bayesian LASSO… effect – Non-parametric nonlinear models • RKHS, NN, RBFNN All marker have an effect Some markers have null effect Parameter shrinkage of estimates effects Same extend of shrinkage Same extend of shrinkage Marker-specific shrinkage Hyper-parameters No σβ Distribution of effects Gaussian Gaussian Double exponential Best for… Trait controlled by many loci w. small effects Trait controlled by few loci varying in effect size
  13. 13. Regression model and marker effects Bayesian LASSO (MAF2.5 - r2≤0.75) with 9-fold CV -- Grain yield  cor(ŷ[tst], y[tst]) = 0.25  cor(ŷ, y) = 0.84 Since 1967 / Science to cultivate change Marker Effects
  14. 14. Accuracy is function of trait genetic architecture, heritability and, for FD, of choice of markers Bayesian LASSO (9 X matrices) with 9-fold CV Grain yield {h2 = 0.30} Panicle weight {h2 = 0.19} Since 1967 / Science to cultivate change 0.600 0.500 0.400 0.300 0.200 0.100 0.600 0.500 0.400 0.300 0.200 0.100 0.000 0 2000 4000 6000 8000 0.600 0.500 0.400 0.300 0.200 0.100 0.000 0 2000 4000 6000 8000 0.600 0.500 0.400 0.300 0.200 0.100 0.000 0 2000 4000 6000 8000 0.000 0 2000 4000 6000 8000 Flowering date {h2 = 0.86} Plant height {h2 = 0.61} No. SNP Accuracy (cor(ŷ, y)) No. SNP Accuracy (cor(ŷ, y))
  15. 15. Selection of markers (predictors) based on LD improved the accuracy for oligogenic traits Flowering date (9 X matrices, 3-fold CV, Ridged Regression) 2.5% 5.0% Since 1967 / Science to cultivate change 7.5% 10% 0.50 0.40 0.30 0.20 0.10 0.00 -0.10 0 1000 2000 3000 4000 5000 6000 7000 Accuracy = corr(Yobs, Yhat) Number of SNP No. SNP r² <= 0.75 r² <= 0.80 r² <= 0.90 r² <= 1.00 Series5 Series6 Series7 Accuracy (cor(ŷ, y))
  16. 16. Slight superiority of the Bayesian Statistics Since 1967 / Science to cultivate change Grain Yield (9 X matrices with 9-fold CV) 0.350 0.300 0.250 0.200 0.150 0.100 0.050 0.000 0 2000 4000 6000 8000 BL BRR GBLUP RR No. SNP Accuracy (cor(ŷ, y))
  17. 17. The RS breeding scheme Since 1967 / Science to cultivate change Recombination Candidate units Evaluation Phenotype in target environments Synthetic Population 3000 S0 plants Varieties Selected units Evaluations
  18. 18. The RGS breeding scheme Since 1967 / Science to cultivate change Recombination Candidate units Whole Genome Genotyping Breeding Population 3000 S0 plants Promising lines Selected units GEBVs Genomic prediction MET Evaluations GS models Evaluations Training Population New varieties
  19. 19. Since 1967 / Science to cultivate change Conclusions • Yes, GS on rice synthetic population is feasible! • Although not fantastic accuracies were achieved, it was 1 site, 1 year and a first promising result • Small accuracy may still be worth considering the cost of field evaluation, the gain in time to select during the off-season and the possibility to apply stronger selection intensity • Soon to come:  More data, more sites, and more adequate statistics (experimental design and multi-site evaluations accounted in the model) for nonparametric non linear models  GS on the breeding population using the entire training population to develop the genomic prediction model  Maximizing the benefit of GS on earlier generation of the RS scheme (S0 generation)

×