SlideShare a Scribd company logo
1 of 46
Download to read offline
Best practices for data analysis when using
UMI adapters to improve variant detection
1
Wendy Lee, PhD
Staff Scientist
Outline
• Overview of NGS workflow that includes sample multiplexing
• Overview of workflow with xGen® Dual Index UMI Adapters—Tech
Access
• Discussion of data analysis steps:
– Extracting UMIs from sequencing reads
– Constructing consensus reads within UMI families
• Improving variant calling accuracy using consensus reads
2
UMI: unique molecular identifier
NGS workflow with xGen Dual Index UMI Adapters
3
xGen Universal
Blockers
xGen
xGen Dual Index UMI Adapters—Tech Access
4
3-in-1 design
• Designed for Illumina sequencers
• Compatible with standard end-repair and A-tailing library
construction, including PCR-free library methods
• Dual unique sample indices reduce sample cross-talk
• Degenerate 9-base UMI is incorporated for error correction and/or
counting applications
xGen Dual Index UMI Adapters—Tech Access
5
3-in-1 design
Consensus calling reduces artifacts in sequencing data
6
TP
Total readsDedup by start/stop positions
7
TP
Total reads
TP
Consensus reads
(Min3)
Dedup by start/stop positions
A UMI family
Consensus calling reduces artifacts in sequencing data
8
TP TP
Consensus reads
(Min3)
Dedup by start/stop positions
Consensus calling reduces artifacts in sequencing data
Extracting UMIs within sample index reads during
demultiplexing
9
Assumptions and requirements
• Sequencing data are generated from the Illumina platform
• The following tools are installed in a Linux environment:
– Picard, version 2.9.0
– Burrows-Wheeler Aligner (BWA), version 0.7.15-r1140
– Fgbio, version 0.5.0
– VarDict Java
• Access to the raw basecall data output from the sequencer
10
Data analysis guidelines on IDT website
11
www.idtdna.com/UMI-techaccess
Overall workflow
12
Sample Sheet
Steps D1–6: Converted base-calls to short
reads with UMI information during
demultiplexing NGS runs
Short reads files with UMI info
Illumina basecalls
Steps C1–4: Call consensus reads using UMI
Steps P1–4: Post-consensus calling analysis
Variant calls
Extract UMIs from sample index reads
through Illumina demultiplexing workflow
13
Step D1: Create the sample barcode input file
Barcode_file.txt
Step D2: Create output directory for storing
the extracted barcode from the
sample index reads
Step D3: Determine the read structure
100T8B9M8B100T
Step D4: Run ExtractIlluminaBarcodes (Picard)
Extracted barcode files
Step D5: Create an input file to specify the
output BAM file associated with the sample
Library_param.txt
Step D6: Run IlluminaBasecallsToSam (Picard)
Unmapped BAM files
Sample sheet
14
Sample sheet example
15
16
Step D1: Create the sample barcode input file
Barcode_file.txt
Step D2: Create output directory for storing
the extracted barcode from the
sample index reads
Step D3: Determine the read structure
100T8B9M8B100T
Step D4: Run ExtractIlluminaBarcodes (Picard)
Extracted barcode files
Step D5: Create an input file to specify the
output BAM file associated with the sample
Library_param.txt
Step D6: Run IlluminaBasecallsToSam (Picard)
Unmapped BAM files
Steps D1,2: Create a barcode file containing the sample barcode
information for each sample.
Steps 1 and 2 of 6 in demultiplexing
17
Steps D1,2: Create a barcode file containing the sample barcode
information for each sample.
17
• UMI bases are in Ns in the barcode sequence
• This is a tab-delimited file
• In this example, we saved this file in /mnt/demodata/barcode_file.txt
• In this example, we create an output directory in /mnt/demodata/barcodes
barcode_name library_name barcode_sequence_1 barcode_sequence_2
20180326-BN573-S1 Mix1_Rep1 CTGATCGTNNNNNNNNN GCGCATAT
20180326-BN573-S2 Mix1_Rep2 ACTCTCGANNNNNNNNN CTGTACCA
20180326-BN573-S3 Mix1_Rep3 TGAGCTAGNNNNNNNNN GAACGGTT
Steps 1 and 2 of 6 in demultiplexing
18
Step D1: Create the sample barcode input file
Barcode_file.txt
Step D4: Run ExtractIlluminaBarcodes (Picard)
Extracted barcode files
Step D5: Create an input file to specify the
output BAM file associated with the sample
Library_param.txt
Step D6: Run IlluminaBasecallsToSam (Picard)
Unmapped BAM files
Step D2: Create output directory for storing
the extracted barcode from the
sample index reads
Step D3: Determine the read structure
100T8B9M8B100T
Step D3: Determine the read structure for running
ExtractIlluminaBarcodes.
Step 3 of 6 in demultiplexing
19
Step D3: Determine the read structure for running
ExtractIlluminaBarcodes.
Step 3 of 6 in demultiplexing
For xGen Dual Index UMI Adapters—Tech Access with DNA insert of 100 bp,
use the following corresponding read structure:
100T8B9M8B100T
T – template (insert)
B – Sample barcode
M – Molecular index (UMI)
Read
Step 4 of 6 in demultiplexing 20
Step D1: Create the sample barcode input file
Barcode_file.txt
Step D4: Run ExtractIlluminaBarcodes (Picard)
Extracted barcode files
Step D5: Create an input file to specify the
output BAM file associated with the sample
Library_param.txt
Step D6: Run IlluminaBasecallsToSam (Picard)
Unmapped BAM files
Step D2: Create output directory for storing
the extracted barcode from the
sample index reads
Step D3: Determine the read structure
100T8B9M8B100T
Step D4: Run Picard ExtractIlluminaBarcodes to extract sample barcodes.
Input: BARCODE_FILE: Barcode file created in Step D1
BASECALLS_DIR: Directory with sequencing basecall files
READ_STRUCTURE: 100T8B9M8B100T from Step D3
LANE: ExtractIlluminaBarcodes process one lane at a time
Output: 1. A metrics file with the barcode extraction summary
2. Extracted barcodes in output directory created in Step D2.
21
java -Xmx4g -jar picard-2.9.0.jar ExtractIlluminaBarcodes 
BARCODE_FILE=/mnt/demodata/barcode_file.txt 
BASECALLS_DIR=/mnt/runs/BN573/Data/Intensities/BaseCalls 
READ_STRUCTURE=100T8B9M8B100T 
LANE=1 
OUTPUT_DIR=/mnt/demodata/barcodes 
METRICS_FILE=/mnt/demodata/barcode_metrics.txt
Step 4 of 6 in demultiplexing
Step D4: Run Picard ExtractIlluminaBarcodes to extract sample barcodes.
22
Step D1: Create the sample barcode input file
Barcode_file.txt
Step D4: Run ExtractIlluminaBarcodes (Picard)
Extracted barcode files
Step D5: Create an input file to specify the
output BAM file associated with the sample
Library_param.txt
Step D6: Run IlluminaBasecallsToSam (Picard)
Unmapped BAM files
Step D2: Create output directory for storing
the extracted barcode from the
sample index reads
Step D3: Determine the read structure
100T8B9M8B100T
Step D5: Create a tab-delimited file to specify the BAM file for each sample in
the sequencing run with the corresponding barcode sequence(s).
Step 5 of 6 in demultiplexing
23
In this example, we saved this file in
/mnt/demodata/library_param.txt.
Be sure to create the output directory for the BAM file.
In this example, the output directory is /mnt/bam/L001
OUTPUT SAMPLE_ALIAS LIBRARY_NAME BARCODE_1 BARCODE_2
/mnt/bam/L001/BN573-S1_unmapped.bam 20180326-BN573-S1 Mix1_Rep1 CTGATCGTNNNNNNNNN GCGCATAT
/mnt/bam/L001/BN573-S2_unmapped.bam 20180326-BN573-S2 Mix1_Rep2 ACTCTCGANNNNNNNNN CTGTACCA
/mnt/bam/L001/BN573-S3_unmapped.bam 20180326-BN573-S3 Mix1_Rep3 TGAGCTAGNNNNNNNNN GAACGGTT
/mnt/bam/L001/Unmatched.bam Unmatched Unmatched N
Step D5: Create a tab-delimited file to specify the BAM file for each sample in
the sequencing run with the corresponding barcode sequence(s).
Step 5 of 6 in demultiplexing
24
Step D1: Create the sample barcode input file
Barcode_file.txt
Step D4: Run ExtractIlluminaBarcodes (Picard)
Extracted barcode files
Step D5: Create an input file to specify the
output BAM file associated with the sample
Library_param.txt
Step D6: Run IlluminaBasecallsToSam (Picard)
Unmapped BAM files
Step D2: Create output directory for storing
the extracted barcode from the
sample index reads
Step D3: Determine the read structure
100T8B9M8B100T
Step D6: Run IlluminaBasecallsToSam to convert sequencing
base-calls to short reads in the BAM files.
Step 6 of 6 in demultiplexing
25
Step D6: Run IlluminaBasecallsToSam to convert sequencing
base-calls to short reads BAM files.
java -Xmx4g -jar picard-2.9.0.jar IlluminaBasecallsToSam 
BASECALLS_DIR=/mnt/runs/BN573/Data/Intensities/BaseCalls 
BARCODES_DIR=/mnt/demodata/barcodes  # Step D4
LANE=1  # process by lane
READ_STRUCTURE=100T8B6M8B100T  # Step D3
RUN_BARCODE=180326_BN573  # prefixed to the read names in the output
LIBRARY_PARAMS= /mnt/demodata/library_param.txt  # Step D5
TMP_DIR=/mnt/tmp 
MOLECULAR_INDEX_TAG=RX  # BAM tag that stores UMI sequence
ADAPTERS_TO_CHECK=INDEXED 
READ_GROUP_ID=BN573-S1 
NUM_PROCESSORS=8
Step 6 of 6 in demultiplexing
BAM file created by IlluminaBasecallsToSam
• The reads in the BAM file generated by IlluminaBasecallsToSam are
not yet aligned to the reference genome.
• UMI sequence is in the RX tag.
• UMI sequence quality is in the QX tag.
• Sequencing adapter location is in the XT tag. Adapter sequence can
be trimmed using SamToFastq in Picard tools.
26
180326_BN573:1:1101:10008:4281 77 * 0 0 * * 0 0
ACAACGCTCCACGGGAGACCCACCCATCCCTGCCAGGTGAGCCAGACAGTGGCCAAGGGTCTCTAGGTCGAGGCAG
CDDDDCCCDDFFGGGGGGGGGGGGGGHHHHHHHHHHHGHHHHHHHHGHHHHHGHHHHGGHHHHHHHHHHGEFGGGG
RG:Z:BN573-S1 XT:i:114 QX:Z:FFFFGGGG RX:Z:GGTAAAATG
An example record from the BAM file:
Calling consensus using UMIs
27
Workflow for consensus
calling
28
Step C1: Align reads to reference genome
Mapped BAM
without UMI tags
Step C2: Include
UMI tags from
unmapped BAM in
the mapped BAM
Mapped BAM
with UMI tags
Mapped BAM
with UMI family tags
Step C3:
Group reads by UMIs
Unmapped BAM
with UMI tags
Extract UMIs from sample index during demultiplexing
Unmapped BAM
with consensus
reads
Step C4: Call consensus
Step C1,2: Aligning reads from unmapped BAM files to reference
genome, and including the UMI tags
29
Step C1: Align reads to reference genome
Mapped BAM
without UMI tags
Unmapped BAM
with UMI tags
Extract UMIs from sample index
Step C2: Include
UMI tags from
unmapped BAM in
the mapped BAM
Mapped BAM
with UMI tags
Mapped BAM
with UMI family tags
Step C3:
Group reads by UMIs
Unmapped BAM
with consensus
reads
Step C4: Call consensus
Steps 1 and 2 of 4 in consensus calling
Step C1,2: Aligning reads from unmapped BAM files to reference
genome, and including the UMI tags
The following command consists of three steps:
1. Convert BAM to FASTQ
2. Align reads using BWA-MEM
3. Include UMI tags from the unmapped BAM in the mapped BAM
Steps 1 and 2 of 4 in consensus calling
30
java -Xmx4g -jar picard-2.9.0.jar SamToFastq 
I=BN573-S1_unmapped.bam 
F=/dev/stdout INTERLEAVE=true 
| bwa mem –p –t 8 hg38.fa /dev/stdin 
| java –Xmx4g –jar picard.jar MergeBamAlignment 
UNMAPPED=BN573-S1_unmapped.bam ALIGNED=/dev/stdin 
O=BN573-S1_mapped.bam R=hg38.fa 
SORT_ORDER=coordinate MAX_GAPS=-1 
ORIENTATIONS=FR
31
Step C3: Grouping reads by UMIs
Unmapped BAM
with UMI tags
Extract UMIs from sample index
Step C1: Align reads to reference genome
Mapped BAM
without UMI tags
Step C2: Include
UMI tags from
unmapped BAM in
the mapped BAM
Mapped BAM
with UMI tags
Mapped BAM
with UMI family tags
Step C3:
Group reads by UMIs
Unmapped BAM
with consensus
reads
Step C4: Call consensus
Step 3 of 4 in consensus calling
Step C3: Grouping reads by UMIs
The reads are grouped into families that share the same UMI
Step 3 of 4 in consensus calling
32
java -Xmx4g -jar fgbio.jar GroupReadsByUmi 
--input=BN573-S1_mapped.bam --output=BN573-S1_grouped.bam 
--strategy=adjacency --edits=1 --min-map-q=20 
-–assign-tag=MI
Step 4 of 4 in consensus calling
33
Step C1: Align reads to reference genome
Mapped BAM
without UMI tags
Unmapped BAM
with UMI tags
Extract UMIs from sample index
Step C4: Calling consensus
Step C2: Include
UMI tags from
unmapped BAM in
the mapped BAM
Mapped BAM
with UMI tags
Mapped BAM
with UMI family tags
Step C3:
Group reads by UMIs
Unmapped BAM
with consensus
reads
Step C4: Call consensus
Step C4: Calling consensus
Consensus reads will be generated using fgbio’s
CallMolecularConsensusReads
Step 4 of 4 in consensus calling
34
java -Xmx4g -jar fgbio.jar CallMolecularConsensusReads 
--input=BN573-S1_grouped.bam 
--output=BN573-S1_ssConsensus_unmapped.bam 
--min-reads=1 
--rejects=BN573-S1_ssConsensus_rejected.bam 
--min-input-base-quality=30 
--read-group-id=BN573-S1
Workflow for post consensus-calling analysis
35
Step P2: Include UMI
tags from unmapped
BAM in the mapped
BAM
Mapped BAM
with UMI tags
Step P3: Filter consensus reads
Filtered consensus
BAM
Unmapped BAM
with consensus
reads
Mapped BAM
without UMI tags
Step P1: Align reads to reference genome
VCF
Step P4: Variant calling
Steps 1 and 2 of 4 in post-consensus calling analysis
36
Step P2: Include UMI
tags from unmapped
BAM in the mapped
BAM
Mapped BAM
with UMI tags
Step P3: Filter consensus reads
Filtered consensus
BAM
Unmapped BAM
with single strand
consensus reads
Mapped BAM
without UMI tags
Step P1: Align reads to reference genome
Step P1,2: Aligning reads from unmapped BAM files to reference
genome and merging the UMI tags
VCF
Step P4: Variant calling
Step P1,2: Aligning reads from unmapped BAM files to reference
genome and merging the UMI tags
The following command consists of three steps:
1. Converting BAM to FASTQ
2. Aligning reads using bwa mem
3. Including UMI tags from the unmapped BAM in the mapped BAM
Steps 1 and 2 of 4 in post-consensus calling analysis 37
java -Xmx4g -jar picard-2.9.0.jar SamToFastq 
I=BN573-S1_consensus_unmapped.bam 
F=/dev/stdout INTERLEAVE=true 
| bwa mem –p –t 8 hg38.fa /dev/stdin 
| java –Xmx4g –jar picard.jar MergeBamAlignment 
UNMAPPED=BN573-S1_dsConsensus_unmapped.bam
ALIGNED=/dev/stdin 
O=BN573-S1_consensus_mapped.bam R=hg38.fa 
SORT_ORDER=coordinate MAX_GAPS=-1 
ORIENTATIONS=FR
Step 3 of 4 in post-consensus calling analysis
38
Step P2: Include UMI
tags from unmapped
BAM in the mapped
BAM
Mapped BAM
with UMI tags
Step P3: Filter consensus reads
Filtered consensus
BAM VCF
Unmapped BAM
with single strand
consensus reads
Mapped BAM
without UMI tags
Step P1: Align reads to reference genome
Step P3: Filtering consensus reads
Step P4: Variant calling
Step P3: Filtering consensus reads
There are two kinds of filtering of consensus reads:
1. Masking or filtering individual bases in reads
2. Filtering reads (i.e., not writing them to the output BAM file)
Step 3 of 4 in post-consensus calling analysis
39
java -Xmx4g -jar fgbio.jar FilterConsensusReads 
--input=BN573-S1_ssConsensus_mapped.bam 
--output=BN573-S1_ssConsensus_mapped_filtered.bam 
--min-reads=3 
--min-base-quality=50 
--max-no-call-fraction=0.05
Step 4 of 4 in post-consensus calling analysis
40
Step P2: Include UMI
tags from unmapped
BAM in the mapped
BAM
Mapped BAM
with UMI tags
Step P3: Filter consensus reads
Filtered consensus
BAM VCF
Unmapped BAM
with single strand
consensus reads
Mapped BAM
without UMI tags
Step P1: Align reads to reference genome
Step P4: Variant calling
Step P4: Variant calling
Step P4: Variant calling
Step 4 of 4 in post-consensus calling analysis
41
• Variant calling can be accomplished with the variant caller of your choice
• The following example shows how to use VarDictJava to generate a VCF file
VarDictJava/bin/VarDict 
–G hg38.fa 
-N tumor 
-f 0.01 
-b BN573-S1_ssConsensus_mapped_filtered.bam 
-z –c 1 –S 2 –E 3 –g 4 –th 4 target_regions.bed 
| VarDictJava/VarDict/teststrandbias.R 
| VarDictJava/VarDict/var2vcf_valid.pl –N tumor –E –f 0.01 
| awk ‘{if ($1 ~/^#/) print; else if ($4 != $5) print}’ 
> BN573-S1.ssConsensus.VarDict.vcf
Tumor model system for benchmarking
• 25 ng of a 1% mixture (0.5% minimum allelic frequency) was used to
assess sensitivity and positive predictive value (PPV)
• Libraries were captured with a set of custom xGen Lockdown Probes
covering a total target area of ~35 kb
• Variant calling was performed with VarDict
42
Consensus analysis increases variant calling accuracy
43
All expected variants
0.2% variant calling threshold Positive predictive value (PPV)
THANK YOU
44
Take-home messages
• Building consensus sequences enables in silico error correction,
dramatically increasing variant calling specificity
• Due to the prevalence of artifacts arising from sample degradation,
PCR amplification and sequencing, consensus analysis is necessary
to accurately detect variants present below 1%
• xGen Dual Index UMI Adapters mitigate index switching and can
accurately assign rare variants in multiplexing studies
45
www.idtdna.com/UMI-techaccess
Sensitivity and specificity (PPV)
46
TP: True positive
FP: False positive
FN: False negative
PPV: Positive Predictive Value
Sensitivity =
TP
TP+FN
Specificity (PPV) =
TP
TP+FP

More Related Content

What's hot

Metagenomic analysis
Metagenomic analysisMetagenomic analysis
Metagenomic analysisAnimesh Kumar
 
FastQC and Prinseqlite
FastQC and PrinseqliteFastQC and Prinseqlite
FastQC and PrinseqliteRavi Gandham
 
Transcript detection in RNAseq
Transcript detection in RNAseqTranscript detection in RNAseq
Transcript detection in RNAseqDenis C. Bauer
 
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923GenomeInABottle
 
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression AnalysisSo you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression AnalysisUniversity of California, Davis
 
Metabarcoding QIIME2 workshop - Denoise
Metabarcoding QIIME2 workshop - DenoiseMetabarcoding QIIME2 workshop - Denoise
Metabarcoding QIIME2 workshop - DenoiseEvelien Jongepier
 
Introduction to NGS
Introduction to NGSIntroduction to NGS
Introduction to NGScursoNGS
 
An introduction to RNA-seq data analysis
An introduction to RNA-seq data analysisAn introduction to RNA-seq data analysis
An introduction to RNA-seq data analysisAGRF_Ltd
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation SequencingShelomi Karoon
 
Quality Control of NGS Data
Quality Control of NGS Data Quality Control of NGS Data
Quality Control of NGS Data Surya Saha
 
Next generation sequencing methods (final edit)
Next generation sequencing methods (final edit)Next generation sequencing methods (final edit)
Next generation sequencing methods (final edit)Mrinal Vashisth
 
RNA-seq differential expression analysis
RNA-seq differential expression analysisRNA-seq differential expression analysis
RNA-seq differential expression analysismikaelhuss
 
Understanding and controlling for sample and platform biases in NGS assays
Understanding and controlling for sample and platform biases in NGS assaysUnderstanding and controlling for sample and platform biases in NGS assays
Understanding and controlling for sample and platform biases in NGS assaysCandy Smellie
 

What's hot (20)

Metagenomic analysis
Metagenomic analysisMetagenomic analysis
Metagenomic analysis
 
Overview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data AnalysisOverview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data Analysis
 
Genomic Data Analysis
Genomic Data AnalysisGenomic Data Analysis
Genomic Data Analysis
 
NGS File formats
NGS File formatsNGS File formats
NGS File formats
 
FastQC and Prinseqlite
FastQC and PrinseqliteFastQC and Prinseqlite
FastQC and Prinseqlite
 
Transcript detection in RNAseq
Transcript detection in RNAseqTranscript detection in RNAseq
Transcript detection in RNAseq
 
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
 
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression AnalysisSo you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
 
Metabarcoding QIIME2 workshop - Denoise
Metabarcoding QIIME2 workshop - DenoiseMetabarcoding QIIME2 workshop - Denoise
Metabarcoding QIIME2 workshop - Denoise
 
Genome in a Bottle
Genome in a BottleGenome in a Bottle
Genome in a Bottle
 
Introduction to next generation sequencing
Introduction to next generation sequencingIntroduction to next generation sequencing
Introduction to next generation sequencing
 
Introduction to NGS
Introduction to NGSIntroduction to NGS
Introduction to NGS
 
An introduction to RNA-seq data analysis
An introduction to RNA-seq data analysisAn introduction to RNA-seq data analysis
An introduction to RNA-seq data analysis
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 
Quality Control of NGS Data
Quality Control of NGS Data Quality Control of NGS Data
Quality Control of NGS Data
 
Intro bioinfo
Intro bioinfoIntro bioinfo
Intro bioinfo
 
Next generation sequencing methods (final edit)
Next generation sequencing methods (final edit)Next generation sequencing methods (final edit)
Next generation sequencing methods (final edit)
 
RNA-seq differential expression analysis
RNA-seq differential expression analysisRNA-seq differential expression analysis
RNA-seq differential expression analysis
 
Understanding and controlling for sample and platform biases in NGS assays
Understanding and controlling for sample and platform biases in NGS assaysUnderstanding and controlling for sample and platform biases in NGS assays
Understanding and controlling for sample and platform biases in NGS assays
 
Genome Big Data
Genome Big DataGenome Big Data
Genome Big Data
 

Similar to Best practices for data analysis when using UMI adapters to improve variant detection

Next-generation sequencing data format and visualization with ngs.plot 2015
Next-generation sequencing data format and visualization with ngs.plot 2015Next-generation sequencing data format and visualization with ngs.plot 2015
Next-generation sequencing data format and visualization with ngs.plot 2015Li Shen
 
matmultHomework3.pdfNames of Files to Submit matmult..docx
matmultHomework3.pdfNames of Files to Submit  matmult..docxmatmultHomework3.pdfNames of Files to Submit  matmult..docx
matmultHomework3.pdfNames of Files to Submit matmult..docxandreecapon
 
Developing Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaDeveloping Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaJoe Stein
 
Cis 170 c ilab 7 of 7 sequential files
Cis 170 c ilab 7 of 7 sequential filesCis 170 c ilab 7 of 7 sequential files
Cis 170 c ilab 7 of 7 sequential filesCIS321
 
Crash dump analysis - experience sharing
Crash dump analysis - experience sharingCrash dump analysis - experience sharing
Crash dump analysis - experience sharingJames Hsieh
 
fileop report
fileop reportfileop report
fileop reportJason Lu
 
The Anatomy of TYPO3 Sitepackages
The Anatomy of TYPO3 SitepackagesThe Anatomy of TYPO3 Sitepackages
The Anatomy of TYPO3 SitepackagesBenjamin Kott
 
IBM Db2 11.5 External Tables
IBM Db2 11.5 External TablesIBM Db2 11.5 External Tables
IBM Db2 11.5 External TablesPhil Downey
 
Terraform Modules Restructured
Terraform Modules RestructuredTerraform Modules Restructured
Terraform Modules RestructuredDoiT International
 
Terraform modules restructured
Terraform modules restructuredTerraform modules restructured
Terraform modules restructuredAmi Mahloof
 
FAIR Projector Builder
FAIR Projector BuilderFAIR Projector Builder
FAIR Projector BuilderMark Wilkinson
 
Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Oslo)
Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Oslo)Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Oslo)
Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Oslo)Cathrine Wilhelmsen
 
Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Sacrame...
Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Sacrame...Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Sacrame...
Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Sacrame...Cathrine Wilhelmsen
 
Using-The-Common-Space-DUG-Datatel-Miko
Using-The-Common-Space-DUG-Datatel-MikoUsing-The-Common-Space-DUG-Datatel-Miko
Using-The-Common-Space-DUG-Datatel-MikoMIKO ..
 
picard_poster_12_16_15
picard_poster_12_16_15picard_poster_12_16_15
picard_poster_12_16_15David E. Kling
 

Similar to Best practices for data analysis when using UMI adapters to improve variant detection (20)

Next-generation sequencing data format and visualization with ngs.plot 2015
Next-generation sequencing data format and visualization with ngs.plot 2015Next-generation sequencing data format and visualization with ngs.plot 2015
Next-generation sequencing data format and visualization with ngs.plot 2015
 
matmultHomework3.pdfNames of Files to Submit matmult..docx
matmultHomework3.pdfNames of Files to Submit  matmult..docxmatmultHomework3.pdfNames of Files to Submit  matmult..docx
matmultHomework3.pdfNames of Files to Submit matmult..docx
 
BioMake BOSC 2004
BioMake BOSC 2004BioMake BOSC 2004
BioMake BOSC 2004
 
Developing Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaDeveloping Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache Kafka
 
Raptor user manual3.0
Raptor user manual3.0Raptor user manual3.0
Raptor user manual3.0
 
Cis 170 c ilab 7 of 7 sequential files
Cis 170 c ilab 7 of 7 sequential filesCis 170 c ilab 7 of 7 sequential files
Cis 170 c ilab 7 of 7 sequential files
 
Crash dump analysis - experience sharing
Crash dump analysis - experience sharingCrash dump analysis - experience sharing
Crash dump analysis - experience sharing
 
fileop report
fileop reportfileop report
fileop report
 
The Anatomy of TYPO3 Sitepackages
The Anatomy of TYPO3 SitepackagesThe Anatomy of TYPO3 Sitepackages
The Anatomy of TYPO3 Sitepackages
 
IBM Db2 11.5 External Tables
IBM Db2 11.5 External TablesIBM Db2 11.5 External Tables
IBM Db2 11.5 External Tables
 
Sayeh extension(v23)
Sayeh extension(v23)Sayeh extension(v23)
Sayeh extension(v23)
 
Wireshark Packet Analyzer.pptx
Wireshark Packet Analyzer.pptxWireshark Packet Analyzer.pptx
Wireshark Packet Analyzer.pptx
 
Terraform Modules Restructured
Terraform Modules RestructuredTerraform Modules Restructured
Terraform Modules Restructured
 
Terraform modules restructured
Terraform modules restructuredTerraform modules restructured
Terraform modules restructured
 
FAIR Projector Builder
FAIR Projector BuilderFAIR Projector Builder
FAIR Projector Builder
 
Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Oslo)
Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Oslo)Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Oslo)
Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Oslo)
 
Audit
AuditAudit
Audit
 
Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Sacrame...
Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Sacrame...Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Sacrame...
Level Up Your Biml: Best Practices and Coding Techniques (SQLSaturday Sacrame...
 
Using-The-Common-Space-DUG-Datatel-Miko
Using-The-Common-Space-DUG-Datatel-MikoUsing-The-Common-Space-DUG-Datatel-Miko
Using-The-Common-Space-DUG-Datatel-Miko
 
picard_poster_12_16_15
picard_poster_12_16_15picard_poster_12_16_15
picard_poster_12_16_15
 

More from Integrated DNA Technologies

Overcoming the challenges of designing efficient and specific CRISPR gRNAs
Overcoming the challenges of designing efficient and specific CRISPR gRNAsOvercoming the challenges of designing efficient and specific CRISPR gRNAs
Overcoming the challenges of designing efficient and specific CRISPR gRNAsIntegrated DNA Technologies
 
Increasing genome editing efficiency with optimized CRISPR-Cas enzymes
Increasing genome editing efficiency with optimized CRISPR-Cas enzymesIncreasing genome editing efficiency with optimized CRISPR-Cas enzymes
Increasing genome editing efficiency with optimized CRISPR-Cas enzymesIntegrated DNA Technologies
 
The quest for high confidence mutations in plasma: searching for a needle in ...
The quest for high confidence mutations in plasma: searching for a needle in ...The quest for high confidence mutations in plasma: searching for a needle in ...
The quest for high confidence mutations in plasma: searching for a needle in ...Integrated DNA Technologies
 
SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...
SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...
SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...Integrated DNA Technologies
 
Optimized methods to use Cas9 nickases in genome editing
Optimized methods to use Cas9 nickases in genome editingOptimized methods to use Cas9 nickases in genome editing
Optimized methods to use Cas9 nickases in genome editingIntegrated DNA Technologies
 
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...Integrated DNA Technologies
 
Reducing off-target events in CRISPR genome editing applications with a novel...
Reducing off-target events in CRISPR genome editing applications with a novel...Reducing off-target events in CRISPR genome editing applications with a novel...
Reducing off-target events in CRISPR genome editing applications with a novel...Integrated DNA Technologies
 
rhAmp™ SNP Genotyping: A novel approach for improving PCR-based SNP genotyping
rhAmp™ SNP Genotyping: A novel approach for improving PCR-based SNP genotypingrhAmp™ SNP Genotyping: A novel approach for improving PCR-based SNP genotyping
rhAmp™ SNP Genotyping: A novel approach for improving PCR-based SNP genotypingIntegrated DNA Technologies
 
Unique, dual-matched adapters mitigate index hopping between NGS samples
Unique, dual-matched adapters mitigate index hopping between NGS samplesUnique, dual-matched adapters mitigate index hopping between NGS samples
Unique, dual-matched adapters mitigate index hopping between NGS samplesIntegrated DNA Technologies
 
Analyzing the exome—focusing your NGS analysis with high performance target c...
Analyzing the exome—focusing your NGS analysis with high performance target c...Analyzing the exome—focusing your NGS analysis with high performance target c...
Analyzing the exome—focusing your NGS analysis with high performance target c...Integrated DNA Technologies
 
Getting started with CRISPR: a review of gene knockout and homology-directed ...
Getting started with CRISPR: a review of gene knockout and homology-directed ...Getting started with CRISPR: a review of gene knockout and homology-directed ...
Getting started with CRISPR: a review of gene knockout and homology-directed ...Integrated DNA Technologies
 
Cpf1-based genome editing using ribonucleoprotein complexes
Cpf1-based genome editing using ribonucleoprotein complexesCpf1-based genome editing using ribonucleoprotein complexes
Cpf1-based genome editing using ribonucleoprotein complexesIntegrated DNA Technologies
 
Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...
Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...
Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...Integrated DNA Technologies
 
Accurate detection of low frequency genetic variants using novel, molecular t...
Accurate detection of low frequency genetic variants using novel, molecular t...Accurate detection of low frequency genetic variants using novel, molecular t...
Accurate detection of low frequency genetic variants using novel, molecular t...Integrated DNA Technologies
 
Target capture of DNA from FFPE samples— recommendations for generating robus...
Target capture of DNA from FFPE samples— recommendations for generating robus...Target capture of DNA from FFPE samples— recommendations for generating robus...
Target capture of DNA from FFPE samples— recommendations for generating robus...Integrated DNA Technologies
 
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDT
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDTHigh efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDT
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDTIntegrated DNA Technologies
 
Tips for effective use of BLAST and other NCBI tools
Tips for effective use of BLAST and other NCBI toolsTips for effective use of BLAST and other NCBI tools
Tips for effective use of BLAST and other NCBI toolsIntegrated DNA Technologies
 
Gene synthesis technology and applications update—unleash your lab’s potentia...
Gene synthesis technology and applications update—unleash your lab’s potentia...Gene synthesis technology and applications update—unleash your lab’s potentia...
Gene synthesis technology and applications update—unleash your lab’s potentia...Integrated DNA Technologies
 
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...Integrated DNA Technologies
 

More from Integrated DNA Technologies (20)

Overcoming the challenges of designing efficient and specific CRISPR gRNAs
Overcoming the challenges of designing efficient and specific CRISPR gRNAsOvercoming the challenges of designing efficient and specific CRISPR gRNAs
Overcoming the challenges of designing efficient and specific CRISPR gRNAs
 
Increasing genome editing efficiency with optimized CRISPR-Cas enzymes
Increasing genome editing efficiency with optimized CRISPR-Cas enzymesIncreasing genome editing efficiency with optimized CRISPR-Cas enzymes
Increasing genome editing efficiency with optimized CRISPR-Cas enzymes
 
The quest for high confidence mutations in plasma: searching for a needle in ...
The quest for high confidence mutations in plasma: searching for a needle in ...The quest for high confidence mutations in plasma: searching for a needle in ...
The quest for high confidence mutations in plasma: searching for a needle in ...
 
SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...
SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...
SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...
 
Optimized methods to use Cas9 nickases in genome editing
Optimized methods to use Cas9 nickases in genome editingOptimized methods to use Cas9 nickases in genome editing
Optimized methods to use Cas9 nickases in genome editing
 
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
 
Reducing off-target events in CRISPR genome editing applications with a novel...
Reducing off-target events in CRISPR genome editing applications with a novel...Reducing off-target events in CRISPR genome editing applications with a novel...
Reducing off-target events in CRISPR genome editing applications with a novel...
 
rhAmp™ SNP Genotyping: A novel approach for improving PCR-based SNP genotyping
rhAmp™ SNP Genotyping: A novel approach for improving PCR-based SNP genotypingrhAmp™ SNP Genotyping: A novel approach for improving PCR-based SNP genotyping
rhAmp™ SNP Genotyping: A novel approach for improving PCR-based SNP genotyping
 
Unique, dual-matched adapters mitigate index hopping between NGS samples
Unique, dual-matched adapters mitigate index hopping between NGS samplesUnique, dual-matched adapters mitigate index hopping between NGS samples
Unique, dual-matched adapters mitigate index hopping between NGS samples
 
Analyzing the exome—focusing your NGS analysis with high performance target c...
Analyzing the exome—focusing your NGS analysis with high performance target c...Analyzing the exome—focusing your NGS analysis with high performance target c...
Analyzing the exome—focusing your NGS analysis with high performance target c...
 
Getting started with CRISPR: a review of gene knockout and homology-directed ...
Getting started with CRISPR: a review of gene knockout and homology-directed ...Getting started with CRISPR: a review of gene knockout and homology-directed ...
Getting started with CRISPR: a review of gene knockout and homology-directed ...
 
Cpf1-based genome editing using ribonucleoprotein complexes
Cpf1-based genome editing using ribonucleoprotein complexesCpf1-based genome editing using ribonucleoprotein complexes
Cpf1-based genome editing using ribonucleoprotein complexes
 
Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...
Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...
Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...
 
Accurate detection of low frequency genetic variants using novel, molecular t...
Accurate detection of low frequency genetic variants using novel, molecular t...Accurate detection of low frequency genetic variants using novel, molecular t...
Accurate detection of low frequency genetic variants using novel, molecular t...
 
Target capture of DNA from FFPE samples— recommendations for generating robus...
Target capture of DNA from FFPE samples— recommendations for generating robus...Target capture of DNA from FFPE samples— recommendations for generating robus...
Target capture of DNA from FFPE samples— recommendations for generating robus...
 
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDT
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDTHigh efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDT
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDT
 
Tips for effective use of BLAST and other NCBI tools
Tips for effective use of BLAST and other NCBI toolsTips for effective use of BLAST and other NCBI tools
Tips for effective use of BLAST and other NCBI tools
 
Gene synthesis technology and applications update—unleash your lab’s potentia...
Gene synthesis technology and applications update—unleash your lab’s potentia...Gene synthesis technology and applications update—unleash your lab’s potentia...
Gene synthesis technology and applications update—unleash your lab’s potentia...
 
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...
 
PrimeTime® qPCR products for gene expression
PrimeTime® qPCR products for gene expressionPrimeTime® qPCR products for gene expression
PrimeTime® qPCR products for gene expression
 

Recently uploaded

Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINsankalpkumarsahoo174
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyDrAnita Sharma
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 

Recently uploaded (20)

The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomology
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 

Best practices for data analysis when using UMI adapters to improve variant detection

  • 1. Best practices for data analysis when using UMI adapters to improve variant detection 1 Wendy Lee, PhD Staff Scientist
  • 2. Outline • Overview of NGS workflow that includes sample multiplexing • Overview of workflow with xGen® Dual Index UMI Adapters—Tech Access • Discussion of data analysis steps: – Extracting UMIs from sequencing reads – Constructing consensus reads within UMI families • Improving variant calling accuracy using consensus reads 2 UMI: unique molecular identifier
  • 3. NGS workflow with xGen Dual Index UMI Adapters 3 xGen Universal Blockers xGen
  • 4. xGen Dual Index UMI Adapters—Tech Access 4 3-in-1 design • Designed for Illumina sequencers • Compatible with standard end-repair and A-tailing library construction, including PCR-free library methods • Dual unique sample indices reduce sample cross-talk • Degenerate 9-base UMI is incorporated for error correction and/or counting applications
  • 5. xGen Dual Index UMI Adapters—Tech Access 5 3-in-1 design
  • 6. Consensus calling reduces artifacts in sequencing data 6 TP Total readsDedup by start/stop positions
  • 7. 7 TP Total reads TP Consensus reads (Min3) Dedup by start/stop positions A UMI family Consensus calling reduces artifacts in sequencing data
  • 8. 8 TP TP Consensus reads (Min3) Dedup by start/stop positions Consensus calling reduces artifacts in sequencing data
  • 9. Extracting UMIs within sample index reads during demultiplexing 9
  • 10. Assumptions and requirements • Sequencing data are generated from the Illumina platform • The following tools are installed in a Linux environment: – Picard, version 2.9.0 – Burrows-Wheeler Aligner (BWA), version 0.7.15-r1140 – Fgbio, version 0.5.0 – VarDict Java • Access to the raw basecall data output from the sequencer 10
  • 11. Data analysis guidelines on IDT website 11 www.idtdna.com/UMI-techaccess
  • 12. Overall workflow 12 Sample Sheet Steps D1–6: Converted base-calls to short reads with UMI information during demultiplexing NGS runs Short reads files with UMI info Illumina basecalls Steps C1–4: Call consensus reads using UMI Steps P1–4: Post-consensus calling analysis Variant calls
  • 13. Extract UMIs from sample index reads through Illumina demultiplexing workflow 13 Step D1: Create the sample barcode input file Barcode_file.txt Step D2: Create output directory for storing the extracted barcode from the sample index reads Step D3: Determine the read structure 100T8B9M8B100T Step D4: Run ExtractIlluminaBarcodes (Picard) Extracted barcode files Step D5: Create an input file to specify the output BAM file associated with the sample Library_param.txt Step D6: Run IlluminaBasecallsToSam (Picard) Unmapped BAM files Sample sheet
  • 15. 15
  • 16. 16 Step D1: Create the sample barcode input file Barcode_file.txt Step D2: Create output directory for storing the extracted barcode from the sample index reads Step D3: Determine the read structure 100T8B9M8B100T Step D4: Run ExtractIlluminaBarcodes (Picard) Extracted barcode files Step D5: Create an input file to specify the output BAM file associated with the sample Library_param.txt Step D6: Run IlluminaBasecallsToSam (Picard) Unmapped BAM files Steps D1,2: Create a barcode file containing the sample barcode information for each sample. Steps 1 and 2 of 6 in demultiplexing
  • 17. 17 Steps D1,2: Create a barcode file containing the sample barcode information for each sample. 17 • UMI bases are in Ns in the barcode sequence • This is a tab-delimited file • In this example, we saved this file in /mnt/demodata/barcode_file.txt • In this example, we create an output directory in /mnt/demodata/barcodes barcode_name library_name barcode_sequence_1 barcode_sequence_2 20180326-BN573-S1 Mix1_Rep1 CTGATCGTNNNNNNNNN GCGCATAT 20180326-BN573-S2 Mix1_Rep2 ACTCTCGANNNNNNNNN CTGTACCA 20180326-BN573-S3 Mix1_Rep3 TGAGCTAGNNNNNNNNN GAACGGTT Steps 1 and 2 of 6 in demultiplexing
  • 18. 18 Step D1: Create the sample barcode input file Barcode_file.txt Step D4: Run ExtractIlluminaBarcodes (Picard) Extracted barcode files Step D5: Create an input file to specify the output BAM file associated with the sample Library_param.txt Step D6: Run IlluminaBasecallsToSam (Picard) Unmapped BAM files Step D2: Create output directory for storing the extracted barcode from the sample index reads Step D3: Determine the read structure 100T8B9M8B100T Step D3: Determine the read structure for running ExtractIlluminaBarcodes. Step 3 of 6 in demultiplexing
  • 19. 19 Step D3: Determine the read structure for running ExtractIlluminaBarcodes. Step 3 of 6 in demultiplexing For xGen Dual Index UMI Adapters—Tech Access with DNA insert of 100 bp, use the following corresponding read structure: 100T8B9M8B100T T – template (insert) B – Sample barcode M – Molecular index (UMI) Read
  • 20. Step 4 of 6 in demultiplexing 20 Step D1: Create the sample barcode input file Barcode_file.txt Step D4: Run ExtractIlluminaBarcodes (Picard) Extracted barcode files Step D5: Create an input file to specify the output BAM file associated with the sample Library_param.txt Step D6: Run IlluminaBasecallsToSam (Picard) Unmapped BAM files Step D2: Create output directory for storing the extracted barcode from the sample index reads Step D3: Determine the read structure 100T8B9M8B100T Step D4: Run Picard ExtractIlluminaBarcodes to extract sample barcodes.
  • 21. Input: BARCODE_FILE: Barcode file created in Step D1 BASECALLS_DIR: Directory with sequencing basecall files READ_STRUCTURE: 100T8B9M8B100T from Step D3 LANE: ExtractIlluminaBarcodes process one lane at a time Output: 1. A metrics file with the barcode extraction summary 2. Extracted barcodes in output directory created in Step D2. 21 java -Xmx4g -jar picard-2.9.0.jar ExtractIlluminaBarcodes BARCODE_FILE=/mnt/demodata/barcode_file.txt BASECALLS_DIR=/mnt/runs/BN573/Data/Intensities/BaseCalls READ_STRUCTURE=100T8B9M8B100T LANE=1 OUTPUT_DIR=/mnt/demodata/barcodes METRICS_FILE=/mnt/demodata/barcode_metrics.txt Step 4 of 6 in demultiplexing Step D4: Run Picard ExtractIlluminaBarcodes to extract sample barcodes.
  • 22. 22 Step D1: Create the sample barcode input file Barcode_file.txt Step D4: Run ExtractIlluminaBarcodes (Picard) Extracted barcode files Step D5: Create an input file to specify the output BAM file associated with the sample Library_param.txt Step D6: Run IlluminaBasecallsToSam (Picard) Unmapped BAM files Step D2: Create output directory for storing the extracted barcode from the sample index reads Step D3: Determine the read structure 100T8B9M8B100T Step D5: Create a tab-delimited file to specify the BAM file for each sample in the sequencing run with the corresponding barcode sequence(s). Step 5 of 6 in demultiplexing
  • 23. 23 In this example, we saved this file in /mnt/demodata/library_param.txt. Be sure to create the output directory for the BAM file. In this example, the output directory is /mnt/bam/L001 OUTPUT SAMPLE_ALIAS LIBRARY_NAME BARCODE_1 BARCODE_2 /mnt/bam/L001/BN573-S1_unmapped.bam 20180326-BN573-S1 Mix1_Rep1 CTGATCGTNNNNNNNNN GCGCATAT /mnt/bam/L001/BN573-S2_unmapped.bam 20180326-BN573-S2 Mix1_Rep2 ACTCTCGANNNNNNNNN CTGTACCA /mnt/bam/L001/BN573-S3_unmapped.bam 20180326-BN573-S3 Mix1_Rep3 TGAGCTAGNNNNNNNNN GAACGGTT /mnt/bam/L001/Unmatched.bam Unmatched Unmatched N Step D5: Create a tab-delimited file to specify the BAM file for each sample in the sequencing run with the corresponding barcode sequence(s). Step 5 of 6 in demultiplexing
  • 24. 24 Step D1: Create the sample barcode input file Barcode_file.txt Step D4: Run ExtractIlluminaBarcodes (Picard) Extracted barcode files Step D5: Create an input file to specify the output BAM file associated with the sample Library_param.txt Step D6: Run IlluminaBasecallsToSam (Picard) Unmapped BAM files Step D2: Create output directory for storing the extracted barcode from the sample index reads Step D3: Determine the read structure 100T8B9M8B100T Step D6: Run IlluminaBasecallsToSam to convert sequencing base-calls to short reads in the BAM files. Step 6 of 6 in demultiplexing
  • 25. 25 Step D6: Run IlluminaBasecallsToSam to convert sequencing base-calls to short reads BAM files. java -Xmx4g -jar picard-2.9.0.jar IlluminaBasecallsToSam BASECALLS_DIR=/mnt/runs/BN573/Data/Intensities/BaseCalls BARCODES_DIR=/mnt/demodata/barcodes # Step D4 LANE=1 # process by lane READ_STRUCTURE=100T8B6M8B100T # Step D3 RUN_BARCODE=180326_BN573 # prefixed to the read names in the output LIBRARY_PARAMS= /mnt/demodata/library_param.txt # Step D5 TMP_DIR=/mnt/tmp MOLECULAR_INDEX_TAG=RX # BAM tag that stores UMI sequence ADAPTERS_TO_CHECK=INDEXED READ_GROUP_ID=BN573-S1 NUM_PROCESSORS=8 Step 6 of 6 in demultiplexing
  • 26. BAM file created by IlluminaBasecallsToSam • The reads in the BAM file generated by IlluminaBasecallsToSam are not yet aligned to the reference genome. • UMI sequence is in the RX tag. • UMI sequence quality is in the QX tag. • Sequencing adapter location is in the XT tag. Adapter sequence can be trimmed using SamToFastq in Picard tools. 26 180326_BN573:1:1101:10008:4281 77 * 0 0 * * 0 0 ACAACGCTCCACGGGAGACCCACCCATCCCTGCCAGGTGAGCCAGACAGTGGCCAAGGGTCTCTAGGTCGAGGCAG CDDDDCCCDDFFGGGGGGGGGGGGGGHHHHHHHHHHHGHHHHHHHHGHHHHHGHHHHGGHHHHHHHHHHGEFGGGG RG:Z:BN573-S1 XT:i:114 QX:Z:FFFFGGGG RX:Z:GGTAAAATG An example record from the BAM file:
  • 28. Workflow for consensus calling 28 Step C1: Align reads to reference genome Mapped BAM without UMI tags Step C2: Include UMI tags from unmapped BAM in the mapped BAM Mapped BAM with UMI tags Mapped BAM with UMI family tags Step C3: Group reads by UMIs Unmapped BAM with UMI tags Extract UMIs from sample index during demultiplexing Unmapped BAM with consensus reads Step C4: Call consensus
  • 29. Step C1,2: Aligning reads from unmapped BAM files to reference genome, and including the UMI tags 29 Step C1: Align reads to reference genome Mapped BAM without UMI tags Unmapped BAM with UMI tags Extract UMIs from sample index Step C2: Include UMI tags from unmapped BAM in the mapped BAM Mapped BAM with UMI tags Mapped BAM with UMI family tags Step C3: Group reads by UMIs Unmapped BAM with consensus reads Step C4: Call consensus Steps 1 and 2 of 4 in consensus calling
  • 30. Step C1,2: Aligning reads from unmapped BAM files to reference genome, and including the UMI tags The following command consists of three steps: 1. Convert BAM to FASTQ 2. Align reads using BWA-MEM 3. Include UMI tags from the unmapped BAM in the mapped BAM Steps 1 and 2 of 4 in consensus calling 30 java -Xmx4g -jar picard-2.9.0.jar SamToFastq I=BN573-S1_unmapped.bam F=/dev/stdout INTERLEAVE=true | bwa mem –p –t 8 hg38.fa /dev/stdin | java –Xmx4g –jar picard.jar MergeBamAlignment UNMAPPED=BN573-S1_unmapped.bam ALIGNED=/dev/stdin O=BN573-S1_mapped.bam R=hg38.fa SORT_ORDER=coordinate MAX_GAPS=-1 ORIENTATIONS=FR
  • 31. 31 Step C3: Grouping reads by UMIs Unmapped BAM with UMI tags Extract UMIs from sample index Step C1: Align reads to reference genome Mapped BAM without UMI tags Step C2: Include UMI tags from unmapped BAM in the mapped BAM Mapped BAM with UMI tags Mapped BAM with UMI family tags Step C3: Group reads by UMIs Unmapped BAM with consensus reads Step C4: Call consensus Step 3 of 4 in consensus calling
  • 32. Step C3: Grouping reads by UMIs The reads are grouped into families that share the same UMI Step 3 of 4 in consensus calling 32 java -Xmx4g -jar fgbio.jar GroupReadsByUmi --input=BN573-S1_mapped.bam --output=BN573-S1_grouped.bam --strategy=adjacency --edits=1 --min-map-q=20 -–assign-tag=MI
  • 33. Step 4 of 4 in consensus calling 33 Step C1: Align reads to reference genome Mapped BAM without UMI tags Unmapped BAM with UMI tags Extract UMIs from sample index Step C4: Calling consensus Step C2: Include UMI tags from unmapped BAM in the mapped BAM Mapped BAM with UMI tags Mapped BAM with UMI family tags Step C3: Group reads by UMIs Unmapped BAM with consensus reads Step C4: Call consensus
  • 34. Step C4: Calling consensus Consensus reads will be generated using fgbio’s CallMolecularConsensusReads Step 4 of 4 in consensus calling 34 java -Xmx4g -jar fgbio.jar CallMolecularConsensusReads --input=BN573-S1_grouped.bam --output=BN573-S1_ssConsensus_unmapped.bam --min-reads=1 --rejects=BN573-S1_ssConsensus_rejected.bam --min-input-base-quality=30 --read-group-id=BN573-S1
  • 35. Workflow for post consensus-calling analysis 35 Step P2: Include UMI tags from unmapped BAM in the mapped BAM Mapped BAM with UMI tags Step P3: Filter consensus reads Filtered consensus BAM Unmapped BAM with consensus reads Mapped BAM without UMI tags Step P1: Align reads to reference genome VCF Step P4: Variant calling
  • 36. Steps 1 and 2 of 4 in post-consensus calling analysis 36 Step P2: Include UMI tags from unmapped BAM in the mapped BAM Mapped BAM with UMI tags Step P3: Filter consensus reads Filtered consensus BAM Unmapped BAM with single strand consensus reads Mapped BAM without UMI tags Step P1: Align reads to reference genome Step P1,2: Aligning reads from unmapped BAM files to reference genome and merging the UMI tags VCF Step P4: Variant calling
  • 37. Step P1,2: Aligning reads from unmapped BAM files to reference genome and merging the UMI tags The following command consists of three steps: 1. Converting BAM to FASTQ 2. Aligning reads using bwa mem 3. Including UMI tags from the unmapped BAM in the mapped BAM Steps 1 and 2 of 4 in post-consensus calling analysis 37 java -Xmx4g -jar picard-2.9.0.jar SamToFastq I=BN573-S1_consensus_unmapped.bam F=/dev/stdout INTERLEAVE=true | bwa mem –p –t 8 hg38.fa /dev/stdin | java –Xmx4g –jar picard.jar MergeBamAlignment UNMAPPED=BN573-S1_dsConsensus_unmapped.bam ALIGNED=/dev/stdin O=BN573-S1_consensus_mapped.bam R=hg38.fa SORT_ORDER=coordinate MAX_GAPS=-1 ORIENTATIONS=FR
  • 38. Step 3 of 4 in post-consensus calling analysis 38 Step P2: Include UMI tags from unmapped BAM in the mapped BAM Mapped BAM with UMI tags Step P3: Filter consensus reads Filtered consensus BAM VCF Unmapped BAM with single strand consensus reads Mapped BAM without UMI tags Step P1: Align reads to reference genome Step P3: Filtering consensus reads Step P4: Variant calling
  • 39. Step P3: Filtering consensus reads There are two kinds of filtering of consensus reads: 1. Masking or filtering individual bases in reads 2. Filtering reads (i.e., not writing them to the output BAM file) Step 3 of 4 in post-consensus calling analysis 39 java -Xmx4g -jar fgbio.jar FilterConsensusReads --input=BN573-S1_ssConsensus_mapped.bam --output=BN573-S1_ssConsensus_mapped_filtered.bam --min-reads=3 --min-base-quality=50 --max-no-call-fraction=0.05
  • 40. Step 4 of 4 in post-consensus calling analysis 40 Step P2: Include UMI tags from unmapped BAM in the mapped BAM Mapped BAM with UMI tags Step P3: Filter consensus reads Filtered consensus BAM VCF Unmapped BAM with single strand consensus reads Mapped BAM without UMI tags Step P1: Align reads to reference genome Step P4: Variant calling Step P4: Variant calling
  • 41. Step P4: Variant calling Step 4 of 4 in post-consensus calling analysis 41 • Variant calling can be accomplished with the variant caller of your choice • The following example shows how to use VarDictJava to generate a VCF file VarDictJava/bin/VarDict –G hg38.fa -N tumor -f 0.01 -b BN573-S1_ssConsensus_mapped_filtered.bam -z –c 1 –S 2 –E 3 –g 4 –th 4 target_regions.bed | VarDictJava/VarDict/teststrandbias.R | VarDictJava/VarDict/var2vcf_valid.pl –N tumor –E –f 0.01 | awk ‘{if ($1 ~/^#/) print; else if ($4 != $5) print}’ > BN573-S1.ssConsensus.VarDict.vcf
  • 42. Tumor model system for benchmarking • 25 ng of a 1% mixture (0.5% minimum allelic frequency) was used to assess sensitivity and positive predictive value (PPV) • Libraries were captured with a set of custom xGen Lockdown Probes covering a total target area of ~35 kb • Variant calling was performed with VarDict 42
  • 43. Consensus analysis increases variant calling accuracy 43 All expected variants 0.2% variant calling threshold Positive predictive value (PPV)
  • 45. Take-home messages • Building consensus sequences enables in silico error correction, dramatically increasing variant calling specificity • Due to the prevalence of artifacts arising from sample degradation, PCR amplification and sequencing, consensus analysis is necessary to accurately detect variants present below 1% • xGen Dual Index UMI Adapters mitigate index switching and can accurately assign rare variants in multiplexing studies 45 www.idtdna.com/UMI-techaccess
  • 46. Sensitivity and specificity (PPV) 46 TP: True positive FP: False positive FN: False negative PPV: Positive Predictive Value Sensitivity = TP TP+FN Specificity (PPV) = TP TP+FP