CNV analysis for exome data
CNV detection is now available for exome data! Perform faster analyses by integrating the detection of CNVs to your exome sequencing workflow, whether you are analyzing clinical exomes or whole exomes.
The CNVExome module has been validated for diagnostic routine by several laboratories and provides a solid alternative to cytogenetic standards (CGH array, karyotype, MLPA).December 2021 Newsletter
Copy number variations (CNVs), whether they are duplications or deletions, altering the diploid state of DNA, are a source of significant variability in the human genome.
These alterations, like point mutations, may have no phenotypic consequences for their carrier, nonetheless they may drive the acquisition of adaptive traits or be responsible for diseases. Thus clinically relevant CNVs may be identified in patients with intellectual disability (ID) and other neurodevelopmental disorders.
Analysis of CNVs has become popular with the emergence of DNA microarray, in particular microarray-based comparative genomic hybridization (array CGH or aCGH), which offers a finer resolution than karyotype analysis limited to large rearrangements. However, cytogenetic techniques do not allow the detection of short variations (i.e. SNV, indels) which relies mainly on next generation sequencing (NGS).
As molecular biology techniques enable the detection of CNVs, we undertook to develop a NGS-based alternative to array CGH with the aim of unifying approaches by proposing a joint analysis for SNV and CNV for exome data. This combined approach makes some diagnoses easier, notably in cases of recessive pathologies involving CNV and SNV at the same locus.
Type of CNVs detected by CNVExome
CNVExome module allows both the detection of:
- large intra- or intergenic rearrangements spanning tens to hundreds of kilobases
- small intragenic events involving one or several exons
Analysis and interpretation
A board for CNV interpretation is available within GermlineVar and GermlineFamily analysis interfaces through the tab CNVs if the CNVExome analysis was performed. The board is divided in two parts:
1. On the left, a table displaying newly identified CNVs with the possibility to filter results based on size and sample frequency or to select those belonging to OMIM morbid genes.
2. On the right, a detailed panel is available for each CNV and comprises 4 tabs :
CNV summary. This first tab encloses information and annotations regarding the selected CNV, namely :
- clinical regions and genes included within the interval affected by the CNV
- overlaps between the CNV and DGV database [http://dgv.tcag.ca/dgv/app/home]
- short variants (SNV, indels) identified by GermlineVar within the same region
- a section dedicated to the Variant Knowledge Base (VKB) displaying validated VKB entries sharing more than 95% identity with the selected CNV and their corresponding evaluation
A summary of those elements is available as metrics at the top of the page :
Evaluations. Similarly to variants detected by GermlineVar, SeqOne users can define the pathogenicity class of each CNV and edit notes and interpretations within the comment section. New evaluations will contribute to enrich the personal knowledge base of the laboratory.
Genome browser. The tab allows you to visualize the selected CNV in the IGV genome browser. To check the stability of the signal, up to 5 additional samples can be displayed in addition to the sample being analyzed.
UCSC. An external link towards UCSC genome browser allows you to visualize the genomic region of the CNV. The link opens a new page or tab in your web browser and integrates the user’s personal settings offering therefore a personalized view.
In addition to alignment files, several files can be downloaded :
- the result of CNV analysis as tabular file
- the ploidy for each chromosome
- the predicted number of copies for each exon of the manifest
Setting up a CNV model
Like pipelines for CNV analysis on panel data, the identification of CNVs from exome is based on the analysis of the depth of coverage.
The approach, widely accepted in the scientific community, is based on the assumption that the number of reads covering a region is directly proportional to the number of copies of the locus within the sample. This approach, relatively simple to apply to gene panel data, is especially tricky for exome data. Indeed, the depth of sequencing on each region is very variable and influenced by several factors such as the initial amount of DNA, the complexity of the sample, the GC content of sequences, the efficiency of capture and sequencing, and the alignment.
Thus CNVExome is an individual analysis as each sample is compared to a model computed by SeqOne.
In order to take into account technical variations in sample preparation, we build this model from a large cohort, based on fifty to hundreds of samples sequenced by the laboratory by using :
- the same protocol for library preparation
- the same capture kit
- the same sequencing platform
These requirements and the variable number of samples required to set up a model are justified by the need of a homogeneous cohort regarding the coverage. Therefore several factors must be considered such as the depth of sequencing, the degree of variability within the cohort as well as the reproducibility of protocols for sample preparation.
How to get CNVExome?
Get in touch with SeqOne Customer Support at firstname.lastname@example.org to find out about the procedures for setting up CNVExome on your entity.