Get actionable insights from your tumor samples with TheraOne

By deploying the new TheraOne view, SeqOne sets the objective of accelerating the identification and provision of a treatment adapted to each patient according to the genomic profile of their tumor.

TheraOne will support you in identifying actionable biomarkers from  both DNA and RNA analyses in order to offer a personalized therapeutic solution to your patients.

December 2021 Newsletter

Introduction

The comprehensive characterization of a tumor’s unique genomic landscape is a challenging undertaking that is made accessible by providing the right bioinformatic tools. However improving patient outcomes doesn’t end with the sole detection of mutations and other genomic events. A greater challenge lies in evaluating the real contribution of each genetic variant to a patient’s cancer, and their clinical implications.

Too often, it is still up to oncologists and pathologists to search through various resources in order to identify treatments that would fit the unique genomic profile of a patient’s tumor.  

With TheraOne, we address this issue and aim at establishing a cartography of potentially actionable genomic events for each patient according to their type of cancer, in order to identify a personalized treatment strategy, or to direct them towards relevant clinical trials.

This new module takes advantage of SeqOne’s powerful bioinformatic pipelines for both DNA and RNA assays, and aggregates all their outputs into a single patient-centric dashboard to quickly identify actionable biomarkers.

Patients and samples

TheraOne opens a new ‘Patients’ section on the SeqOne dashboard, that lists all patients available on the account.

Simply put, a patient can be seen as a collection of samples and their associated analyses on SeqOne. One or more samples, of the same or different nature (DNA, RNA) can thus be associated with the same patient, and the variants originating from the different somatic pipelines (SomaVar, SomaCNVCapture, SomaMSI, SomaRNA) will be accessible from this view.

A new patient can be created:

  • When uploading a new sample to the platform,
  • From the “Create Patient” icon in the upper right corner of the user dashboard.

All variants relevant to the patient’s cancer type

The new Cancer ROI viewer groups together the variants detected through the various analyses carried out on the samples attached to the patient: variants, CNVs, fusions and splices. It lists the associations that exist between specific genomic regions and each type of variant (SNV, CNV, fusions or splices) on the one hand, and the patient’s cancer type on the other hand.

This association is based on a database, built in particular from the COSMIC gene census and COSMIC mutation census. 

SeqOne Cancer ROI database

The SeqOne Cancer ROI DB identifies cancer genes/exons/codons for which a cancer driver mutation is known for the patient disease.

For each gene, the mutation type (point mutations, deletion, amplification, fusions or splice) and cancer function (tumor suppressor or oncogene) is identified using the COSMIC gene census and a manually curated internal database.

– For tumor suppressor genes, all exons are added to the database. 
– For oncogenes, all exons with a tier 1 to 3 mutation from COSMIC mutation, and codons with a tier 1 mutation are added.

Additional predictive biomarkers

Both microsatellite instability (MSI) and tumor mutational burden (TMB), used as predictive biomarkers for immunotherapy in cancer treatment, can be visualized from the Signature tab. 

MSI

Microsatellite status is assessed in a cohort-based fashion. Not only looking at a set of predefined regions, our tool identifies any microsatellite region within the targeted genomic regions in order to increase the resolution of the assay, and generates a panel of controls from the sample cohort directly in order to identify unstable microsatellites.

For more information on the SomaMSI workset, you can read our dedicated blog post.

TMB

When performing a SomaVar analysis on a large panel, missense somatic mutations are used in order to compute a Mut/Mb score for each tumoral sample. This raw score is further interpreted based on an organ specific regression model built from the TCGA (The Cancer Genome Atlas) exome dataset.

Prioritized actionable information 

TheraOne matches the genomic profile of your patient’s tumor with actionable information on available drugs and ongoing clinical trials.

Through the Actionability tab, you can access:

  • a list of prioritized FDA-approved treatments for which actionable variants have been identified in the patient. Each variant – drug association in this list comes with their respective AMP/ASCO classification, type of response and specificity.
  • Phase I to IV clinical trials matching the tumor’s genomic profile, with ongoing recruitment near the user’s geographic location.

Reporting

A comprehensive report can be generated and exported in both an editable or a pdf format, including:

  • Patient and sample metadata,
  • Identified genomic alterations and biomarkers,
  • Actionability and available FDA approved drugs,
  • Recruiting clinical trials in the user’s geographic area.

How to get access to TheraOne on SeqOne?

If you are interested in this new module, contact our Customer Support service at support@seqone.com.

Analysis of microsatellite instability (MSI) in a somatic context

A workset dedicated to the analysis of microsatellite instability (MSI) in a somatic context is now available for DNA panels.

December 2021 Newsletter

Introduction

A microsatellite region is a genomic region consisting of 5 to 50 repetitions of nucleotide motifs of 1 to 6 base pairs. These regions exist at multiple loci throughout the genome [1].

Microsatellite instability (MSI) encompasses alterations of length of repeated sequences that sometimes occur in tumours. MSI occurs as a consequence of genomic instability in tumor cells with reduced ability to accurately replicate their DNA.

MSI is a frequent marker of functional inactivation of DNA mismatch repair (MMR) genes, namely MLH1, MSH2, MSH6, PMS1 and PMS2. DNA mismatch repair enzymes usually remove single or multiple misincorporated bases resulting from random errors during DNA replication or recombination. Thus functional loss of mismatch repair may lead to microsatellite instability. MSI is mostly observed in colorectal, gastric and endometrial cancer, and more generally in many types of cancer [2]. Knowing the microsatellite status of a cancer type may help choosing the most appropriate treatment.

SeqOne now offers a tool dedicated to analyze microsatellite regions in order to determine the MSI or MSS (microsatellite stable) status of a tumour sample. 

How does it work ?

The stable or unstable status of microsatellite regions of a tumoral sample requires two preliminary steps:

  • identifying microsatellite regions sequenced in the sample
  • determine the ‘normal’ microsatellite status

To this aim, regions covered by the manifest are first scanned to identify microsatellite regions.

Then a reference sample is generated based on cohort samples estimated as negative controls. Since the MSI status of cohort samples is unknown, the estimation is based on the calculation of deletion rate observed in microsatellite regions: half of the samples from the cohort with the lowest rate are selected to generate a reference sample; each sample contributes with a fraction of their microsatellite sequences.

Each tumoral sample is then compared to the reference sample using msisensor [3]. A score corresponding to the percentage of the microsatellite regions exhibiting instability is calculated.

Based on this score, the MSI or MSS status of each sample is determined in comparison to a threshold calculated from the distribution of cohort samples. If all samples of the run have a MSS status, the threshold will be set at 10 to avoid generating false positives.

Using SomaMSI on SeqOne

SomaMSI is a fully-fledged workset of analysis available for analyzing gene panel data. 

Like SomaCNVCapture, SomaMSI is a cohort-based analysis.

Once the analysis is completed, results for each sample are accessible within the SomaMSI analysis interface under the ‘Results’ tab, divided into two parts :

  • A graphical representation displays the MSI score of each sample. The threshold determined for the cohort of samples delimits two zones on the graph, respectively for samples considered as MSI and MSS.
  • Details about scores and status assigned to each sample are displayed in the right panel:

Compatibility

The SomaMSI analysis workset is compatible with data including UMIs if the configuration was specified when creating the project.

In the case of UMI data, only “standard” and “disabled UMI” configurations will be available at the start of the analysis.

References

[1] Schlötterer C, Harr B (March 2004). Microsatellite Instability

[2] Boland CR, Goel A. Microsatellite instability in colorectal cancer. Gastroenterology. 2010; 138 (6): 2073-2087.e3. doi: 10.1053 / j.gastro.2009.12.064

[3] Niu B, Ye K, Zhang Q, Lu C, Xie M, McLellan MD, Wendl MC, Ding L. MSIsensor: microsatellite instability detection using paired tumor-normal sequence data . Bioinformatics. 2014 Apr 1; ​​30 (7): 1015-6. doi: 10.1093 / bioinformatics / btt755. Epub 2013 Dec 25. PMID: 24371154; PMCID: PMC3967115.

CNV analysis for exome data

CNV detection is now available for exome data! Perform faster analyses by integrating the detection of CNVs to your exome sequencing workflow, whether you are analyzing clinical exomes or whole exomes. 

The CNVExome module has been validated for diagnostic routine by several laboratories and provides a solid alternative to cytogenetic standards (CGH array, karyotype, MLPA).

December 2021 Newsletter

Background

Copy number variations (CNVs), whether they are duplications or deletions, altering the diploid state of DNA, are a source of significant variability in the human genome.

These alterations, like point mutations, may have no phenotypic consequences for their carrier, nonetheless they may drive the acquisition of adaptive traits or be responsible for diseases. Thus clinically relevant CNVs may be identified in patients with intellectual disability (ID) and other neurodevelopmental disorders.

Analysis of CNVs has become popular with the emergence of DNA microarray, in particular microarray-based comparative genomic hybridization (array CGH or aCGH), which offers a finer resolution than karyotype analysis limited to large rearrangements. However, cytogenetic techniques do not allow the detection of short variations (i.e. SNV, indels) which relies mainly on next generation sequencing (NGS).

As molecular biology techniques enable the detection of CNVs, we undertook to develop a NGS-based alternative to array CGH with the aim of unifying approaches by proposing a joint analysis for SNV and CNV for exome data. This combined approach makes some diagnoses easier, notably in cases of recessive pathologies involving CNV and SNV at the same locus.

Type of CNVs detected by CNVExome

CNVExome module allows both the detection of:

  • large intra- or intergenic rearrangements spanning tens to hundreds of kilobases
  • small intragenic events involving one or several exons

Analysis and interpretation

A board for CNV interpretation is available within GermlineVar and GermlineFamily analysis interfaces through the tab CNVs if the CNVExome analysis was performed. The board is divided in two parts:

1. On the left, a table displaying newly identified CNVs with the possibility to filter results based on size and sample frequency or to select those belonging to OMIM morbid genes.

2. On the right, a detailed panel is available for each CNV and comprises 4 tabs :

CNV summary. This first tab encloses information and annotations regarding the selected CNV, namely :

  • clinical regions and genes included within the interval affected by the CNV
  • overlaps between the CNV and DGV database [http://dgv.tcag.ca/dgv/app/home
  • short variants (SNV, indels) identified by GermlineVar within the same region
  • a section dedicated to the Variant Knowledge Base (VKB) displaying validated VKB entries sharing more than 95% identity with the selected CNV and their corresponding evaluation

A summary of those elements is available as metrics at the top of the page :

Evaluations. Similarly to variants detected by GermlineVar, SeqOne users can define the pathogenicity class of each CNV and edit notes and interpretations within the comment section. New evaluations will contribute to enrich the personal knowledge base of the laboratory.

Genome browser. The tab allows you to visualize the selected CNV in the IGV genome browser. To check the stability of the signal, up to 5 additional samples can be displayed in addition to the sample being analyzed.

UCSC. An external link towards UCSC genome browser allows you to visualize the genomic region of the CNV. The link opens a new page or tab in your web browser and integrates the user’s personal settings offering therefore a personalized view.

In addition to alignment files, several files can be downloaded :

  • the result of CNV analysis as tabular file
  • the ploidy for each chromosome
  • the predicted number of copies for each exon of the manifest

Setting up a CNV model

Like pipelines for CNV analysis on panel data, the identification of CNVs from exome is based on the analysis of the depth of coverage.

The approach, widely accepted in the scientific community, is based on the assumption that the number of reads covering a region is directly proportional to the number of copies of the locus within the sample. This approach, relatively simple to apply to gene panel data, is especially tricky for exome data. Indeed, the depth of sequencing on each region is very variable and influenced by several factors such as the initial amount of DNA, the complexity of the sample, the GC content of sequences, the efficiency of capture and sequencing, and the alignment.

Thus CNVExome is an individual analysis as each sample is compared to a model computed by SeqOne.

In order to take into account technical variations in sample preparation, we build this model from a large cohort, based on fifty to hundreds of samples sequenced by the laboratory by using :

  • the same protocol for library preparation
  • the same capture kit
  • the same sequencing platform

These requirements and the variable number of samples required to set up a model are justified by the need of a homogeneous cohort regarding the coverage. Therefore several factors must be considered such as the depth of sequencing, the degree of variability within the cohort as well as the reproducibility of protocols for sample preparation.

How to get CNVExome?

Get in touch with SeqOne Customer Support at support@seqone.com to find out about the procedures for setting up CNVExome on your entity.

SeqOne étend son répertoire de variants structuraux

Vos worksets GermlineVar et SomaVar intègrent un nouvel outil, permettant la détection de variants structuraux jusqu’à 300bp.

Newsletter Février 2021

Le défi des variants structuraux

Les variants structuraux (SV) sont un type important de variation génétique, dont l’impact sur  la diversité phénotypique n’est pas negligeable [1]. Générallement définis comme des insertions, délétions, duplications, inversions ou translocations de 50bp ou plus [2], ils jouent un rôle dans le développement de nombreuses maladies, y compris celui de certains cancers. 

Représentation schématique de quelques variants structuraux

Bien que moins courants que d’autres formes de variations telles que les polymorphismes mononucléotidiques (SNP), les variants structuraux sont susceptibles d’altérer de manière drastique la fonction des gènes au sein desquels ils surviennent. Ils restent cependant largement moins étudiés que ces derniers, principalement en raison de la difficulté à les détecter.

Identifier les SV à partir des données de NGS

Chaque type de SV se caractérise par un profil spécifique à l’issue de l’alignement des séquences. Un alignement qu’ils perturbent bien souvent, en particulier à l’issue de séquençage dit short reads : lorsque les SV sont d’une magnitude équivalente ou supérieure à celle d’un read de séquençage, il devient difficile sinon impossible d’aligner correctement les séquences au génome de référence. Ces profils caractéristiques sont cependant autant de signatures pouvant être exploitées pour les détecter [2].

  1. Une profondeur de couverture variable. 

Ce critère est principalement utilisé dans la détection des CNV, et peut suggérer la présence d’autres types de réarrangements telles que les fusions de gènes, sans toutefois fournir d’indication précise quant aux coordonnées génomiques exactes des points de fusion.

Retrouvez plus d’informations sur la détection des CNV en contexte constitutionnel et somatique sur SeqOne.

  1. Des paires de séquences discordantes. 

Lors de séquençage paired-end, les paires de séquences dont l’orientation ou la position sont innatendues, peuvent permettre d’inférer la présence d’un SV, tout en restant imprécis quant à ses coordonnées génomiques exacts (le point de cassure n’étant pas directement séquencé).

  1. Des split reads. 

A l’issue de l’alignement par des outils tels que BWA, certaines séquences ne sont alignées que partiellement au génome de référence. La fraction non-alignée de ces séquences est alors masquée par un processus appelé soft-clipping. Lorsque la fraction non alignée d’une séquence est suffisamment longue pour pouvoir être alignée de manière non ambigue en d’autres coordonnées génomiques, on parle de split-reads. Ces séquences chevauchant par définition le point de cassure, elles permettent de déterminer de manière précise ses coordonnées génomiques. 

Un module dédié dans vos pipelines SeqOne

Nous avons ainsi intégré aux pipelines bioinformatiques GermlineVar et SomaVar le SV caller GRIDSS, lequel tire non seulement partie des signatures mentionnées ci-dessus, mais également de toutes autres séquences présentant du soft-clipping, des indels, ainsi que les paires de reads partiellement alignées, qu’il réassemble à l’échelle du génome en amont du variant calling [3]. L’ajout de cet outil étend le répertoire des variants détectés par nos pipelines, en y incluant désormais les SV jusqu’à 300bp.

Ce module vient complémenter nos outils de détection des insertions d’éléments transposables ou encore de gènes de fusion.

Références

[1] Sudmant, P.H., Rausch, T., Gardner, E.J., Handsaker, R.E., Abyzov, A., Huddleston, J., Zhang, Y., Ye, K., Jun, G., Fritz, M.H.-Y., et al. (2015). An integrated map of structural variation in 2,504 human genomes. Nature 526, 75–81.

[2] Alkan, C., Coe, B.P., and Eichler, E.E. (2011). Genome structural variation discovery and genotyping. Nat Rev Genet 12, 363–376.

[3] Cameron, D.L., Schröder, J., Penington, J.S., Do, H., Molania, R., Dobrovic, A., Speed, T.P., and Papenfuss, A.T. (2017). GRIDSS: sensitive and specific genomic rearrangement detection using positional de Bruijn graph assembly. Genome Res 27, 2050–2060.

UMI management for somatic analyses

Get the most out of UMIs by selecting one of SomaVar’s 3 new analysis modes, and browsing through a more comprehensive QC report.

Newsletter February 2021

Introduction

An essential step in most NGS protocols, both for the preparation of libraries and their enrichment in targets of interest, PCR generates for each unique molecule a variable number of clones, or duplicates.

This step is known not only to generate certain biases, preferentially amplifying certain sequences and thus artificially increasing the coverage of a given position in the genome, but also errors which can prove problematic when searching for variants with low allele frequency.

The use of a unique molecular index (UMI, also known as a unique molecular identifier) ​​in the sequencing preparation workflow, upstream of PCR, offers a solution to these problems. These UMIs allow the identification of all sequences originating from the same initial molecule, and thus enhance the precision of the sequencing by eliminating errors [1].

Benefits of using UMIs

UMIs offer two major advantages:

  • A more precise estimate of the allele frequency, improving the deduplication process.

During the bioinformatic analysis pipeline, PCR duplicates are identified during the deduplication step: reads aligning exactly to the same position in the reference genome being identified as as many clones of one and the same initial molecule. Only one of these sequences is then retained as representative of the starting molecule for the remainder of the bioinformatic workflow.

However, two sequences with identical genomic coordinates could just as well come from two distinct sequences, originating from different cells. With a classic deduplication approach, these would be reduced to a single molecule during the deduplication process.This is therefore as much signal lost for the detection of variants, in addition to being only a partial representation of the signal at this locus.

When each molecule is indexed prior to PCR amplification, so that each clone can be associated with the initial molecule, identical sequences at the end of the alignment but coming from different molecules will be associated with different UMIs.

  • Increased specificity to identify low frequency variants.

Multiple PCR clones can be used to increase the quality of the representative sequence of the original DNA fragment. Since the fragment was duplicated before sequencing and then sequenced multiple times, multiple copies can be used to correct sequencing errors. By generating a consensus sequence from these duplicates, which relies on a majority vote for each position, we can then largely eliminate background noise.

This application then becomes particularly useful when searching for variants of very low allelic frequency, when sequencing circulating tumor DNA (ctDNA) for example, for which errors generated during PCR or sequencing can quickly become problematic.

How to use SeqOne’s bioinformatics with UMIs

In practice, several methods of UMI processing are proposed on the platform when launching a compatible analysis:

  1. Standard mode (recommended mode): consensus sequences are generated from PCR duplicates with the same UMI, when their number is greater than or equal to 2. UMIs represented by a single sequence (singletons) are also preserved.
  2. High quality: Consensus sequences are generated from PCR duplicates with the same UMI when their number is greater than 3, and UMIs supported by 1 (singletons) or 2 reads are eliminated from the analysis. This method of analysis is also more stringent on the quality of the bases after consensus generation, and allows the detection of variants with an allelic frequency of less than 1%. It is recommended when the sequencing depth is greater than 5000X, and for applications such as ctDNA sequencing.
  3. UMI disabled: UMIs are not used for deduplication, and they are cut off from the end of sequences before analysis.

The quality control report of the SomaVar analysis in particular now provides a more detailed view of the composition of the sample, in particular the distribution of UMIs  according to the number of sequences carrying them. A better understanding of the sample’s profile can then guide the choice of the most appropriate analysis mode.

UMI distribution in terms of the number of reads carrying them within the sample

CNV analysis

Do you want to detect CNVs from your capture data with UMI? The SomaCNVCapture pipeline will now be available in your UMI projects as well.

Regardless of the configuration selected when launching your SomaVar analyzes in this project, the analysis of CNVs will be based on a standard approach: consensus sequences will be generated from PCR duplicates with the same index, and the singletons will be preserved

Current limitations and backward compatibility

  • Only the following kits are currently supported on the SeqOne platform:

– QIAGEN QIAseq

– Agilent XTHS / Low input

– Agilent XTHS V2

– IDT xGen UDI-UMI

– Illumina TruSight Oncology 500 

If you use another protocol, contact us!

  • Only the SomaVar, SomaCNVCapture  and SomaRNA worksets are compatible with UMI data.
  • Each of the two new UMI (standard, high quality) configurations differ from the previous implementation for SomaVar v1.4, summarized in the following table:
SomaVar v1.4 UMISomaVar v1.5 UMI standardSomaVar v1.5 UMI high quality
Number of reads per consensus223
Minimal base quality (phred score)303040
Reads outside consensus sequences filtered outyesnoyes

Bibliography

[1] Kou, R. et al. Benefits and Challenges with Applying Unique Molecular Identifiers in Next Generation Sequencing to Detect Low Frequency Mutations. PLoS One 11, e0146638 (2016).

Alignements multiples dans votre genome browser intégré

La visualisation IGV integrée sur SeqOne gagne en flexibilité, vous permettant désormais d’afficher de multiples alignements dans une même fenêtre.

Newsletter octobre 2020

Un affichage modulable dans IGV

Afin de fluidifier la navigation sur la plateforme SeqOne, et en particulier la visualisation des variants dans le genome viewer intégré, nous avons entrepris de générer un alignement allégé, centré autour des variants de chaque échantillon, à l’issue des pipelines d’analyse GermlineVar, GermlineFamily, SomaVar et SomaDuo.

Représentation schématique de la génération du bam échantillon minimal

Dans la pratique, chaque échantillon dispose désormais de deux fichiers bam associés :

  • le fichier d’alignement brut, disponible au téléchargement depuis l’onglet Files.
  • le nouveau fichier bam dit “minimal”, ou min.bam, généré à la fin de l’analyse bioinformatique et à destination d’IGV.

De plus, le genome viewer intégré gagne en flexibilité, et permet désormais l’alignement de multiples fichiers bams. Ceux-ci peuvent aussi bien être les alignements d’échantillons différents à la même position à des fins de comparaison, ou bien le même échantillon (alignement brut, ou généré par une autre analyse).

Utilisation

A partir de la page variant, une icône d’options sur l’onglet Genome browser vous permet d’accéder au menu de paramétrage des alignements (Tracks settings).

Détail des options disponibles dans le genome browser

L’ensemble des projets et échantillons de votre compte sont alors disponibles à la sélection à partir de menus déroulants.

Enfin, il est possible de sélectionner le bam souhaité parmi ceux générés pour l’échantillon, qu’il s’agisse de son alignement brut (noté sample BAM file) ou du fichier généré par le dernier pipeline lancé sur l’échantillon (noté BAM from latest analysis).

Menu de sélection de l’alignement à ajouter dans la fenêtre Genome browser