SeqOne users

Analysis of microsatellite instability (MSI) in a somatic context

Published on 28 December 2021

A workset dedicated to the analysis of microsatellite instability (MSI) in a somatic context is now available for DNA panels.

December 2021 Newsletter


A microsatellite region is a genomic region consisting of 5 to 50 repetitions of nucleotide motifs of 1 to 6 base pairs. These regions exist at multiple loci throughout the genome [1].

Microsatellite instability (MSI) encompasses alterations of length of repeated sequences that sometimes occur in tumours. MSI occurs as a consequence of genomic instability in tumor cells with reduced ability to accurately replicate their DNA.

MSI is a frequent marker of functional inactivation of DNA mismatch repair (MMR) genes, namely MLH1, MSH2, MSH6, PMS1 and PMS2. DNA mismatch repair enzymes usually remove single or multiple misincorporated bases resulting from random errors during DNA replication or recombination. Thus functional loss of mismatch repair may lead to microsatellite instability. MSI is mostly observed in colorectal, gastric and endometrial cancer, and more generally in many types of cancer [2]. Knowing the microsatellite status of a cancer type may help choosing the most appropriate treatment.

SeqOne now offers a tool dedicated to analyze microsatellite regions in order to determine the MSI or MSS (microsatellite stable) status of a tumour sample. 

How does it work ?

The stable or unstable status of microsatellite regions of a tumoral sample requires two preliminary steps:

  • identifying microsatellite regions sequenced in the sample
  • determine the ‘normal’ microsatellite status

To this aim, regions covered by the manifest are first scanned to identify microsatellite regions.

Then a reference sample is generated based on cohort samples estimated as negative controls. Since the MSI status of cohort samples is unknown, the estimation is based on the calculation of deletion rate observed in microsatellite regions: half of the samples from the cohort with the lowest rate are selected to generate a reference sample; each sample contributes with a fraction of their microsatellite sequences.

Each tumoral sample is then compared to the reference sample using msisensor [3]. A score corresponding to the percentage of the microsatellite regions exhibiting instability is calculated.

Based on this score, the MSI or MSS status of each sample is determined in comparison to a threshold calculated from the distribution of cohort samples. If all samples of the run have a MSS status, the threshold will be set at 10 to avoid generating false positives.

Using SomaMSI on SeqOne

SomaMSI is a fully-fledged workset of analysis available for analyzing gene panel data. 

Like SomaCNVCapture, SomaMSI is a cohort-based analysis.

Once the analysis is completed, results for each sample are accessible within the SomaMSI analysis interface under the ‘Results’ tab, divided into two parts :

  • A graphical representation displays the MSI score of each sample. The threshold determined for the cohort of samples delimits two zones on the graph, respectively for samples considered as MSI and MSS.
  • Details about scores and status assigned to each sample are displayed in the right panel:


The SomaMSI analysis workset is compatible with data including UMIs if the configuration was specified when creating the project.

In the case of UMI data, only “standard” and “disabled UMI” configurations will be available at the start of the analysis.


[1] Schlötterer C, Harr B (March 2004). Microsatellite Instability

[2] Boland CR, Goel A. Microsatellite instability in colorectal cancer. Gastroenterology. 2010; 138 (6): 2073-2087.e3. doi: 10.1053 / j.gastro.2009.12.064

[3] Niu B, Ye K, Zhang Q, Lu C, Xie M, McLellan MD, Wendl MC, Ding L. MSIsensor: microsatellite instability detection using paired tumor-normal sequence data . Bioinformatics. 2014 Apr 1; ​​30 (7): 1015-6. doi: 10.1093 / bioinformatics / btt755. Epub 2013 Dec 25. PMID: 24371154; PMCID: PMC3967115.

Need help improving your genomic analysis process ?
We'd love to help !