Evaluating Pathogenicity Scoring system for Missense Variants on real-world data: A comparison between AlphaMissense and SeqOne DiagAI pathogenicity score


Predicting the impact of missense mutations remains a significant and complex challenge in human genetics. Approximately 98% of these amino acid substitutions are classified as having uncertain significance, underscoring the ongoing difficulty in determining their effects on protein function and associated health implications (Cheng et al., 2023).

Here, we provide a real-world comparison between scores that rank mutations according to their predicted pathogenic impact

  • Google AlphaMissense is a novel pathogenicity scoring system rooted in AlphaFold’s predictive capabilities (Cheng et al., 2023) 
  • Established scores REVEL (Ioannidis et al., 2016) and CADD (Rentzsch et al., 2018)
  • SeqOne DiagAI’s Universal Pathogenicity Score (“UPP”) that emulates ACMG guidelines with a machine learning model

Google has recently unveiled AlphaMissense, a novel pathogenicity scoring system rooted in AlphaFold’s predictive capabilities (Cheng et al., 2023). AlphaFold, a cutting-edge AI developed by DeepMind, revolutionizes life sciences by accurately forecasting protein structures from amino acid sequences (Jumper et al., 2021). Unlike traditional ensemble methods that aggregate multiple scores, AlphaMissense leverages AlphaFold’s structural predictions to assess the pathogenic potential of missense variants—mutations that modify the protein’s amino acid composition. This innovative model integrates insights from protein structure and linguistic models used in 3D protein structure prediction.

SeqOne has developed an advanced variant pathogenicity prediction score within its SeqOne DiagAI suite, designed to assist geneticists in diagnosing hereditary diseases. This model leverages a machine learning algorithm trained on a comprehensive dataset comprising over one million genetic variants from ClinVar. To enhance interpretability and closely align with expert evaluations, the model integrates features inspired by ACMG criteria, utilizing data from sources such as ClinVar, gnomAD, and dbNSFP. 


Upon the public introduction of a novel scoring metric, our team at SeqOne undertakes a preliminary scientific benchmark analysis utilizing real-world datasets to ascertain its efficacy and determine its potential integration into our platform and proprietary AI-based predictive models.

We assessed the performance of AlphaMissense in conjunction with established scores such as REVEL (Ioannidis et al., 2016), CADD (Rentzsch et al., 2018), and our proprietary AI-driven pathogenicity score, “UPP” (Universal Pathogenicity Prediction), across 62 Whole Exome Sequencing (WES) datasets identified with diagnostic missense mutations. To ensure the integrity of our evaluation, particularly given that our UPP model is trained on ClinVar data, we meticulously excluded any diagnostic variants previously cataloged in ClinVar. This measure eliminates potential benchmark biases favoring UPP or REVEL. This empirical assessment yields critical insights into a key performance indicator consisting of the percentage of WES analyses with the diagnostic variants being part of the best ranked variants.

Such a real-world benchmarking exercise is instrumental in elucidating the tangible benefits these scoring metrics may offer to biologists in their clinical diagnostics and routine analytical workflows.


Benchmarking pathogenicity scores against the entirety of missense variants identified through Whole Exome Sequencing (WES) offers insights into their respective efficacies. 

REVEL operates as an ensemble method amalgamating numerous scores. A large proportion, typically 60%, of the WES samples have their diagnostic variant well-ranked by REVEL. However, in terms of sensitivity—a measure of the proportion of diagnostic variants that are ranked in the top list of variants—AlphaMissense exhibits superior performance.

CADD, which was previously regarded as the benchmark in this domain, does not surpass AlphaMissense in terms of variant ranking, suggesting a shift in the paradigm of state-of-the-art scoring methods is warranted.

SeqOne’s proprietary “UPP” score surpasses the performance of other evaluated prediction scores. This performance reflects UPP integration of a collection of predictive factors influenced by ACMG guidelines—including prediction scores, Gnomad allele frequencies, Clinvar mutation prevalence, recognition of mutational hotspots. This holistic approach to pathogenicity scoring for missense variants surpasses conventional methods to rank variants.

Further analysis incorporated an initial variant filtration based on Gnomad frequency (population frequency < 0.01), a standard practice in causal variant identification during variant interpretation. Post-Gnomad filtering, all methods enhance their performance significantly, narrowing the performance gap between them. However, SeqOne’s UPP remains superior to all other prediction scores.

Figure: Percentage of analysis with the diagnostic variant, which was found by a geneticist, being in the top-rank list of variants according to different scoring systems: AlphaMissense, CADD, REVEL and SeqOne UPP.


A notable limitation of the UPP score stems from its dependency on a range of predictive and annotation metrics customary in clinical contexts. This dependency implies that while the model is proficient in evaluating variants with established clinical significance for medical diagnostics, its capability may diminish when encountering novel variants lacking recognized clinical importance. The continuous evolution of the ClinVar database, marked by regular and significant updates to variant classifications, necessitates an ongoing refinement of our model to remain in sync with the latest developments. The most recent update to our model aligned with the November 2022 release of ClinVar, and we have instituted a regimen of periodic updates to seamlessly integrate new information from ClinVar.

Moreover, our model is designed to identify variants not yet annotated in ClinVar that exhibit features consistent with those identified as pathogenic in the database. However, it’s important to note that while ClinVar categorizes certain variants as pathogenic, our evaluations have flagged some of these as potential “artifacts” or “false positive pathogenic” variants, employing specific criteria for such determinations.

To augment the model’s precision in distinguishing authentic pathogenic variants from potential artifacts and false positives, we have incorporated frequency data from gnomAD into our analyses. This integration aims to bolster the model’s discernment capabilities, thereby enhancing the accuracy and reliability of the diagnostic process.


The introduction of AlphaMissense is poised to influence the predictive score landscape profoundly. Its standalone performance rivals that of established benchmarks.
Likely, AlphaMissense will soon be integrated into ensemble approaches, like REVEL and METARNN, augmenting their efficiency in predicting pathogenicity for missense variants.

SeqOne has created UPP, a component of the SeqOne DiagAI suite, tailored for everyday clinical diagnostics, following the principles of explainable AI, as outlined by ACMG guidelines.

SeqOne is dedicated to equipping biologists with a sophisticated tool designed to streamline the decision-making process, thereby enhancing the efficiency of diagnostics and contributing to superior patient outcomes.

SeqOne proprietary AI-driven pathogenicity score, “UPP” (Universal Pathogenicity Prediction) is available as a standalone API and is part of SeqOne clinical genomics cloud platform. To learn more contact science@seqone.com.

Stay in the loop Want to stay on top of latest genomic analysis news? Subscribe to our monthly newsletter.  


Cheng, J., Novati, G., Pan, J., Bycroft, C., Žemgulytė, A., Applebaum, T., Pritzel, A., Wong, L. H., Zieliński, M., Sargeant, T., Schneider, R. G., W, A., Senior, Jumper, J., Hassabis, D., Kohli, P., & Avsec, Ž. (2023). Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science, 381(6664). https://doi.org/10.1126/science.adg7492

Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S. a. A., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., . . . Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583–589. https://doi.org/10.1038/s41586-021-03819-2

Ioannidis, N. M., Rothstein, J. H., Pejaver, V., Middha, S., McDonnell, S. K., Baheti, S., Musolf, A. M., Li, Q., Holzinger, E. R., Karyadi, D. M., Cannon-Albright, L., Teerlink, C. C., Stanford, J. L., Isaacs, W. B., Xu, J., Cooney, K. A., Lange, E. M., Schleutker, J., Carpten, J. D., . . . Sieh, W. (2016). REVEL: An ensemble method for predicting the pathogenicity of rare missense variants. American Journal of Human Genetics, 99(4), 877–885. https://doi.org/10.1016/j.ajhg.2016.08.016

Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J., & Kircher, M. (2018). CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Research, 47(D1), D886–D894. https://doi.org/10.1093/nar/gky1016