ClinVar is a public archive of information submitted by genetic testing labs on the relationships between medically important genetic variants and their clinical characteristics found in patients. Now, all published variants, pathogenicity interpretations, and other key information within ClinVar’s database are included as part of Mastermind’s comprehensive body of evidence.
This new integration means that clinical users can keep their workflow contained within a single easily searchable interface. By streamlining variant interpretation in this way, Mastermind accelerates diagnosis and enriches clinical reporting with actionable insights not found in ClinVar.
You will learn how:

Dr. Mark Kiel is the co-founder and chief scientific officer at Genomenon, where he oversees the company's scientific direction and product development. Mark received his MD/PhD in Clinical Pathology at the University of Michigan. He founded Genomenon to address the challenge of connecting researchers with evidence in the literature to help diagnose and treat patients with rare genetic diseases and cancer.

Anna has spent her training and early career helping patients to understand their genetic diagnoses and the health risks associated with those diagnoses. This effort continues at Genomenon, where she supervises a team of quality assurance specialists reviewing the curated variant data in our pursuit of full curation of the human genome.
WEBINAR TRANSCRIPT
GARRETT: Hello, everyone, and welcome to today’s webinar, where we’ll be discussing how Mastermind’s new ClinVar integration can be used to accelerate diagnosis in the clinical lab! My name is Garrett Sheets, and I’ll be your host. Let’s get started!
The Mastermind genomic search engine is the most comprehensive source of genomic evidence, and can be used to quickly identify papers for patient diagnosis and treatment decisions. In today’s webinar, our speakers will explore our latest integration with ClinVar, which allows you to keep your workflow within a single, easily-searchable interface. By streamlining variant interpretation in this way, Mastermind not only accelerates diagnosis, but also enriches clinical reporting with actionable insights that may not be found in ClinVar. So, a really interesting topic today, especially for those of you involved in clinical decision-making!
Keep in mind that today’s presentation is going to include Professional Edition features of Mastermind. If you don’t already have a Mastermind account, you can create one today with the bit.ly link that you see on your screen and start with a free trial of Mastermind Pro, so do take advantage of that. As always, we have a lot of great information to share today, so I’ll cover some housekeeping and then get into our introductions.
If you’re joining us live, feel free to drop your questions down into the Q&A, and if we have time, we’ll get to those at the end of our presentation. Also, please know, this webinar is being recorded and will be emailed to you once we’ve wrapped up. Without further ado, we’ll introduce our speakers today!
We’re joined by Dr. Mark Kiel, our co-founder and Chief Scientific Officer. Hi, Mark. Mark oversees scientific direction and product development, and founded Genomenon to address the challenge of connecting researchers with evidence in the literature to help diagnose and treat patients with rare diseases and cancer. It’s really awesome.
We’re also joined by Anna McGill, our product quality manager. Hi, Anna. Anna supervises a team of quality assurance specialists here at Genomenon that reviews our curated variant data and ensures that we maintain the highest data quality standards. It’s a really important job. Mark and Anna, thank you both for being here to share your expertise. I can’t wait to get started! Mark’s going to kick us off with an overview of Mastermind and some features of the ClinVar integration, and then Anna will join the conversation. Mark, I’ll stop talking, it’s all yours.
MARK: Alright, thank you so much, Garrett! I’m gonna go dark so that you can see the screen a little bit better. I hope I’m preserved there. I’ll advance the slide here and get started. Thank you, attendees, for joining. I haven’t given a webinar in a while — it’s always great to connect with our audience of users and folks who may be new to Mastermind and its capabilities. I wanted to start by describing, at a high level, what we’re here to talk about; what is ClinVar; very briefly describe what is Mastermind; but also, emphasize how they complement each other as a way to set Anna up for going through some illustrative examples of why this is such a valuable integration, and how it can streamline clinical variant interpretation workflows.
So, the first thing that I want to do, as I say, is describe to you what ClinVar is. Undoubtedly, many of you are familiar with ClinVar, but I think it’s instructive to take a step back and really go through some of the aspects of what ClinVar has and doesn’t have, and how Mastermind can complement some of those features and data aspects that ClinVar may be missing. Let me begin by emphasizing that ClinVar is an extremely valuable resource. It’s a resource that comprises clinically-encountered genetic variants, and it’s publicly available, intended to showcase the way those genetic variants, as they’re encountered in clinical practice, are associated with clinical phenotypes, diseases, and patient attributes that are seen by the physicians who are encountering these patients. It’s also intended to facilitate communication across the ecosystem about those relationships, and to keep that data as up-to-date as possible by accepting batch submissions of these reports from various different labs, large and small.
There are, however, some limitations to ClinVar just by nature of the design, and those limitations prevent it from being a stand-alone resource. This is where I want to highlight some of those limitations, given how it was designed and what its intended use is, but then, to speak specifically as to how Mastermind can fill those gaps in the capability of ClinVar, and together, through this integration, become a standalone resource for surfacing this evidence to maximize diagnostic yield and diagnostic accuracy.
So the first aspect of ClinVar’s limitations that I want to highlight is that it’s missing a substantial number of clinically-relevant variants. The number of variants that are in ClinVar is about one and a half million, which is a substantial number of variants, but as you’ll see here on subsequent slides, that is not the totality of the disease-causing variants. In fact, given how ClinVar is situated as being user-submitted or crowd-sourced, there’s a tendency for those variants to skew toward benign variants, or otherwise, variants that are incidentally found in individuals through sequencing, but that aren’t associated with disease causation. So there’s a skew in the nature of the variants that are found in ClinVar, they are more likely to be benign and more likely to be VUS. Obviously, this varies widely across genes. Different genes have different profiles of the nature of these variants, but that is a tendency across ClinVar variants: that there’s a healthy number of them that don’t cause disease.
Further, when there are variants that are associated with disease, many of them lack evidence citations. In particular, many of those that do have evidence citations lack relevant evidence citations. They lack variant-specific information, and instead, they point to the interpretation frameworks that were used to make the determination about the meaningfulness of that variant, but no such variant-specific information is present in those citations. That can be a real limitation, particularly if there’s a variant that’s associated with a case, but that lacks sufficient evidence to promote that interpretation from VUS to likely pathogenic. Lacking that sufficiency of evidence can be a real challenge in clinical workflows. I’ll leave it to Anna to highlight that with some relevant examples. Further, there are often situations where there are conflicting interpretations by different submitters of different backgrounds, with different criteria, different levels of evidence, and rigor of interpretation. Those conflicts can make it challenging to sort through the truth from the noise.
Lastly, and not unimportantly, is that there’s there’s frequently a lack of consistency in disease annotations, either lacking disease annotations, or otherwise lazy annotations, where the association with disease is just at the gene level for the variant that was submitted to ClinVar. In other words, there may be a benign variant that was submitted in a disease-causing gene, and then in the disease field, the disease that’s associated with that gene for pathogenic variants is put into that disease designation for an otherwise benign variant. That can cause confusion and create challenges, particularly when that disease field is populated by multiple diseases, making it really challenging to disaggregate what, in fact, was seen by the submitter, if it was in fact in a clinical circumstance.
So into this milieu, about nine years ago now, Genomenon introduced a capability and a database with a user interface that solved these problems. The way that we solve those problems is through one of our core technological capabilities, which we refer to as “genomic language processing.” I call this the order-from-chaos slide, where you can think about all of the evidence necessary to accurately interpret these genetic variants in a clinical workflow being dispersed widely across numerous publications — millions, in fact — with complicated and unstructured nomenclatures for diseases, phenotypes, therapies, and most importantly, for genes and the genetic variants in those genes. Genomenon’s genomic language processing, or GLP, has compelled that to order in a maximally sensitive way, recognizing all of those different ways that authors can describe those terms, all of the different idiosyncrasies of variant nomenclature, following HGVS standard guidelines or not, and then, organizing all of those results that are returned per paper, across variants and across genes, into what we call genomic associations.
The fruits of that effort are what are patterned into the Mastermind genomic search engine. I’m going to very quickly buzz through the treetops here about what Mastermind’s feature sets look like, in case attendees are not familiar, but Anna’s going to go into more depth with specific examples. What I really want to emphasize on this slide is how those deficits that we talked about, the intrinsic limitations of ClinVar, are perfectly complemented by Mastermind’s capability. Mastermind is really good at highly sensitively identifying clinically relevant information from the published literature, the full text, and supplemental data. Because our source of evidence is the published literature, if you look at the breakdown of variants that are found in Mastermind, they tend to skew toward the pathogenic and the case-related variants, in contrast to those that are submitted that are encountered in routine clinical sequencing in ClinVar, that tend to skew toward benign and VUS. In fact, the variants that are found in Mastermind are either pathogenic, likely pathogenic, or otherwise, variants of uncertain significance (but demonstrably associated with disease-related cases from the literature.)
So Mastermind has a great burden of these genetic variants, a substantial fraction of which are not in ClinVar. The number of variants in ClinVar is around a million and a half, and as Garrett mentioned, the number of variants that are found in Mastermind is nearly 19 million. With that in mind, and appreciating the skew in ClinVar toward the benign and the VUS, and the skew in Mastermind toward the pathogenic and case-related variants, you can see a great wealth of information, a great number of variants that are just not captured in ClinVar. Each one of the variants that are found in Mastermind has a comprehensive catalog of evidence citations supporting your interpretation workflow for coming to those conclusions about the variant being likely path or pathogenic. In that context, the evidence is useful for providing unbiased insight to help reconcile some of those conflicts or to help associate those variants based on evidence with the diseases to make the necessary clinical designations on a per variant basis.
Again, for those of you who are unfamiliar, I’ll just very quickly go through the feature capabilities of Mastermind and leave the rest to Anna’s demo. At the top in the Mastermind interface are some search features: searching on genes, variants, keywords, diseases, phenotypes, etc. On the left is a complete catalog of all of those variants in the gene that has been searched, in visual form and in list form. Then, on the right is an itemization of the articles that are found for the gene, or for any one of those particular variants. In this view, I’ve searched on a variant, and what you see at the bottom is a highlighted title and abstract, and the full text sentence fragment context where the variant is mentioned in any one of those references that that you seek to examine in your workflow.
What we’re talking about today is the integration of ClinVar designations, which now appears at the top. It’s open by default. For this particular variant that I’ve searched on, which is a common variant in diseased patients who have ENPP1 deficiency — obviously, it’s rare or absent in the healthy general population — the ClinVar integration of this data now displays what the ClinVar determination was, and provides a link out to ClinVar for this variant. The ease of searching and Mastermind can now be used for searching through ClinVar as well, in the Mastermind application itself.As I said, I’ll leave the details to Anna for her demo.
The other thing that I wanted to point out here is, for this particular variant, that’s extremely damaging when found in patients, and it’s common among patients with ENPP1 deficiency, this is a variant that we happen to have curated with Anna and my team of curators and QA specialists. We’ve gone through and reviewed all that evidence in Mastermind and made our determination based on ACMG guidelines: that, in fact, this variant is pathogenic in contrast to the ClinVar call. So that’s a little bit of foreshadowing of another webinar that we’re going to have in some weeks’ time, describing Genomenon’s capability of systematically and at high throughput and high accuracy, curating this information, gene by gene, with the ultimate goal of curating this information across the entire genome. We refer to this as disease-specific curated content. That’s just a little bit of a teaser for an upcoming webinar that the Genomenon team will be hosting.
The benefit of Mastermind, I’ll highlight briefly, is obviously to reduce turnaround time. Having the evidence pre-organized and ready at your beck and call when you execute a search can accelerate your time to results. I’ve alluded to the strong increase in diagnostic yield, with a significant number of additional variants, in many cases, per gene, it’s 3 to 10 times more, as well as the dramatic increase in the content, the number of reference citations per variant. That’s also a great feature for diagnostic workflows. Put these together, and that lends your workflow to more automation and more ready scalability. Specifically, what I want to speak to here, with respect to our ClinVar integration — I’ve already highlighted the ease of search of ClinVar, leveraging the genomic language processing capabilities in Mastermind that makes searching for variants dramatically more simple and more sensitive. It also lends itself to increased diagnostic yield, as Anna will go through, increasing the amount of evidence that you can see to promote higher specificity and greater accuracy of your variant interpretation, including overturning calls or reconciling conflicting calls critically, with the necessary evidence from the primary literature, the functional studies, and the clinical studies, which are necessary in clinical workflows.
With that, I’d like to turn it over to Anna. I’ll just say a quick word while we’re getting situated and Anna’s changing her screen. I can’t say enough how fortunate I feel to have Anna on our team. It was recently her one year anniversary with Genomenon, and she has a wealth of experience. She’s made a very significant impact not just on the operations of the team, but on the functioning and the quality and how much we all enjoy working together. Anna, I hope that I’ve ceded control to you, if you can take it away with your demonstrations, then I’ll return for some Q&A if we have time at the end.
ANNA: Okay, yeah. Thank you, Mark, for that lovely introduction! Now, we enter the demonstration portion of this webinar. I hope to demonstrate to you all of the use cases for which this ClinVar integration can really help your workflow. Before I go into the meat of these examples, I did want to review the user interface for those of you that may be unfamiliar. Up in the top corner here, we have the search bar. In this case, we have our gene of interest and our variant of interest, which we’ve used as search terms. You can also add additional keywords or phrases that can help refine your results. As Mark pointed out on one of his slides, we now have this ClinVar tab in the variant info section. I’ll go ahead and review that in further detail shortly. We have our variant diagram here over in this window, and you can see the residues of the protein that we’ve selected here on the x-axis, and the citations per variant here on the y-axis. This is an interactive tool. You can go right, to the left, you can zoom in and out. It’s just really meant to show, if there are any hot spots within the protein that are very well published. Down here in the variant list, we have the variant itself that’s been searched for. P691q is the variant that was searched for, and that’s highlighted here in blue. You can see that Mastermind has identified four total articles with this gene and variant combination.
In this bottom left corner, we have the PubMed data, which is simply the title and the abstract of the article that’s selected over in the articles field here. We have the publication history in this window. We have the date of publication and the citations per journal of the articles that have been indexed by Mastermind for our gene and variant combination. You can see here that each bubble corresponds to a particular article, and the size of these bubbles is a little bit different. The size of the bubble does correspond to the relevance of the article for the search terms that we’ve provided. If we added additional search terms, the size of these bubbles could change. We’ll go down here to our articles list. These are the four articles that Mastermind has indexed for this particular variant, and you can see they’re sorted by relevance. Down here, on the bottom right corner, we have our full text matches. This is where the sentence fragments that contain the variant and the protein are seen, so we can quickly scan what that PubMed ID might actually contain and make an educated guess as to whether or not we should open that to review further, or if it may have important data for our classification purposes.
With that review, I’ll go ahead and jump into this first example. Here we have npc1 and the variant of interest is p691q. Npc1 is causative of Niemann-Pick disease type C, which is an abnormality of the fat metabolism. There’s no cure for this disease, unfortunately, but there are treatment options. It’s really important to get these patients the correct diagnosis. I will just note here that the C dot nomenclature has been used to search, but we’ve been able to index that to the one letter protein amino acid abbreviation, as well as the three letter protein amino acid abbreviation. Searching any type of nomenclature can bring us to the correct result, which makes it easy to index the ClinVar, which has a little bit more difficulty in how it’s able to be searched. In this case, we do have a ClinVar entry for this particular variant, and as I mentioned before, we have four articles within Mastermind.
Let’s look at this ClinVar data in detail. We see that it’s a likely pathogenic classification, and the info that’s taken from ClinVar is easily viewable here. We can see that it’s a single nucleotide variant, it has one star status from one submission, and it was fairly recently updated. As Mark alluded to, we can then directly link out to ClinVar, no need to do an additional search and figure out the correct nomenclature to be able to perform that search. We can just directly link out here. I’ll go ahead and scroll to the bottom just to look at this particular example. In here, as it showed on the Mastermind page, we have one interpretation, one submitter, and that’s a likely pathogenic interpretation. We can go ahead and actually expand the details here, and we can see that this submitter was really helpful in how they submitted their information: they actually provided all of the details for which they could base their classification on. However, we’ll notice here that they didn’t submit any citations for this variant, so they didn’t actually base this classification on any literature.
We can go back to our Mastermind example. We can see here that there were four articles that Mastermind did find that have this gene and variant match. Perhaps what’s in ClinVar is enough for us to feel confident, with a likely pathogenic call, but additionally, we know that there could be more evidence that could lead us to the correct classification or the most well-informed classification. In this case, even just looking at the surrounding information here, it looks like there could be some more case-level data that might even allow us to bring this up to pathogenic.
Now, I will move to my second example. This example is in the gene f8, which is causative of hemophilia type A. This variant, t1088i, if you’ll notice here, there’s no record for ClinVar in the variant info section. In this case. we can be confident, based on Mastermind’s genomic language processing, that there was not a record in ClinVar to be found. There’s no need to perform a separate search to confirm this, but that doesn’t mean that there’s not any usable information here that can help inform a classification above a baseline VUS classification.
If you notice here, Mastermind did find four articles that have this gene and variant match. Again, they’re sorted by relevance, so we’ll just go ahead and take a look at this very first article. We can see that there actually may be some relevant data that could help us with some case-level data to further inform what the overall classification of this should be. I’ll go ahead and pull up the PDF of this particular example. We can see that in table one, there’s a summary of mutations in 103 hemophilia patients. That is the phenotype that we were concerned about in this case, and we do see here that it was actually found in one patient with severe disease.
Based on what may be contained in these other three publications, the intrinsic properties of this variant itself, it could have damaging computational models. It could be absent from gnomAD, etc. This actually could be enough data to bring this up to pathogenic or likely pathogenic. At the very least, it gives us a little bit of suspicion that this could be causative for disease if we have a patient that’s seen with this particular variant.
My next example is in NF1. The particular variant is l58r. NF1 is a little bit unique in that many of the variants found within this gene many of the pathogenic variants occur de novo, and there’s not necessarily reoccurring de novo variants. They may be only seen in one family or in one patient, which leads ClinVar to have a little bit of an edge over Mastermind, in that it’s easy for labs to identify these variants and to push them to ClinVar, whereas they may not necessarily be published. While there are a lot of published variants in NF1, this is more than often the case for this particular gene. In this one example, we have a ClinVar entry here. However, there’s actually no Mastermind search results for this particular gene and variant match. This just goes to show that this can be a one-stop shop for you. You can go ahead and search the variant here, and if there is a ClinVar entry, it will pull up, even if there’s no Mastermind entry for that same variant.
We’ll go ahead and review this example: it’s a likely pathogenic classification. We’ll go ahead and pull it up in ClinVar to see what we can see. Here, we have a likely pathogenic classification, as stated, and the submitter was really nice here in that they submitted a lot of information to support this claim. We have case data here, in that it’s found in a patient with neurofibromatosis type 1. We have some additional evidence here, and perhaps that’s enough for us to be confident with a likely pathogenic call. This example is just meant to demonstrate that, really, you are getting the best of both worlds from ClinVar and Mastermind here.
My next example is in ATP7B. The variant of interest here is i-899-del, which is a deletion of eight amino acids. ATP7B, when it’s mutated, is causative of Wilson disease, which is an abnormality of the copper metabolism. Early diagnosis is extremely important for these patients, because early treatment can prevent long-term neurological defects. We really want to make sure we get these patients diagnosed. In this case, you’ll see we have a ClinVar entry for this variant, and Mastermind has indexed eight articles that contain this gene and variant match.
First and foremost, we’ll go ahead and view this in ClinVar, and we can see here that there’s a single submitter with a likely pathogenic call. Unfortunately, in this case, there’s no evidence details to support that classification. In addition, you’ll see that there are no literature citations provided for this variant either, so we have a call of likely pathogenic, but we really don’t have any data supporting that call. We just have to to trust the submitter that they’ve done their work, and that is the correct call, but if we’re not comfortable with that, we have a different option here. That’s to review Mastermind, where there is data to review. As I mentioned, there are eight articles that Mastermind indexed.
We can go ahead and review those articles one by one, sorted by relevance, or as Mark alluded to, we have a new offering in our curated content. As he said, and I will reiterate, we do have a webinar that’s coming up in about a month that I encourage all of you to attend. Our mission is to curate the entire genome. We have a staff of expert curators working behind the scenes. They’re going gene by gene, variant by variant, pulling out the relevant information from each of these articles to make it easier for you to access and review. We’ll go ahead and look at the curated content entry for this particular example, and we’ll see that our curators have determined that there’s a likely pathogenic classification here. What that’s based on is their review of these eight articles within Mastermind. They found that there are multiple case reports with Wilson disease and their segregation with disease. We also are looking at gnomAD for population data, so it’s rare in population databases, and there’s just the intrinsic property of this particular variant in that it’s an in-frame deletion.
We’ll go ahead and briefly take you behind the scenes here to look at our curated content. You can see that our curators have pulled out sentence fragments from each of the relevant publications to support their ACMG classification. This will be the most informative information from that paper to support a pp1 classification. The same goes for multiple case reports here. We have our population data over here, and that pm4 for the intrinsic property of this particular variant, which then leads to a likely pathogenic call. We’ll go back to the main page here. In this case, our own interpretation of likely pathogenic agrees with ClinVar’s interpretation of likely pathogenic. However, with our curated content and with the information available in Mastermind, we’re able to actually show the evidence behind that classification, so you can be confident that you have the most well-informed call for your variant.
The next example I would like to highlight is also in ATP7B. This has an entry in ClinVar, and it’s a variant of uncertain significance. You’ll see here that there are 20 articles that Mastermind has indexed, so we’ll go ahead and view this in ClinVar. We’ll scroll down to the bottom, and we’ll see that this is classified as a variant of uncertain significance by a single submitter. However, in this case, there are no evidence details provided. We don’t know what they’re basing this call of variant of uncertain significance on. There is one article citation for this variant, but this is the submitter’s own adaptation of the ACMG guidelines, so it doesn’t contain any relevant variant-level data. We’ll go back to our Mastermind page, and as I mentioned, there are 20 articles within Mastermind here, so likely, there’s something in there that can either move us to the pathogenic side or the benign side, just because this is so well-published. We can go ahead and review each article, sorted by relevance, manually, if we like. In this case, we do have the option of curated content for this particular gene.
I’ll go ahead and review the curated content because we do have it available. Based on these 20 publications, our curators have pulled out some important information to make a overall classification for this variant. There are clinical cases within the literature. We did run computational algorithms, particularly, SIFT and PolyPhen, which predict that this is damaging. It’s rare in population databases. Missense variants within this gene are typically pathogenic, so it does get an additional criteria for that. Based on this information, we’re able to come up with a call of likely pathogenic. So, the moral of the story for this example is that there is increased evidence within Mastermind that can potentially help convert calls up to likely pathogenic if the evidence is there.
Our next example is also in ATP7B, g875r, and this, again, has a ClinVar entry, and it has 21 publications within Mastermind. This has a benign ClinVar entry. We can just see here that it’s a pretty outdated call; it was last updated in 2014. That was eight years ago; it was actually prior to the most recent iteration of the ACMG guidelines. It’s likely that this is quite outdated. Let’s review the submission within ClinVar. Here, we see, as advertised, this is a benign classification. There are no evidence details provided here by the submitter. In addition, there is no publication evidence provided either, so we have a benign classification, an older classification, but we don’t have any evidence to support that benign classification. So we’ll go back to our Mastermind page, and again, we can go through each and every one of the 21 articles, sorted by relevance, and pick out the most important information, that can either agree with that benign call, or maybe we find something else that shifts that classification.
In this case, we do have the benefit of curated content, so we’ll go ahead and view what our curators pulled out of those 21 articles. In this case, they found that there’s functional data that support a functional consequence to this variant. There’s clinical data in that it segregates with disease in affected individuals. There’s damaging computational algorithms. It’s rare in population databases. In addition, missense variants within this gene are typically pathogenic. That leads to a provisional ACMG classification of likely pathogenic. In this case, if someone were just using ClinVar to use for their variant interpretation, they would have diagnosed this patient, because the data available since the publication of that submission goes to show that this actually does have an effect on the protein, and that a patient carrying this variant likely has the disease.
Alright, I will move on now to my last example, and this example is in scn2a. Scn2a causes an epileptic encephalopathy disorder, which is not curable, as with many of these rare diseases, but it is treatable, so an important step is to get these patients correctly diagnosed. This example is r1882q, and we do have a ClinVar entry for this variant, and that’s of conflicting interpretations of pathogenicity. We do have a a whopping 41 articles within Mastermind, so there’s just a lot of information to go through here. It’s probably not surprising, given the level of publication, that there are eight submissions for this variant. That’s why some of them are not agreeing with each other. We’ll go ahead and view this variant in ClinVar.
We’ll scroll down here, and we can see that even with the first three entries, none of these are agreeing. We have a pathogenic classification, a variant of uncertain significance classification and a likely pathogenic classification. If one were using ClinVar and ClinVar alone to make their calls for the variant pathogenicity, they may not know which one to rely on, based on the evidence. This is where Mastermind can come in to help disambiguate. You can go through each of these articles one by one in terms of relevance to search for the case-level data, the functional data, etc. that you may need to make a well-informed call for your variant. Again, in this case, we do have some curated content available, so we can go ahead and review that.
In this case, our curators have pulled out that there’s functional consequences for this variant within the literature. There’s some clinical data in that it’s de novo, and there’s multiple case reports of patients that are affected with this particular variant. There’s computational algorithms that predict that this is damaging, and it’s rare in population databases. In this case, based on all the evidence that we were able to review, pathogenic was the most well-informed classification for our variant, and that does agree with most of the submissions in ClinVar, but it really just helps bring that evidence behind that call, so you know that that’s the correct call for your patient.
Okay, that concludes my demonstration portion. I really hope this exhibited to you that the integration of ClinVar and Mastermind gets the best of both worlds. You get all of the information within Mastermind, you can easily see what the classifications in ClinVar are and easily link out to those those submissions. I really hope that this will make your workflow much more streamlined the and with that I will give it back to Garrett.
GARRETT: Yes, Mark and Anna, great conversation so far! Outstanding examples, very interesting and useful comparison for our viewers to see. Just really awesome. We do have some questions that have come in from the audience, so at this point we will move to the Q&A segment of our discussion today and give our speakers a chance to respond. Looking at our questions, the first one from the viewers is: What is the periodicity of updates to the ClinVar data integration?
MARK: Pardon me, this is Mark here. That’s a great question, Garrett! Thank you, attendee. We haven’t yet established the cadence for updates for ClinVar. However, that question gives me an opportunity to say that the data in Mastermind is updated on a weekly basis. As new content gets published, new variants and new evidence, Mastermind gets updated on that weekly basis, as it has for many years now. Since the ClinVar integration is so new, we’re working internally on deciding what that periodicity will be.
GARRETT: Outstanding. The next question — this is a longer question — so many of the variants in ClinVar aren’t supported with publications. Of those that are, many cite the Richards publication on HGMD guidelines, or a publication at the gene level without any variant specific information. Of the 1.5 million variants in ClinVar, what percentage of these have evidence at the variant level needed to support the calls by ACMG guidelines?
MARK: I’m happy to take that, but I’ve looked at this more holistically; Anna has experience at the variant level, and so she’s got a good intuition. I’ve looked at it, on a whole, as a data set, and I’ll say that it varies pretty substantially gene by gene, but on average, the number of variants that have no variant-specific support in ClinVar is quite high. It’s between 66 percent and 80 percent, so that is to say, a fifth or a third of the variants, depending on the gene, have that variant-specific support. That also affords an opportunity for me to to say that there is that skew toward benign polymorphisms, and otherwise, variants of uncertain significance. There may not be any evidence for those variants. It’s not a deficiency of ClinVar, they just may have never been cited before, but also, it suggests that when you look at a variant designation in ClinVar, examining the evidence for it is really important. In particular, when there are variants that do have citations but that aren’t the Richards paper or the corresponding Niekamp paper for ACMG to SHERLOCK, respectively, if they do have a citation, it’s quite often at the gene level and not at the variant level.
To emphasize what Anna said in describing some of her examples, this integration of ClinVar and Mastermind really is highly complementary. The capability of Mastermind will now allow searchers of ClinVar to search through Mastermind directly to see all that evidence and make those informed determinations based on all the available evidence, using all the amenities in Mastermind, the features of which Anna walked through in her example. The short answer there, Garrett, is quite a lot have no variant-specific evidence, and the recommendation to the attendees is, that is one of the important facets of value behind this ClinVar integration into Mastermind. I gave an equally long-winded answer to that longer question.
GARRETT: Great, thanks, Mark. No, I think it’s really important to highlight the complementary component of ClinVar. I think that’s really, really critical, to shine a light on. Our next question: Does the ClinVar integration help us provide more value for somatic work, or is ClinVar mostly rare disease related?
ANNA: It’s a good question! So, ClinVar has somatic variants, just the same way that Mastermind does. Really, this integration can provide just as much benefit for those looking at somatic work as well as as germline work. Very complementary in that sense.
MARK: We were reading each other’s minds, Anna.
GARRETT: Great. Looking at other questions… Okay, here’s one. Does your database also include therapy options and clinical trials available for a variant?
MARK: I’ll tackle that. We do not have clinical trial data, per se, from clinicaltrials.gov, so we haven’t integrated that data set yet, except, to say, for certain genes that we have curated in some examples that Anna showcased, and in others that we’ll showcase more in this upcoming webinar on October 20th. For a certain subset of genes, there is information about clinical trials directly through clinicaltrials.gov. With respect to the therapy information, I don’t remember if Anna specifically called out the ability to search categorically for therapy-related terms to prioritize the references that would be most relevant, but that is a feature capability: categorical keywords that help you prioritize the references that mention therapeutic options. That’s the first answer to the question. The second answer to the question would be that Mastermind allows you to do free-text searching. Vemurafenib, for instance, is a key term associated with somatic variants and their treatment that is not part of the categorized key terms in the dropdown, but that you can search on. That’s a very powerful capability. That gives the searcher a great deal of flexibility, particularly when you’re trying to parse through a lot of evidence, say, finding the difference between a somatic and a germline variant. If you’d like to parse those, if you’d like to find some functional aspect of the way the protein works, and you’re looking for a particular assay type — in answer to this specific question, for therapy, a named therapy that’s not in the canned keywords that Mastermind has, you’re allowed to search by free-text. That’s a very powerful tool that I’d recommend any attendee who hasn’t tried it yet to try.
GARRETT: Nice. Yeah, that’s a great feature to highlight, the free-text. Really valuable. Next question: Is there any case where variants are found in ClinVar and not found in Mastermind?
ANNA: I’ll go ahead and take this one. The answer is both yes and no. Before the ClinVar integration into Mastermind, Mastermind would only turn up a result if there were publications found within the literature for that variant. However, now, with the integration, if there is a ClinVar entry for that variant, it will show up even if there are no articles indexed. For a variant that has neither, there will not be a result returned, but you can be confident that if there is a ClinVar entry, regardless if there are articles found in Mastermind, that it will return a result.
MARK: Yeah, which just really underscores, as Anna highlighted with her examples, that this is now a one-stop shop for those searches. You can go to Mastermind search with its flexible search capability, and even if there’s no references, you’d still see the ClinVar result for that variant. That’s why we’re so excited about the capability is that this is now giving our searchers with this integration.
GARRETT: Nice, great answers. Our next question is a numbers question. We may have mentioned this, but I think it does bear repeating: How many total variants are currently integrated from ClinVar into Mastermind?
MARK: Oh yeah, good. I think that’s one and a half million, that’s the figure that I have. That compares, as I think I mentioned, to the 18.8 million variants that Mastermind has indexed. Anybody who knows me knows I love Venn diagrams, so there’s a Venn diagram for that information. As I suggested earlier, that number of variants in ClinVar has a representation, or differential representation toward the benign and toward the unsupported VUS. In the middle, there are variants in ClinVar that are disease-causing that lack evidence. There’s no evidence submitted in ClinVar, so there’s value in this integration to find those variants that, as Anna talked about for NF1, are de novo and have no publications, but are nevertheless in ClinVar. Again, ClinVar is a valuable resource for those clinically-encountered variants. You’ll see those in that crescent of the Venn diagram. You’ll see those variants that are found in common in both, but that, in ClinVar, lack evidence. You’ll see these in the integration, as well as the substantial crescent of variants in the Venn diagram that are only found in Mastermind, every one of which has a reference citation. Really, to highlight at a high level, the true value of this integration, as Anna said quite aptly, is that you get the best of both of these resources. ClinVar for the clinically-encountered variants, especially those de novo variants, and Mastermind for all those that are missing in ClinVar, as well as all the rich evidence citations for all of the veteran Masterminds.
GARRETT: Thank you! Next question: Do you curate somatic cancer genes?
ANNA: I can speak to this one. At this moment, we aren’t doing that for our curated content, but that is on the roadmap. Very soon in the future, we hope to be curating somatic variants.
MARK: I’ll underscore that we have that capability, and have done so in different contexts, but as Anna pointed out, quite rightly, they’re not yet in the Mastermind interface. But if there’s particular interest in specific genes, we’d love to talk to the submitter of that question.
GARRETT: Great. Let’s see what else. With the integration of ClinVar, how does Mastermind compare to other databases in terms of variant classification (and literature)?
MARK: Well, the answer is, obviously, we’re better. But jokes aside, our philosophy at Genomenon is sensitivity first, and have everything predicated on evidence. Those two are sort of our guiding principles. That’s really where I started the company and my own clinical work and research. I needed those two things to be true. When you’re gathering data at this scale, you first have to make sure that you’re not missing things, and then, when you’re refining for specificity, you have to do so based on evidence. Mastermind is a very exhaustive data repository of published variants and all the associated evidence. Now, with the integration of ClinVar, for those variants that aren’t published but are still important and are seen in clinical care, I’d like to say that, now, with this integration, we’re truly a superset. Again, for whoever submitted that question, I don’t want to call out specific databases, but I’ll be happy to address your particular questions, because whenever we’ve done comparisons to any data repository, our data compares extremely favorably, and that includes the data that comes out of large sequencing labs. Some of our customers have done a trial and looked at their own internally developed database, that doesn’t exist anywhere else in the world, and even within that circumstance, for groups that have been doing this for years, we still come out favorably because we have such a comprehensive approach: maximal sensitivity. Critically, it’s all predicated on the evidence. There’s value even in that overlap in data that Mastermind has that other databases have. We still have a feed of that necessary evidence to make the most accurate interpretations, and to ensure that you have the highest diagnostic yield.
GARRETT: Thank you, yes, great clarification around that. Our next question is kind of a two-part question. The first part is: What guidelines is Mastermind applying for your curated content? And, are you utilizing gene-specific guidelines, or the general ACMG guidelines?
MARK: Anna, that’s all you.
ANNA: That’s definitely me, yeah — sorry, can you remind me to the first part of the question again? I got concentrated on the second one.
GARRETT: What guidelines is Mastermind applying for your curated content?
ANNA: Yes, so we are applying the generalized ACMG guidelines for our curated content. When there are gene-specific guidelines available such as ClinGen guidelines, we do review those ahead of time. We take them into consideration, and we will utilize them if we think that they will be helpful for our curation. With curated content, in addition, we have a provisional ACMG call available for you to review all the data that’s associated with it. If you want to modify those criteria in any way, you’re free to do so based on the evidence that’s provided.
MARK: If that was your question, listener, definitely come back for the follow-up webinar here that’s displayed. There will be a lot more discussion and examples around that.
GARRETT: Awesome. In the interest of time, we probably have time for one or two more questions. Our next question is: Can you search using gene coordinates?
Yes. The GLP allows for flexibility for indexing of the content. Then, we sort of invert it with the same technology when you search. The examples are always fun; I recommend an automatic link out or a copy/paste mechanism so you don’t have to type a bunch of numbers, but if you’ve got the genomic coordinates for your variant, those can be searched on as well. There’s a lot of nuance to how to do that to the best effect, but you can definitely search on genomic coordinates. As I think, Anna, you showcased with c.DNA, you can go deeper to the G-dot.
GARRETT: Okay, great. Well, we have about two minutes left, so I think that’s all the questions that we’re going to have time to get to today. For those of you who didn’t get your questions answered, our team will work to make that happen for you. At this point, Mark and Anna, I want to give you a chance to offer any closing thoughts that you may have.
MARK: Just keep the questions coming, I would say. Come back to the next webinar. Don’t be shy. We’ve got a fantastic support team, some of whom you’ll meet on the next webinar. There’s a lot of very positive and dramatic things happening here at Genomenon, so we want to make sure that you guys are kept in the loop, so keep the emails and communication lines open.
ANNA: Yeah, I would also like to agree with Mark on that one. I appreciate the opportunity to show this to you all. If I was still in the clinic, I think that this would be a really valuable tool. Very excited for this integration, and I’m very excited to showcase the clinical content for the next webinar coming up. I hope everyone can attend that.
GARRETT: Awesome. Well, Mark and Anna, thank you for your time! I think you’ve shared some really valuable insight for our users around patient care and streamlining their work. Great discussion, and thanks to everyone watching! As a reminder, you will receive a recording of our discussion later today, so watch your email. Also, if you don’t yet have a Mastermind account, you can sign up at the bit.ly link and start with a free trial of the Professional Edition. If you’re currently using basic, you can talk to us about upgrading to get access to all the features that you saw today in the presentation. As always, we want to help, so if you have any questions, don’t hesitate to reach out to us at support@genomenon.com.
Finally, as Mark and Anna have mentioned a few different times, be sure to put October 20th on your calendar. That will be our upcoming webinar on disease-specific curated content, where we will be exploring how Mastermind is used to support both clinical and pharma initiatives with our expertly curated data. You really won’t want to miss that one. Really excited about that. At this point, that concludes our event — Thanks again, everyone, and have a great rest of your day! Bye, now.
We help provide insights into key genetic drivers of diseases and relevant biomarkers. By working together to understand this data, we enable scientists and researchers to make more informed decisions on programs of interest. To learn more about how we can partner together to find your genomic variant solutions, we invite you to click on the link below.