< Resources

Webinar

Future Proof Your Lab: Streamline Variant Interpretation with Automation

Comprehensive review of publications is a time-consuming process, and a significant bottleneck in variant interpretation resulting in longer turnaround times, higher costs and increased risk of errors. Join our webinar to discover how integrating the world’s most comprehensive database of published genomic evidence can help you streamline your interpretation workflows.

Learn how Mastermind and the Cancer Knowledgebase data empower clinical labs to scale their interpretation efforts more efficiently, improve accuracy and greatly reduce turnaround time. In this session, we’ll explore the latest and upcoming product offerings from Genomenon, highlighting the technological advancements. Attendees will also hear directly from organizations like CDC and Hartwig Medical Foundation on how they use Genomenon data to accelerate their operations.

Key takeaways:

• Strategies for seamless data integration and API connectivity  
• Best practices for implementing evidence automation in clinical diagnostics
• Real-world success stories: Overcoming challenges and optimizing workflows
• Future-proofing your lab: Staying ahead of emerging trends and technologies

Complete the form below to watch the webinar recording:

Panelist
Mark J. Kiel, MD, PhD
Chief Scientific Officer & Co-Founder

Mark has extensive experience in genome sequencing and clinical data analysis underlying the vision and technology driving the Genomenon suite of software tools.

Panelist
Brittnee Jones, PhD
Principal Technical Product Manager, AI & Data Engineering

Data-driven product manager with expertise in clinical genomics, global team building, and driving customer-focused products forward through market analysis and cross-functional collaboration.

Panelist
Amy Gaviglio, MS, CGC
Public Health Genetics and Rare Disease Consultant, Founder and CEO

Amy Gaviglio is a genetic counselor and founder of Connetics Consulting, LLC which provides newborn screening, genomics, and rare disease services globally. She has worked in the NBS and rare disease space for the past 17 years. Amy currently works with the Centers for Disease Control and Prevention, the Association of Public Health Laboratories (APHL), Expecting Health, RTI International, and several other rare disease and genomics organizations. She is co-chair of APHL’s New Disorders in Newborn Screening Subcommittee and is a member of additional national groups including the Rare Disease Diversity Coalition and EveryLife Foundation’s Community Congress. She also serves on the Executive Board for the International Society of Neonatal Screening and is a member of the MPS Society’s Scientific Advisory Board. Finally, Amy serves as Chair of the NBS Expert Panel for the Clinical and Laboratory Standards Institute.

Panelist
Korneel Duyvesteyn
Development Lead

MARK: Hello, everyone, and welcome. Today we will be hosting a webinar about future-proofing your lab by streamlining variant interpretation with automation. I'm Mark Kiel. I'm the chief science officer and founder of Genomenon, and though this is the first such webinar where we've focused our attention on the automation aspect of what we offer, we have been doing this work for a number of years. We're quite privileged to have with us today, not only some internal folks like myself at Genomenon, but also some customers of ours, who are going to go through some representative use cases of how they take advantage of the data that Genomenon provides in their projects and everyday operation.

The structure of the conversation is, first, I'll be highlighting some of the backstory for Genomenon and talking about the nature of our offerings and the history behind the company, very briefly. We'll go through a deeper discussion about the data that our product offerings comprise. Then, we'll go through those examples with some panelists that I'll be introducing here momentarily, and then we'll end with, hopefully, a lively and informative panel discussion that I will lead, asking some questions that we had already preordained to get some discussion going — but also, we welcome questions from the audience! With the time that remains after that panel discussion, we'll be opening up that conversation to the questions that you've entered.

With that in mind, let me take care of some housekeeping. All of the content that you're hearing about in the workflows that are being discussed by our panelists, these are available to our viewers through following these links. We talked about the CKB offering, the Cancer Knowledge Base, as well as the Mastermind offering. They're separate products, both offered by Genomenon, with the data that's necessary to drive maximal automation of your interpretation workflows. You can follow those links to have access to some of those offerings.

So, I already introduced myself. I'm Mark, chief science officer and founder. You'll see the least of me for the first part here. I'll welcome to the stage shortly here, our VP of product strategy and product management, a good friend and colleague of mine, Brittnee Jones, who will be discussing the data and the offerings here in some depth. Then, Amy Gaviglio, who leads the Connetics Consulting, and is doing some great work that she'll showcase for the CDC, as well as Korneel Duyvesteyn, who is the head of the clinical innovation team at Hartwig, who is a great friend to Genomenon from many years hence, many years in the past. So, thank you, panelists. I'll invite you back on the stage when we get to your your topic.

Very briefly from me, by way of introduction: Genomenon, writ large, provides genomic intelligence in the service of streamlining clinical diagnostics, and separately to inform precision therapeutic development. We like to say very succinctly that we simplify complex genetic data into actionable insights for both of those two endeavors, diagnostics and drug development.

Very briefly, our history is, we were founded about 10 years ago. We were born out of a need that I experienced in my own practice, that I was well aware, was true of multiple of my colleagues in clinical diagnostics, and that I also was aware of in drug development and drug research. We spent a number of years developing and refining our data indexing and organization capability before we launched our flagship product in that space. We began selling into pharma in addition to the clinical market space, as well as embarking on larger-scale projects, such as, highlighted there, curation of this data for the purposes of newborn sequencing and a newborn screening initiative. 

Following on to that, we knew there was great value, as Britt will highlight, in augmenting the value our data through human expert curation. We acquired the Boston Genetics team of curators, a hundred strong individuals, who are helping us live up to our mission of curating the entire human genome. We're doing that in two ways. One of those ways is highlighted there, is curating at the gene level to know how genes are associated with disease, and the curation of the evidence associated with those designations. Also, at the variant level, using ACMG and AMP criteria to curate the evidence to interpret those variants to help streamline these interpretation workflows that we'll be talking about.

Lastly, and very relevant to Korneel's conversation is, we've recently acquired the clinical, now cancer, knowledge base from JAX labs, which has a focus on oncology and has a great deal of valuable information for not only somatic oncology, but also for germline. Korneel will bring that to life by way of example.

I think this is my last slide, Britt, if you want to come back on. Actually, it's my penultimate slide, so Britt, you can come on because we're friendly here. But Genomenon offers different types of offerings. There's software, which you'll hear a great deal about. There's also data, the content that drives that software that can be integrated in these workflows. You'll hear about that from our speakers and from Britt as well. Separately, especially when there's a need to take care of scaling challenges, we offer services useful for designing diagnostic workflows and executing those evidence curation and variant interpretation pipelines.

So in summary, as as Britt and I will highlight here, we have genomic intelligence that powers these offerings, and is driven by both artificial intelligence as well as human intelligence, the genomic expertise that I alluded to earlier. I really want to emphasize that point before I pass it over to Britt. Britt will focus a lot on the technology, but it's important to note that behind every one of the things that we assert from our curations is a dedicated team of human experts, some of which are shown here, those hundred strong individuals at Genomenon who review all that evidence and present that to our users. That is married to the sophisticated computation that Britt will be describing here, including some exciting new developments. With that, Britt, I'll turn the floor over to you, and I'll go silent until the panel discussion.

BRITTNEE: Yep, thanks, Mark. So first, as Mark has just mentioned, we'll discuss our AI-based indexing. And then we're going to dive into how our expert curators use that information in our quest to curate the genome as he said. The accurate and expedient assessment of clinical and functional evidence and variant curation is critical for accurate high throughput pipelines. We're testing and finalizing a modern AI approach to pre-categorize articles into needed categories for assessment. This will provide more value quickly to our users. Our expert curations, to date, serve as a premier training data set, because we're using gold standard ACMG guidelines and we're citing the most impactful articles. So we're using that to then train an LLM to automatically categorize articles to show the contents within those articles.

You can imagine, that's the clinical and functional information that you want to know that's in an article, so that you prioritize those articles and assess those first. We know, lots of time is also spent deciphering somatic versus germline articles, the context within the articles, because there's lots of genes and variants discussed in both of those contexts, so surfacing that information so that you know, if you're searching a variant, that, in this article, it's discussing in a somatic context or in this article, is discussing that in a germline context, will really help speed the workloads for our curators.

We're then expanding on that work to now tag and categorize more information within the article. So ACMG categories. Those are subsets of, say, the clinical and the functional type of information, as well as you know, additional clinical significance, like case reports, association studies.

In all of this we are showing our work. We are calculating and displaying, or will be displaying, probabilistic scores that will enable prioritization as well as filtering of those articles. So, how probabilistic, or how likely is it, that this article does contain that clinical information, or does discuss that variant in a somatic context? In this, we're remaining data driven. We've always been data driven. We're assessing these results as we go. Our curators are actually using this right now, and we will be displaying and being very transparent with our high precision and accuracy for this data. This information is being stress tested with our curation team, and then eventually in the hands of our customers, because, as we all know, experts know best.

So our AI and expert curation is to aid our customers, right? We are in the trenches with you. We have curators. We know it's both hard and time consuming. 90% of labs are reporting interpretation bottlenecks, so you're not alone in this. We know that this is a very difficult step, often very manual. 75% of labs, these manual steps, and then that interpretation, causes delays in those workflows, and 40% of their overall time is actually spent on the manual interpretation steps in the workflow. Automation, which is what we're going to be discussing here today, is designed to present the right information quickly, so that at least we can minimize that component. You can do the important work and find the right information to diagnose your patient the most quickly.

I'm here to introduce two key pieces of technology today. We're going to do Mastermind first. Mastermind is really a database with over 10 million articles that you can search for many types of genomic and related information. So when I say genomics, variants, copy number variants, then you can search that with phenotypes, diseases, or, as we've just mentioned, clinical evidence within the article, functional. This is an exhaustive search. We search all known nomenclatures for variants and ensure you'll see all those results. For some variants, we've shown that there are 120 different nomenclatures, including things like shorthand, legacy nomenclatures. We take all that information, feed it to curators, so that's sensitivity, to curate the data and offer those classified variants, but, most importantly, the supporting evidence. So again, we're showing our work. We're giving you back that supporting evidence. This is to ensure rapid insight for you and your team.

To access that data for automation, we offer both curated and indexed, or what we were just talking about, the AI-extracted data from Mastermind in three different ways. One, our experts have curated gene-disease relationships. This is based on the ClinGen guidelines. So we don't just curate at the variant level. However, we do curate at the variant level, and we have curated variants following the ACMG guidelines. Finally, we have a file containing the indexed information, including counts per variant, for over 25 million published variants, in order to ensure you can automate that information to your samples as rapidly as possible.

The second core technology that we're discussing today is the Cancer Knowledge Base, or CKB. This contains the most up-to-date, highest quality evidence for cancer information. You can search for comprehensive information across over 2,000 genes. This is then going to return to you information like FDA drugs, therapies, trials, and includes the NCCN guidelines. This is updated daily to ensure you find all the right options for your patients, and it's highly structured to support automation with controlled vocabularies and standardized ontologies, including the tiering of evidence.

What's in their data that you can access? Their data files are also available for automation. These contain the expert-curated information for over 45,000 variants, with all the supporting evidence, including the most relevant articles and diagnostic and prognostic implications. The value of this is one singular combined offering of all that AI index, and then both curated germline as well as curated somatic information, is key to reducing your costs, while increasing your diagnostic yields and providing you with the best treatment for your patients.

How would you go about setting up this workflow in your lab? First, variants can be automatically queried and filtered using the curated datasets. That's both on the germline side as well as the oncology or somatic or cancer — I'll use those words interchangeably — side. These are classified variants, but it's also, again, supporting evidence, summarized evidence. And it's a daily-growing dataset. This is updated frequently and can provide your lab with the most up-to-date curated evidence as a first step.

When variants aren't found, or if your team wants to ensure the most up-to-date information since curation of that singular variant, or our singular gene-disease, whatever that is, you can then automatically search our AI-indexed database. This can then identify those variants without evidence in one exhaustive search. Your team won't spend time on those, and you can focus on the third bucket, which is those that do have new or uncurated evidence. The user interface will then return you to a user interface. That shows the most relevant sentences within the article, with detailed location and nomenclature type information, so you and your team can expediently review those articles.

So then, how can that fit in your workflow? So that workflow right there, how can that workflow fit into your lab? Several ways. First, if you're building your own pipeline, we can walk you through our files and offer best practices for integration of this. We support all workflows. So everything from — you can input a VCF, or you can send specific variants to reanalysis. "I would like new information since the last time I analyzed this." It really covers all different automation workflows.

If you're not necessarily building your own, in that case, we support off-the-shelf options. We are comprehensively integrated into several analysis partners. If you have a commercial software provider already in place, and we're not integrated, we welcome your recommendations. We welcome them to contact us. We'd be happy to provide a list of providers if you need that. You can contact us at hello@genomenon.com to find out any of that information. In all of this, we cover all different analysis types. What we're going to hear about next with our speakers here today is, we'll cover everything from panels to whole-genome sequencing. You've heard we cover both germline, and Amy's going to speak to that, as well as somatic, and Korneel's going to speak to that.

Finally, we provide very frequent updates on all of this information to ensure that we are supporting you and your patients as rapidly and with the best, most up-to-date information as possible. We have a worldwide user base. We support all levels of these clinical workflows. So everything from research to clinical reporting to pharma. Come to us with your workflow, and we'd be happy to discuss that with you. Amy will now join us and describe the workflow that they've set up. It's a distributive model. It's really amazing, and so I will let her take over.

AMY: Amazing. Okay, thank you so much, Britt, and thank you so much for the invitation to add to this conversation by sharing our somewhat unique use case, which is really the growing need for efficient variant classification through automation within public health newborn screening programs for the purpose of trying to detect rare diseases at birth. Very quickly, for my time, I will try to kind of level set us with a very brief background on the newborn screening process, focusing especially on the need for timeliness, which then speaks to our need for automation. We'll talk, then, about the current and future integration of molecular technology, specifically sequencing and whole-genome sequencing into newborn screening. Then I'll end with highlighting our particular platform, the CDC's development of a solution known as ED3N. 

This schematic really illustrates the process of newborn screening. We are starting with a seemingly healthy population of newborns in the United States. We're talking about 3.6 million newborns who receive newborn screening. We collect a dried blood spot sample around 24-48 hours after birth. We apply this blood to a filter paper matrix, dry it, and then look at a number of analytes in these screening tests that we use, which are a mix of biochemical and molecular,  really kind of screen out or weed out those that we feel are at high risk of having the disease, and lets those that are lower-risk pass through the screen.

The newborns that are identified as having a higher risk, those that are shown in orange at the top, must complete further testing or diagnostic testing to determine whether the child is indeed affected, so a true positive result, or whether the child is actually unaffected, or a false positive result. It's really important for us. We can't necessarily forget about the babies who screen negative or pass through the screen. So those shown in green at the bottom, while the vast majority will be truly unaffected or true negative results. This is screening, and there are some infants who may actually be affected, and thus had false negative screening results.

One of the things that we're always trying to do within newborn screening is improve our performance metrics, so reducing false positive, and false negative results. This is where we're seeing much more utility and application of molecular technologies in this space. The last piece around newborn screening that I really want to highlight for the purpose of this conversation is around timeliness. Owing to the early onset of many of the diseases that we screen for through newborn screening, timely provision of our screening results is really of the essence. This diagram shows the timeliness goals for newborn screening programs in the United States. Again, we start with that specimen collection generally between 24-48 hours of birth. 

We ask that birth facilities send their specimens to our public health laboratories as soon as possible. Ideal receipt by the public health lab within 24 hours of collection, certainly no later than 48 hours after collection. In general, we only give ourselves around two to four days to complete testing, because we are aiming to report out our results, especially those high-risk results, within five to seven days of birth. This timeline is applied to a very high throughput environment. Newborn screening programs, on average, of course, depending on the state, receive anywhere from hundreds to thousands of specimens per day. I'm hoping this gives us a nice framework for why we really need automation in this space as we start to add sequencing.

Hopefully, now we know a bit more about the screening process and our timing expectations. We can shift our focus into really how we are incorporating molecular testing, inclusive of the variant classification into this paradigm. Truly, incorporation of molecular testing into newborn screening programs has been a rather slow-but-steady pace of inclusion. By and large, the majority of our testing is still examining biochemical analytes with molecular, primarily functioning in a kind of reflexed or tiered capacity. Much of this is done at the single gene level, using either a targeted variant panel, as is often the case for screening for, say, cystic fibrosis, or in single-gene sequencing, which we largely see in the space of our lysosomal storage disorder screening and leukodystrophy screening.

That being said, the use of NGS is really starting to evolve in programs as we start to explore the potential utility of multi-gene panels. For example, one of the diseases we screen for, severe combined immune deficiency, has a number of underlying genetic causes. We've looked at multi gene panels for that disease, as well as the potential utility of exome and genome sequencing as a reflex test. At this time, the use of whole-genome sequencing, either as a first or second tier test, has really remained in the research and clinical sphere. It hasn't yet broken into the public health realm, but we absolutely expect that to change. I'll talk a little bit about that on the next slide.

In the screening realm, whole-genome sequencing in newborns can mean a number of different things, as outlined here. Certainly, on the far left, we hear a lot about the diagnostic application of rapid whole-genome sequencing. For newborns primarily in the NICU/PICU environment, I'm not going to touch that today, I'm really going to focus on the screening perspective. Here we can see a spectrum of potential uses. I mentioned a bit on the last slide, the potential use as a reflex test in newborn screening programs. This is being examined already. Then the current research studies, examining the utility and feasibility of this technology, kind of move along the spectrum. 

Right now, what we're seeing is what I'm going to call targeted whole-genome sequencing, primarily using a genomic backbone and then targeting genes and variants known to be associated with actionable diseases in childhood. This concept of targeted whole-genome sequencing is the likely path forward, in my opinion, for any implementation of first-tier genomics and public health.

So, understanding that we will absolutely see an expansion of the use of NGS and whole-genome sequencing within public health newborn screening programs, combined with the acknowledgment that we have a general public health workforce crisis in the United States, we have embarked on our project called ED3N. The idea behind this is really trying to be proactive in making sure that programs are prepared and able to incorporate this technology, without maybe having the resources and manpower that may otherwise be needed to try to do this manually. We really are focused on automation to streamline the workflows and continue to meet those timeliness goals.

That led us to the development of ED3N, which is our national newborn screening data platform. ED3N stands for Enhancing Data-driven Disease Detection in Newborns, and its objective is to increase the capacity and infrastructure to collect, aggregate, and analyze newborn screening data across the country, while removing the burden on public health programs to have to develop any of their own robust data, analytic, and bioinformatic tools. Aggregation is also really, really important for us in the realm of rare diseases, where any single state program may not generate enough data, may not pick up enough cases, to really understand things like genotype-phenotype correlation or longitudinal outcomes of screening.

ED3N will actually process all types of newborn screening data. We’ll apply things like machine learning algorithms to the biochemical data, again, in these efforts to improve our risk assessment. We’ll also obtain downstream clinical data after newborn screening to really better understand the correlation between the screening biochemical phenotype and the screening molecular genotype to the clinical outcomes and findings. Obviously, for the purpose of today, I’ll be focusing on the molecular module, but I just wanted to let you know that this information will all ultimately be linked to additional data types, so that we really can have a holistic vision of what is going on with each infant, and ultimately improve our screening performance while contributing to clinical knowledge in the rare disease space.

At the moment, ED3N really starts at the point of a VCF file, though we are certainly exploring incorporating tools for upstream wet lab and bioinformatic pipeline processes. We’ll start for today at the point of a VCF. So programs can upload their VCF file into ED3N, and ED3N will start by annotating and starting to pull the evidence needed to provide variant classification. So again, very automated, one-stop shopping. ED3N will automatically, based on predetermined workflows by the program, assign individuals to review the variant. It will also provide information on whether or not that variant has already been seen by the program, but also let them know if that variant was seen by another newborn screening program, and how that newborn screening program classified the variant. We think this is especially important as we think about variant identification in a screening space, where we do pick up a large number of novel variants that aren’t seen in places like ClinVar, and where we don’t really have a lot — or in most cases, any — phenotypic data at the time of variant classification.

Upon ingestion, ED3N will provide annotation, and will incorporate a number of APIs and data pulls to grab evidence needed for variant classification into this one platform. Of course, one of our key integrations is with Mastermind. I think Britt really set this up nicely. We heard the same things when we talked to programs about their biggest concern around incorporating molecular technology into their high-throughput, fast workflows: that variant classification was a very manual process. Going through the literature, in particular, was one of the biggest bottlenecks to actually doing efficient implementation and continuing to meet the timeliness goals that are set by the federal government. That really led us to wanting to integrate Mastermind into our own solution for the benefit of our programs. It’s been a great integration. Our programs love this.

We’re not using Mastermind so much as a standalone product, but we’ve really integrated it into our system with a link out to the full functionality of the website, of Genomenon. Again, our goal here was to have a one-stop shop for all newborn screening-related data. We don’t have the time to be going in and out and trying to garner evidence all over the web, so this has been how we’re using it now. The curation of many of our genes has already been done, which is an added feature for us as well, we’re pulling that in eventually, too. We're just really happy with the way this is going in terms of reducing our timing, and enhancing the ability for our programs to actually implement this technology into their programs. 

I’ll end with just some key take-home messages for our particular use case and our integration with Genomenon. Again, the way we’ve built this is really in a collaborative effort to allow programs to learn from each other and share variant classifications. We will also provide reclassification guidance and processes as more evidence is generated, if we get notice from Genomenon that something’s been changed, we’ll alert our programs. 

We do think that tying this to additional screening data is going to be very valuable, not only for improving our performance, but for contributing to other variant databases like ClinVar, and more holistically, just contributing to our knowledge base of many of the rare diseases that we’re screening for. So with that, I will end and turn it over to Korneel.

KORNEEL: Thank you, Mark and Brittnee, for the invite and for organizing the webinar, and for pronouncing my name correctly. We’ve been using CKB for about four years now in the cancer setting, exactly in the use case that Britnee described. I think we had a textbook example. In the next ten minutes, I’ll give a few examples of how we use CKB, and I will give a brief introduction of what Hartwig does for the people who don’t know us. Hartwig is a not-for-profit organization, and we're primarily based in the Netherlands, but we also have offices in Spain, Canada, and Australia. Our mission is quite broad: it is to improve cancer care for metastatic patients. We started around ten years ago, driven by the desire of a few key hospitals in the Netherlands to perform whole-genome sequencing for cancer patients. At that time, there was no lab in the Netherlands to do it, so we set up our own lab. We started sequencing and building up a database, and then, over the course of ten years, we gradually moved more towards diagnostics. 

We now have a certified product that we can deliver for patients who get whole-genome sequencing, and we’re moving more into comprehensive treatment guidance. The clinical space is where CKB comes into play, and there will be examples of that later. We’re around 60 people, mostly tech-oriented, but we also have some clinicians in our company. For those who have tried to find me in the picture, I’m not there.

Our core expertise from the start has been whole-genome sequencing and analysis. Over time, we have built algorithms to analyze whole-genome sequencing tumor data, and we have made that independent of sequencing platforms and independent of infrastructure. We also support not only whole-genome sequencing but also exome sequencing and panel sequencing, also independent of the actual sequencing done. We not only support whole-genome sequencing, but also exome and panel. This suite of algorithms is available open source, and there's quite a few labs who use it, next to our own lab. There’s a lot of information about this on the internet, if you’re interested, but I will skip this for now.

As mentioned, we started with whole-genome sequencing and collecting all the data in our database. By now, we have just over 7,000 samples in our database, all annotated with clinical data, and they are available for researchers globally. There are a couple hundred groups in the world using our data. Incidence follows fairly typical patterns, as you can see on the left. For those who may not have recognized the map, that is the Netherlands. It’s a small country, it fits into Lake Michigan. We have worked with 45 hospitals out of 90, and so they're all the dots on the picture, so quite a broad coverage over the Netherlands.

Since about five years ago, we have developed a product that we share with patients it’s called OncoACT. This is where CKB comes in. This is simply a report where we combine the output of our genome analysis algorithms with interpretation from CKB and then share it with hospitals. This typically goes to pathology groups at hospitals, who then further determine treatment options based on this report. In addition, we’re moving towards a broader treatment guidance program. This is only done with a few hospitals in the Netherlands at the moment. It’s called ACTIN, and it very much focuses on automation. This is an application where CKB is even more relevant. What we try to do is capture all the relevant information from the hospital, all the DNA data that has been sequenced for a patient is analyzed using our algorithms, and annotated with CKB. Then, we combine basically everything we know about a patient and push it directly into the patient record at the hospital. Ideally, no human has to do interpretation on the data. This is partly thanks to CKB, but I will come back to that later.

This is one example from a patient discussed in a GTB at Erasmus MC, which is the main hospital we work with in the Netherlands. Just one specific, typical example, for this patient, a there was a fusion found in the tumor. The fusion doesn't have a high driver likelihood, meaning it doesn’t happen a lot in our database. But then, in CKB, it is annotated with gain-of-function evidence. So, this was presented to the GTB, and then the people in the discussion can decide whether they want to treat the patient based on this information. I could show hundreds more examples like this, but hopefully, this is clear.

So, more concretely, what do we use CKB for? The first challenge with whole-genome sequencing is that we find so many mutations that we have to decide which ones may actually be clinically relevant. For that purpose, CKB provides useful input because it contains a list of all genes potentially related to cancer, giving us a good initial filter on all the mutations we find. Then, we evaluate every single mutation against the CKB database, determine if it’s a variant of unknown significance or not, and annotate it with impact. The screenshots that you see here show a colorectal cancer patient that we processed with a pilot setup in the US. In this case, they can see whether there is evidence or knowledge about these mutations. We also annotate all the mutations with efficacy evidence. That shows what treatments are applicable for what mutations based on literature.

In addition, we also use the trial database from the CKB database. This is from the same patient that we processed in the US. For this patient, there was apparently a trial running in Boston. This picture also shows what I think is one of the key strengths of CKB, which is categorized variants. In cancer, what typically happens in trials is that a trial demands a category variant, sort of a vague, not-completely-specified variant, in this case, it's "APC inact mut," which means any inactivating mutation, but CKB contains a tree that helps us determine which mutations are actually considered inactivating. We can walk that tree and then see whether the mutations that we found are actually considered inactivating mutations. That's super helpful.

Furthermore, there are a few improvements we are working on. There's definitely information from CKB that we don't use yet in our system. One thing that we're recently experimenting with is providing guidance on what has not been tested. Oncologists in the Netherlands often ask us, if a patient has done a panel, should we do whole genome sequencing for this patient? CKB is actually quite a good resource to find the evidence that may be relevant for the patient if they would be whole-genome sequenced. Another thing that we're working on is trying to rank the different treatment options. So quite often, patients have multiple treatment options. It's not that trivial to decide what is the best, based on the evidence. 

Overall, CKB improves efficiency and quality. We don't have the numbers, but I can imagine the numbers that Brittnee presented are applicable to our situation as well. I will skip the first three items because they were mentioned already. The fourth item, I want to highlight, because the CKB team has also just improved our understanding of molecular variant interpretation in general, which made some improvements possible in our system in general, which I think is really a great thing.

Then, the final impact on our actual workflow: the fast turnaround time. Today, we reported a patient for Erasmus MC within two hours, but that's somewhat unusual. Normally, it takes a few days, but it's still quite fast because there's no humans involved. Reduction of time spent in in genomic tumor boards, as mentioned by Brittnee as well. I think one really key aspect of automation, and a source like Genomenon, is that it leads to consistency in quality across hospitals. At least in the Netherlands, the belief is that it makes quite a big difference what hospital you enter. We hope that by at least providing all the information that is known about mutations to any doctor in any hospital, that difference will be smaller. Ultimately it should hopefully lead to more comprehensive decision making. It's actually what patients sometimes tell us, as well — the fact that their data has been comprehensively analyzed gives them comfort that no options have been missed. No evidence has has been missed. No trials have been missed. That's the feedback, as we get that, even when there are few or no good treatment options, that fact is still important. That's it.

MARK: Fantastic. Thank you so much, Korneel. At this time, I’d like to invite Amy back to the stage. Britt, if you’d like to participate in the panel discussion, please join us as well. We have about 15 minutes. As Korneel mentioned to me by email, we could spend many hours talking about individual aspects of the scale and interpretation challenge here. I’ll try to be succinct and guide some meaningful discussion.

It struck me, as I was listening to you both, that although there are differences in what you’re looking at — germline and somatic disease screening and diagnostics — there are critical similarities, even outside of your use of Genomenon data. There’s the scale challenge. Amy, I was struck by the 3.6 million lives that you’re going to need to impact with your ED3N program. But in addition to that, the systems that you had to build, you had to build your own system and integrate the data we’ve been discussing, all in the service of improving turnaround times while maintaining accuracy. Just like sensitivity and specificity, there’s a tension between turnaround time and accuracy. 

I wonder if you can comment on how you struck the balance between the depth of analysis required and making sure that you had optimal turnaround times, because both of those are important to adequate patient care. Maybe, Amy, if you could go first and speak to that from your perspective, and then, Korneel, I’d love to hear your response as well.

AMY: Yeah, it’s a great question. It’s something that I think we struggle with all the time in newborn screening. Can you hear me okay? Newborn screening programs, like most clinical programs, are CLIA labs, they're CLIA- or CAP-accredited as well. This balance is really important. One of the things that we will be doing, and that each program will need to do, is to validate their workflows. We have been doing one-on-one comparisons with manual curation to the automated curation, in efforts to verify and validate the process, which is, of course, a requirement. You’re right, there is this inherent tension between quality and quantity and speed. We certainly don’t want to be fast and inaccurate, that’s really not helpful for anyone. So, it will be a lot of upfront testing and verification, and all programs will have to validate the workflow as part of their process. That’s really how we’ll try to meet that balance as much as we can.

MARK: Korneel?

KORNEEL: I have a slightly different view, I think. My impression is that through automation, because things become more tangible, you can actually measure precision and sensitivity. When you compare it with a process that is largely manual… We have quite a few discussions in the Netherlands, at least, where people ask, can you demonstrate that your trial list is exhaustive? But then, if you talk to clinicians, they say, yeah, but if one of the doctors is on holiday, we will probably miss trials anyway, because we now rely on humans to carry the data. So I feel like, whenever we talk with clinicians, they say, your automation is already better than what we do manually. Now that we can measure precision and sensitivity, it feels like a bit of a weird comparison, comparing something with a manual process.

MARK: That’s interesting. In my experience, when Britnee and I do these kinds of comparisons, you compare against the current standard. But what if there’s a new gold standard? We often have to make the argument about things being missed in current operations. If you look at the totality of the data, if you want to advance the field, or your own work, or continue to optimize for accuracy. I’m also struck again by the differences, in my work with newborn sequencing, it’s different. It’s not a diagnosis, and you're looking at drawing the line differently, Amy, you talked about a true positive and a true negative, and false negatives are what you're trying to avoid, but screening is different, especially when you have pre-existing and very efficacious newborn screening biochemical methods.

That allows me to move into the next question. I was also struck — Amy, you didn’t talk about it, but you put it on the slide, and Korneel, you alluded to it as well. This is a multi-stakeholder enterprise. There are multiple inputs here, and the output is, you're trying to coalesce those inputs into one actionable output, whether it’s a screening result or a diagnostic decision. I imagine that was challenging. We don’t need to linger on how challenging it was, but if you could speak to which aspects were challenging, and if there’s any benefit that you can give  the audience, things you learned in that process of coordinating all those activities and consolidating that information from multiple stakeholders. So, Korneel, and then Amy.

KORNEEL: What stakeholders do you mean?

MARK: Well, perhaps, Amy, you can go first because you had that slide. I’m thinking about the clinicians, the clinical information, laboratory operations, and the mechanics. We’re talking about the interpretation in this webinar, but there are other pieces that have to go into this to make the whole apparatus functional.

AMY: Yeah, yeah, I can try. We’ve done a ton of data flow mapping, ad nauseam, in terms of really understanding the timing. For us, as I tried to indicate, timing is really important. We’re talking about very, very fast turnaround times, so we’ve done quite a bit of mapping in terms of, when do we expect biochemical data to come off? Would the molecular data come around the same time? Would it come after? Who’s doing what, when? How do we best link all this data so that, at the time of variant classification, within ED3N, they have what they need, and they're not needing to go into their own database systems. 

What makes it additionally challenging is trying to build a database that fits 50 different workflows. It is a lot of trying to say, okay, how can we standardize some of the workflows? Which is truly one of the goals. I think where automation is gonna help us a lot as well is that we maybe will remove some of these minor disparities and discrepancies in the way things are done, to actually harmonize practices across the country, which I think is a ton of value. So we've acknowledged that different workflows exist, but then we've tried to really work with key knowledge leaders in the space to say, okay, but can we come up with an ideal workflow? and move everyone kind of towards that. But you're right. Then, what does the report look like after? Who is it going to? Is it geared towards the family? Is it geared towards a primary care provider? Is it geared towards a specialist? This is a a piece of a a giant system, and so it's just taken a lot of focus groups, mapping, discussions, blood, sweat, tears.

MARK: So, Korneel, in my experience in molecular pathology, on the oncology side, in particularly for hematopoietic disease, there are multiple lines of clinical information that you have to integrate. The molecular data is highly valuable, especially now, with dictating therapy. If you do incorporate those into your system, that would be useful to talk about. If you don’t, I wonder how you ensure that your molecular output gets married to those pieces seamlessly, because there's not just turnaround time in the molecular lab; there’s also turnaround time to get that information into clinical hands to take root with the patient.

KORNEEL: Yeah, I get the question. I think you can break down the workflow in in a few different pieces, and one piece that is at least done in the Netherlands a lot is the thing that Brittney described. There's a lot of time spent on just finding data, essentially. Actually, in GTBs in the Netherlands as well, they asked, what is known about this mutation? No one can really argue that automation can do that better. The more complicated discussions, I think there's two more complicated discussions. One is, how far do you want to take automation? Because in the end, as a cancer patient, it does sound comforting that a lot of specialists look at your case. If a doctor says, "I have no idea what treatment you're going to get. It's a fully automated setup." Good luck with it. Then probably patients won't trust it a lot.

The other discussions that we have with molecular pathologists are like, well, what is my job then, if I don't have to look up evidence anymore? I think your job actually becomes a lot more exciting and challenging, because you don't have to do the boring stuff anymore. You can fully dedicate your time to things that computers are not that good at.

MARK: Perfect. I mentioned at the onset, that Genomenon was born out of need. I've often said, there's the muscular work of going out and finding the data, but what the people who are looking at it were trained for is the cerebral work of interpreting those results. I like that idea, that things that automation allows to be processive, you should let it do those things, as long as you've been measuring and assessing and validating, and then tee up that data for maximal interpretation, whether it's preordained or whether it needs to be looked at anew in the context of the clinical circumstance.

So Britt, that that calls to my mind some of the things that Korneel and Amy talked about with respect to the balance between speed and accuracy. Similar questions with AI and machine learning. Korneel, you brought it up. We, obviously, at Genomenon, are using AI. We marry that with the curation team. I wonder, Britt, if you could speak to how we strike that balance, and then Korneel and Amy, if you could close that out by talking about how you appreciate the output that comes out of the AI married to the human curation. So Britt first, and then Amy and Korneel.

BRITTNEE: It's something that we've been working on a lot, which is one, as we just mentioned, we need the most comprehensive data sets to compare to. So in order to know that you're getting everything, and you're serving the right information to the curators to look at, we need to make sure that we have a complete data set. The good thing is, lots of those are starting to pop up in the industry. People are recognizing this across the board as a need. We are doing all of those calculations to ensure that our algorithms are presenting the information, so precision and accuracy, and recall. I didn't mention that one before. Ensuring that we have the highest quality metrics, and then that goes to the humans. 

Making that information then feedback, that's the last component. That is very key, that it really has to feed back into the work that we're doing. As we said, we're stress testing this, the new AI that we're coming out with, with our own internal curators. They've had lots of feedback. (Thank you all!) No end of feedback. All of that feedback is going back into those algorithms, so that we can—we can develop those even further.

MARK: It's a nice recursive loop of complication, duration. Amy, Korneel, if you could succinctly summarize what you talked about, the benefits that you've gotten either from the Mastermind data set or the CKB data set.

AMY: Yeah, I can go first. I really liked what Korneel said in terms of, automation isn't necessarily replacing the people. It is enhancing their ability to do more complex and more interesting and, in some cases, more important work, of really, really cerebrally thinking about this data. One of the comments I remember hearing at the last ACMG meeting, which I really liked, was that AI isn't going to replace clinicians or lab individuals, but those that use it will replace those that don't. I think that's where we're at. I think we are at a point, and we do not have, in newborn screening programs, time to do a lot of this, looking around and trying to pull things and stuff. We're talking, you know, like I said, hundreds to thousands of babies each and every day, including weekends and holidays. And so, you know, having this available so that our limited staff can really focus on getting these results out, thinking about everything in context of the biochemical and clinical data, is vitally important for the sustainability of newborn screening, inclusive of molecular technology.

MARK: Great. Korneel, last word here.

KORNEEL: From the user perspective, we want a database that is a hundred percent accurate. I imagine for newborn screening, that's the same. That's what I wonder, if you use AI internally, how do you make sure that it's — you can't really go for 99% precision. We struggle with that challenge ourselves as well. We use AI a lot to automatically structure clinical records, but we can't make a mistake in a treatment history for a patient. So at the moment, we validate, we check everything manually. It does help efficiency, but I'm wondering if there would be a point in time where we don't have to do that anymore.

MARK: Actually, this is a coda here, to bring everything together, Britt, Amy, and Korneel, your conversations. Britt, you had said, we show our work, and, Amy, you said, you're not going to replace clinicians, you're going to replace clinicians who are not using these tools. And Korneel, you're bringing those very things up. What is the gold standard? How are we comparing against that gold standard? And are we getting better? But showing your work and ensuring that there's a human step in that process, in the context of that automation, I think that's the best marriage of the technology and the knowledge that we've accumulated as a field, to make sure that we're not missing anything for the betterment of our patients.

I'm looking at the clock. We had a couple of questions from the audience that we weren't able to get to. Rest assured that we will reach out to you separately and answer those questions. In conclusion, I'd like to get back to sharing my screen here and just emphasizing that these data sets and these tools that we had talked about, those are available to you for preview, and in the CKB software, as well as the Mastermind software for oncology and germline and oncology somatic disease in Mastermind.

Britt, you mentioned that we're readily open to having conversations. You can reach out to us at hello@genomenon.com to take advantage of — Korneel, that was a very flattering comment — the expertise of both the CKB and the Mastermind teams at Genomenon. We'd love to chat. With that, I'll conclude this first, hopefully, of many automation webinars. Thank you, Amy. Thank you, Korneel. Thank you, Britt. Thank you, audience, for your kind attention. And with that, we'll adjourn.

The World’s Most Comprehensive Source of Genomic Evidence

Mastermind accelerates variant interpretation with immediate insight into the full text of millions of scientific articles. Prioritize your search results by clinical relevance and find what you are looking for 5-10 times faster

Create your free account
The Most Comprehensive Source of Curated Genomic Evidence + Scientific Experts

We help provide insights into key genetic drivers of diseases and relevant biomarkers. By working together to understand this data, we enable scientists and researchers to make more informed decisions on programs of interest. To learn more about how we can partner together to find your genomic variant solutions, we invite you to click on the link below.

Contact Us
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.