Auto-coding, the process of automatically assigning standardized codes to clinical trial data, has primarily relied on string matching techniques. When data terms match exactly with those in the dictionary such as MedDRA, coding is straightforward and accurate. However, fuzzy matching algorithms are employed when exact matches are unavailable. These help identify and pair similar but not identical terms, ensuring accurate coding even in cases of minor variations, such as spelling mistakes.

While these approaches have been fundamental to auto-coding, they falter with more complex or unstructured data. For example, while the system might easily code “headache,” it could struggle with a more nuanced description like “dull head pain,” leading to errors, inconsistencies, or the need for manual coding—a time-consuming process that everyone in clinical trials aims to minimize.

Traditional systems perform adequately with structured and finite terms but are less effective when faced with unstructured text or new terminology. They lack the ability to understand context or nuance, which limits their effectiveness.

Enter AI and Embeddings: A Game-Changer

AI, particularly through the use of embeddings, introduces a new level of sophistication to auto-coding. Unlike traditional systems that rely on string matching, embeddings allow for the understanding of the underlying meaning of terms. This advancement can be likened to moving from matching puzzle pieces to seeing the entire picture.

For instance, if a term like “dull head pain” is encountered, an AI-powered system using embeddings can recognize its similarity to “headache” by understanding the relationship between the terms, even without an exact match. This capability extends beyond just handling synonyms; embeddings can adapt to new terminology as it emerges, which is particularly important in the ever-evolving field of medical science. Traditional systems often struggled to keep up with new terms, but AI can learn and adapt, ensuring accurate and up-to-date coding.

Moreover, when dealing with unstructured data, such as patient-reported outcomes, traditional systems are inadequate. With embeddings, however, this unstructured data can be transformed into a format that fits seamlessly into existing coding systems.

Looking Ahead

We are on the cusp of a new era in auto-coding, where AI and embeddings have the potential to revolutionize the process. This shift is not just incremental but a significant leap forward. As clinical trials become more complex, the ability to accurately and efficiently code vast amounts of data will be increasingly critical.

While traditional auto-coding algorithms have served well in the past, the future clearly lies with AI based embeddings. With the recent advancement in these technologies the prospect of faster, more accurate, and more adaptable coding is not just a possibility—it’s available today.

I will take a deeper dive into using embeddings in a future article.

You can follow Clinical AI Pulse directly on LinkedIn or, if you prefer a monthly collection, subscribe through this link.


Weekly Clinical AI Pulse

FDA and CTTI Hold Joint Workshop on AI in Drug Development – AI: The Washington Report – This workshop highlighted the need for clearer regulatory guidelines for AI in drug development. Discussions emphasized public-private partnerships, multidisciplinary collaboration, and the importance of explainable AI models to build trust and innovation in clinical trials.

Citeline SmartSolutions Take AI to New Levels, Optimizing Clinical Trial Planning and Site Selection – Citeline’s SmartSolutions uses AI to enhance clinical trial planning and site selection. By analyzing vast datasets, the platform optimizes trial designs, improving efficiency and success rates in identifying suitable trial sites.

Veridix AI, part of Emmes Group, Announces New Protocol Digitization Capabilities – Veridix AI, part of Emmes Group, announced new capabilities in protocol digitization aimed at streamlining clinical trial processes. The AI-driven technology automates the digitization of trial protocols, improving accuracy and efficiency. This advancement supports more precise trial designs and enhances the ability to adapt to real-time data, ultimately leading to more efficient drug development and regulatory compliance.

Artificial Intelligence and Machine Learning (AI/ML)-Enabled Medical Devices – The U.S. Food and Administration updated the list of Artificial Intelligence and Machine Learning (AI/ML)-Enabled Medical Devices. With this update, the FDA has authorized ~950 AI/ML-enabled medical devices so far with 130, 150, 221 devices approved in 2021, 2022, & 2023. Among these 723 in Radiology and 98 in cardiovascular.

Development and Validation of a Natural Language Processing Algorithm for Extracting Clinical and Pathological Features of Breast Cancer From Pathology Reports – Researchers developed and validated a natural language processing (NLP) algorithm to extract clinical and pathological features from breast cancer pathology reports. This AI-driven tool improves data extraction accuracy and supports better clinical decision-making by automating the interpretation of complex medical texts.

Leading AI models struggle to identify genetic conditions from patient-written descriptions – Leading AI models have difficulty accurately identifying genetic conditions based on patient-written descriptions. This challenge highlights the need for improved model training using diverse and representative datasets to enhance AI’s performance in clinical settings, particularly in interpreting patient-reported symptoms for genetic diagnoses.

Podcast: How AI Can Improve Patient Identification and Recruitment for Clinical Trials – This podcast explores how AI can significantly enhance patient identification and recruitment for clinical trials. By analyzing vast datasets, AI tools can identify suitable candidates more efficiently, improving trial diversity and enrollment rates. The discussion also touches on the ethical considerations and challenges of implementing AI in patient recruitment.

Medable Launches No-Code Platform for Clinical Trials – Medable has launched a no-code platform designed to simplify the design and execution of clinical trials. This platform enables researchers to create customized trials without extensive coding knowledge, accelerating the trial setup process and improving accessibility. By reducing the technical barriers, it aims to streamline clinical research and increase trial efficiency.

Innovation and challenges of artificial intelligence technology in personalized healthcare – This study discusses the innovation and challenges of AI technology in personalized healthcare. It emphasizes the potential of AI to tailor medical treatments to individual patients based on their unique genetic and clinical data. However, challenges like data privacy, algorithm transparency, and the integration of AI into clinical workflows remain significant hurdles that need to be addressed to fully realize AI’s potential in personalized medicine.

Free Conference: AI-Driven Cancer Solutions: From Basic Science to Standard of Care in Breast Cancer and Beyond – The Breast Cancer Program focuses on AI-driven solutions for cancer care, discussing advancements from basic science to standard clinical practices, particularly in breast cancer treatment. The event highlights the integration of AI in diagnostics, treatment planning, and personalized medicine, emphasizing the importance of interdisciplinary collaboration in advancing these technologies.