Large language models provide a leap in capabilities on understanding medical language and context – from passing the US medical licensing exam to summarizing clinical notes. They also suffer from a wide range of issues – hallucinations, robustness, privacy, bias – blocking many use cases. This session shares currently deployed software, lessons learned, and best practices that John Snow Labs has learned while enabling academic medical centers, pharmaceuticals, and health IT companies to build LLM-based solutions.

First, we cover benchmarks for new healthcare-specific large language models, showing how tuning LLM’s specifically on medical data and tasks results in higher accuracy on use cases such as question answering, information extraction, and summarization, compared to general-purpose LLM’s like GPT-4. Second, we share an architecture for medical chatbots that tackles issues of hallucinations, outdated content, privacy, and building a longitudinal view of patients. Third, we present a comprehensive solution for testing LLM’s beyond accuracy – for bias, fairness, representation, robustness, and toxicity – using the open-source nlptest library.

Talk by: David Talby

Here’s more to explore:
LLM Compact Guide: https://dbricks.co/43WuQyb Big Book of MLOps: https://dbricks.co/3r0Pqiz

Connect with us: Website: https://databricks.com
Twitter: https://twitter.com/databricks
LinkedIn: https://www.linkedin.com/company/databricks
Instagram: https://www.instagram.com/databricksinc
Facebook: https://www.facebook.com/databricksinc

Add comment

Your email address will not be published. Required fields are marked *

Categories

All Topics