Working Group Webinar Library
Webinar Library
Behavioral Testing and Evaluation to Probe Language Models for Algorithmic Bias
With growing legal and scientific evidence for the importance of reducing model bias, both model developers and deployers need tools to quantify the bias. Unfortunately, algorithmic bias can take as many forms as there are implementations. In this talk, Paul M. Heider covesr a range clinical NLP use cases like de-identification and predicting diagnoses, highlighting the utility of behavioral testing and comparative evaluation methods to identify the scope of a model’s bias. These approaches can be leveraged at the training, testing, and evaluation stages to benefit researchers doing de novo model development and community members tasked with choosing between multiple third-party models to deploy. Presenter
ALNI/NIWG Joint Webinar: Using Python and an Open Source LLM to Facilitate Competency Mapping
This project leverages Python programming and an open-source large language model (LLM) to calculate semantic similarity scores between pairs of text strings. These scores are systematically recorded in an Excel workbook, enabling the organization of string pairings into ranked mappings. Dr. Macintosh employed this methodology to propose mappings between AACN graduate sub-competencies and course learning outcomes.
Levels of Clinical Evaluation for LLMs: Towards More Realistic Evaluations
Large language models (LLMs) hold immense promise for democratizing access to medical information and assisting physicians in delivering higher-quality care. However, realistic evaluations of LLMs in clinical contexts have been limited, with much focus placed on multiple-choice evaluations of clinical knowledge. In this talk, I will present a four-level framework for clinical evaluations, encompassing multiple-choice knowledge assessments, open-ended human ratings, offline human evaluations of real tasks, and online real-world studies within actual workflows. I will discuss the strengths and weaknesses of each approach and argue that advancing towards more realistic evaluations is crucial for realizing the full potential of LLMs. Watch the Recording Presenter
A Gene Regulatory Switch Promotes a New Therapeutic Vulnerability in EGFR Inhibitor Drug Tolerant Persister Cells
This presentation is an introduction to Oncology Data Science at AstraZeneca and the important work being done to impact decision making to advance AstraZeneca’s oncology drug portfolio. Steven Criscione will also share a research vignette on our study of the EGFR inhibitor osimertinib drug tolerant persister cells.
Transforming Complex Information Into Compelling, Human Stories
Learn strategies for translating even the most technical information into compelling, human stories—so you can change the way people think, feel, and act. We’ll use sustainability communications as our lens for looking at messaging strategies that work (and those that don’t work). We’ll also talk about the buzzwords and cliches you should avoid in your writing. Finally, we’ll practice what we’ve learned with a simple writing exercise that brings the concepts to life. This session requires audience participation, so bring your ideas, your questions, and something to write with. Watch the Recording Presenter