Skip to main content

Webinar Library

Behavioral Testing and Evaluation to Probe Language Models for Algorithmic Bias

With growing legal and scientific evidence for the importance of reducing model bias, both model developers and deployers need tools to quantify the bias. Unfortunately, algorithmic bias can take as many forms as there are implementations. In this talk, Paul M. Heider covesr a range clinical NLP use cases like de-identification and predicting diagnoses, highlighting the utility of behavioral testing and comparative evaluation methods to identify the scope of a model’s bias. These approaches can be leveraged at the training, testing, and evaluation stages to benefit researchers doing de novo model development and community members tasked with choosing between multiple third-party models to deploy. Presenter

ALNI/NIWG Joint Webinar: Using Python and an Open Source LLM to Facilitate Competency Mapping

This project leverages Python programming and an open-source large language model (LLM) to calculate semantic similarity scores between pairs of text strings. These scores are systematically recorded in an Excel workbook, enabling the organization of string pairings into ranked mappings. Dr. Macintosh employed this methodology to propose mappings between AACN graduate sub-competencies and course learning outcomes.

Levels of Clinical Evaluation for LLMs: Towards More Realistic Evaluations

Large language models (LLMs) hold immense promise for democratizing access to medical information and assisting physicians in delivering higher-quality care. However, realistic evaluations of LLMs in clinical contexts have been limited, with much focus placed on multiple-choice evaluations of clinical knowledge. In this talk, I will present a four-level framework for clinical evaluations, encompassing multiple-choice knowledge assessments, open-ended human ratings, offline human evaluations of real tasks, and online real-world studies within actual workflows. I will discuss the strengths and weaknesses of each approach and argue that advancing towards more realistic evaluations is crucial for realizing the full potential of LLMs. Presenter

Transforming Complex Information Into Compelling, Human Stories

Learn strategies for translating even the most technical information into compelling, human stories—so you can change the way people think, feel, and act. We’ll use sustainability communications as our lens for looking at messaging strategies that work (and those that don’t work). We’ll also talk about the buzzwords and cliches you should avoid in your writing. Finally, we’ll practice what we’ve learned with a simple writing exercise that brings the concepts to life. This session requires audience participation, so bring your ideas, your questions, and something to write with. Watch the Recording Presenter