Hi ,

Are your computer vision models consistently delivering fair and accurate results across all patient groups and data sources? If you're not actively managing bias and batch effects, the answer might surprise you.

Join me May 15th at 11 am EDT for a webinar on "Bias and Batch Effects in Medical Imaging"

In this 30-minute webinar, I'll share:

How seemingly minor data variations can dramatically skew your model performance
Real-world case studies where bias led to critical AI failures
Practical strategies for data splitting, stratified metrics, and ongoing monitoring
A framework to ensure your models generalize properly and perform fairly

Whether you're a data scientist, ML engineer, product manager, or business leader implementing computer vision solutions, this session will equip you with actionable techniques to build more reliable and equitable AI systems.

Heather

Research: Foundation Models

Foundation Models -- A Panacea for Artificial Intelligence in Pathology?

A comprehensive study by Nita Mulliqi et al. provides important nuance to the discussion around foundation models in pathology. Their analysis of prostate cancer diagnosis and Gleason grading—spanning 100,000+ biopsies across 15 sites in 11 countries—offers valuable insights for healthcare AI implementation.

𝗪𝗵𝗮𝘁 𝗧𝗵𝗲𝘆 𝗙𝗼𝘂𝗻𝗱
Foundation models (FMs) excel at:
- Learning from limited training data (particularly when only 1-15% of labeled data is available)
- Adapting to new tasks with minimal additional training
- Demonstrating strong performance in data-scarce scenarios

𝗛𝗼𝘄𝗲𝘃𝗲𝗿, 𝘁𝗮𝘀𝗸-𝘀𝗽𝗲𝗰𝗶𝗳𝗶𝗰 𝗺𝗼𝗱𝗲𝗹𝘀 𝘀𝗵𝗼𝘄𝗲𝗱 𝗰𝗹𝗲𝗮𝗿 𝗮𝗱𝘃𝗮𝗻𝘁𝗮𝗴𝗲𝘀:
- Comparable or superior performance when sufficient labeled data was available
- Significantly lower computational costs (FMs consumed up to 35x more energy)
- Better performance on challenging morphologies and edge cases
- More consistent results across different whole slide image scanners

𝗜𝗺𝗽𝗹𝗶𝗰𝗮𝘁𝗶𝗼𝗻𝘀 𝗳𝗼𝗿 𝗖𝗹𝗶𝗻𝗶𝗰𝗮𝗹 𝗜𝗺𝗽𝗹𝗲𝗺𝗲𝗻𝘁𝗮𝘁𝗶𝗼𝗻
This research highlights important trade-offs that should inform AI deployment in healthcare:
- Resource Constraints: When computation and labeled data are limited, FMs offer practical advantages despite higher energy costs
- Clinical Reliability: For high-stakes applications where accuracy on edge cases is critical, task-specific models may provide better safeguards
- Environmental Impact: The substantial energy consumption of FMs deserves consideration in sustainability planning

𝗧𝗵𝗲 𝗪𝗮𝘆 𝗙𝗼𝗿𝘄𝗮𝗿𝗱
Rather than seeing these approaches as competing alternatives, the researchers advocate for hybrid strategies that combine the strengths of both foundation and task-specific models, suggesting that different clinical contexts may require different solutions.

Research: Bias

Investigation on potential bias factors in histopathology datasets

What if the AI diagnosing your biopsy isn't looking at your cells, but at how your hospital prepares its slides?

A recent study by Farnaz Kheiri et al. examines bias in deep learning models for histopathology analysis, particularly in The Cancer Genome Atlas (TCGA) dataset.

𝗞𝗲𝘆 𝗙𝗶𝗻𝗱𝗶𝗻𝗴𝘀
The researchers used KimiaNet and EfficientNet models to identify several sources of bias:
- Data imbalance between institutions in the dataset
- Variation in tissue preparation and staining techniques
- Image preprocessing inconsistencies

Their analysis showed that models could recognize which institution provided a sample, suggesting they were detecting processing artifacts rather than focusing solely on disease features.

𝗜𝗺𝗽𝗹𝗶𝗰𝗮𝘁𝗶𝗼𝗻𝘀
This research highlights practical challenges for deploying AI in pathology workflows. When models are trained primarily on data from specific institutions, they may not generalize well to others using different protocols.

𝗔𝗽𝗽𝗿𝗼𝗮𝗰𝗵𝗲𝘀 𝘁𝗼 𝗥𝗲𝗱𝘂𝗰𝗲 𝗕𝗶𝗮𝘀
The study tested several methods to mitigate these issues:
1. Stain normalization techniques
2. Grayscale transformations
3. Balanced sampling strategies

While these approaches showed improvements, institution-specific bias remained partially present. The Reinhard normalization method and Noise-Based Grayscale Normalization offered the best balance between bias reduction and maintaining diagnostic performance.

This work contributes to our understanding of how to develop more reliable AI tools for pathology by accounting for and addressing these sources of bias.

Insights: Annotation

Managing Annotator Variability in Large-Scale ML Datasets

During my recent webinar on machine learning challenges, an important question came up that many data scientists struggle with:

Q: How do you tackle distribution shift from different annotators for large-scale datasets with many annotators?

This is a critical issue that often gets overlooked until it's causing significant problems in model performance. Let me share some expanded thoughts on this challenge.

𝐓𝐡𝐞 𝐂𝐨𝐫𝐞 𝐏𝐫𝐨𝐛𝐥𝐞𝐦
When multiple annotators apply subjective judgments to your data:
- Different interpretations of what constitutes a positive vs. negative case
- Varying levels of expertise and attention to detail
- Inconsistent application of annotation guidelines
- These differences can confuse your model during training
This is especially prevalent in medical imaging interpretation and any domain requiring subjective judgment.

𝐌𝐞𝐚𝐬𝐮𝐫𝐢𝐧𝐠 𝐭𝐡𝐞 𝐏𝐫𝐨𝐛𝐥𝐞𝐦
Before implementing solutions, quantify the extent of annotator variability:
- Select a representative sample (e.g., 100 images)
- Have multiple annotators (5+) label each item
- Calculate inter-rater reliability metrics
- Identify specific cases with high disagreement

𝐏𝐫𝐚𝐜𝐭𝐢𝐜𝐚𝐥 𝐒𝐨𝐥𝐮𝐭𝐢𝐨𝐧𝐬 𝐈'𝐯𝐞 𝐈𝐦𝐩𝐥𝐞𝐦𝐞𝐧𝐭𝐞𝐝
1. Refine annotation guidelines:
- Analyze disagreement patterns to identify ambiguous areas
- Provide clearer definitions and more examples
- Create decision trees for edge cases
2. Implement tiered annotation processes:
- Have multiple annotators label challenging items
- Use majority voting for consensus
- Escalate high-disagreement cases to expert reviewers
3. Annotator calibration sessions:
- Regular group reviews of challenging cases
- Feedback loops to align understanding
- Ongoing quality monitoring
4. Data-centric approaches:
- Focus on improving annotation quality rather than just model tweaking
- Consider weighted training based on annotation confidence

𝐓𝐡𝐞 𝐤𝐞𝐲 𝐢𝐧𝐬𝐢𝐠𝐡𝐭: tackling this issue from the data perspective often yields better results than trying to solve it purely through modeling techniques.

_{Enjoy this newsletter? Here are more things you might find helpful:}

Team Workshop: Harnessing the Power of Foundation Models for Pathology - Ready to unlock new possibilities for your pathology AI product development? Join me for an exclusive 90 minute workshop designed to catapult your team’s model development.

Schedule now

Did someone forward this email to you, and you want to sign up for more? Subscribe to future emails
This email was sent to _t.e.s.t_@example.com. Want to change to a different address? Update subscription
Want to get off this list? Unsubscribe
My postal address: Pixel Scientia Labs, LLC, PO Box 98412, Raleigh, NC 27624, United States