 |
Distribution Shift: The Silent Killer of Computer Vision Models
As AI practitioners, we've all faced the frustrating reality: a model that performs brilliantly in the lab, only to falter when deployed in the real world. Why? Often, it's due to distribution shift.
I’m offering a private team workshop to dive deep into this critical issue: - How to detect distribution shifts in your data? - What types of distribution shift are affecting your models? - What are the most suitable techniques for detecting and mitigating its effects?
This isn't just theory – we'll focus on how distribution shift is affecting YOUR models, then examine practical strategies for building more robust models that can withstand the ever-changing landscapes of data.
𝐖𝐡𝐲
𝐭𝐡𝐢𝐬 𝐰𝐨𝐫𝐤𝐬𝐡𝐨𝐩 𝐢𝐬 𝐚 𝐠𝐚𝐦𝐞-𝐜𝐡𝐚𝐧𝐠𝐞𝐫: - 𝐓𝐚𝐢𝐥𝐨𝐫𝐞𝐝 𝐟𝐨𝐫 𝐲𝐨𝐮𝐫 𝐭𝐞𝐚𝐦: This isn’t a generic seminar. The content is customized to address your team’s specific challenges and goals. - 𝐑𝐎𝐈-𝐟𝐨𝐜𝐮𝐬𝐞𝐝: Learn how tackling distribution shift can improve model accuracy and reliability, ensuring better outcomes for your AI projects. - 𝐀𝐜𝐭𝐢𝐨𝐧-𝐨𝐫𝐢𝐞𝐧𝐭𝐞𝐝: Walk away with implementable strategies, not just theoretical knowledge. -
𝐅𝐮𝐭𝐮𝐫𝐞-𝐩𝐫𝐨𝐨𝐟 𝐲𝐨𝐮𝐫 𝐭𝐞𝐚𝐦: Equip your team with the skills to handle evolving data distributions and stay ahead in the competitive AI landscape.
Whether you're a seasoned ML engineer or just starting out, understanding distribution shift is crucial for deploying AI systems that truly work in the wild.
Interested? Just reply and I'll send you details.
P.S. Not sure if this fits your needs? Happy to chat 1:1 about your team’s goals!
|
|
|
|
|
 |
White Paper: AI Diagnostics Owkin’s approach to building AI diagnostics: From Slide to Device
Last year I wrote a series of blogs for Owkin to help demystify the development of an AI diagnostic.
Owkin has now combined them into a white paper exploring how AI is revolutionizing pathology by enhancing diagnostic accuracy and improving patient outcomes.
Here are some key highlights:
𝐎𝐩𝐩𝐨𝐫𝐭𝐮𝐧𝐢𝐭𝐢𝐞𝐬 𝐚𝐧𝐝 𝐂𝐡𝐚𝐥𝐥𝐞𝐧𝐠𝐞𝐬: AI offers opportunities like automating tasks and detecting subtle patterns, while addressing challenges such as integration and data quality.
𝐃𝐚𝐭𝐚
𝐐𝐮𝐚𝐥𝐢𝐭𝐲 𝐚𝐧𝐝 𝐌𝐨𝐝𝐞𝐥 𝐁𝐮𝐢𝐥𝐝𝐢𝐧𝐠: Diverse and high-quality data is critical for building machine learning models using both unsupervised and supervised learning.
𝐕𝐚𝐥𝐢𝐝𝐚𝐭𝐢𝐨𝐧 𝐚𝐧𝐝 𝐂𝐨𝐦𝐩𝐥𝐢𝐚𝐧𝐜𝐞: Rigorous validation and regulatory compliance are essential to ensure AI diagnostics are accurate, reliable, and ethically deployed.
Read the full white paper to learn more.
|
|
|
|
|
 |
Research: Change Detection
OPTIMUS: Observing Persistent Transformations in Multi-Temporal Unlabeled Satellite-Data
Monitoring changes in the Earth's surface is critical for identifying environmental problems like wildfires, flooding, and deforestation.
Traditionally, training a machine learning model to detect these changes required a significant amount of labeled data.
Raymond Yu et al. presented a new approach using self-supervised learning.
Taking a set of image time series, they trained OPTIMUS to recover the relative order of the images. The model was tasked with predicting whether an image is closer in time to either of two other images from the same time series.
OPTIMUS achieved more than a 20% increase in AUC in comparison to other methods.
I love to read about creative uses of self-supervised learning like this. The
innate structure of the data is often enough to build a powerful image representation.
|
|
|
|
|
 |
Insights: Foundation Model Efficiency Balancing Power and Efficiency in Pathology AI: The Foundation Model Dilemma
A question from my recent webinar on foundation models for pathology: Are current foundational models in pathology over-reliant on massive computing resources rather than optimizing their computational efficiency?
Here's what recent research reveals:
1. 𝐓𝐫𝐞𝐧𝐝 𝐓𝐨𝐰𝐚𝐫𝐝𝐬 𝐌𝐚𝐬𝐬𝐢𝐯𝐞 𝐌𝐨𝐝𝐞𝐥𝐬 Many foundation models in pathology are indeed relying on enormous datasets and computational resources. For example, Prov-GigaPath requires high-end GPUs like NVIDIA A100 for optimal performance.
2. 𝐓𝐡𝐞
𝐄𝐟𝐟𝐢𝐜𝐢𝐞𝐧𝐜𝐲 𝐂𝐨𝐮𝐧𝐭𝐞𝐫-𝐌𝐨𝐯𝐞𝐦𝐞𝐧𝐭 Some developers are focusing on optimizing data selection and model architecture rather than just scaling up. This approach aims to create more efficient models without sacrificing performance.
3. 𝐇𝐚𝐫𝐝𝐰𝐚𝐫𝐞 𝐋𝐢𝐦𝐢𝐭𝐚𝐭𝐢𝐨𝐧𝐬 𝐢𝐧 𝐂𝐥𝐢𝐧𝐢𝐜𝐚𝐥 𝐒𝐞𝐭𝐭𝐢𝐧𝐠𝐬 The substantial computational requirements of large models can be a barrier to adoption in resource-constrained medical facilities.
4. 𝐌𝐨𝐝𝐞𝐥
𝐃𝐢𝐬𝐭𝐢𝐥𝐥𝐚𝐭𝐢𝐨𝐧 Larger foundation models trained on diverse datasets can be distilled into smaller models. For example, Virchow2G Mini is a 22 million parameter distillation of 1.9 billion parameter Virchow2G.
5. 𝐅𝐮𝐭𝐮𝐫𝐞 𝐃𝐢𝐫𝐞𝐜𝐭𝐢𝐨𝐧𝐬 Research is ongoing to optimize models for less resource-intensive hardware, potentially broadening their accessibility.
💡 𝐊𝐞𝐲 𝐓𝐚𝐤𝐞𝐚𝐰𝐚𝐲: While the trend towards larger models continues, there's a growing focus on developing more efficient and interpretable AI systems in pathology. The future likely lies in striking a balance between computational power and smart, efficient design.
|
|
|
|
|
Enjoy this newsletter? Here are more things you might find helpful:
Office Hours -- Are you a student with questions about machine learning for pathology or remote sensing? Do you need career advice? Once a month, I'm available to chat about your research, industry trends, career opportunities, or other topics. The next one is coming up on Thursday March 20! Register for the next session
|
|
Did someone forward this email to you, and you want to sign up for more? Subscribe to future emails This email was sent to _t.e.s.t_@example.com. Want to change to a different address? Update subscription Want to get off this list? Unsubscribe My postal address: Pixel Scientia Labs, LLC, PO Box 98412, Raleigh, NC 27624, United States
|
|
|
|
|