Research: Spatial Transcriptomics

Scaling up spatial transcriptomics for large-sized tissues: uncovering cellular-level tissue architecture beyond conventional platforms with iSCALE

Standard spatial transcriptomics platforms can analyze tissue samples up to about 25 square millimeters. But what if you need to study an entire tumor or organ section?

The context: Spatial transcriptomics has emerged as a powerful tool for understanding how gene expression varies across tissue space, providing insights into cell-cell interactions, tissue organization, and disease mechanisms. However, current commercial platforms face significant constraints: high costs, lengthy processing times, limited gene coverage, and crucially, small capture areas that restrict analysis to tissue fragments rather than whole organs or large anatomical structures.

This size limitation is particularly problematic for studying complex diseases like multiple sclerosis, where pathological changes occur across vast brain regions with heterogeneous patterns that can't be captured in small tissue sections.

Amelia Schroeder et al. developed iSCALE (inferring Spatially resolved Cellular Architectures in Large-sized tissue Environments), a computational framework that addresses this scale problem by leveraging the relationship between gene expression patterns and histological features visible in standard H&E stained slides.

Key technical approach:
- "Daughter capture" integration: Combines multiple small spatial transcriptomics data from different tissue regions
- H&E-guided prediction: Uses machine learning to predict gene expression patterns across entire large tissue sections based on histological features
- Semi-automatic alignment: Develops methods to computationally stitch together data from different tissue sections
- Cellular-resolution inference: Predicts super-resolution gene expression at near single-cell level

Validation and applications:
The method was tested on multiple sclerosis brain samples, where iSCALE uncovered lesion-associated cellular characteristics that were undetectable by conventional ST experiments. The approach enables analysis of tissue areas far exceeding the physical constraints of current platforms while maintaining spatial resolution.

Broader implications:
This work demonstrates a shift from hardware-limited to computationally-enabled spatial transcriptomics. Rather than requiring ever-larger capture arrays, iSCALE shows how intelligent integration of limited experimental data with computational inference can overcome physical platform constraints. The approach could enable spatial transcriptomics studies of whole organs, developmental processes requiring large-scale analysis, and disease contexts where pathology spans anatomical regions.

How might computational approaches like this change the scale of questions we can address in spatial biology?

Code

Research: Multi-Sensor Foundation Models

PyViT-FUSE: A Foundation Model for Multi-Sensor Earth Observation Data

Earth observation relies on dozens of different sensors - optical, radar, thermal, hyperspectral - each capturing different aspects of our planet at varying resolutions and frequencies. The problem isn't lack of data; it's effectively combining these diverse data streams into coherent insights.

Manuel Weber and Carly Beneke developed PyViT-FUSE, a foundation model designed specifically to address this multi-sensor fusion challenge. Unlike previous approaches that often struggle with mixed-resolution inputs, this model uses attention mechanisms to intelligently weight and combine data from different sensors.

Key technical contributions:
- Handles arbitrary numbers of input bands at different resolutions
- Uses a pyramidal vision transformer architecture for processing fused data
- Trained via self-supervision on globally sampled datasets
- Provides interpretable attention visualizations showing decision-making process

Practical implications:
This approach could improve applications where sensor fusion is critical - from more accurate crop monitoring to better disaster response and urban planning. The self-supervised training also means less dependence on expensive manual labeling.

Insights: Batch Effects

Is More Data Really the Silver Bullet for Batch Effects in Medical Imaging AI?

When building AI models for medical imaging, we often hear the same advice: "Just get more data." But is diversity in your dataset truly enough to overcome the notorious challenge of batch effects?

During my recent webinar on bias and batch effects in medical imaging, someone asked whether the ultimate solution lies in simply collecting more diverse and comprehensive data.

My answer? Yes, but with a crucial caveat.

Gathering diverse, representative data will indeed take you further than implementing the latest state-of-the-art algorithm on a flawed dataset. This fundamental truth of machine learning holds especially true in medical imaging.

However, the reality is more complex. Medical center differences are just one type of batch effect. Consider the full spectrum of variables: scanner manufacturers, acquisition protocols, staining techniques, tissue thickness (in pathology), reconstruction algorithms, and countless other factors that vary across clinical settings.

Creating a dataset truly diverse across ALL these dimensions is technically challenging and potentially prohibitively expensive.

The pragmatic approach? Pursue diversity in your data as aggressively as resources allow, but simultaneously develop technical solutions and robust validation strategies that can identify and mitigate the batch effects you couldn't eliminate through data collection alone.

_{Enjoy this newsletter? Here are more things you might find helpful:}

Team Workshop: Mastering Distribution Shift in Computer Vision - Ready to transform your computer vision models into robust systems that thrive in real-world conditions? Join me for an exclusive 90 minute workshop designed to empower your team to identify, understand, and address distribution shift—one of the most critical challenges in building AI systems.

Schedule now

Did someone forward this email to you, and you want to sign up for more? Subscribe to future emails
This email was sent to _t.e.s.t_@example.com. Want to change to a different address? Update subscription
Want to get off this list? Unsubscribe
My postal address: Pixel Scientia Labs, LLC, PO Box 98412, Raleigh, NC 27624, United States