Hi ,If you’ve been following me on LinkedIn for a while now, you’ve likely noticed that one of my favorite topics is using deep learning to predict molecular biomarkers from H&E. This was the focus of my PhD research, and it’s been exciting to follow the rapid advances since.
To assess tissue properties like receptor status, genomic subtype, mutational status, or other clinically-relevant phenotypes, pathologists often rely upon different types of molecular analysis -- immunohistochemistry or RNA sequencing, for example. These analyses are time-consuming and costly, so are not routinely performed. However, they can provide key information for selecting an appropriate
treatment.
Deep learning has demonstrated repeated success in predicting some of these complex and abstract biomarkers from H&E alone, even on datasets with fewer than 1000 patients. Larger training sets will likely enable improved prediction performance.
We might not need to wait for larger datasets after all though -- Krause et al. just published a method to improve classification accuracy by augmenting their dataset with synthetically generated images [1].
They trained a Conditional Generative Adversarial Network (CGAN) on their training set to create new sample images with and without microsatellite instability. Augmenting their training set with these synthetic images improved model accuracy.
GANs have advanced rapidly over the last few years. (You’ve probably heard about “deep fakes.” These are created by GANs.) Great to see a practical use for histology!