Latest Artificial Intelligence (AI) Study from Harvard Find Ways to Maximize the Accuracy of Image Segmentation by Machine Learning Algorithms in Multiplexed Tissue Images Containing Common Imaging Artefacts

On Nov 23, 2022

Cell types, basement membranes, and connective structures that organize tissues and tumors can be found in length ranges ranging from microscopic organelles to whole organs (0.1 to >104 m). In the investigation of tissue architecture, Hematoxylin, Eosin (H&E), and immunohistochemistry microscopy have long been the method of choice. Furthermore, clinical histopathology continues to be the principal method for diagnosing and treating illnesses like cancer. Classical histology, however, needs to give more molecular data to correctly classify disease genes, analyze development pathways, or identify cell subtypes.

It is adequate to identify cell types, assess cell states (quiescent, proliferating, dying, etc.), and investigate cell signaling pathways using high-plex imaging of healthy and sick tissues (also known as spatial proteomics). In a conserved 3D environment, high-plex imaging also exposes the morphologies and locations of acellular structures necessary for tissue integrity. The resolution, field of view, and diversity (plex) of high-plex imaging techniques vary, but they all provide 2D pictures of tissue slices that are typically 5–10 m thick.

The single-cell data produced by segmenting and quantifying multiplexed pictures perfectly complement single-cell RNA sequencing (scRNASeq) data, which has significantly advanced their understanding of healthy and pathological cells and tissues. However, multiplex tissue imaging maintains morphological and spatial information, unlike dissociative RNASeq. However, pictures of cultured cells, which have until now been the main focus of machine vision systems with a biology-focused focus, are far more difficult to evaluate computationally than high-plex imaging data.

Metazoan cell segmentation techniques have undergone extensive development; however, segmenting tissue pictures presents a more challenging problem because of cell crowding and the variety of cell shapes. Like the ubiquitous application of convolutional neural networks (CNNs) in image identification, object detection, and synthetic picture production, segmentation algorithms that employ machine learning have recently become mainstream. Architectures like ResNet, VGG16, and, more recently, UNet and Mask R-CNN have gained wide acceptance for their capacity to learn millions of parameters and generalize across datasets.

Since most cell types only have one nucleus, localizing nuclei is an ideal starting point for segmenting cultured cells and tissues. Nuclear stains with high signal-to-background ratios are also widely available. Researchers in the past proposed two random forest-based approaches using a group of decision trees to assign pixel-wise class probabilities to a picture by using several channels for class-wise pixel classification. However, a significant drawback of random forest models is that they are far less capable of learning than CNNs. Therefore, much research has yet to be done on the potential of utilizing CNNs with multi-channel data to improve nucleus segmentation.

The most popular method for extending training data to account for picture artefacts is computational augmentation, which entails randomly rotating, shearing, flipping, etc., images before pre-processing them. This is done to stop algorithms from picking up unrelated information about a picture, including its orientation. Focus artefacts have so far been eliminated by utilizing calculated Gaussian blur to supplement training data. Gaussian blur, however, is merely a rough approximation of the blurring present in any optical imaging device with a restricted bandpass, such as a true microscope, as well as the consequences of mismatched refractive indices and light scattering.

This research explores methods to improve the accuracy of multiplexed tissue pictures with typical imaging artefacts and image segmentation using machine learning techniques. By manually selecting a variety of normal tissues and tumors, they create a training and test set with ground-truth annotations. They then used this data to measure the segmentation accuracy of three deep learning networks, each of which was trained and tested independently: UNet, Mask R-CNN, and Pyramid Scene Parsing Network (PSPNet). The resultant models are a series of Universal Models for Identifying Cells and Segmenting Tissue (UnMICST), each based on a different type of ML network but using the same training data. They found two strategies to increase segmentation accuracy for all three networks based on their study. The first combines photographs of nuclear chromatin stained with DNA-intercalating dyes with photos of nuclear envelope staining (NES). The second includes natural augmentations—here defined as purposefully blurred and oversaturated photos into the training data to strengthen models against the kinds of artefacts seen in actual tissue images. They discover that actual data augmentation greatly outperforms traditional Gaussian blur augmentation, improving model robustness statistically dramatically. The benefits of including NES data and genuine augmentations are cumulative across various tissue types.

Check out the paper and code. All Credit For This Research Goes To Researchers on This Project. Also, don’t forget to join our Reddit page and discord channel, where we share the latest AI research news, cool AI projects, and more.

Aneesh Tickoo is a consulting intern at MarktechPost. He is currently pursuing his undergraduate degree in Data Science and Artificial Intelligence from the Indian Institute of Technology(IIT), Bhilai. He spends most of his time working on projects aimed at harnessing the power of machine learning. His research interest is image processing and is passionate about building solutions around it. He loves to connect with people and collaborate on interesting projects.

Credit: Source link