Multi-stage Synthetic Image Generation for the Semantic Segmentation of Medical Images

Published in Advances in Smart Healthcare Paradigms and Applications, Springer, 2023

Recommended citation: Paolo Andreini, Simone Bonechi, Giorgio Ciano, Caterina Graziani, Veronica Lachi, Natalia Nikoloulopoulou, Monica Bianchini, and Franco Scarselli. Multi-stage Synthetic Image Generation for the Semantic Segmentation of Medical Images. pages 79–104. Springer International Publishing, Cham, 2023. (BibTex)

Abstract

Recently, deep learning methods have had a tremendous impact on computer vision applications, from image classification and semantic segmentation to object detection and face recognition. Nevertheless, the training of state-of-the-art neural network models is usually based on the availability of large sets of supervised data. Indeed, deep neural networks have a huge number of parameters which, to be properly trained, require a fairly large dataset of supervised examples. This problem is particularly relevant in the medical field due to privacy issues and the high cost of image tagging by medical experts. In this chapter, we present a new approach that allows to reduce this limitation by generating synthetic images with their corresponding supervision. In particular, this approach can be applied in semantic segmentation, where the generated images (and label-maps) can be used to augment real datasets during network training. The main characteristic of our method, differently from other existing techniques, lies in the generation procedure carried out in multiple steps, based on the intuition that, by splitting the procedure in multiple phases, the overall generation task is simplified. The effectiveness of the proposed multi-stage approach has been evaluated on two different domains, retinal fundus and chest X-ray images. In both domains, the multi-stage approach has been compared with the single-stage generation procedure. The results suggest that generating images in multiple steps is more effective and computationally cheaper, yet allowing high resolution, realistic images to be used for training deep networks.

You can find the full paper here