AI scratching your car: Using diffusion models for training data generation in automotive damage detection
Julian Stritzel,
M. Saquib Sarfraz,
Rainer Stiefelhagen
Kapitel/Beitrag aus dem Buch: Längle T. & Heizmann M. 2024. Forum Bildverarbeitung 2024.
Demand for reliable data remains a major issue in training machine learning models in computer vision. Frequently, datasets are of insufficient scale, imbalanced, not diverse, and of poor quality, potentially resulting in biased, inaccurate, non-robust, and badly generalizing models. Moreover, realworld training data can raise privacy concerns or be extremely expensive to gather, necessitating alternative solutions. This paper investigates the use of diffusion models for generative data augmentation in semantic image segmentation, specifically in the domain of vehicle damage detection. We propose a new approach that utilizes an existing diffusion model ControlNet to generate useful synthetic data depicting realistic vehicles with damages such as scratches, rim damages, dents and etc. Based on this we provide an analysis and show how such a generative data augmentation may help in scenarios where training data is scarce and of low quality.