Extensive clinical testing of Deep Learning Segmentation models for thorax and breast cancer radiotherapy planning
DOI:
https://doi.org/10.1080/0284186X.2023.2270152Keywords:
Breast cancer, automatic target segmentation, deep learning segmentation, automatic organs at risk segmentation, artificial intelligenceAbstract
BackgroundThe performance of deep learning segmentation (DLS) models for automatic organ extraction from CT images in the thorax and breast regions was investigated. Furthermore, the readiness and feasibility of integrating DLS into clinical practice were addressed by measuring the potential time savings and dosimetric impact.
Material and MethodsThirty patients referred to radiotherapy for breast cancer were prospectively included. A total of 23 clinically relevant left- and right-sided organs were contoured manually on CT images according to ESTRO guidelines. Next, auto-segmentation was executed, and the geometric agreement between the auto-segmented and manually contoured organs was qualitatively assessed applying a scale in the range [0-not acceptable, 3-no corrections]. A quantitative validation was carried out by calculating Dice coefficients (DSC) and the 95% percentile of Hausdorff distances (HD95). The dosimetric impact of optimizing the treatment plans on the uncorrected DLS contours, was investigated from a dose coverage analysis using DVH values of the manually delineated contours as references.
ResultsThe qualitative analysis showed that 93% of the DLS generated OAR contours did not need corrections, except for the heart where 67% of the contours needed corrections. The majority of DLS generated CTVs needed corrections, whereas a minority were deemed not acceptable. Still, using the DLS-model for CTV and heart delineation is on average 14 minutes faster. An average DSC=0.91 and H95=9.8 mm were found for the left and right breasts, respectively. Likewise, and average DSC in the range [0.66, 0.76]mm and HD95 in the range [7.04, 12.05]mm were found for the lymph nodes.
ConclusionThe validation showed that the DLS generated OAR contours can be used clinically. Corrections were required to most of the DLS generated CTVs, and therefore warrants more attention before possibly implementing the DLS models clinically.