Hyperspectral Imaging for Non-invasive Diagnostics of Melanocytic Lesions
John Paoli1,2, Ilkka Pölönen3, Mari Salmivuori4,5, Janne Räsänen4,6, Oscar Zaar1,2, Sam Polesie1,2, Sari Koskenmies5, Sari Pitkänen5, Meri Övermark5, Kirsi Isoherranen5, Susanna Juteau7, Annamari Ranki5, Mari Grönroos4 and Noora Neittaanmäki1,8,9
1Department of Dermatology and Venereology, Institute of Clinical Sciences, Sahlgrenska Academy, University of Gothenburg, 2Department of Dermatology and Venereology, Region Västra Götaland, Sahlgrenska University Hospital, Gothenburg, Sweden, 3Faculty of Information Technology, University of Jyväskylä, 4Department of Dermatology and Allergology, Päijät-Häme Social and Health Care Group, Lahti, 5Department of Dermatology and Allergology, University of Helsinki and Helsinki University Hospital, Helsinki, 6Department of Dermatology, Tampere University Hospital and Faculty of Medicine and Medical technology, Tampere University, Tampere, 7Department of Pathology, University of Helsinki and HUSLAB, Helsinki, Finland, 8Department of Laboratory Medicine, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg and 9Department of Clinical Pathology, Region Västra Götaland, Sahlgrenska University Hospital, Gothenburg, Sweden
Malignant melanoma poses a clinical diagnostic problem, since a large number of benign lesions are excised to find a single melanoma. This study assessed the accuracy of a novel non-invasive diagnostic technology, hyperspectral imaging, for melanoma detection. Lesions were imaged prior to excision and histopathological analysis. A deep neural network algorithm was trained twice to distinguish between histopathologically verified malignant and benign melanocytic lesions and to classify the separate subgroups. Furthermore, 2 different approaches were used: a majority vote classification and a pixel-wise classification. The study included 325 lesions from 285 patients. Of these, 74 were invasive melanoma, 88 melanoma in situ, 115 dysplastic naevi, and 48 non-dysplastic naevi. The study included a training set of 358,800 pixels and a validation set of 7,313 pixels, which was then tested with a training set of 24,375 pixels. The majority vote classification achieved high overall sensitivity of 95% and a specificity of 92% (95% confidence interval (95% CI) 0.024–0.029) in differentiating malignant from benign lesions. In the pixel-wise classification, the overall sensitivity and specificity were both 82% (95% CI 0.005–0.005). When divided into 4 subgroups, the diagnostic accuracy was lower. Hyperspectral imaging provides high sensitivity and specificity in distinguishing between naevi and melanoma. This novel method still needs further validation.
Key words: hyperspectral imaging; non-invasive diagnostic; machine learning; malignant melanoma.
Accepted Oct 25, 2022; Epub ahead of print Oct 25, 2022
Acta Derm Venereol 2022; 102: adv00815.
DOI: 10.2340/actadv.v102.2045
Corr: Noora Neittaanmäki, Department of Clinical Pathology, Sahlgrenska University Hospital, Gula Stråket 8, SE-41345 Gothenburg, Sweden. E-mail: noora.neittaanmaki@fimnet.fi
SIGNIFICANCE
To aid melanoma diagnostics, various non-invasive technologies have developed. Hyperspectral imaging is a novel non-invasive technology, which combines digital imaging, spectroscopy and the use of machine learning. The advantages include large field of view and rapid imaging process. This study assessed the accuracy of hyperspectral imaging in distinguishing between histopathologically verified naevi and melanomas. The results indicate that hyperspectral imaging is feasible for non-invasive diagnostics and provides high sensitivity and specificity. The novel method needs further validation with larger data-sets. The results will serve as a basis for future development of this novel imaging technique for commercial use.
INTRODUCTION
Invasive malignant melanoma (MM) is the deadliest type of skin cancers with its prognosis related to the invasion depth (Breslow depth) at the time of diagnosis (1). With increasing incidence, the healthcare costs of MM are expected to expand dramatically (2). MM is the cost driver of skin cancers, with total annual costs of > 90 million euros in Sweden (3). Earlier diagnosis could minimize these expenses, since the costs of melanoma in situ (MIS) without metastatic potential are significantly lower than those of advanced or metastasized MM (4).
The excisional biopsy and histopathological examination of suspicious pigmented lesions is the current gold standard for diagnosing MM. Even though early detection of MM is the best strategy to reduce mortality associated with melanoma, unnecessary excision of benign lesions increases morbidity and raises healthcare costs associated with melanoma screening (5). The number needed to excise, i.e. the number of excised lesions per diagnosed melanoma varies from > 20 for primary care to 6 for pigmented lesion specialists (6). A differential diagnostic problem for melanomas are benign pigmented lesions, including dysplastic naevi (DN) and other non-dysplastic benign naevi (BN), which can resemble early MM or MIS clinically (7). There is therefore a demand for objective and non-invasive examination methods to aid the clinicians in deciding which lesions to excise.
Hyperspectral imaging (HI) is a novel non-invasive imaging technique, which combines digital imaging, spectroscopy and the use of machine learning (ML) to provide automated diagnostic classifications. Unlike human colour vision, which is limited by the trichromatic colour system to detect the wavelengths of visual light (380–740 nm), HI can provide information from wavelengths not visible by humans. A hyperspectral image is a stack of hundreds of overlapping images taken at different narrow wavebands of light. The resulting hyperspectral cube contains 2 spatial dimensions and the spectral data for every pixel provides a third dimension. The technique can be used for diagnostic purposes (8, 9) or to visualize tumour borders (10, 11).
This study sought to determine the accuracy of a HI system combined with a novel 3-dimensional (3D) deep learning method in the non-invasive diagnosis of melanocytic lesions.
MATERIALS AND METHODS
Recruitment
The study protocol followed the principles of the Declaration of Helsinki and was approved by local ethics committees in both Gothenburg and Tampere (approval numbers 283-18 and R14120). Patients were recruited prospectively at 2 study centres in Finland (the Department of Dermatology of Helsinki University Hospital in Helsinki and the Päijät-Häme Central Hospital in Lahti) between June 2016 and October 2017 and at Sahlgrenska University Hospital in Gothenburg, Sweden between June 2018 and December 2019.
The inclusion criterion was any clinically atypical melanocytic lesion that was scheduled for excision and subsequent histopathological analysis.
Image acquisition and sampling for histopathology
The lesions were first photographed and evaluated with a dermatoscope by a dermatologist. Hyperspectral images were taken in vivo using 3 similar HI system prototypes (HSCP2, Revenio group, Finland). The system consists of a Fabry-Pérot interferometer (FPI) based hyperspectral imager and diffuse illumination system (12). The use of an FPI enables fast scanning in the spectral domain. The imager captures 120 wavebands rapidly in seconds using the diffuse reflectance of visible and near-infrared light (wavebands 450–900 nm) within a large field of view (FOV) of 12 cm2 (spatial resolution 6,400 pixels/cm). The imaging depth of HI depends on the wavelength (13). In the used wavelength range, the imaging depth varies between 0.5 and 5 mm as a function of wavelength. The full width of each waveband’s half maximum varies from 5 to 15 nm. The camera used is capable to taking images at a resolution of 1,920×1,200 pixels. This corresponds to approximately 15 μm/pixel spatial resolution. Before each hyperspectral image was acquired, an image from a white reference standard was obtained. The detailed description of the HI technique is available elsewhere (8, 10, 11).
After imaging, the lesions were excised and processed for routine histopathological examination. The specimens were fixed in 4% formalin, embedded in paraffin, sectioned using the traditional vertical bread loaf technique and stained with haematoxylin-eosin (H&E). All samples were assessed by dermatopathologists and the histopathological diagnoses of the excised specimens were considered as the true label for the data-set.
Data processing
Based on the histopathological reports, a dermatologist (JP) and dermatopathologist (NN) manually annotated each image and categorized them according to 5 classes: healthy skin, BN, DN, MIS, and MM.
The mathematical modelling was performed by a mathematician (IP). A supervised ML approach was used to train a deep neural network algorithm to distinguish between the different lesion types. The neural network used in this study was modified from Hyper3Dnet, a network which utilizes both 3D and 2-dimensional convolutional layers, extracting features from both spectral and spatial domains (14). To increase the sample size, we chose to train a pixel-wise classification algorithm. The images were divided vertically in the middle of the annotated lesion similarly to the method used by Räsänen et al. (9). The left side was used to train the algorithm and validate the training process. The right side was used to test classification performance. This ensured firstly that the training or validation set did not contain data-points from the image currently being classified, and, secondly, that the training set contained a sufficient variation of different lesion types (15–17). In the pixel-wise analysis this approach made pixels training and test sets independent and reduced the effect of spatial autocorrelation (18). From the left side of each of the images, 50 pixels from the healthy area and 100 pixels from the annotated lesion area were randomly selected for training and validation. For each pixel, a 25×25×55 window was collected, where the first 2 variables correspond to the spatial domain and the last 1 to the spectral domain. The analysis included every second band in the spectral domain in order to reduce the amount of data for processing. This data-set was then randomly divided into the training set (41,437 samples) and the validation set (7,313 samples). There was a slight imbalance between classes in the training set. Classes were balanced using the random over-sampling method (19), which increased the amount of training data to 89,700 samples. Data augmentation was performed by rotating images 3 times 90° in the spatial domain (20). This multiplied the training set by a factor of 4 (358,800 samples). For testing, a total of 24,375 pixels from the right halves of the images was selected.
Before the actual training, effects of vignetting and curvature of the skin surface were reduced by normalizing reflectance from images subject to the spectral mean, i.e. where λi denotes different wavebands. Here R() is reflectance spectra for the each pixel. Reflectance R{ ) = I{ )/I_0{ ), where I is measured radiance and I_0 is irradiance of the light source, which is measured by imaging white Teflon target. R_( i) is single waveband image. This is normalizing data in such a way that the mean of each spectrum is 0.
Training was performed using Hyper3Dnet with modifications to the encoder part of the architecture (14) where only 16 filters were used, and the dense layer had only 256 nodes. Training used an Adam optimizer with a learning rate of 0.0001, a momentum term β_1 of 0.9, a momentum term β_2 of 0.999, and an epsilon value of 10×10−8. The algorithm was implemented using Python
3.6 (https://www.python.org) and Tensorflow 2.0 (tensorflow.org). For computing, a Tesla P100-PCIE-16GB general-purpose graphics processing unit (Tesla, Nvidia, UK) was used. For training, 60 epochs on mini-batches (size 64) were computed. Categorical cross-entropy was used for loss function.
Two different approaches were used: (i) the majority of the pixels per lesion “majority vote” classification; and (ii) the pixel-wise classification. In the majority vote classification, the class was determined by selecting pixels from the annotated right half of the lesional area and counted to which class the majority of pixels in this area belonged. The classification result was true positive if most predictions were in the same class as the ground truth (the annotated masks based on histopathology). In the pixel-wise classification, each tested pixel was seen as an independent sample. Both models were trained twice: for 2 class classification to distinguish benign (BN+DN) and malignant lesions (MIS+MM) and to classify different lesion types. The pixel-wise classification also included pixels from the annotated healthy skin regions surrounding the lesional area and resulted in either 3 classes (healthy, benign or malignant) or 5 classes (healthy, BN, DN, MIS, MM) while the majority vote only analysed the lesional area and resulted in either 2 classes (benign vs malignant) or 4 classes (BN, DN, MIS and MM).
RESULTS
In total, 364 melanocytic lesions were imaged in the 3 study centres. In 39 cases, the HI system settings were not optimal at the time of imaging and the images were excluded due to imaging artefacts. Thus, 325 lesions in 285 patients were included in the study.
Lesion characteristics and histopathological diagnoses
The mean diameter of the lesions was 9.4 mm (range 3–50 mm). All the lesions fitted the FOV and could be imaged in 1 session. The malignant lesions (MM+MIS) were larger than the benign lesions (DN+BN) with a mean diameter of 11.5 mm (range 4–50) and 7.2 (range 3–16), respectively, p < 0.05. The majority of the imaged lesions were located on the torso (53.5 % n = 174), and on the extremities (40.9%, n = 133), while only 5.5 % (n = 18) were in the head and neck region.
Histopathological diagnoses were: 74 MMs, 88 MIS, 115 DN, and 48 BN. The mean Breslow thickness of the MMs was 1.2 mm (range 0.2–6.3). The MMs were of the following subtypes: superficial spreading (n = 52), lentigo maligna melanoma (n = 9), melanoma associated with a naevus (n = 6), nodular melanoma (n = 3) and unclassified (n = 3). Of the DN, 81 showed low-grade dysplasia and 21 high-grade dysplasia. In 6 cases, the grade was not reported. Among the BN, the diagnoses were mainly compound naevi (n = 27), junctional naevi (n = 5), intradermal naevi (n = 5) and blue naevi (n = 5). There was also 1 case each of: congenital naevus, Reed naevus, spindle cell naevus, deep penetrating naevus, and special site naevus.
Hyperspectral analysis
Majority vote classification. In the majority vote analysis, the annotated right half of the lesions were analysed, and which class the majority of pixels in this area belonged to was counted. The classification result was true positive if most predictions were in the same class as the ground truth (the annotated masks based on histopathology). In 2 classes analyses for benign (BN+DN) vs malignant (MIS+MM), the overall sensitivity was 0.95 and specificity 0.92 (95% CI 0.024–0.029). Overall sensitivity for 4 classes (BN, DN, MIS, MM) was 0.88 and specificity 0.84 (95% CI 0.035–0.040) (Table I, Fig. 1).
Pixel-wise classification. In the pixel-wise classification, the model was trained based on independent pixels of the image. Each tested pixel on the right lesion halves was seen as an independent sample and the class was determined separately for each pixel. Pixels from the annotated lesional area and healthy skin surrounding the lesions were included. Overall accuracy of pixel-wise classification for 3 classes (healthy, benign, malignant) was 0.82 for both sensitivity and specificity (95% CI 0.005–0.005), while for 5 classes (healthy, BN, DN, MIS, MM) the overall sensitivity was 0.77 and specificity was 0.74 (95% CI 0.005–0.006) (Table II, Fig. 2).
Both the pixel-wise and the majority vote approaches offered not only classification, but also delineation, and a map-like representation of the lesions as is shown in Figs 3–4. Mean spectra and standard deviation (SD) for healthy skin and different lesions are shown in Fig. 5.
DISCUSSION
In this study, HI showed its potential for the non-invasive diagnosis of naevi and melanoma. The most reliable results were achieved when using the majority vote method for differentiating benign (BN+DN) vs malignant (MIS+MM) melanocytic lesions. This analysis achieved higher overall sensitivity 95% and specificity 92% than the pixel-wise analysis for (82% overall sensitivity and specificity). The classifications into histological subclasses (BN, DN, MIS, MM) showed somewhat lower overall accuracy with both methods, which could be explained by a lower number of cases and pixels when dividing the material into subgroups.
Previously, we have shown HI to be useful in differentiating between MM and pigmented basal cell carcinoma (BCC) and thus shown capability in differentiating tumours of melanocytic and keratinocytic origin (9). Furthermore, HI can offer a tool for detecting invasive parts of larger melanocytic lesions, such as lentigo maligna melanoma, which could allow for targeted biopsies (8). We have also shown that map-like HI images can be used preoperatively in the delineation of both melanocytic and non-melanocytic malignant tumours (10, 11). The advantages of HI include: (i) a combination of digital imaging and spectroscopy, i.e. spectral data can be obtained from map-like images; (ii) a large imaging field (12 cm2) and (iii) a rapid imaging process (s) and automated analysis performed on visual data. Although there is no commercial HI device available currently, the technique could be especially useful for an inexperienced physician, since it is not user-dependent and could potentially provide an automated diagnosis. HI could potentially decrease the number needed to excise among general practitioners, and even among dermatologists, and help to determine which lesions should be excised and which could be followed.
To aid melanoma diagnostics, various other non-invasive imaging methods have been developed, including digital dermoscopy, reflectance confocal microscopy (RCM), high-resolution optical coherence tomography (HR-OCT), multiphoton laser scanning microscopy (MLT), electrical impedance spectroscopy, Raman spectroscopy and multispectral imaging (21–36). Many of these techniques are based on spectral technologies (37). The advantages of the proposed HI method compared with some other spectral techniques, such as Raman spectroscopy, include the possibility to obtain the spectral data from an image. The large FOV makes the HI fast and enables imaging even larger lesions at once. This is a clear advantage compared with RCM or MLT. The maplike images can also be used preoperatively in the delineation of the lesions (10, 11). Furthermore, HI gives automated analysis performing visual data, which makes it simple to use and not user dependent. This is an advantage compared with extensive training and knowledge about histopathology needed for the use of some high-resolution techniques, such as RCM or HR-OCT. A disadvantage compared with the high resolution techniques is the poorer resolution. While RCM, MLT and HR-OCT are able to detect intracellular structures, HIS is limited to cell aggregates. However, we believe this resolution combined to the spectral data obtained with HI is good enough to achieve acceptable diagnostic accuracy for skin tumour diagnostics. Imaging depth with HI is approximately 2 mm, which is better than the high-resolution devices that typically reach the papillary dermis. The technique most similar to HI is multispectral imaging (MI) which unlike HI (which takes tens to hundreds of images with narrow continuous wavebands) takes 5–15 separate images using non-continuous wide spectral bands (37). In previous studies, MI using commercial products, such as Melafind (MELA Sciences, Irvington, NY, USA) and Siascope (MedX Headquarters, ON, Canada), has achieved varying sensitivity of 83–98% and specificity of 8–91% in melanoma diagnostics (38). In a recently published study, another HI system including 202 pigmented skin lesions achieved 96.7% sensitivity to detect MM and a specificity of 42.1% for benign lesions (39). The current study had a larger sample size and 2 further developed analysing techniques, which resulted in increased accuracy compared with this study. Interestingly, there are HI systems that can be attached to light microscopes for obtaining hyperspectral histopathology images of H&E-stained melanoma tissue for tumour detection from healthy skin and for measurement of the invasion depth (Breslow thickness) (40, 41).
Study limitations
Limitations of the current study include the limited number of imaged lesions. Nevertheless, the use of a pixel-wise approach increased the data-set dramatically, since in this method every single pixel was seen as an independent “lesion”. The algorithm used for the data analysis was a novel modification of deep neural network analysis that still needs further development. In the majority vote analysis, the fact that the current study used half of each lesion for training/validation and the other half for analysis could potentially have affected the results. This approach was used due to the limited sample size and large variation in the imaged lesions. However, the pixel-wise analysis overcame this potential bias. Since some images had to be excluded due to the low quality, the HI system prototypes also still need adjustments in order to avoid imaging artefacts. Even though there was variation in the lesion size, with naevi being smaller than MM on average, the approach of using pixel-wise classification overcame this potential bias. Furthermore, experience is needed in imaging different body parts, for example acral melanomas and naevi of special sites, since skin topography in these areas varies and can complicate image acquisition. The current study material also lacked Spitzoid lesions and rare melanoma subtypes. Furthermore, other pigmented lesions including pigmented BCCs and seborrhoeic keratoses, dermatofibromas, benign lentigines should be included in the differential diagnoses and this needs further study. Moreover, patients in this investigation were recruited from 2 Nordic countries, meaning that most patients had fair skin phototypes. Another limitation is that the current study did not collect the specific skin phototypes and that the images were taken at different seasons of the year, which may have affected the skin pigmentation. To examine the external validity of our findings, future investigations should also target other populations with more pigmented skin.
Both analytical methods have their limitations. The majority vote classification gives only 1 diagnosis per lesion and could potentially miss a small melanoma associated with a large naevus or a small invasive component within a larger MIS. The pixel-wise classification gives a more realistic picture, mimicking the histopathology of the lesion, showing a classification for each pixel and resulting in mixed lesion types. However, these mixed images may be confusing for clinicians to interpret.
HI cannot replace the gold standard of histopathological evaluation of melanoma, including assessment of the lesion thickness, ulceration, regression and mitotic rate. However, since image acquisition with HI is performed with different wavelengths and different penetration depths, imaging could be adjusted to also measure lesion thickness by providing 4-dimensional information (3 spatial dimensions and the spectral dimension). Interestingly, the classification maps showed a combination of spectra within the same lesion in some cases as was depicted in Fig. 4. It is therefore possible that HI could detect areas of interest that could have been missed in the histopathological analysis. Theoretically, it is therefore also possible, that minimally invasive melanomas may have been missed in the routine histopathological sectioning of the specimen and HI may actually have classified 1 or more invasive melanomas correctly. Further studies with larger data-sets are thus warranted.
Conclusion
In this study, HI showed its potential in the non-invasive diagnosis of melanocytic tumours with relatively high sensitivity, specificity, and overall accuracy.
ACKNOWLEDGEMENT
This study was funded by the Instrumentarium Foundation, by the Finnish Cancer foundation, by the Finnish Dermatopathology society, by the Hudfonden Foundation and by the Academy of Finland.
Conflicts of interest. The authors NN, IP and MG are patent holders for the patent US 10478071 regarding hyperspectral imaging. MG has received consultation fee from the Revenio group. The Revenio group kindly loaned the hyperspectral imaging devices for the study.
REFERENCES