Frequency of Publication of Dermoscopic Images in Inter-observer Studies: A Systematic Review
Sam Polesie1,2 and Oscar Zaar1,2
1Department of Dermatology and Venereology, Institute of Clinical Sciences, Sahlgrenska Academy, University of Gothenburg, and 2Region Västra Götaland, Sahlgrenska University Hospital, Department of Dermatology and Venereology, Gothenburg, Sweden
Research interest in dermoscopy is increasing, but the complete dermoscopic image sets used in inter-observer studies of skin tumours are not often shared in research publications. The aim of this systematic review was to analyse what proportion of images depicting skin tumours are published in studies investigating inter-observer variations in the assessment of dermoscopic features and/or patterns. Embase, MEDLINE and Scopus databases were screened for eligible studies published from inception to 2 July 2020. For included studies the proportion of lesion images presented in the papers and/or supplements was extracted. A total of 61 studies (53 original studies and 8 shorter reports (i.e. research letters or concise reports)). published in the period 1997 to 2020 were included. These studies combined included 14,124 skin tumours, of which 373 (3%) images were published. This systematic review highlights that the vast majority of images included in dermoscopy research are not published. Data sharing should be a requirement for future studies, and must be enabled and standardized by the dermatology research community and editorial offices.
Key words: data sharing; dermoscopy; inter-observer variation; photography; skin diseases/diagnostic imaging; systematic review as topic.
Accepted Dec 2, 2021; Epub ahead of print Dec 2, 2021
Acta Derm Venereol 2021; 101: adv00621.
doi: 10.2340/actadv.v101.865
Corr: Sam Polesie, Department of Dermatology and Venereology, Institute of Clinical Sciences, Sahlgrenska Academy, University of Gothenburg, Gröna stråket 16, SE-413 45 Gothenburg, Sweden. E-mail: sam.polesie@vgregion.se
SIGNIFICANCE
A dermoscope is a loupe equipped with a light that assists dermatologists in diagnosis of skin cancer. Several dermoscopic features have been described, suggestive of different skin tumours. However, in publications regarding agreement about specific features, researchers often include only a small number of example images. This systematic review investigated what proportion of dermoscopy images is shared in these publications. Following a literature review, 61 studies were included. Of these studies, images of only 373 out of 14,124 (3%) skin tumours were shared. This result should be wake-up call for the promotion of data sharing in dermoscopy research.
INTRODUCTION
Research interest in dermoscopy has accelerated and dermoscopes are now pivotal tools, particularly in the evaluation of skin tumours (1, 2). Despite their usefulness, there is only moderate agreement among experts about many of the described dermoscopic features and patterns (3). In order to address clinical transferability and reliability, dermoscopy studies relating to specific features or patterns of skin tumours often include data on inter-observer agreement between different image readers. Authors often publish example images to highlight their findings, which prove particularly useful for clinical dermatologists. The images may be even more important than tabulated data and running text for understanding the message. Moreover, images are irreplaceable for continuous medical education and teaching, and are central to how dermatologists teach, learn and remember. While research studies frequently include example images, the complete data-sets analysed are not often published, which raises concern about reproducibility and scientific transparency. To increase transparency in research, data sharing in prospective trials has become the norm, and several journals now require a data sharing statement upon publication (4). The aim of this systematic review was to determine what proportion of skin tumours analysed is made available to the reader in publications on inter-observer studies of dermoscopic features and/or patterns.
MATERIALS AND METHODS
A systematic review was conducted, adhering to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (5). The review was not registered and a formal study protocol was not prepared. The PRISMA checklist is available in Table SI.
Eligibility criteria
Exclusion criteria
The review was limited to inter-observer studies that included the level of agreement between ≥ 2 independent readers assessing dermoscopy features and/or patterns of skin tumours.
Information sources and search strategy
Embase (Ovid), MEDLINE (PubMed) and Scopus databases were searched for eligible studies published from inception to 2 July 2020. The search strings used were constructed in collaboration with 3 medical librarians (Appendix S1).
A standard method was used to remove duplicates when merging the data-sets obtained from the 3 databases (6).
Selection and data collection process
Two reviewers (SP and OZ) screened the titles and abstracts of all studies. Any record with an eligible title but missing abstract was included for full-text review. Records with an abstract indicating that ≥ 2 readers analysed dermoscopic images of the skin tumours were included for full-text review. All full texts were reviewed independently by both authors. Only studies including the level of inter-observer agreement regarding dermoscopic features and/or patterns were included. Any disagreement regarding eligibility was resolved by consensus. Occasionally, when other diagnostic modalities, such as optic coherence tomography, reflectance confocal microscopy and histopathology, were assessed, the studies were included only if dermoscopic images were also reviewed. Whenever a subset of lesions was included for inter-observer agreement the number of lesions in that subset was extracted. The data collected from the included studies was verified by both authors.
Data items
The following items were extracted from included studies: first author; year of publication; country/countries from which patients were included (if available); publication type; journal; digital object identifier (DOI)-number (if available); number of analysed lesions; number of available images in the manuscript (including supplementary material); and number of annotated images. All included studies were also accessed online by both authors to verify whether there was any supplementary material available. Data extraction was performed by both authors in collaboration. Several sequential or magnified images of a single lesion were considered as 1 image. A figure consisting of a panel depicting 4 tumours was considered as 4 images. The final data-set was verified by both authors.
Study risk of bias assessment
Due to the nature of this systematic review (i.e. dichotomous outcome), no quality assessments tools were used or tested for publication bias.
Effect measures and statistical analysis
The measure for this review was binary (i.e. presence or absence of images). The proportions of shared images in each included investigation and for the complete data-set were determined. Two software packages; EndNote (Clarivate Analytics, Philadelphia, PA, USA) and Rayyan (Rayyan Systems Inc., Cambridge, MA, USA) were used throughout to compile and sort the records. All publications were handled manually and no automation tools were used. The EndNote libraries used for the review are available on request from the first author. Microsoft Excel (Microsoft, Redmond WA, USA) was used for data tabulation. Fisher’s exact test was used to analyse whether more recent publications and original publications were more predisposed to data sharing.
RESULTS
Of the 1,225 records first identified, 686 studies were reviewed in full text. After exclusions, 61 studies published in the period 1997 to 2020 were included in the analysis (3, 7–66) (Fig. 1). Overall, 53 were original studies, whereas 8 were published in a more concise format.
Several studies, including those by Argenziano et al. (67), Moscarella et al. (68), Haenssle et al. (69) and Zalaudek et al. (70), reported on specific features of skin tumours, but were excluded, since they did not include any data on inter-observer agreement between the readers who analysed the dermoscopic images. In many instances disagreement between 2 independent readers were resolved by consultation with a third reader; however, if no data on the level of agreement of specific features and/or patterns were provided, the investigation was excluded.
When combining the 61 studies mentioned above, 14,124 lesions were analysed. In total, 373 images were published; an overall sharing rate of 3% (Table I). One investigation shared the complete data-set (47). Of the included images 104 (28%) were annotated. The proportion of images shared in 2015 (2.4%, 184 out of 7,486) did not differ significantly from the proportion shared from 2016 to the end of the study period (i.e. 2 July 2020) (2.8%, 189 out of 6,638, p = 0.16) (Fig. 2). Publications in shorter format (i.e. research letters or concise reports) shared a higher proportion of images (6.1%, 76 out of 1,229) compared with more lengthy research articles (2.3%, 297 out of 12,895; p < 0.0001).
DISCUSSION
Of the studies included in this systematic review, 97% of the dermoscopic images analysed were not published. In the author instructions, editorial offices often restrict the number of images and or tables that can be included in the main manuscript, but allow researchers to include supplementary material, either available online at the journal website or in a digital repository. Needless to say, sharing of data-sets of dermoscopic images expands the available image gallery for dermatologists. Other than for educational purposes, image data is instrumental in terms of external validation and critical appraisal of the finding by other researchers. Since it is also not certain that the readership will agree with the results and image interpretation, data sharing would enable a much- welcomed debate and nurture a healthy scientific community. Furthermore, sharing of data-sets of dermoscopic images is important when new features and/or patterns are discovered. Whenever new features and/or patterns are presented, older available data-sets could be re-used to critically review the reproducibility of the suggested findings. Moreover, since most of the lesions depicted in scientific publications have received a histopathological diagnosis and are also peer reviewed and quality checked, these data-sets could be of fundamental value for the development of machine learning algorithms, which are expected to have a bright future in our field (71, 72). Finally, sharing data-sets of dermoscopic images would diminish the risk of data duplication. Considering the arguments above, we are confident that sharing a greater proportion of dermoscopic images would increase the validity of the presented results and have an important auxiliary effect on the quality of inter-observer research in dermoscopy.
We acknowledge that sharing image data was impractical in the early 2000s, but today it is easy to include images as a supplement or, even better, to share them in an online repository, such as the Human Against Machine data-set, which has 10,000 training images (HAM10000) (73). In the current study, there was no difference in the proportion of images shared before or after 2015. However, shorter articles shared a greater proportion of dermoscopic image data compared with original studies. This was somewhat surprising, since supplements are not always allowed in these types of studies.
A limitation of this review is that the search was restricted to include only studies assessing inter-observer agreement. The main purpose of an inter-observer investigation is to evaluate the variation between results obtained by observers examining the same material. Consequently, the results of an inter-observer investigation should be reproducible among the population studied by the cohort of observers. While including only these types of studies may appear to be arbitrary, we found these studies particularly suitable, since agreement per se is more useful when the reader is given information regarding what content the observers in the study agreed or disagreed upon. Moreover, for practical reasons the current study search was limited to include only skin tumours, precluding inflammatory skin disorders. It is possible that broadening the inclusion criteria would yield different results. Finally, the current review was limited to English publications.
This review does not address the reasons why only a minority of images were shared. The obstacles to data sharing should be investigated in future studies. While clinical images may be difficult to anonymize, it is widely accepted that a patient cannot be recognized by simply viewing a dermoscopic image. Nonetheless, the theoretical risk of identifying a patient must, of course, be weighed against improving research transparency and the substantial educational benefit it entails. Regardless of the underlying reasons, this systematic review should be considered as a wake-up call for the dermatology community and for editorial offices of dermatology journals to focus on improving sharing of data-sets of dermoscopic images in dermoscopy research. As such, this review is an invitation to the dermoscopy research community to focus on standardizing the format in which dermoscopy studies should share data-sets. A consensus concerning the reporting of data, along with a checklist, could help future researchers. To enhance sharing of data-sets of dermoscopic images, we suggest including both annotated and unannotated images in a supplement. By placing these images side by side the reader will have a better opportunity to critically review the features and/or patterns. This would increase the educational value of the studies. Furthermore, along with the original images, it would be valuable to publish the annotated worksheets that the study readers used when deciding on the presence of specific features or patterns.
This systematic review highlights that the vast majority of dermoscopy research images are not shared. In our opinion the publication of such images should be a requirement for future studies.
ACKNOWLEDGEMENTS
The authors thank Ida Stadig at the medical library of Sahlgrenska University Hospital and Linda Hammarbäck and Helen Sjöblom at the Biomedical Library at the University of Gothenburg for helping us generate the search strings used for this investigation. We thank Martin Gillstedt at the Department of Dermatology and Venereology, Institute of Clinical Sciences, Sahlgrenska Academy, University of Gothenburg for help with statistical analysis.
The authors have no conflicts of interest to declare.
REFERENCES