DIAGNOSTIC PERFORMANCE OF DEEP LEARNING ARCHITECTURES IN THE DIAGNOSIS OF SJOGRENS SYNDROME- A SYSTEMATIC LITERTAURE REVIEW
This article was originally published by Indira Gandhi Institute of Medical Science and was migrated to Scientific Scholar after the change of Publisher.
Sjogren's Syndrome (SS) is a chronic, multifactorial autoimmune disease, characterized by clinical symptoms of dry mouth and dry eyes, due to chronic lymphocytic destruction of salivary and lacrimal glands, respectively. Proper diagnosis is a key towards better outcome. Recently introduced deep learning systems have ability to reflect the complexity of condition, with an aim to bring personalized medicine closer to the patients.
The aim of this systematic review is to compile evidence-based studies pertaining to diagnostic performance of DL system and its algorithms in diagnosis of monitoring of SS.
Materials and method:
Computerized literature search was performed to select eligible articles from the following databases: PUBMED [MEDLINE], SCOPUS, SCIENCE DIRECT and COCHRANE DATABASE using specific keywords. The search was limited to articles published as full text in English, which were screened by two authors for eligibility.
Four studies satisfied our inclusion criteria, that suggested it to have high diagnostic accuracy when compared to inexperienced radiologist, but equivalent to those of experienced radiologists. Two studies found accuracy, sensitivity, and specificity of DL systems to be 89.5%, 90.0%, and 89.0%, for USG salivary gland images respectively whereas for CT images, the accuracy, sensitivity, and specificity was observed to be 96.0%,100% and 92.0%, respectively, and the diagnostic performance was higher from an inexperienced radiologist (p < 0.0001).
DL systems have the potential to provide useful diagnostic support to inexperienced radiologists in assessment of images for the presence of characteristic features of SS. They could assist the radiologists in automated segmentation of salivary glands, and enables feature extraction in a reduced time with reduced risk of cognitive errors.
imaging and Sjogren's syndrome.
Sjogren's Syndrome (SS) is a chronic autoimmune disorder characterized by lymphocytic infiltration of exocrine glands predominantly salivary and lacrimal glands, resulting in symptoms of dry mouth (xerostomia) and dry eyes (keratoconjunctivitis sicca). SS has been reported to commonly involve parotid glands of middle-aged females between 40 and 60 years.1 Etiopathogenesis of this disorder is still elusive, although genetic, environmental, immune system dysregulation and hormonal factors have been found to play a key role towards its occurrence. In majority of cases, patient usually complains of mucosal dryness but in few patients recurrent parotid gland enlargement is observed that places the clinician in a challenging position to arrive at proper diagnosis.1,2 Diagnosis is SS is usually based on patient's history, clinical examination, American European Consensus Group (AECG 2002) classification criteria, autoantibody tests, and minor labial salivary gland biopsy.2 Although aforementioned standard diagnostic work up can assist the clinician to formulate working diagnosis but most of them are invasive and result in patient discomfort or pain, moreover further confirmation by various imaging modalities is required to rule out the exact cause of xerostomia.3
Imaging modalities are useful in assessing functional activity of glands and in monitoring disease progression. Sialography is one of the widely used radiographic technique that involves retrograde injection of radiopaque contrast media into the gland and resultant sialo graphic images provides information on the ductal architecture. However, procedure requires cannulation of ducts which is invasive, and retrograde injection of contrast media can exacerbate the existing inflammatory process.3,4 Therefore, to overcome these limitations, oalternative sialo graphic techniques with isotropic voxel resolution such as sialo-cone beam computed tomography (sialo-CBCT) and Magnetic resonance (MR) sialography have been introduced.5 Besides this, cystic changes can be better visualized on MRI which occur in later stages due to fatty degeneration in the glandular parenchyma.6
However, radiologists with less experience in interpretation of images, find diagnostic difficulties in assessment of cystic changes associated with SS in advanced stages on CT or MRI images.6 With software innovation, intelligent systems that include machine-based learning, deep learning systems and artificial intelligence (AI) are capturing the interest of the researchers. Intelligent systems deal with the computational models that can act and think like human brain and has been studied to reflect the complexity of autoimmune diseases with an aim to bring personalized medicine closer to the patients.7,8 Deep learning (DL) system, a subset of AI learning methods comprises of convolutional neural networks (CNN) in which multiple layers of algorithm are interconnected, pooled together and stratified into more or less meaningful data, thus these multi- layered algorithms form a large artificial neural network. Large volume of data sets are trained and fed into the multi-layered neural network, thereafter system performs step wise learning by means of supervised or unsupervised learning programs and extracts the characteristic features from the data set, that is auto segmentation, to create automated learning model [Figure 1]. 8
Studies have documented high diagnostic performance of DL systems in clinical diagnostic imaging such as location of radiographic anatomical landmarks, in detection of oral cancer, periapical pathologies, temporomandibular joint osteoarthritis, and maxillary sinus pathologies, etc.7-11 These systems have been found to be a promising diagnostic tool for radiologists in accurate prediction and classification of oral diseases from large volume of data in less time, besides this they have reduced radiological work load, and have probably eliminated the errors that may occur due to cognitive bias. 8,9 Literature has revealed paucity of research in the clinical applications of intelligent systems in detection and classification of salivary gland diseases. To this consideration, the aim of this systematic review is to summarize studies from pre-existing literature to evaluate the performance of DL systems in detection of SS, in differentiation of SS from other salivary gland pathologies and to compare the diagnostic performance of DL systems with both experienced and in experienced radiologists.
MATERIALS AND METHOD:
A systematic review of scientific literature regarding the performance of DL systems in detection of SS was done in the manuscript. The electronic retrieval systems and data bases such as PUBMED (MEDLINE), SCOPUS, SCIENCE DIRECT and COCHRANE DATABASE were searched for relevant articles from December 2018 till December 2020, interrogated by MeSH terms such as "artificial intelligence", "deep learning", "diagnostic performance", "imaging" and "Sjogren's syndrome. Computerized search strategy was employed to search the relevant articles published in English. The search was based on PICO model, that is population /patient/problem, intervention/indicator, comparison, and outcome. Titles and abstracts of the selected articles were reviewed for inclusion in the systematic review.
Inclusion and exclusion Criteria
Cross sectional, observational, case control, cohort studies primarily focussed on applications of DL systems in detection of SS and published as full text in English Language were included in this systematic review. Exclusion criteria was: review articles, non- peer reviewed meeting abstracts or posters and case reports, inadequacy of outcome, improper mention of datasets used to assess the model and which were not pertaining to DL systems.
Literature Search strategy and data extraction
In the initial selection, two authors with ten years of experience performed computerized search for the articles that met the selection criteria. Full title and abstract of the retrieved articles were reviewed, and the articles that were not matching inclusion criteria and duplicate articles were discarded. In the next stage of selection, the authors tried to obtain the full papers for all the potentially eligible studies. Any disagreement was resolved by discussion between the authors for final inclusion of studies in the systematic review.
On the basis of studies characteristics (title of the paper, author's information, year of study, aim and objectives, outcome and conclusion) two authors independently extracted data from eligible studies using the standard data extraction form. Studies focusing on the clinical applicability DL learning systems in diagnosis of SS were analysed for bias by the authors. Differences were resolved by the discussion between the authors.
Table 1 shows that the initial computerized search strategy yielded 248 titles. In the first selection two authors with ten years' experience in the field, screened the articles by reading titles and abstracts. Due to duplication, 116 articles were discarded, and then full text of 132 eligible articles were screened and 123 articles were excluded as they did not fulfil the inclusion and exclusion criteria. In the second stage of selection, out of 9 eligible studies, 5 were discarded due to disagreement between the authors, and final analysis included 4 studies for this systematic review. All the selected studies demonstrated high diagnostic performance of DL systems in the early diagnosis of SS, and in differentiation of this condition other salivary gland pathologies or conditions that may cause xerostomia /dry mouth. 6,13-15
In two studies by Kise et al, CT and USG images of the SS patients were analysed using CNN based architectural framework of DL systems.6 For interpretation of USG images of parotid glands, the accuracy, sensitivity, and specificity of DL systems was found to be 89.5%, 90.0%, and 89.0%, respectively whereas for CT images, the accuracy, sensitivity, and specificity was 96.0%,100% and 92.0%, respectively, and the diagnostic performance was different from an inexperienced radiologist (p <0. 0001).6,12 These findings suggested that DL systems could be used as a reliable and useful diagnostic support while interpretation of CT and USG images selected by an experienced radiologist [Figure 2]. In another study, Vukicevic et al performed automated segmentation of salivary gland USG images by use of DL systems and they also compared the performance of four DL architectural networks: i) Fully convolutional neural (FCN), ii) Fully convolutional dense nets (FCN-DenseNet) network, iii)U-Net network, iv)Link-Net network. All of the networks were trained for 300,000 iterations, for semantic automated segmentation of salivary glands images and FCN-DenseNet network showed the best performance above the inter-observer agreement of 0.76 and slightly above the intra-observer agreement of 0.84 between clinical experts. Hence, it was suggested that DL system based, FCN -Dense Net network could be used for non-invasive assessment of salivary glands for SS, and pSS scoring.13 Later, di Scandalea et al performed auto segmentation on in vivo confocal microscopy images using U Net network and accuracy of 81.1 % was observed for SS patients.14 Based on the outcome of 4 studies as shown in Table 2, DL systems were found to be a promising diagnostic tool for diagnosis of pSS patients and could be used as a reliable, non-invasive diagnostic support while interpretation of CT, USG and IVCM images.
|Kise et al (2019)||To assess the diagnostic performance of DL systems for diagnosis of SS on USG images||100 SS patients
Training data:80 USG images
Test Data:20 USG images
USG scan of both PG and SM gland and images were selected by an experienced radiologist
|100 non-SS patients with complaint of dry mouth
Training data:80 USG images
Test Data:20 USG images
Comparison with inexperienced radiologists
|For PG: The accuracy, sensitivity, and specificity of DL systems was 89.5%, 90.0%, and 89.0%, respectively
The accuracy, sensitivity, and specificity of inexperienced radiologists was 76.7%, 67.0%, and 86.3%, respectively
For SM gland: The accuracy, sensitivity, and specificity of DL systems was 84.0%, 81.0%, and 87.0%, respectively, and for the inexperienced radiologists was 72.0%, 78.0%, and 66.0%, respectively
The AUC of the inexperienced radiologists was significantly different from that of the deep learning system for PG and SMG (p < 0.0001, p = 0.0005), this finding indicated that DL systems have high clinical utility in diagnosis of SS when static images were selected by an experienced radiologist
|Kise etal (2020)||To assess the diagnostic performance of DL systems for detection of SS on CT images||250 CT images of (n=25) SS patients
Training data set: 200 images
Test data:50 images
|250 CT images of 25 control subjects with no parotid gland abnormality
Training data set:200 images, and test data set:50
Comparison of performance with three experienced and three inexperienced radiologists
|The accuracy, sensitivity, and specificity of the DL systems were 96.0%,100% and 92.0%, respectively, comparable to both experienced and inexperienced radiologists DL systems could be used as a diagnostic adjunct for interpretation of CT images|
|Vukicevic et al (2020)||Robust DL -based system for the automated segmentation of SGUS images.
i) Fully convolutional neural (FCN)
ii) Fully convolutional dense nets (FCN-DenseNet) network
iii) U-Net network
iv) Link-Net network
|1184 annotated SGUS images (287 patients)||Comparison between four DL architectural convolutional networks||The accuracy, sensitivity, and specificity of the DL systems were 96.0%,100% and 92.0%, respectively, comparable to both experienced and inexperienced radiologists DL systems could be used as a diagnostic adjunct for interpretation of CT images|
|Di Scandalea et al (2020)||To use CNN in a Multiple Instance learning setting for diagnosis of SS from IVCM images
Auto segmentation performed on IVCM images using U Net network MIL framework performed the diagnostic task
|IVCM images of 63 SS patients||IVCM images of 17 normal (non -SS) subjects||Accuracy of 81.1 % and an average ROC AUC of 0.69, for the SS patients on the test set were found.
CNN showed promising diagnostic ability to detect and monitor SS patients from IVCM images
Risk of Bias Assessment:
All the studies demonstrated low risk bias using COCHRANE BIAS TOOL. Randomization sequence bias was low for all the 4 studies, and allocation concealment, and blinding bias was high for the studies (Table 3).
|Author (Year)||Randomization Sequence||Allocation Concealment||Blinding of Participants||Blinding of outcome assessment||Incomplete Outcome data|
|Kise et al (2019)||_||+||+||+||_|
|Kise et al (2020)||_||+||+||+||_|
|Vukicevic et al (2020)||_||+||+||+||_|
|Di Scandalea et al (2020)||_||+||+||+||_|
Imaging plays a very important role in diagnosis, and functional assessment of salivary glands in SS. Due to complexity of the condition, and overlapping symptoms with other diseases such as chronic sialadenitis, sialolithiasis and sialadenosis, diverse imaging modalities have been developed.3 Sialography is most commonly preferred technique to detect salivary gland abnormalities, however recently introduced MR -sialography and 3D CBCT sialography may replace conventional sialography due to high soft tissue resolution. 3D CBCT sialography is a way forward in dentistry, due to its inherent ability to produce 3D reconstructed images with isotropic voxel resolution and allows precise evaluation of location of salivary calculi, including those with size smaller than 2mm.4,5 In future, intelligent systems may transform imaging in detection, diagnosis and monitoring treatment response by their unique advantage of autosegmenation of salivary glands in a reduced time and providing radiologist with an image that are highly reproducible for interpretation.
SS is a chronic autoimmune disorder, most commonly affecting bilateral parotid glands and patient often reports with clinical symptoms of dry mouth and dry eyes. 2 Imaging plays a crucial role in diagnosis, and is largely influenced by the stage of the disease. In initial stages, parotid glands appear normal, but as the condition progresses multiple small cystic areas are found in the intermediate stage within the parotids and in the later stages fatty degeneration, large cystic or solid masses are observed. Cystic masses indicate destroyed gland or collection of saliva and solid masses are the lymphoid aggregates that result in the glandular destruction.6 Interpretation of fatty changes within the glands is necessary to differentiate SS from other conditions such as medications or systemic diseases that result in xerostomia.5,6 Although these changes are well appreciable on MRI and contrast enhanced CT images, however detection becomes challenging for the inexperienced radiologists, who have less experience in accurate interpretation of disease characteristics on the images for the presence of SS and as a result cases are either mis diagnosed or are left untreated. Moreover, for manual image segmentation radiologist should have adequate experience to extract appropriate images for disease classification.
Now a days, intelligent systems algorithms, particularly DL methods are becoming popular that have the ability to formulate algorithms for automated analysis of images, they could assist the radiologists in classification of diseases without the human assessment, and provides information on functional performance of salivary glands. These systems are based on the CNN networks in which multiple layers are pooled together to create a learning model from the trained data sets [Figure 1].7,8 Studies by our literature search have shown high diagnostic performance of DL systems in interpretation of CT and USG images for the presence of characteristic features of SS.6,13 Kise et al conducted training and validation processes for 300 epochs to create a learning model on the basis of these training samples, thereafter test data were input into the learning model, and each CT image was interpreted for presence or absence of SS. They observed accuracy, sensitivity, and specificity of the DL system as 96.0%, 100% and 92.0%, respectively, comparable to experienced radiologists; 98.3%, 99.3% and 97.3% and significantly higher than those of inexperienced radiologists with 83.5%, 77.9% and 89.2%. These findings suggest that DL systems could be used as diagnostic support especially by inexperienced radiologists to classify salivary gland diseases based on CT images.7 Similarly, in another study, Kise et al observed the accuracy, sensitivity, and specificity of the DL system for the parotid gland as 89.5%, 90.0%, and 89.0%, respectively, higher than those for the inexperienced radiologists as 76.7%, 67.0%, and 86.3%, respectively while interpretation of USG images for the presence of SS image characteristics. For the submandibular gland, accuracy, sensitivity, and specificity of the DL system 84.0%, 81.0%, and 87.0%, respectively, were also found to be higher than those for the inexperienced radiologists; 72.0%, 78.0%, and 66.0%, respectively.13 These findings highly suggest that automated segmentation by CNN networks provides advanced diagnostic support to inexperienced radiologists in selection of appropriate images for interpretation. In addition, training of large data sets improves the diagnostic efficiency of DL models, and assists in recognition of complex patterns.
Later Vukicevic et al, suggested that FCN-DenseNet, CNN based architectural network have wider applications in clinical practice, and they can automate and integrate both the methods for pSS scoring and salivary gland segmentation. They stated that FCN - DenseNet network could effectively enhance the interpretation of USG images for diagnosis of SS, by increasing the depth of CNNs, termed as "gradient vanishing.” FCN -DenseNet network perform semantic segmentation by an unsampling path that increases information flow by interconnected dense blocks and pooling operations, that input received by the dense blocks is transferred into the subsequent layers to yield the output.14
Clinical applications of DL architectural frameworks have been studied for the location of radiographic landmarks, in detection of periapical pathologies, caries, and oral cancer in its initial stages. Endres et al found total predictive value (TPV) for DL models as 0.67 (+0.05), in comparison to Oral Maxillofacial (OMF) surgeons, TPV obtained was 0.51 (+ 0.14) in assessment of periapical radiolucencies such as cysts, granuloma, and osteomyelitis on selected 192 panoramic images. The results indicated that on an average 49% of periapical radiolucencies were missed by OMF surgeons which may be their reduced ability to accurately identify periapical radiolucencies in panoramic radiographs may be limited. It was concluded that DL process could guide OMF surgeons for interpretation of periapical radiolucencies on panoramic radiographs for better clinical outcome.11 Lee JH et al found high diagnostic performance of deep CNN architectures in detection of three types of odontogenic cystic lesions (OCLs): odontogenic keratocysts, dentigerous cysts, and periapical cysts on dental panoramic and CBCT images. The pretrained model using CBCT images showed higher diagnostic performance (AUC = 0.914, sensitivity = 96.1%, specificity = 77.1%), than that achieved by models using panoramic images (AUC = 0.847, sensitivity = 88.2%, specificity = 77.0%) (p = .014).10
Accurate localization of oral lesions for development of cancer presents diagnostic difficulties, currently developed AI models are a boon for both patient and cancer specialists for detection of oral cancer in its early stages. Welikala et al suggested Res Net-101 CNN framework could extract characteristic features from well-annotated images of oral lesions, F1(harmonic mean of precision and recall) score of 87.07% was obtained for the identification of images that contained lesions and 78.30% for the recognition of images that requires referral.9 In study by Fu et al sensitivity and specificity of 94.9% and 88.7%, respectively was observed for DL algorithms in early and rapid detection of oral squamous cell carcinoma (OSCC) and allows differentiation of OSCC from non OSCC oral lesions.8 Studies have suggested outstanding diagnostic accuracy of DL systems in detection of oral lesions beyond the skill of clinical experts but its usefulness in early diagnosis of salivary gland diseases particularly SS, is still in the nascent stage. Therefore, future demands more research to be focussed in this domain for better diagnosis and treatment outcome for patients with persistent xerostomia and chronic or recurrent salivary gland swelling.
Analysis of studies in this systematic review revealed that DL algorithms and architectural CNN networks have potential to assist the radiologists in performing the challenging task in a reduced time, and provides reproducible images of oral diseases. Applications of DL systems are still an active area of research and promising results have been obtained for diagnostic performance in diagnosis of characteristic features of SS, and in establishing likely differential diagnosis of SS. Diagnostic performance of CNN networks increases by increasing the volume of data sets, that makes DL systems as a reliable modality for non-invasive assessment of SS, and in detection of salivary gland abnormalities in both ductal system and glandular parenchyma invisible to naked human eye. In future more research studies should be encouraged for better prognosis of SS patients.
- Deep Learning based diagnosis of Sjögren syndrome using In Vivo Confocal Microscopy. Investig Ophthalmol Vis Sci. 2020;61:1621.
- [Google Scholar]