• No results found

This research work mainly focuses on investigating the perspective of interpretabil- ity in deep learning models for detecting cardiac abnormalities from ECG signals and explaining the predictions to the stakeholders. The thesis contributes in the fol- lowing directions. Initially, the low-frequency baseline wander is removed from the ECG signal for effective analysis, followed by detecting heartbeats (QRS complex) from the ECG signal and synthesizing irregular heartbeats to circumvent the imbal- anced heartbeat distribution. In the first contribution, a penalty-induced prototype- based explainable ResNet is proposed for heartbeat classification that addresses the black-box nature of deep neural networks and provides explanations in the form of candidate heartbeats of the corresponding class. In the second contribution, gradient backpropagation-based posthoc interpretability techniques are investigated to explain ventricular tachyarrhythmia prediction using ResNet. The techniques are validated through sanity checks, thereby providing reliable diagnosis explanations to clinicians.

In the third contribution, an attentive transformer neural network is proposed to classify and interpret varied length ECG segments into atrial fibrillation, normal si- nus rhythm, noisy rhythms, and other rhythms. In the fourth and last contribution, multichannel multilabel ECG classification and interpretation is performed using a channel-specific dynamic CNN with demographic features, eliminating manual effort with less trainable parameters for reduced lead ECG configuration, making the model less prone to overfitting and memory-efficient in real-time processing. The contribu- tions are summarised and represented in Figure 1.2. The detailed contributions are described below.

TH-2764_156201001

1.3. CONTRIBUTIONS

Single Channel Single Label ECG Rhythm Classification and

Interpretation

Multichannel Multilabel ECG Rhythm Classification and

Interpretation

Baseline Wander Removal using VMD ECG Signal

Synthesis, Classification, Explanation of Detected

Heartbeats

Figure 1.2: Thesis Contributions.

1. Baseline Wander (BW) Removal using Variational Mode Decomposition (VMD):

The ECG signal recordings get contaminated with BW during signal acquisi- tion, making classification challenging. BW is removed using VMD in this work, and the results are shown for normal sinus rhythm and ventricular tachycardia.

Firstly, the ECG signal is decomposed into variational modes using VMD, and the noisy modes are removed, and the remaining variational modes are combined to obtain the clean signal. The performance of VMD is estimated using percent- age root mean square difference, pearson Correlation, maximum absolute error, and the signal decomposition time. In addition, mean filtering, mean median filtering, empirical mode decomposition, ensemble empirical mode decomposi- tion, and complete ensemble empirical mode decomposition with adaptive noise are compared with VMD.

2. Heartbeat Detection: The R-peaks indicate heartbeats in the ECG signal. A simple, reliable, and intuitive algorithm is proposed for real-time R-peak detec- tion using the pattern recognition property of fractals. Mathematical morpho- logical operators such as erosion and dilation are implemented using dynamic programming with memoization to calculate the area under the ECG signal, and peaks are extracted after resampling and thresholding the area curve. The al- gorithm achieved higher sensitivity and predictivity for non-arrhythmic records compared to arrhythmic records. The predictivity was high compared to sensi- tivity; thus, fewer extra beats were detected with more missed beats. A similar pattern was observed in the detection error rate. Fractals’ fast and low compu-

8 TH-2764_156201001

1. INTRODUCTION

tational complexity properties make an effective real-time QRS detector.

3. Heartbeat Synthesis: A Deep Convolution Conditional Generative Adversarial Network is proposed for synthesizing different beat classes, namely, supraven- tricular ectopic beats (SVEB), ventricular ectopic beats (VEB), and normal beats (N) as recommended by the Association for the Advancement of Med- ical Instrumentation [7] to circumvent the imbalanced heartbeat distribution.

The generated beats are evaluated quantitatively through five metrics, namely, Frechet Distance, Dynamic Time Warping, Maximum Mean Discrepancy, Root Mean Square Error, and Time Warp Edit Distance. Qualitatively, the beats are visually analysed by plotting against the original beats.

4. Heartbeat Classification and Explanation: A Penalty Induced Prototype-based eXplainable Residual Neural Network (PIPxResNet) is proposed to classify N, SVEB, and VEB. The model addresses the black-box nature of deep neural net- works. PIPxResNet encodes the temporal variations of heartbeats by employing pre-trained ResNet following the concept of task transfer learning. The algo- rithm extracts prototypes that are most representative of the training dataset that explain model predictions to general physicians, making them clinically rel- evant. The prototypes of a particular class having a close resemblance to other class prototypes are penalized, and their contribution towards the correspond- ing class is reduced. PIPxResNet is adopted from xDNN [31]. PIPxResNet takes into account the following conditions: (i) Beats and prototypes of class should be in close vicinity; and (ii) Other class prototypes should be farther away from corresponding class prototypes. PIPxResNet uses the beats and generates encoded and actual prototypes associated with that beat class and the actual prototypes are used for explaining the predictions. PIPxResNet is tested on four publicly available standard datasets such as Massachusetts Insti- tute of Technology-Beth Israel Hospital Arrhythmia Database [32], MIT-BIH Supraventricular Arrhythmia Database [33], St. Petersburg INCART 12-lead Arrhythmia Database [34], and China Physiological Signal Challenge 2020 [35].

The data is collected from multiple geographical locations and is diverse enough to account for a real-world scenario.

CVD indicators may appear in random episodes on the nonstationary and non- linear ECG signals making the segment classification necessary. In the subse- quent two contributions, diseases such as Atrial and Ventricular Tachyarrhyth- mia are covered that occur in an episodic fashion. They are related to rapid TH-2764_156201001

1.3. CONTRIBUTIONS

irregular contractions of the heart’s muscle fibers which are generated due to untreated SVEB and VEB beats.

5. Posthoc Interpretability Techniques for Explaining Ventricular Tachyarrhyth- mia Prediction using ResNet: Ventricular Tachyarrhythmias such as Ventricu- lar Tachycardia and Ventricular Fibrillation along with Normal Sinus Rhythm are classified using ResNet and the predictions are interpreted using gradi- ent backpropagation-based posthoc interpretability techniques such as Guided Backpropagation (GBP) [36], Gradient Class Activation Map (Grad CAM) [37], and Guided Grad CAM [37] that highlight the signal timestamps responsible for a particular diagnosis. The models are tested on five datasets acquired from different geographical locations, which include Massachusetts Institute of Tech- nology [38], Malignant Ventricular Arrhythmia Database [39], Creighton Univer- sity Ventricular Tachyarrhythmia Database [40], Ideo Ventricular Arrhythmia Database [41], and American Heart Association Database [34]. Quantitatively, ResNet layers and ECG signal length were varied, and the effect of Augmenta- tion techniques, namely, Synthetic minority oversampling technique (SMOTE) [42], Borderline SMOTE using Support Vector Machine [43], and Adaptive syn- thetic sampling approach [44] was also analyzed for achieving optimum perfor- mance. Qualitatively, the saliency maps generated by posthoc interpretability techniques are used for interpretation, and the techniques are validated us- ing sanity checks; namely, weight randomization and training data permuta- tion [45]. Grad CAM performed better than other interpretability techniques and precisely highlighted relevant signal timestamps responsible for/against the diagnosis. GBP and Guided Grad CAM worked as peak detectors that usually aid in detecting heartbeats for most cases. The sanity checks validate inter- pretability techniques by perturbing the respective saliency maps, providing reliable diagnosis explanations to clinicians.

6. Attentive Transformer Neural Network for Atrial Fibrillation Classification and Interpretation: An end-to-end framework is developed for detecting Atrial Fib- rillation from varied length ECG segments using CNN, ResNet, Attentive Con- volution Neural Network, and Transformer Neural Network. The framework is verified using PhysioNet Computing in Cardiology Challenge 2017 database.

The effect of ECG length is also analyzed. ResNet achieves good performance but is not interpretable. The interpretability aspect is incorporated in DLM through Attentive Convolution Neural Network and Transformer Neural Net-

10 TH-2764_156201001

1. INTRODUCTION

work models that provide explanations through feature maps highlighting the clinically meaningful characteristic ECG waves responsible for and against the diagnosis.

7. Multichannel Multilabel ECG Classification and Interpretation: The cardiac abnormalities often provide lower resolution on single-channel ECG but pro- vide better resolution on multi (12,6,4,3,2) channel ECG. Moreover, the car- diac abnormality interpreted by one cardiologist may differ when interpreted by another cardiologist, leading to multiple possible cardiac abnormalities for a single ECG recording. This problem is tackled in three stages. Initially, 12-lead ECG classification is performed with a single label through convolution, re- current, and attention-based models. Convolution-based ResNet outperformed other models in terms of performance and time complexity and are preferred in later stages. Next, demographic and heartbeat features extracted from modi- fied limb lead II are fused with static CNN to perform multilabel classification.

Multiple labels improved the performance drastically, but adding heartbeat fea- tures did not improve the performance significantly. The training and testing time of feature-based models increased due to the extraction of heartbeat fea- tures. Therefore, heartbeat features were excluded in the next stage. Lastly, a demographic feature fused channel-specific dynamically built CNN is used that eliminates manual effort and provides less trainable parameters for reduced lead ECG, making the model less prone to overfitting and memory-efficient in real- time processing. The methods are verified on six publicly available standard datasets [8, 34, 46]. In addition, the model also introduces an interpretability mechanism that highlights the important leads and relevant signal timestamps responsible for cardiac pathology prediction, which builds trust in the model.