• No results found

4.2 Penalty Induced Prototype-Based eXplainable Residual Neural Network 83

4.3.5 Evaluation of Prototype-Based Techniques

The prototype based techniques are evaluated in three stages: (i) Prototype extraction is performed on raw heartbeats; (ii) eXplainable ResNet; (iii) PIPxResNet.

Prototype Extraction from Heartbeats: The extraction of prototypes from raw heartbeats without using pretrained neural network as feature extractors dete- riorated the performance. However, the explanations and reliability scores provided evidence for misclassified beats, thereby allowing better model assessment. Figure 4.5b provides the performance over original data and augmented data. Since, no neural network is employed, CWOD is skipped. The P r and Se is consistently low for all the models. The training time was around 100 seconds for full training set and testing time (T e T i) was around 300 seconds for testing set. The time (≈ sec- onds) is significantly less than training neural networks (≈ day). The testing time

92 TH-2764_156201001

4. PENALTY INDUCED PROTOTYPE-BASED EXPLAINABLE RESNET FOR HEARTBEAT CLASSIFICATION

0.80 0.85 0.90 0.95

C1 C2 C3 C4 C5 C6 C7 C8 C9 R1 R2 R3 R4 R5 R6 R7 R8 R9

Accuracy Specificity F1Score

(a) Original Data

0.75 0.80 0.85 0.90 0.95

C1 C2 C3 C4 C5 C6 C7 C8 C9 R1 R2 R3 R4 R5 R6 R7 R8 R9

Accuracy Specificity F1Score

(b) Original Data With Class Weights

0.70 0.75 0.80 0.85 0.90 0.95

C1 C2 C3 C4 C5 C6 C7 C8 C9 R1 R2 R3 R4 R5 R6 R7 R8 R9

Accuracy Specificity F1Score

(c) WWT Augmented Data

0.65 0.70 0.75 0.80 0.85 0.90 0.95

C1 C2 C3 C4 C5 C6 C7 C8 C9 R1 R2 R3 R4 R5 R6 R7 R8 R9

Accuracy Specificity F1Score

(d) ADASYN Augmented Data

0.75 0.80 0.85 0.90 0.95

C1 C2 C3 C4 C5 C6 C7 C8 C9 R1 R2 R3 R4 R5 R6 R7 R8 R9

Accuracy Specificity F1Score

(e) Augmented Beats at batch 255

0.75 0.80 0.85 0.90 0.95

C1 C2 C3 C4 C5 C6 C7 C8 C9 R1 R2 R3 R4 R5 R6 R7 R8 R9

Accuracy Specificity F1Score

(f) Augmented Beats at batch 525

0.75 0.80 0.85 0.90 0.95

C1 C2 C3 C4 C5 C6 C7 C8 C9 R1 R2 R3 R4 R5 R6 R7 R8 R9

Accuracy Specificity F1Score

(g) Augmented Beats at batch 649

0.80 0.85 0.90 0.95

C1 C2 C3 C4 C5 C6 C7 C8 C9 R1 R2 R3 R4 R5 R6 R7 R8 R9

Accuracy Specificity F1Score

(h) Augmented Beats at batch 757 Figure 4.4: Neural Network Model Performance. The x-axis represent models and y-axis represent the performance achieved for corresponding metric.

is directly proportional to the number of prototypes extracted during training phase.

Table 4.1 shows the number of extracted prototypes from each configuration without using pretrained neural networks. DCCGAN generated very less prototypes for all batches as compared to other configurations. The reason might be the high resem- blance of synthesised beats which were not generated of so better quality with other augmentation techniques. Majority beats (N) generate more prototypes compared to minority beats (SVEB and VEB). For individual class performance, the method produces highest P r, for Normal class beats, highest Se for SVEB, and highest Acc and Sp for VEB.

eXplainable ResNet (xResNet): Pretrained Neural Networks were used to TH-2764_156201001

4.3. EXPERIMENTAL SETUP AND RESULTS

0.75 0.80 0.85 0.90 0.95

Precision Accuracy Sensitivity Specificity

OD CWOD WWTOD ADAOD 255 525 649 757

(a) Neural Network Models

0.35 0.45 0.55 0.65 0.75

Precision Accuracy Sensitivity Specificity

OD WWTOD ADAOD 255 525 649 757

(b) Prototype Extraction from Heartbeats

0.5 0.6 0.7 0.8 0.9

Precision Accuracy Sensitivity Specificity

OD CWOD WWTOD ADAOD 255 525 649 757

(c) eXplainable ResNet

0.5 0.6 0.7 0.8 0.9

Precision Accuracy Sensitivity Specificity

OD CWOD WWTOD ADAOD 255 525 649 757

(d) Proposed Method (PIPxResNet)

Figure 4.5: Best performing models from each configuration.

Table 4.1: Number of individual class prototypes extracted from heartbeats without using neural networks.

Class DCCGAN OD WWTOD ADAOD

N 14 66 66 66

S 12 28 17 30

V 10 23 33 25

extract features and ResNet [31] was used for heartbeat classification and explanation without penalising the beats. Figure 4.5c provides the performance of all configu- rations. Model trained using DCCGAN augmented beats on batch 255 performed better than ADAOD and WWTOD. Table 4.2 shows the number of extracted pro- totypes from each configuration using xResNet. The prototypes of OD, WWTOD, and ADAOD reduced as compared to the prototypes extracted from heartbeats with- out using neural networks as described in Table 4.1. The number of individual class prototypes follow the probability distribution of corresponding class beats for DC- CGAN synthesised beats, i.e., N prototypes are highest followed by SVEB followed by VEB. Other configurations also produce highest number of N prototypes followed by irregular beat class. CWOD configuration produced many prototypes of Normal class which might be due to the less penalising of missclassified Normal class beats during network pretraining. The results improved significantly over prototype-only method and reduced the prototypes resulting in reduced testing time. The training and testing times reduced due to the reduction of the data dimension from 186 to 3. Moreover, whenever the data dimension reduces, the method tends to perform

94 TH-2764_156201001

4. PENALTY INDUCED PROTOTYPE-BASED EXPLAINABLE RESNET FOR HEARTBEAT CLASSIFICATION

better and faster. Model performed better on CPSC and INCART data with good explanation maybe because same patient beats (at different period) were present in the train and test data.

Table 4.2: Number of extracted prototype using xResNet

Class 255 525 649 757 OD CWOD WWTOD ADAOD

N 21 15 16 14 8 45 15 17

S 10 13 15 14 10 9 8 12

V 8 11 9 12 16 16 11 9

All 39 39 40 40 34 70 34 38

PIPxResNet: A data driven penalty is imposed on encoded and actual beats if they are near to other class prototypes as compared to their corresponding class prototypes. Suppose, encoded N beat resembles with SVEB then it contributes less during prototype updation to reduce missclassification while inferencing. Table 4.3 shows the number of extracted prototypes from each configuration using the pro- posed PIPxResNet. The number of prototypes reduced for 649th batch of DCCGAN, OD, CWOD, and WWTOD and increased in other configurations as compared to xResNet, increasing the performance. Introducing the penalty helps in extraction of better prototypes that are more representative of the training dataset rather than influencing the number of prototypes. Figure 4.5d provides the performance of all configurations and shows that imposing penalty increases the model performance in general. The testing time also reduced by 10 times making the model suitable for mobile devices as provided in Figure 4.6. The P r and Sp improved as penalising reduces the contribution of ambiguous beats, thereby reducing the undesired over- fitting effects, and provides better model generalization. Normal class prototypes reduced without major change in performance. SVEB prototypes increased with an increase in P r, Acc, and Sp. VEB prototypes nearly doubled and reduced P r but increased Acc and Se.

Table 4.3: Number of individual class prototypes extracted using PIPxResNet.

Class 255 525 649 757 OD CWOD WWTOD ADAOD

N 21 18 16 23 5 22 7 13

S 8 11 8 10 7 23 11 14

V 14 15 13 14 11 16 13 15

All 43 44 37 47 23 61 31 42

Comparing xResNet and PIPxResNet: We compared the best performing xResNet with PIPxResNet and calculated the percentage increase in performance for TH-2764_156201001

4.3. EXPERIMENTAL SETUP AND RESULTS

Time (Seconds)

20 40 60 100 200

OD CWOD WWTOD ADAOD DCCGAN

xResNet Train xResNet Test PIxResNet Train PIxResNet Test

Figure 4.6: Training and Testing time taken by xResNet and PIPxResNet for vari- ous data configurations.

all evaluation metrics as described in Table 4.4. Inducing the penalty in xResNet algorithm generated better quality prototypes and had less influence in the number of prototypes, improving the prediction performance of the classifier. The Pr for DCCGAN augmented data at 757 batch improved by 9.68%,Acc for CWOD improved by 24.68%, Se for DCCGAN augmented data at 525 batch improved by 10.69%, and Sp for CWOD improved by 26.35%. The best performing PIPxResNet was at DCCGAN augmented data at 255 batch, which is further compared for individual classes with best performing ResNet.

Table 4.4: Comparison between the best performing xResNet and PIPxResNet.

xResNet PIPxResNet Change (%) Pr Acc Se Sp Pr Acc Se Sp Pr Acc Se Sp OD 0.66 0.85 0.83 0.91 0.69 0.86 0.84 0.91 4.20 0.65 1.12 0.06 CWOD 0.59 0.69 0.61 0.69 0.59 0.86 0.66 0.87 0.15 24.68 8.52 26.35 WWTOD 0.66 0.86 0.86 0.92 0.68 0.88 0.87 0.93 4.25 2.40 0.50 1.10

ADAOD 0.72 0.89 0.84 0.92 0.77 0.91 0.87 0.93 5.71 2.80 3.35 0.42 255 0.76 0.92 0.86 0.94 0.78 0.93 0.88 0.94 1.68 0.68 2.12 0.25 525 0.68 0.89 0.80 0.92 0.74 0.92 0.88 0.94 8.69 2.87 10.69 1.66 649 0.73 0.91 0.84 0.94 0.74 0.92 0.88 0.94 0.87 0.80 4.38 0.34 757 0.68 0.89 0.81 0.92 0.75 0.92 0.88 0.93 9.69 3.75 8.66 1.09