• No results found

Development of pancreatic CT – scan image dataset and retrieval process for diagnosis

N/A
N/A
Protected

Academic year: 2022

Share "Development of pancreatic CT – scan image dataset and retrieval process for diagnosis"

Copied!
5
0
0

Loading.... (view fulltext now)

Full text

(1)

*Author for correspondence E-mail: eswarijp@gmail.com

Development of pancreatic CT – scan image dataset and retrieval process for diagnosis

K Jayaprakash1* and R Anandan2

1Department of Biotechnology, 2Department of Computer Science & Engineering, Karpaga Vinayaga College of Engineering

& Technology, Madhuranthagam, Chennai 603 308, India

Received 09 December 2011; revised 31 July 2012; accepted 01 August 2012

This study presents medical CT scan image feature analysis, creation of data bank and development of a data mining technique. A dataset of 50 known pancreas digital CT scan images with their clinical diagnosis were composed. All images were subjected for image textual characters (energy, entropy, contrast, homogeneity and correlation), which were statistically calculated in numerical MAT lab environment with syntax. Gray level co-occurrence matrix was utilized for interpretation of 2D digital images. Data was compared and mean values were maintained for retracing a matched image with query image. Results were discussed with future innovation and scope in medical digital image diagnosis.

Keywords: Biomedical image retrieval system, Development of digital image database & data mining, Feature extraction & CBIR in medical image

Introduction

Digital image feature extraction using statistical design and content based image retrieval (CBIR) have been employed in medical imaging science. CBIR is a computational procedure underlying with image physical principles with statistical analysis. There lies a much gap and dearth of knowledge exists in this area, although a large number of systems and explanation of the technologies have been implemented. Enser1 has elaborated image archives, various indexing methods and common searching task using text based queries on annotated images. Gupta & Jain2 highlighted past, present and future retrieval process of medical images. Tang et al3 presented a well documented medical image retrieval systems in current usage. Although many different techniques are adapted for medical image retrieval, CBIR is considered the most. However, still a lacuna exists in general application of picture recapturing system and programming tools4. CBIR has been proved worthy as it is mainly based on image feature extraction, feature storage; future comparison and query interface5. Existing image processing tools (khoros / Cantata / Visi Quest 1), insight toolkits (ITK)2, visualization tool kit (VTK)3 or

image 14 are available for feature extraction and comparison, but do not have a supporting system for generation of data bank. Therefore, processing tool that is not coupled with storage system may do complications and deprives with new information. Further, end user (clinician) may find difficulty for easy operations of retrieval algorithms. This study presents abdominal CT scan digital images of pancreas of human individuals.

Experimental Section

Out of 800 cases of intra abdominal human pancreas CT scan images, collected from various clinical and scan centers, 50 selected image samples were used for statistical interpretation, and diagnosed by an efficient histopathologist. Diseased conditions were also recorded.

Feature Extraction

Gray level co-occurrence matrices (GLCM) were used for estimation of image properties as per reported method6. Statistical and structural calculations were done as follows: G * G (GLCM) Pd for displacement vector d

= (dx, dy ). Entry (i,j) of Pd is number of occurrences of the pair of grey levels I and j, which are a distance d apart. Pd (i, j) = |{ (r,s), (t,v) :| (r,s) = i | (t,v) = j } | , where (r,s), (t,v) N * N, (t,v) = (r+dx, s+dy), and |.| is cardinality of a set. From co-occurrence matrix, useful

(2)

texture features (energy, entropy, contrast, homogeneity and correlation) were calculated6-8 (Table 1). Here µx and µy are statistical means and ìx and ìy are S.D of Pd (x) and Pd (y), where Pd (x) = “ Pd (x,j) and Pd (y) = Σ Pd (i,y). Due to intensive nature of computation involved, only d = 1 with angles 0o, 45o, 90o and 135o were considered. SYNTAX used for calculating GLCM of image was Glcm = graycomatrix (I, ‘NUMLEVES’, 8,

‘G’,[5], offset.

Minimum Distance Approach

Difference between textual feature values of query sample image and image stored in dataset were calculated. Different values for all features (energy, entropy, contrast, homogeneity, and correlation) were also determined. All values were stored in ascending order.

Image with lowest difference (nearest to zero) was selected. This was done by step wise step procedure.

Results and Discussion

Feature Extraction

A database consisting of 50 pancreas images was composed (Table 2). Clinical diagnosis was also ascertained by pathologist. Feature properties of all the 50 pancreas CT scan images were calculated for different angles orientations. Statistical mean values for each case of feature characteristic were worked out.

Differences between various structural features of query image and the same numerical properties of each image in constructed database were compared. Number of images (1-50) present in organized picture pool with

lowest difference is automatically selected. Selected image may be considered as an equal picture in all concepts. Therefore, selected image would be the maximum structural analogy.

Algorithm (Retrieval Procedure)

Step 0  Create a database with images and their details; Step 1  Start from the first image of database and proceed step 2 until last image is arrived; Step 2  Find feature extraction for image in 4 directions (0°, 45°, 90° & 135°) with syntax; Step 3  While stopping condition is false, do step 4; Step 4  Select query image and proceed with Step 2; Step 5  Repeat step 4 until stopping condition is false; Step 6  If query image is matched with data base image using minimum distance algorithm, resultant image is captured otherwise it will be added as a new image in the data base for future analysis; and Step 7  Display resultant image and details of diagnosis.

Query image tested for efficiency of present retrieval process is presented (Fig. 1a). Various structural features were worked out in MAT lab environment. Entry of query image picture data into the dataset of 50 pancreas image has automatically brought out a nearest equal image. It was found as 33rd image among images scanned, indicating that 33rd image is equal in all respects with query image (Fig. 1b). Resultant image Fig. 1b) and its textural features characteristics were automatically compared and minimum different values were calculated (Table 3). It was observed that query image has very minimum difference between features of entropy,

Table 1—Texture feature determined

Sl no Feature Description Formula adopted

1 Energy Measures number of repeated pairs. Energy is expected to be high if occurrence of repeated pixel pairs is high. Energy is 1 for a constant image.

2 Entropy Statistical measure of randomness that can be used to characterize texture of input image. Entropy is expected to be high if gray levels are

distributed randomly throughout the image.

3 Contrast Returns a measure of intensity contrast between a pixel and its neighbour over the whole image. Range = [0(size(GLCM)-1)2]. Contrast is expected to be low if gray levels of each pixel pair are similar.Contrast

is 0 for a constant image.

4 Homogeneity Measures local homogeneity of a pixel pair. Homogeneity is expected to be large if gray levels of each pixel pair are similar. Range = [0 1].

5 Correlation Returns a measure of how correlated a pixel is to its neighbor over the whole image. Range = [-1 1]. Correlation 1 or -1 for a perfectly positively or negatively correlated image. Correlation is expected to be high if gray levels of pixel pairs are highly correlated.

(3)

contrast, homogeneity and correlation. At 45° angle orientation, characters (entropy and correlation) have minimum difference on comparison of the first sample image and resultant image was (No. 33) (0.002377 and 0.0000004). For 90° angle analysis, features

(contrast, homogeneity and correlation) have minimum differences, which were recorded as 0.005491;

0.000641and 0.00000042 respectively. Likewise for 135°

angle orientation analysis, corresponding minimum difference were occurred in structural features of entropy

Table 2—Database of 50 human pancreatic CT scan images and their diagnosed clinical condition

Image Clinical condition Image Clinical condition

order order

1 Pseudocyst compresses stomach and 26 Small cell carcinoma of the pancreas with

spleen implants

2 Cystadenocarcinoma of the pancreas 27 Islet cell tumor of the pancreas

3 normal 28 Cystic fibrosis (fat replaced pancreas)

4 Cystadenocarcinoma of the pancreas 29 Invasive pancreatic adenocarcinoma 5 Cystadenocarcinoma of the pancreas 30 Carcinomatosis with implants

6 Normal 31 Pancreatic cancer presents as an abdominal

aortic aneurysm

7 Carcinoma of tail of pancreas 32 Serous cystadenoma

8 Normal 33 Acute pancreatitis

9 Cystic fibrosis with dilated bowel 34 Adenocarcinoma of the tail of pancreas

10 Normal 35 Pancreatic cancer invades splenic vein

11 Pancreatic cancer 36 Mucinous carcinoma of pancreas

12 Stone in distal common bile duct 37 Retained sponge simulates a pancreatic pseudocyst

13 Normal 38 Duodenal perforation with free air s/p an ercp

14 Cystadenoma pancreas 39 Lymphoma infiltrares pancreas

15 Chronic pancreatitis 40 Cystadenoma of tail of pancreas

16 Pancreatic cancer invades splenic vein 41 Annular pancreas

17 Adenocarcinoma of pancreas 42 Carcinoma of tail of pancreas

18 VHL with multiple renal carcinomas with 43 Normal

metastases to pancreas

19 Normal 44 Infected pseudocysts

20 Pancreatic cancer with vessel encasement 45 Carcinoma of pancreas invades splenic artery

& vein & results in splenic infarction 21 Pancreatic cancer with vessel encasement 46 Cystadenoma of pancreas

and liver metastases

22 Lymphoma infiltrates pancreas 47 Hamoudi tumor

23 Splenic artery aneurysm simulates a 48 Normal

pancreatic mass

24 Pancreatic cystadenoma 49 Acute pancreatitis

25 Invasive pancreatic cancer recurrence 50 Cystic fibrosis involves pancreas

Fig. 1—Human pancreas CT scan image: a) Query pancreas image; and b) Resultant image retrieved from database (33rd image in order)

a) b)

(4)

and correlation. Differential unit values were calculated as 0.00573 and 0.00000125 (Table 3).

These results unambiguously demonstrated that image no. 33rd, known as retrieval image, may be considered as an equal structural image of query pancreas scan image. Resultant image (no. 33rd) has appeared 3 times in 0° angle orientation, 2 times in 45°, 3 times in 90° and again 2 times in 135° angle orientation (Table 4).

Significant point is that sample image has minimum distance in all the 4 angles of direction in correlation studies (Table 4). Therefore, present minimum distance approach may also be one of the suitable computational tool for medical image recapture studies from dataset and query image diagnosis. Also, there is a significant speed of recapturing image when compared with other commercial software (VTK)3. Accuracy of present system is substantiated with other biochemical parameters, clinical symptoms and earlier known cases.

Application of computational numerical determination of textural features and selection of most probable equal matching image save the time. It can also help clinician to initiate treatment procedures.. Many similar studies6,9-

12 have been conducted in various anatomical tissue abnormalities by image characteristics. However, these works are based on characterization of image structural features by computer analysis and not by case specific.

These study and pancreas image database developments are a new approach to understand image recognition by using texture properties. Currently the most active areas of image retrieval research appear to be the detection of features and topics with in images using automated annotation methods13.

Conclusions

Minimum absolute difference approach for CBIR system has been tested for efficiency. A database consists of 50 pancreas images and their 5 textural properties have been created. A sample query images have been tested to establish this concept. Odd image, which has not been matched with any image, is entered automatically into the data set as a new image. This can improve efficiency of retrieval system and to develop a large data bank. Future scope lies in this area for the state of art of 3D visual perception and increase the speed of recapture.

Acknowledgements

Authors thank Director, Advisor, Principal and Dean of Karpaga Vinayaga College of Engineering and Technology, Chennai 603 308, India for encouragements.

References

1 Enser P G B, Pictorial information retrieval, J Doc, 51 (1995) 126-170.

2 Gupta A & Jain R, Visual information retrieval, ACM Commun, 40 (1997) 70-79.

3 Tang L H Y, Hanka R & Ip H H S, A review of intelligent content-based indexing and browsing of medical images, Health Inform J, 5 (1999); 40-49.

4 Carson C, Belongie S, Greenspan H & Malik J, Blobworld – Image segmentation using expectation- mamization and its application to image querying, IEEE Trans Pattern Anal Machine Intell, 24 (2002) 1026-1038.

5 Montagnat J, Breton V & Magnin I E, Using grid technologies to face medical image analysis challenges, in Proc Third IEEE ACM Int Symp on Cluster Computing and Grid, 2003, 588-593.

6 Haralick R, Stastical and structural approaches to texture, Proc IEEE, 67 (1979) 786-804.

Table 3—Difference between first sample image and resultant image (33rd image)

Sl no. Feature Character 45° 90° 135°

1 Energy 0.004292 0.000574 0.000384 0.000437

2 Entropy 0.001128 0.002377 0.006979 0.00573

3 Contrast 0.011776 0.008851 0.005491 0.002728

4 Homogeneity 0.000292 0.001818 0.000641 0.00482

5 Correlation 0.000304 0.0000004 0.0000042 0.00000125

Table 4—Result of image retrieval process for different orientation

Degree Contrast Energy Entropy Homogeneity Correlation Max no. of occurrence of

image / time

0 45 36 33 33 33 3/33

45 46 21 33 26 33 2/33

90 33 21 25 33 33 3/33

135 45 21 33 26 33 2/33

(5)

7 Belongie S & Malik J, Puzicha shape matching and object recognization using shape concepts, IEE Trans Pattern Anal Machine Intell, 24 (2002) 502-522.

8 Gonzalea R, Rafel C W & Richard E, Digital image processing using MAT LAB pearson education, 2005.

9 Miles K, Functional computer tomography in oncology, Eur J Cancer, 38 (2002) 2079-2084.

10 Wurflinger T, Stockhausen J, Meyer-Ebrect D & Bocking A, Automatic co registration, segmentation and classification for multimodal cytopathology, in Proc Med Inform Europe Conf (St Malo, France) 2003.

11 Antani S, Long L R & Thoma G R, Bridging the Gap; Enabling CBIR in medical applications, in Proc 21st Int Symp on Comput based Med Syst (University of Jyvaskyla, Finland) 2008.

12 Smietanski J & Rysz A R & Tradeusiewicz E L, Texture analysis in perfusion images of prostate cancer – A case study, Int J Appl Math Comput Sci, 20 (2010) 49-156.

13 Yavlinsky A & Ruger S, Efficient re-indexing of automatically annotated image collections using keyword combination, in Proc SPIE: Multimedia Content Access; Algorithms and Systems, vol 6506, 2007.

References

Related documents

Keywords: data, data display, data processing, empiricism, hypotheses, interview schedule, non-probability sample, primary data, probability sample,

Since the correlation coefficients of Chl-a with both aerosols and wind speed peak with a lag of 1–2 time steps, we chose a lag of 1 time step, for a linear step-wise

 An abstract data type (ADT) can be thought of as a mathematical model with a collection of operations defined on that model..  E.g., Sets of integers, together with the

(i) Assume the step moved down as positive integers and the step moved up as negative integers Given that monkey is at first step After first and second jump, he will be at step

first challenge by filtering out impractical and impracti- cable steps by maintaining a database of infeasible dance steps. The details of the infeasible dance step database are

SURF feature extraction from test images is the first step in annotation phase. In second step classify each test images using Fuzzy K-NN algorithm based on the model created in

 To study about the possibility of using different feature based face image registration techniques that can be used as a preprocessing step for face recognition, so

Step 2 – The LA nodes recalculate their position using the approach of the conventional DV-Hop algorithm. Step 3 – Step 2 is iterated to modify the hop size for finding the