SHARP: a shape recognition system and its parallel
implementation
C P Ravikumar* and Rajender Sethi t
A parallel algorithm for shape recognition is presented along with its implementation on a distributed memory muhiprocessor. Shape recognition is one of the fundamental problems of computer vision. We consider a shape to be composed of a set of small straight line segments tangential to the object. The recognition problem is to determine whether the test image contains a speci- fied reference shape or not. The straight line Hough transform (SLHT) has been used to detect reference shapes. A signature- based parallel algorithm called SHARP is developed for shape recognition using SLHT on a distributed memory multiprocessor system. In the SHARP algorithm, the (0, r) space is divided among processors. The SHARP algorithm has been implemented on a Meiko transputer with 32 nodes. We analyse the performance of the parallel algorithm using both theoretical and experimental techniques.
Keywords: shape recognition, parallel algorithm, Hough trans- form
for detection of straight lines in binary images 2'4's. A straight line in the two-dimensional plane can be repre- sented by an angle 0 of its normal and its distance r from the origin. The equation of the straight line is given by:
r = x c o s 0 + y s i n 0
The value of 0 is restricted to the interval [0, x] and r is restricted to the interval [--n*(cos 45 + sin 45), n*(cos 45 + sin 45)], where n * n is the size of the image. The (0, r) space is discretized and represented by a two-dimensional accumulator array. Each accumulator cell counts the number of edge points (xi, Yi) mapped to it. Thus, informa- tion about line segments is obtained by applying a thresh- old value to the accumulator array. A limitation of the Hough transform is that it does not preserve information about which pixel belongs to a particular line, i.e. the connectivity of the edge pixels is not taken into account.
In this paper we use a modified transformation called the
Shape recognition is one of the fundamental problems of computer vision. In many practical applications it is necessary to recognize shapes of various types from pictures. Given a reference shape, the recognition problem is to determine if the test image contains the reference shape or not. In this paper, we consider the problem of shape recognition in binary images.
Shape representation
An arbitrary shape can be considered to be composed of small tangential straight line segments as shown in F i g u r e 1. Hough transformation (HT) 1 3 is a well known method
*Electrical Engineering Department, Indian Institute of Technology, New Delhi 110016, India
fNational Informatics Centre, Labour Information Systems, Division 111, SS Bhavan, Rafi Marg, New Delhi 11000], India
Paper received: 15 December 199 ~ Figure 1 An arbitrary shape and its represemation
0141-9331/95/$09.50 '!~ 1995 Elsevier Science B.V. All rights reserved Microprocessors and Microsystems Volume 19 Number 3 April 1995 131
SHARP: a shape recognition system: C P Ravikumar and R Sethi
straight line Hough transform (SLHT) 5'~', where the conti- nuity of the edge pixels is checked; unrelated pixels are not mixed together. The line segments are collected in a two- dimensional array called slht. A threshold is applied to the length of line segments detected in each slht cell, rather than to the number of edge points in the accumulator. In SLHT, the (0, r) space is regarded as a two-dimensional Boolean array, say A, such that A[O,][r/1 will have a value of 1 if the corresponding slht cell contains one or more detected line segments; otherwise the entry will have a value 07 .
Each line segment in the input shape (see Figure 1 for an example) maps to a point (0, r). Alternatively, each point in (0, r) space corresponds to a line that is tangential to the curve. Thus the SLHT of an arbitrary shape is the set of lines that are tangential to it 5'6. A shape signature, called the scalable translation invariant rotation-to-shifting (STIRS) 7, is obtained from the SLHT by computing the distances between pairs of parallel tangents to the shape.
stored in the database. Each processor computes the partial SLHT of the test image, a partial signature of the test image, and matches the partial signature with the reference signa- ture to find a partial score. These partial scores are combined to draw a conclusion about the detection and orientation of the reference object. The only overhead of the parallel algorithm is the summation of partial scores, thereby resulting in a processor utilization that is close to unity. The details of the SHARP algorithm are given below.
Organization
The concept of signature-based shape recognition using the SLHT is explained in the following section. The parallel shape recognition algorithm (SHARP) developed for the signature-based technique is described in the section following. Experimental results of the parallel implementa- tion are then discussed, followed by conclusions.
Shape matching
In recognition of arbitrary shapes, we are concerned with the detection of the object as well as estimation of the angle of rotation of the object in the test image. To recog- nize the test object, we need to match the signature of the test object with the signature of the reference object. The score at every angle of orientation is computed for all points at which the reference signature matches with the test signature. The detection of a peak at some angle [~ in the score indicates the orientation of the reference shape by angle fi in the test image. If no significant peak is detected, we conclude that the test shape does not contain an instance of the reference object.
Computation requirements
Shape recognition is a computation-intensive task, espe- cially because the reference shape may appear in many orientations in the test image. If we imagine all possible 360 orientations then reference features have to be matched a corresponding number of times in order to locate the shape. This motivates us to consider parallel solutions for recognizing shapes in real-time applications, e.g. robotics.
Parallel implementation
In this paper, we present SHARP (SHApe Recognition in Parallel) which has been designed to obtain linear speed- ups. The algorithm has been implemented on a distributed- memory multiprocessor system. Our algorithm divides the 0 space nearly equally among the available number of processors p. Given a binary test image and a database of reference shapes, our algorithm considers each reference shape in succession and matches it with the test image. The signatures for the reference shapes are precomputed and
SLHT OF A CURVE
A closed curve can be considered ~' 5, 7 to be composed of small, tangential, straight line segments. A curved object can be described by an SLHT. The SLHT of an arbitrary shape, say C, can be described as a set of tangents, i.e. as a multiple-valued function r(fl) such that for a given 0i, say:
r(O) = { r l t h e l i n e / : r = x* cos 0 + y* sin 0 is tangent to C}
We now consider the effect of rotation on the SLHT of a straight line, say 8 / - (01, rl). Consider rotating the image by an angle q~ in a counter-clockwise direction. Then the point (01, r~) will be mapped to (02, r2) such that:
()2 - (01 + qS) mod 1 80 and r 2 - rr if 0 + # 5 < 180
r2= r~ i f 0 + ~ b > 1 8 0
SLHT representation
If we discretize the (0, r) space into m~ and mr levels respectively, the SLHT of a curve is represented by a Boolean array A[O...m~][O...mr], where A[OiJ[r i] will have a value 1 if the corresponding slht cell contains one or more detected line segments at (0i, rj); otherwise it will have a value 0.
STIRS signature of the curve
The distances between pairs of parallel tangential lines to C represent the signature of the curve C. The resultant trans- form (signature) has the following properties:
• It is invariant to the translation of the shape.
• Rotation of the shape corresponds to a circular shift of its signature in the (0, r) space.
SHARP: a shape recognition system: C P Ravikumar and R Sethi
• If the shape is scaled by a factor 5, then the signature is also scaled by the same factor.
Due to the above reasons, the signature is termed as a scalable translation-invariant, rotation-to-shifting (STIRS) signature 7. The STIRS signature is also represented by a Boolean array, say D [ 0 . . , m , ] 1 0 . . . m ~ ] such that D[Oi][r i] = 1 if the distance between a selected pair of points in 0, column is equal to rj; otherwise D[Oi][O] = 0.
PARALLEL ALGORITHM FOR SHAPE RECOGNITION
Parallel computing in shape recognition
The following three steps are involved in detecting a refer- ence shape in a test image:
(a) Computation of the SLHT for the test image.
(b) Computation of the STIRS signature of the test image.
(c) Matching the test signature with that of the reference shape.
The process of shape recognition demands a significant amount of CPU time, and speedup through parallel processing is desirable if we intend to use the shape recognizer in a real-time application.
Computational model
We assume a distributed-memory, multiple instruction multiple data (MIMD) computational model '~. This is because the machine available to us for experimentation, the Meiko transputer system ~°, is a distributed memory multiprocessor.
Parallelization scheme
First consider step (a), namely, the computation of the SLHT of a test image. A naive parallelization scheme is to divide the image row-wise or column-wise into as many partitions as the number of processors p. Each processor can compute the slht array independently on the partitioned image. However, this scheme calls for a globally accessible slht array, which cannot be supported in our computational model. Alternatively, a separate copy of the slht array can be maintained in the local memory of each processor, and these can be finally merged. This solution is also expensive, since it calls for an excessive amount of interprocessor communication. Furthermore, the memory requirement at each computing node is increased significantly, since the slht array is m, x m~ in dimension. In the SHARP algorithm, we consider a different form of data parallelism. The slht array is divided over the 0 space into p partitions, each consisting of m~/p angles. Processor i computes the signa- ture for the entire image for angles in the range Ai = [i*mJp, ( i + ] ) m ~ / p - 1 ] . We refer to this as the partial signature. Processor i also applies the matching
algorithm to the test image and the reference image for the angles in the range Ai for all m, orientations of the refer- ence shape. The score resulting from the partial match is termed the partial score, and consists of an array of m~ real values. The merging step now involw~s the computation of the total score by adding the respective elements of the partial score available with processors. The merging step can be carried out using a binary-tree reduction procedure whose complexity is only O(m,~ + log 2 p). This is further explained in the next section.
Li et al. ~ have proposed a scheme for speeding up the computation of the SLHT using a systolic array architecture.
Their scheme is restricted to the detection of straight-line segments in the test image. Further, the technique { uses fine-grain parallelism. The systolic array solution leads to a special-purpose machine for shape recognition. The emphasis in this paper is on using coarse-grain parallelism in the shape recognition algorithm; this leads to a multi- processor-based parallel algorithm which can be imple- mented on a variety of commercially available MIMD machines such as the Intel iPSC, Intel Touchstone-Delta, BBN Butterfly, NCUBE, PARAM and so on. Furthermore, with the popularity of massively parallel computers, it is expected that future generations of parallel computers will employ over 4096 processors working in a distributed- memory environment. These machines use advanced switching techniques such as wormhole routing to cut down the cost of interprocessor communication. Our parallel algorithm is ideally suited for such platforms.
A limitation of our parallel algorithm is that it requires that a local copy of the test image and the reference signa- ture be available with each processor. The space require- ment of the algorithm is highest during the computation of the SLHT; in addition to an n × n image, a reference signature of dimension m~ × mr, there is also the need for storing information regarding line segments which we need to keep track of the connectivity of pixels. If there are k bright pixels in the image, the worst-case memory require- ment for the connectivity information alone is no more than k*mJp*mr. In our implementation, we have used the Meiko transputer where each of the 32 processors has 4 MB of local memory. We have tested our algorithm on as many as three reference shapes being matched with a test image of size 256 × 256. With a 5 resolution or lower, we did not encounter any memory problems.
SHARP algorithm
We now describe SHARP, a parallel algorithm which is based on the ideas developed in the previous section. The algorithm given in Figure 2 is executed by each of the p processors. A list of reference objects is specified to the algorithm; the processors handle one reference shape at a time. A synchronization step is involved before the proces- sors can proceed to the next reference shape. A separate copy of the test image and the reference signature are loaded into the local memory of each processor. Ignoring file I/O time, the SHARP algorithm can be expected to achieve a linear speedup because of the negligible over-
Microprocessors and Microsystems Volume 19 Number 3 April 1995 133
SHARP: a shape recognition system: C P Ravikumar and R Sethi
head of the procedures participate in add and synchro- nize. The procedures to compute the partial SLHT and partial signature, matching, and addition of partial scores are described below.
P a r a l l e l a l g o r i t h m f o r S L H T
Each processor applies the partial_slht procedure on the test shape for bright pixels. To compute the SLHT, the 0 space is divided into p regions numbered 0, 1, 2 . . . . , p - 1. The region i corresponds to 0 values from i*3~m,/p to ( i + 1)*~mo/p. The processor i is assigned to region i and computes the partial SLHT denoted by slhl[t] [r], 0 <~ t <~ m~/p for the 0 values meant for region i. The algo- rithm to compute the SLHT is given in Figure 3. The complexity of applying this procedure is O ( n 2 * l * m J p ) for an n*n binary image, where / is the average number of line segments detected in each slht cell.
P a r a l l e l a l g o r i t h m f o r c o m p u t i n g t h e s i g n a t u r e The computation of the signature can proceed indepen- dently in each processor after the processor has computed the partial SLHT of the test shape. Processor i computes an
procedure SHARP (p. i. test_shape, reference_shapes) /* p is the number of processors and
i is the processor id */
begin
read the test shape into pixel array;
compute partial_slht;
compute partial_signature;
for each reference_shape do begin
read reference signature;
perform p a r t i a l m a t c h ;
participate in add; (* Add partial scores *) i f i = p - 1 t h e n
find peak in matching score;
synchronize;
end Figure 2
end
The SHARP algorithm
procedure partial_slht (i) /* i is the processor id */
begin
e ~ , = i * Se *me/p
e ~ , -- if+ 1)*Se*me/p - 1 forx = O t o n - l d o
for y = O t o n - l d o
if p i x e l [ x ] [ y ] = 1 then
for e = eJ,~. to ei,~,~ step 6e do begin
t = (e - e',,J /5e r = x * c o s e + y * s i n e update/append slhf[t][r].lines end
end
Figure 3 pixel is an n x n array containing the test shape, slhf contains line segments
accumulator A i after applying a line-length threshold value on its slht cells. The signature array D ~ is computed for all possible pairs of points in the accumulator A i. The algo- rithm to compute the partial test signatures is given in Figure 4. The time complexity of computing the Boolean accumulator and the signature are O(mr*l*m~/p) and O(m2*m~/p) respectively. Thus the time complexity for computation of the test signature is O(m2*m~/p).
P a r a l l e l a l g o r i t h m f o r m a t c h i n g
Since each processor i contains the test signature for the appropriate range of 0, the algorithm shown in Figure 5 matches a region of the test signature with the correspond- ing region of the reference signature. The reference signa- ture is rotated and matched m~ times (wrap around) in steps of ,~j. Each processor maintains its own copy of the match- ing score for all the orientations. The time complexity of computing the partial match is O ( m r * r n J p ) for each orien- tation of the reference shape. Therefore, the overall complexity of the matching step is O ( m r * m 2 j p ) .
procedure p a r t i a l s i g n a t u r e (i) /* i is the processor id */
begin
for E) = 0 t o m e / p - 1 do begin
f o r r = O t o m , - I do
for each line in slhf[®][r] do
if length o f line > threshold then Ai[®][r] = 1
f o r r = 0 t o m , - 1 do if Ai[E)][r] = I then
f o r r I = r + l t o m , - I do if Ai[$)][rl] = 1 then Di[E)][G-r] = I end
end
Figure 4 Computing the signature. Threshold is the line length threshold
procedure partial_match(i) /* i is the processor id */
begin
for 01 = 0 t o m e - 1 do begin
match = approx = miss = 0 for E) 2 = 0 t o m e / p - 1 do begin
t = (6) I + ®2) m o d m e f o r r = 0 t o m , - I do
if D,[t][r] = I then if LY~[®2][r] = I then
match = match + 1 else if D ' t [ e 2 ] [ r ± 1] = I then
approx = approx + 1 else miss = miss + I end
scord[E) 7] = match + approx/2- miss end
end
Figure 5 The array . ~ c o r e i contains the matching score for the /th proces- sol"
SHARP: a shape recognition system: C P Ravikumar and R Sethi
Parallel a l g o r i t h m f o r a d d i n g s c o r e s
The scores computed at the end of partialmatch are added to find the total score. Since the addition operator is asso- ciative, a binary-tree technique 9 has been applied to add the matching scores in Iog2p communication steps. The final matching score will be available with the processor labelled (p 1 ). The detection of peaks in the total score is carried out by processor p - 1. The algorithm to compute the total score is given in Figure 6. The computations involved in this procedure have O(m~*log~ p) complexity.
The communication overhead involved in this step is indi- cated by t(:,,rnm, which is the interprocessor communication time for transferring m0 real values. Finally, finding a peak in the matching score takes m~ operations in processor p - 1. Thus the overall time complexity of the procedure is O(m~l*log~ p ÷ m~ + t~ ... ), i.e. O(m~*log~ p + t~ . . . . ).
S p e e d u p o f t h e S H A R P a l g o r i t h m
The total parallel time taken by the SHARP algorithm,
TSHARP, is given by Equation (1).
TSHA~ O n2
+ O(m~log2p) + t~ ...
The time taken by the sequential algorithm for shape recognition, Ts~, is given by Equation (2).
TsE~2 -- O(n 2 Im~) + O(m~ m~) + O(Nm~m~ 2) (2) Thus the speedup of the SHARP algorithm is given by
S p e e d u p - TSE(,) TsuA,~ ~ p
Since the overhead O(m~*log~p)+ t(- ... << TSHARP, the speedup expected is p.
E X P E R I M E N T A L R E S U L T S
The SHARP algorithm has been implemented in C language on a Meiko transputer ~° of 32 nodes with a Sun- SPARC system as the host under the Unix environment. The core of the parallel algorithm requires about 700 lines of code. The entire software, including relevant I/O operations and options for generating and transforming reference images, etc., uses about 2000 lines of C code. The parallel procedure p a r t i c i p a t e in add(i)
/ * i is the processor id */
begin
fork = O t o / o g # ) - 1 d o if i__ 2 ' - 1then
if (i m o d 2 k+l = 2 ' - 1) tl}en send S c o r # to processor i + 2 k else if (i m o d 2 *+~ = 2 *+t - 1) then begin
receive S c o r e from processor i - 2 k update local S c o r #
end end
Figure 6 Procedure to add partial scores available in each processor
algorithm was tested using test reference shapes $1, $2 and
$3 (Figure 7) and test images T1, T2, T3 (Figure 8). The test object T1 consists of multiple shapes, i.e. instances of shapes $1 and $2. Test shapes T2 and T3 are rotated instances of shapes $2 and $3 respectively. The reference signatures were computed using the following parameters:
1. Shape size, n - : 256 2. 0~<0~<~-
3. - 3 6 3 ~ r ~< 363 4. ~r~ = 5" i.e. mc~ - 37 5. ~r = 1 i.e. m~ -- 727 6. Line length threshold = 2.0
S H A R P i m p l e m e n t a t i o n
The SHARP program along with a copy of the test shape is loaded in all the processors running in parallel. The discretized [0...m~] space is divided equally into a number of regions. Each region is associated with a processor, i.e. the /th processor performs the computation meant for the/th region.
The important data structures used in the SHARP algo- rithm are:
• the pixel array to store the test image;
• the slht ~ array to store line segments (s, e, /) obtained on applying the Hough transform for i* (~mo/p <~ 0 <~ (i + 1 )
*3~m~mjp (region i for the /th processor), where s, e and / are the starting point, end point and length of the line segment respectively;
• the accumulator array A i obtained after line-length thresholding for the i-th region;
• the test signature array, namely D' or D~, computed from accumulator array W, corresponding to the /th region;
• the reference shape signature array Dr;
• the scord array to store the matching score for the /th processor.
$1 $2 $3
Figure 7 Reference shapes
T1 T2 T3
Figure 8 Test shapes
Microprocessors and Microsystems Volume 19 Number 3 April 1995 135
SHARP: a shape recognition system: C P Ravikumar and R Sethi
The result of matching the reference shape $1 with test image T1 is shown in Figure 9a. As can be seen, a peak is obtained at 0 ~', indicating that the test object T1 contains an instance of the reference shape $1. Figure 9b shows the result of matching $2 and T2. It is clear that a peak with a score of 99.37 exists at 0 = 9 0 , indicating that the test shape contains a 90" rotated instance of the reference shape $2. When we apply the matching algorithm to two shapes that indeed do not match, the resulting score array does not show positive values. We have observed that proper thresholding is important to find out the exact orientation of the reference object in the test shape. On the other hand, if the orientation information is not crucial, the recognition algorithm is quite robust to changes in the threshold value.
I I t I
2 4 II 15 ] 2
l U m e r o f p,,~,.,,~lCrO
+ T:B v s 52 ¢ T i vw 5;I
In Figure 10a, the speedup of the SHARP algorithm is plot- ted as a function of the number of processors p. It can be seen that the speedup obtained grows more or less linearly up to p = 16, beyond which there is a drop in speedup.
This can be attributed to the domination of the overhead of
1(]0
Matching Scores
Tq v l $1 ~ 52
a
W I . g O - g ? - d O - G S - 9 4 - ( l a -
! ! " l - g D -
'1
6 O I t ? ~ e B~ l i d | o u u l i 2s u u n i u l l SD u i ; ~S u u i u u dOD i , u u i 12S ; u i i i 1S0 i u r u l ,iris nT h e t s
M e t c h i n g S c o r e s
1 2 vm B 2
m - M -
N -
E -
7 4 l U , l l , l l U , , l l l U l , l n l l l l n l l l U U l l l U r l
0 25 50 75 t l a 125 15D 17S
b
Figure 9 a, Matching score for $1 vs T1 ; b, score of matching $2 with T2
SloeecluD v s
Cornmunicet. i o n T[rne([in SecsD v s
W ~ t ~ r Of pfe<mnm~,*D
Computational results
4
i
2
t
a
N ~ ~ r i r m m
b + 12 vn E~ ¢ T3 vm E ]
Figure 10 a, Variation of speedup with number of processors; b, effect of communication with number of processors
interprocessor communication. We also notice that the speedup obtained when matching the test image T2 with the reference shape $2 is larger when compared to the case of T3 and $3. This is because the image T2 is more complex than T3, and computation of the SLHT is more time consuming. As a result, the computation time is rela- tively larger in comparison to the communication over- head. The interprocessor communication time (in seconds) is plotted in Figure 10b. In the present implementation, it is clear that p = 16 delivers maximum speedup for the same problem size. It was observed that speedup increases if the size of problem is increased by decreasing 6+j for higher angular resolution.
Issues in parallel i m p l e m e n t a t i o n
The chief factors which affect the speedup of the parallel algorithm are interprocessor communication time, load balancing among processors, and the problem size.
These are addressed below, along with the experimental results.
SHARP: a shape recognition system: C P Ravikumar and R Sethi
Communication Problem size
The CS tools software supported by Meiko t° provides a way for configuring the processor interconnection structure by programming the communication links of the transpu- ters. Since the SHARP algorithm requires a binary tree kind of interprocessor communication for summing the partial scores, the experiments have also been carried out by defining communication links appropriately. We also used two other interconnection schemes, namely:
• Default, i.e. link allocation is performed automatically by the Meiko operating system
• Forced, i.e. proximity of the processors is specified.
The results of these are shown in Figure 11.
Since the SHARP algorithm divides the () space to achieve parallelism, it can be expected that the speedup is sensitive to variation of m~. Two problem sizes were chosen to study the effect of increasing the angular resolution: (i) ~ = 5 i.e. m~ = 37 and (ii) ~ = 2 ~ i.e. m~ = 91. The results of this experiment are shown in Figure 13. It is clear that as the problem size is increased, there is increase in speedup as well. This is because (i) the problem has been distributed more equally with m~ = 91 than the first option, giving better results, (ii) more computations are involved in comparison with the communication time. Thus p must be chosen appropriately depending on the size of problem.
Load balancing
As we have seen in the previous section, the SHARP algo- rithm partitions the 0 space equally into p regions. When m~ is not a multiple of p, the partition is imbalanced. Two different schemes were tested for handling this problem:
(i) The remainder of the angles was distributed equally among the low-indexed processors.
(ii) The remainder was distributed among high-indexed processors.
The results of the experiments are shown in Figure 12.
Scheme (i) gives better results, since the critical path in our algorithm includes the computations on processor p - 1.
By loading the high-index processors more than the low- index processors, we increase the length of the critical path.
It is clear from our performance measurements that there is no effect on speedup when the number of processors is small, up to four. After this, the effect becomes significant.
The job is distributed almost equally among the processors when there are a small number of them. When the number of processors is increased, the job will not be distributed equally; e.g. if nb -- 37 is divided among 32 processors, it would imply that five processors will carry out twice the computations than others, thus affecting the overall speedup.
C O N C L U S I O N S
In this paper, we have described a parallel algorithm for the shape recognition problem which arises in a number of applications in robotics and machine vision. The parallel algorithm uses coarse-grained parallelism and is suitable for implementation on multiprocessors; we have discussed an implementation on a transputer-based multiprocessor.
E f f e c t . o f L o a d B a l a n c i n g T2 v s S2
11
G D
LO G 10
9 8 7 6 5 4 3 2 1
0 I I I I I I
1 2 4 8 16 32
N u m b e r o f p r o c e s s o r s - = - T y p e I --~-Type I I Figure 12 Effect of load balancing with number of processors
Commun i c a t . i o n E t l " e c t E f f e c t o f P r o b I em S [ z e
T2 v s 52 T3 v £ $3
11 12
0 1 2 4 8 18 32 I 0
N u m b e r o f p r o c e £ g o r s 1 2 4 8 16 3 2
-e-De-rau i t. --~-I=or-cecl N u m b e r o f p r o c e g ~ o r - s
- . ~ B l n & r y T r e e + P r o b l e m S i z e I - ~ - P r o b l e m S i z e I I Figure 11 Cornmunication effect with number of processors Figure 13 Effect of problem size on speedup with number of processors
Microprocessors and Microsystems Volume 19 Number 3 April 1995 137
S H A R P : a s h a p e r e c o g n i t i o n system: C P R a v i k u m a r and R Sethi
The shape recognition algorithm uses a signature-based technique; the signature of the test image is computed and matched with the precomputed signature of the reference shape. The matching procedure considers any possible rotation that the reference shape may have undergone in the test image. The signatures are based on the straight-line Hough transform and are robust for changes in the orienta- tion, translation and scaling of the reference shape. The data parallelism in the signature-based algorithm has been fully utilized in developing the parallel algorithm. In many image processing algorithms, the image can be partitioned into a number of regions and algorithms can be applied on these regions concurrently. In order to parallelize shape recognition algorithm, the 0 space, rather than the image space, has been partitioned across processors.
The performance analysis of our parallel algorithm has been presented using both theoretical and implementation techniques. Implementation issues such as load balancing, memory requirement of processing nodes, and inter- processor communication have been discussed for effective use of processors.
mentation of the Hough transformation for straight line detection' Part.
Recogn. Vol 22 No 6 (1989) pp697 706
6 Casasent, D and Krishanapuram, R 'Curved object location by Hough transformations and inversions' Patt. Recogn. Vol 2(1 No 2 (1987) pp 181-188
7 Pao, D C W, Li, H F and layakumar, R 'Shapes recognition using the straight line Hough transform: theory and generalization' IEEE Trans.
Patt. Anal. Machine Intell. Vol 14 No 11 (19921 pp 1076-1089 8 Mckenzie, D S and Protheroe, S R 'Curve description using the inverse
Hough transform' Part. Recogn. Vol 23 No 3 and 4 (19901 pp283 290 9 Hwang, K and Briggs, F A Computer Architecture and Parallel Proces-
sing, McGraw-Hill, New York (1985)
10 'Communicating sequential tools' in Meiko User's Manual Meiko Limited (1989)
11 Ravikumar, C P and Gandhi, R K 'Parallel algorithms for edge detection in intensity images' Technical Report, Department of Electrical Engi neering, liT Delhi (1992)
C P Ravikumar obtained his Bachelor's degree in Electronics from Bangalore Universily (1983), Master's degree in Computer Science from the Indian Institute of Science (1987), and Ph.D. in Computer Engineering from the University of Southern California (1991). Since 1991, he has been in the Electrical Engineering Department of liT Delhi as an Assistant Professor. His research interests include parallel processing, CAD for VESI, VLSI design, and heuristic search.
REFERENCES
1 Ballard, D H and Brown, C M Computer Vision, Prentice-Hall, Engle- wood Cliffs, N] (19821
2 Davies, E R Machine Vision: Theory, Algorithms, Practicalities Academic Press, London (1990)
3 Duda, R O and Hart, P E 'Use of the Hough transformation to detect lines and curves in pictures' Comm. ACM Vol 15 No 1 (1972) pp 11-15 4 IIlingworth, J and Kittler, J 'A survey of the Hough transform' Comp. Vis.
Graph. Image Proc. Vol 44 No 1 (1988) pp 87-116
5 Li, H F, Pao, D and Jayakumar, R 'Improvements and systolic imple-
R Sethi obtained an M.Tech. degree in Com- puter Applications from the Department of Mathematics, liT Delhi in 1993. He is currently a senior programmer at the National Informatics Center, New Delhi, India. His areas of interest are parallel processing, image processing, and databases.