Transcript
Probabilistic Tracking of Soccer Players and Ball Kyuhyoung Choi, Yongdeuk Seo, and Sang Wook Lee Dept. of Media Technology, Sogang University Shinsu-dong 1, Mapo-gu, Seoul 121-742, Korea {Kyu, Yndk, Slee}@sogang.ac.kr
Abstract. An effective system simultaneously tracking multiple players and a ball on broadcasted soccer matches is proposed in this paper. This system uses particle filter with synthesized images from templates for tracking players of the same team in occlusion. This synthesized image where an adaptive color histogram is made from means an expected image for each particle and gives more precise likelihood evaluation of the particles. For ball tracking, when the ball is in ballistic motion without any interruption of players, an ordinary particle filter estimates the state of the ball. When the ball is considered to be possessed by a player or players, the tracker stops, waits for the ball to reappear in the area around the corresponding players. This tracker gives good performance on the commonly broadcasted soccer match videos.
1
Introduction
Analysis of soccer video sequences has been an interesting application in computer vision as the abundance of recent papers presents let alone the fever of the soccer itself. This paper focuses on tracking the players and ball on commonly broadcasted soccer match video sequences. In this environment, the most challenging problem is occlusion between players of the same team resulting that particles of an player are separated and populate on two or more players. A similar study on this kind of input video was studied in [1], which tracked multiple players in a video of American football. Our approach is based on particle filter[7, 8]. However, simple attachment of a particle filter to each of the players does not give a good performance because particles of an object change their locations very easily to the region of the adjacent objects whose regions give higher likelihoods through the iteration of likelihood evaluation and re-sampling, which has been observed in the previous multi-tracking papers. JPDAF(joint probabilistic data association filter) has been a solution for the identity association of measurement and multiple objects tracked [9]. OAP(occlusion alarm probability) successfully managed to control the particle population by probabilistic weighting of the likelihood of a particle according to the distance to its neighbors [2]. The suggested tracker exploits a synthesized image from templates of players on the verge of occlusion to give more reasonable measurement. Sequential Monte-Carlo method is explained in Section 2. Section 3 deals with pre-image processing and Section 4 covers ordinary single player tracking. In Section 5 tracking the same team players in
2
occlusion is explained. The method of Ball tracking is discussed in 6. Section 7 provides experimental results and finally Section 8 concludes this paper.
2
Sequential Monte-Carlo
In brief, sequential Monte-Carlo algorithm(SMC) estimates a non-parametric representation of posterior distribution p(xt |zt ) sequentially, where xt is the state and zt is the measurement at time t, given a sequential dynamic system with Gauss-Markov process. The posterior is represented by random particles or samples from the posterior distribution. When it is not possible to sample directly from the posterior, q, a proposal distribution of known random sampler can be adopted to compute the posterior, and in this case the posterior at time t is represented by the pairs of particle s and its weight w updated sequentially: wt = wt−1
p(xt |zt )p(xt |xt−1 ) q(xt |x0:t−1 , z1:t )
(1)
AfterPcomputation of wt ’s for the particles generated from q and normalizaN tion 1 wti = 1, where N is the number of particles, the set of particles comes to represent the posterior distribution. Particles have the same weight 1/N after re-sampling from the proposal distribution which particle filter takes as q = p(xt |zt−1 ) resulting in wt = wt−1 p(zt |xt ), saying that the posterior can be estimated by evaluating the likelihoods at each time using the particles generated from the prediction process of system dynamics. Incorporated with resampling, the weight update equation can be further reduced to wt = p(xt |zt ), where weight normalization is implied afterwards.
3
Pre-image processing
Fig. 1. Image processing
3
Illustrated are the steps of image processing to automatically detect players and identify their classes(ex, player of team A, player of team B, goalie of team A, goalie of team B and referee) in Figure 1. From the original image, I ori , the ground area is segmented out according to the 3D-histogram of the image to gives I sub , as shown in Figure 1. Then applying morphological filtering(I mor ), connected component labelling(I ccl ) and size filtering(I siz ), we get I pla containing only candidate region blobs of players. The rest part of the image is considered to be the ground, spectators or other facilities of the stadium and marked as black. Note that, in the application of particle filter, only those non-black pixels are considered in the evaluation of likelihood and from the second frame just I sub is used for tracking in both player and ball cases.
4
Single Player Tracking
Tracking starts from the second frame. For the image of tth frame Itsub , state estimates of players(pt ) and the ball(bt ) are done by the particle filter assigned respectively. The state vector of a player p is (rT , w, h)T , where r is (rx , ry )T and represents the center position of a rectangle which a player is considered as. w and h mean the half width and height of the rectangle. Constant velocity is assumed for the dynamics of position and no velocity for the width and height. rt = 2rt−1 − rt−2
(2)
For particle filtering, each player has N samples or particles, the weight is determined by likelihood evaluation, that is, histogram comparison. A class is assigned to each player as in Section 3 and has its model color histogram. When the color histogram of the region for siA , i th(i ∈ N ) sample of player A, is hiA and the model color histogram of the corresponding class is hA , Li , the likelihood of siA , can be expressed with total divergence D [10]. µ ¶ 1 −D(hA , hiA )2 Li = √ exp (3) 2σ 2 2πσ ¾ X ½ hi (y) hj (y) D(hi , hj ) = 2 log 2 + + hj (y) log hi (y) log hi (y) + hj (y) hi (y) + hj (y) y∈Both
(4) where y is a index of bin and Both is the set of y satisfying both hi (y) > 0 and hj (y) > 0. The weights are obtained by normalizing L and the weighted sum of ˆ , the estimate of p at this frame. particles leads to p
5
Image Synthesis from Templates
As in Figure 2, the nature of particle filter can not restrain the features of a player from attracting particles of the other in the same team while they are close enough to each other. In Figure 2, during occlusion between them, some
4
Fig. 2. Particle distribution during occlusion
part of the image of player A is not visible due to foreground player B. That is, for player A, that much of color information is lost and this causes that the otherwise best particles fail to give best matches to the given model color histogram which is derived from a player template with whole body. The more A is occluded by B, the harder observation of A is done and the lower confidence in observation becomes. Therefore, it will be more appropriate to take the occluding part of player B as that of A and compare that with the corresponding synthesized image. This is for fully exploiting the given image information and considering the changes around the target for tracking. For this, a observation model as in Figure 3 is required. From the tracking results at every frame, we can get scent of occlusions between players of the same team and for those candidates who are close enough to each other, the image template is saved respectively. During occlusion, a synthesized image is derived from the templates and its color histogram is used for likelihood evaluation of particles. If the distance between player A and B of the same team at frame t − 1 becomes smaller than a proper threshold during single player tracking as in Section 4, that is, if it is expected for those players to overlap each other in frames coming soon, the template images TA , TB respectively corresponding to the expectations E(pA ) and E(pB ) are obtained and saved. To compute the likelihood of i th particle of A at frame t, the synthesized image TSi is made out of TA and TB . If the position expectation of B E(rB,t ) is not gotten yet, the velocity Et−1 (rB ) − Et−2 (rB )) is added to Et−1 (rB ) to give the predicted position rB,t− . Otherwise, E(rB,t ) is used as rB,t− . According to the relative positions between rB,t− and riA,t of the particle siA,t , TA and TB are aligned. The intersection area of two templates TA,sub and TB,sub are derived through this alignment as in Figure 4. TA,sub = {p : xA,f rom ≤ px ≤ xA,to , yA,f rom ≤ py ≤ yA,to } TB,sub = {p : xB,f rom ≤ px ≤ xB,to , yB,f rom ≤ py ≤ yB,to }
(5) (6)
5
Fig. 3. Observation model
where, p is a pixel which has RGB values as elements, (xf rom , yf rom ), (xto , yto ) are the most up left and low right points of each template respectively. Then, i αSi , the pixel constituting TSi is defined according to ry,B,t− and ry,A,t as follows. i When ry,A,t > ry,B,t− , that is, A is occluding B since A is located lower than B in the image coordinate, ½ αSi (j, k)
=
i When ry,A,t
½
αB (j − xA,f rom , k + yB,f rom ) , if α(j, k) ∈ TA,sub and V(α(j, k)) ∈ Bf ield ) αA (j, k) , otherwise (7) < ry,B,t−
αB (j − xA,f rom , k + yB,f rom ) , if α(j, k) ∈ TA,sub and V(α(j, k)) ∈ / Bf ield ) αA (j, k) , otherwise (8) where V maps the color of a pixel to a bin of a color histogram. Unlike single player tracking for the case of no players around the target, the particle of
αSi (j, k)
=
6
(a) Image synthesis
(b) Examples of synthesized images
Fig. 4. Image synthesis and example images
the maximum likelihood is taken as the state of a player during the frames of occlusion. In Figure 4(b), examples of synthesized images are shown.
6
Ball Tracking
The overall model for ball tracking is similar to that of player except the ball is considered as an ellipse with four parameters the center position, the lengths of the long and short axes. Unlike player tracking, contour tracking module is added to that of histogram matching [11]. That is, the intensity gradient along the normal of the circumference is measured as another cue of observation with existing color histogram of inner area of the circle. The weight of i th particle wi , among Nb ball particles, computed via the dual module of contour and color histogram is Yu (i) wi = N Pb Yu (i)
ui − min uj ,
Yu (i) =
j∈Nb
max uj − min uj
j∈Nb
(9)
j∈Nb
i=1
where ui = Yug (i) + Yuc (i) ,
ug =
Nc 1 X g(k) , Nc
uc = D(h, hb )
(10)
k=1
with Nc , the number of pixels on circumference. 1 , if c(k) ∈ E g(k) = 0.5 , if c− (k) ∈ E or c+ (k) ∈ E 0 , otherwise
(11)
where c(k) is the pixel corresponding to k(k ∈ Nc ), c− (k) and c+ (k) are pixels in front of and behind c(k) along its normal respectively, E is the set of Cannyedge pixels and h and hb are the histograms of a particle and the model ball
7
template respectively. As in Figure 5, the states of ball can be classified into two, namely, the case when a player has the ball and the other when the ball is in ballistic motion as an elastic body. An ordinary particle filter is applied for the latter. For the former, the filter stops tracking and the position of the player possessing the ball is taken as the estimate of that of ball. In next frames, it is waited for the ball to separate from the player and the particle filter is reinitialized after reappearance of the ball to resume tracking. That is, if E(bt )
Fig. 5. Ball tracking
the resulted estimate of ball at frame t shows a player(s) has the ball, that is, the distance between from the ball and the player(s) is smaller than a proper threshold, the mean position E(rt ) of pi (i ∈ Nc ) of Nc players within a proper distance d from the estimated ball is considered as the ball position. From the next frame, rectangular bounding boxes with proper size around those players are searched to detect the reappearance of the ball. In the image I ccl , the proper image blob of the splitting ball is searched for and its likelihood evaluation is done in I sub . The blob of maximum likelihood above some threshold is taken as the reappeared ball and is assigned a new particle filter.
7
Experiments
Experiments are carried out on some video sequences of size 640 × 360 and 960 × 540. Figure 6 shows some frames of the results of which the detail is contained in accompanying video clip. The initial position of the ball was given by the user and no two players were supposed to appear as a connected blob at the initial frame. In the situation of scoring a goal of Figure 6(a), the ball is lost when it looks the goal keeper is catching it since the ball is just passing him by without his touch. In Figure 6(b) two players in white top overlaps each other and the trackers keep tracking them through the occlusion.
8
Conclusion
This paper presents an effective system to track the players and the ball in a soccer match video sequence. To deal with the objects of the same class, namely, the players of the same team during occlusion, we exploited the synthesized
8
(a)
(b) Fig. 6. Examples of results
image out of template images saved before the occlusion for likelihood evaluation of the particles. Simple color histogram matching with the proposed method gave lower performance than the expected since the spatial and local information of colors are not considered in the histogram.
References 1. Intille, S., Bobick, A.: Closed-world tracking. In: Proc. Int. Conf. on Computer Vision. (1995) 2. OK, H., Seo, Y., Hong, K.: Multiple soccer players tracking by condensation with occlusion alarm probability. In: Int. Workshop on Statistically Motivated Vision Processing, in conjunction with ECCV 2002, Copenhagen, Denmark. (2002) 3. Reid, I., Zisserman, A.: Goal-directed video metrology. In: Proc. European Conf. on Computer Vision. (1996) 4. Kim, T., Seo, Y., Hong, K.: Physics-based 3d position analysis of a soccer ball from monocular image sequences. In: Proc. Int. Conf. on Computer Vision. (1998) 721–726 5. Inamoto, N., Saito, H.: Immersive observation of virtualized soccer match at real stadium model. In: IEEE and ACM International Symposium on Mixed and Augmented Reality. (2003) 6. Taki, T., Hasegawa, J., Fukumura, T.: Group motion features for teamwork evaluation and its application to soccer games. In: 14th International Conference on Pattern Recognition. (1998) 7. Blake, A., Isard, M.: Active Contours. Springer-Verlag (1997) 8. Doucet, A., Godsill, S., Andrieu, C.: On sequential monte-carlo sampling methods for bayesian filtering. (2000) 9. Bar-Shalom, Y., Fortmann, T.: Tracking and Data Association. Academic Press (1998) 10. Lee, L.: Similarity-Based Approaches to Natural Language Processing. PhD thesis, Harvard University, Cambridge, MA (1997) 11. Birchfield, S.: Elliptical head tracking using intensity gradients and color histograms. In: Proc. IEEE Conf. Computer Vision and Pattern Recognition. (1998)