Transcript
2012 IEEE Workshop on Signal Processing Systems
ARCHITECTURAL STUDY OF HOG FEATURE EXTRACTION PROCESSOR FOR REAL-TIME OBJECT DETECTION Kosuke Mizuno1, Yosuke Terachi1, Kenta Takagi1, Shintaro Izumi1, Hiroshi Kawaguchi1 and Masahiko Yoshimoto1,2 1
Department of Information Science, Kobe University, 1-1 Rokkodai-Cho, Nada-Ku, Kobe, Hyogo 657-8501, Japan 2 JST CREST, Japan and an FPGA-GPU architecture [4] have been proposed for real-time applications. Cao et al. [7] realized an FPGA implementation with the best performance compared with other implementations. However, this study particularly targets stop-sign detection. HOG features are adaptable to widely various applications. Consequently, next-generation HOG feature extraction processors must provide higher expandability and higher performance. Therefore, our goal is to develop design techniques for a real-time HOG feature extraction processor for HDTV resolution video.
ABSTRACT This paper describes a Histogram of Oriented Gradients (HOG) feature extraction processor for HDTV resolution video (1920 × 1080 pixels). It features a simplified HOG algorithm with cell-based scanning and simultaneous Support Vector Machine (SVM) calculation, cell-based pipeline architecture, and parallelized modules. To evaluate the effectiveness of our approach, the proposed architecture is implemented onto a FPGA prototyping board. Results show that the proposed architecture can generate HOG features and detect objects with 40 MHz for SVGA resolution video (800 × 600 pixels) at 72 frames per second (fps). The proposed schemes are easily expandable to HDTV resolution video at 30 fps with 76.2 MHz if a highresolution camera and higher operating frequency are available.
70 [7] T. P. Cao (2008)
Frame Rate [fps]
60
[9] K. Negi (2011)
50 40
[5] H. Hiromoto (2009) [6] R. Kadota (2009)
Our target
30 20 10
[8] Y.Yazawa (2011)
[3] S.Bauer (2009) [4] S.Bauer (2010)
[2] Li. Zhang (2008)
0
Index Terms— HOG, FPGA, VLSI, HDTV
QVGA (320x240)
SVGA VGA (640x480) (800x600)
HDTV (1920x1080)
Image resolution
1. INTRODUCTION
Fig. 1. Previous works of HOG feature extraction processor.
Real-time object detection has been a key technology in various application domains such as surveillance, automotive systems, and robotics. An important algorithm used in object detection systems, Histogram of Oriented Gradients (HOG) [1], has robustness to change of illumination and attains high computational accuracy in detection of variously textured objects. Recent high-performance general-purpose processors can achieve real-time object detection at a heavy computational cost. However, the processor requires high power consumption and is therefore unsuitable for mobile systems under limited battery conditions. Consequently, a low-power and high-performance HOG feature extraction processor is necessary to widen the range of applications. Figure 1 presents the image resolution versus frame rate for several published descriptions of HOG hardware. Zhang et al. [2] proposed efficient object detection using GPGPU. Some FPGA implementations [3], [5], [6], [8], [9]
Most conventional processors employ a window-based approach. For the window-based approach, a workload of 447.7 GOPS and memory bandwidth of 55 Gbps are required for HDTV resolution because of repetitive computations. The workload and memory bandwidth are greatly reduced by reusing calculated data or adopting efficient computation. However, data reuse causes an increase of memory capacity and circuit area. Consequently, a cooperative design between algorithm and architecture is necessary. To achieve real-time and low-power HOG feature extraction for HDTV resolution video, we propose the following three techniques. • Simplified HOG algorithm with cell-based scanning and simultaneous Support Vector Machine (SVM) calculation for workload reduction. • Cell-based algorithm and architecture for memory bandwidth reduction.
978-0-7695-4856-2/12 $26.00 © 2012 IEEE DOI 10.1109/SiPS.2012.57
197
•
Parallelized architectures for cell histogram generation, histogram normalization, and SVM classification to reduce the necessary cycle count. As described in this paper, details of the simplified HOG algorithm are described in Section 2. The proposed architecture is addressed in Section 3. Then, these are followed by FPGA implementation in Section 4. Section 5 concludes this paper.
Figure 4 portrays the workload analysis of HOG-based object detection. Simplified HOG algorithm with cell-based scanning and simultaneous SVM calculation reduces the workload to 10.6 GOPS, as portrayed in Fig. 4. However, the workload of 10.6 GOPS is still heavy for a processor with low operating frequency. To accommodate the workload in real time, our architecture has parallelized modules for cell histogram generation, histogram normalization, and SVM classification.
2. ALGORITHM 500
Workload [GOPS]
447.7
2.1. Original HOG Algorithm Figure 2 portrays a flow diagram of object detection using the original HOG algorithm [1]. Scanning on the input image is based on detection window. The window is divided into cells, for each cell accumulating a histogram of gradient orientations over the pixels of the cell. For better invariance to illumination, histogram normalization can be done by accumulating a measure of the local histogram energy over blocks and using the results to normalize all cells in the block. The normalized histograms (HOG features) are collected over the detection window. The collected features are fed to a linear SVM for object/nonobject classification.
Linear SVM Classification
(640x480)
HOG
Algorithm
for
HDTV (1920x1080)
Detection window-based approach
Gradient calc. + Magnitude & Orientation calc. + Histogram generation + Histogram normalization + Classification = 0.12 + (2.11 + 1.43) + 0.25 + 0.33 + 6.34 = 10.6 [GOPs]
HDTV (1920x1080)
Our approach
Fig. 4. Workload analysis. 2.3. Cell-based Scanning Method Object detection with the HOG feature is executed by the scanning detection window on an input image, as presented in Fig. 5 left. The detection window size is compliant with original algorithm [1]. When one window is finished, the next window is scanned using an offset of 1 cell. The memory bandwidth is increased by reloading input pixels for the next window. Consequently, extensive data reuse is desirable for memory bandwidth reduction. Figure 5 right shows cell-based scanning approach. HOG feature is extracted from cell-based calculations. No cell overlaps with other cells. Consequently, sharing and reuse of a cell have a great impact on memory bandwidth reduction.
Classification Results
Fig. 2. Original HOG algorithm flow. 2.2. Simplified Implementation
VGA
Gradient calc. + Magnitude & Orientation calc. + Histogram generation + Histogram normalization + Classification = 13.7 + (233 + 158) + 5.6 + 30.3 + 6.34 = 447.7 [GOPs]
Our approach
52.6 10.6
Input Image Histogram Normalization
200
0
Scan next detection window
Cell Histogram Generation
Workload reduction by the simplified algorithm
300
100
Collect HOG feature over detection window
Image Scanning & Gradient Calculation
Detection window-based approach
400
Hardware
Input Image Detection Window (64 × 128 pixels)
Cell-based scanning (Section 2.3) Gradient calculation using CORDIC
Orientation Histogram Generation Approximation of weighted voting for spatial & orientation anti-aliasing
Histogram Normalization
Linear SVM Classification
Newton method with approximated initial value
Simultaneous SVM calculation (Section 2.4)
Window 1 Window 2
Input Image
Cell 1 Cell 2
No cell overlaps
Overlap scanning Window-based scanning
Cell-based scanning
Fig. 5. Image scanning methods.
Input Image Image Scanning & Gradient Calculation
Block (2 × 2 cells)
...
Cell
Cell (8 × 8 pixels)
Raster Scan
...
A simplified HOG algorithm for VLSI implementation is introduced in this subsection. Figure 3 shows a flow diagram of object detection using simplified HOG algorithm. This flow is modified from the original flow using the following five techniques. 1. Cell-based scanning (Section 2.3) 2. Gradient calculation using CORDIC [10] 3. Approximation of weighted voting for spatial and orientation anti-aliasing 4. Newton method with approximated initial value 5. Simultaneous SVM calculation (Section 2.4)
2.4. Simultaneous SVM Calculation
Classification Results
In the window-based approach, HOG features of 105 blocks are collected. Then the features are multiplied by SVM coefficients corresponding to one window. However, the cell-based approach provides partial HOG features after normalization for one block; then the features are
Parameter optimization (Section 2.5)
Fig. 3. Simplified HOG algorithm flow.
198
multiplied by SVM coefficients corresponding to 105 windows. Figure 6 presents simultaneous SVM calculations for cell-based processing. Partial HOG feature belongs to 105 windows maximally and are located at different positions in each window. Partial HOG features are multiplied and accumulated by the SVM coefficients of each window. The accumulation result is stored and reused in the subsequent SVM calculation. Simultaneous SVM calculation is suitable for parallel computing in hardware. SVM Weight 0 SVM Weight 1 (Window 1) (Window 0)
SVM Weight 6 (Window 6)
SVM Weight 7 (Window 7)
...
a Normalized Block Partial HOG feature
15 blocks
2.6. Simulation Results A simulation was conducted using software for object detection to estimate performance and accuracy degradation by the simplified algorithm. The software was produced using Microsoft Visual C++ 2008 Express Edition (Microsoft Corp.) with the INRIA Person Dataset [11], which includes several people in various backgrounds. Figure 8 presents a graph of false positives per window (FPPW) versus the miss rate. The simulation results with the simplified algorithm and the optimized bit width show that the miss rate degradation is 3% at 0.0001 FPPW. The algorithm that was used provides sufficient performance for general-purpose applications.
Window A
Nth feature for Window A
1
Simplified algorithm Original
Partial HOG feature
0.1
7 blocks
Miss Rate
SVM Weight 98 (Window 98)
SVM Weight 104 (Window 104) Window B
Fig. 6. Simultaneous SVM classification.
r tte Be
Mth feature for Window B
0.01
0.001
# of test samples : 12457 (positive : 1132, negative : 11325)
2.5. Parameter Optimization 0.0001 0.00001
Figure 7 shows parameters of each process in object detection using HOG features. In general, a software implementation employs floating-point calculations to provide high accuracy. However, the floating-point unit uses hardware resources to a great degree. Therefore, fixedpoint operation is often used for hardware implementation. The accuracy of fixed-point operation depends on the bit width itself, although the bit width affects the memory capacity and the circuit area. Optimized parameters provide reasonable classification accuracy and minimize hardware costs. Table 1 presents the results of parameter adjustment.
Scan Image & Compute Gradients Gradient Magnitude & Orientation
Orientation Histogram Generation Orientation Histogram
Histogram Normalization
Linear SVM Classification
L2-norm
Weight Vector
Normalization divisor
Classification Result
0.1
1
3. ARCHITECTURE 3.1. Cell-based Pipeline Architecture Figure 9 depicts a block diagram of the cell-based pipeline architecture and external peripherals for a demonstration system detailed in Section 4. The proposed architecture comprises a controller, a cell histogram generation module, a histogram normalization module, an SVM classification module, SRAMs for several image data, a CPU interface, and a memory interface. The HOG feature extraction processor is controlled by an external CPU, and the input grayscale image is loaded to a cell-line buffer from an external SRAM via a memory interface. The CPU receives a detection result from HOG feature extraction processor; then it draws the result on an LCD display.
Classification Results
Table 1. Optimized bit width Gradient magnitude Gradient orientation Orientation histogram 1st L2-norm 1st normalization divisor 2nd L2-norm 2nd normalization divisor HOG feature SVM coefficient Classification buffer
0.01
Fig. 8. Accuracy degradation by the simplified algorithm.
Fig. 7. HOG algorithm parameters.
Parameter
0.001
FPPW
HOG feature
Input Image
0.0001
Bit width [bit] Sign Integer Fractional 1 9 0 1 3 3 0 11 0 0 25 0 0 0 11 0 0 14 0 3 7 0 0 4 1 3 7 1 4 8
199
Detection-window annotated frame LCD display
Detection-window drawing module
External Input image SDRAM
LCD controller
SDRAM controller
Grayscale conversion
External Gray image SRAM
CPU 32bit
Detection result CPU BUS
Input image
32bit
MEMORY BUS CPU I/F
SRAM for intermediate cell histogram
MEMORY I/F
Input image
Start signal
for HDTV resolution requires memory bandwidth of 55 Gbps. In general, the mobile system under limited battery conditions adopts a lower operating frequency. Therefore, the memory bandwidth must be reduced as low as possible for low-power and real-time operation. Our approach adopts a cell-based algorithm and architecture to reduce the memory bandwidth to 0.499 Gbps.
Detection result SRAM controller
Cell histogram generation
Cell line buffer
Control signal
Detection result
SRAM for intermediate results
Cell histogram
HOG features
Histogram normalization
SVM SRAM for SVM classification coefficients Classification result
Memory bandwidth [Gbps]
Camera
Control signal Control signal
Controller
Control signal
HOG feature extraction processor
Fig. 9. HOG feature extraction architecture. Our architecture adopts a cell-based pipeline flow, as presented in Fig. 10. Figure 10 above shows a relation between cells, blocks, windows, and a frame. One cell contains 8 × 8 pixels. One block is composed of 2 × 2 cells. One window is made up of 7 × 15 blocks. Each block overlaps with neighboring blocks. Cell-based pipeline processing is conducted as follows: 1. A cell histogram is generated with cell-based scanning. 2. When the process described above reaches the block level, a block-level cell histogram is normalized; then the block-level HOG feature is extracted. 3. Block-level HOG features and SVM coefficients corresponding to each window are multiplied and accumulated. 4. An accumulation result of window level is compared with the SVM threshold. Then the detection result is obtained. The cell-based pipeline architecture greatly reduces the memory bandwidth because it prevents reloading of input pixels in different detection windows.
60
55
Detection window-based approach
50
# of Windows/frame × Window size × Color depth × fps = 27960 × (64 × 128) × 8 ×30 = 55 [Gbps]
40 30 20 10 0
Our approach
6.5
0.499
VGA (640x480)
HDTV (1920x1080)
Detection window-based approach
HDTV (1920x1080)
Our approach
Buffer size × # of lines (Image height) × Color depth × fps = ((1920+2) × (1080+2) )× 8 × 30 = 0.499 [Gbps]
Fig. 11. Memory bandwidth analysis. 3.2. Cell Histogram Generation In cell histogram generation, a magnitude and an orientation of a pixel gradient are calculated; then a weighted magnitude is voted into a bin corresponding to its orientation. Figure 12 presents the architecture for cell histogram generation. Four-way architecture is adopted because one cell is commonly used for four blocks maximally. One processing element (PE) executes weighted voting and binning to generate a histogram of one cell. Spatial anti-aliasing is conducted in four processing elements corresponding to one block. Magnitude SHIFTER
Cell line buffer
Weighted voting
Weighted magnitude 2-way gradient calculation
ADD
ADD
ADD ...
Block 0 Block 1 Cell 1 Cell 241 Cell 481
Cell 2 Cell 242 Cell 482
...
Block 238
......
Cell 7 Cell 247 Cell 487
Window 232 Cell 239
Block 1
Block 0
PE
PE
PE
PE
PE
Four cell histogram generation (Block 3)
PE
PE
Block 3
PE
Block 2
Bin 8
PE
Cell
Four cell histogram generation (Block 2)
Cell
REG
Cell histogram
Four cell histogram generation (Block 1)
Cell
PE
Cell
PE
Cell of interest
PE
Cell
PE
Cell 32399
Cell
PE
...... Cell 32160
Cell
PE
Cell 3607
Cell
REG
Bin 0 Bin 1 Concatenate
2-way orientation anti-aliasing
PE
...
REG
2-way CORDIC
Four cell histogram generation (Block 0)
Cell Cell Cell 3600 3601 3602
Frame (1920 × 1080 pixels)
......
......
1080 pixels
Cell 0 Cell 240 Cell 480
Window 0
1920 pixels Cell histogram generation Histogram normalization SVM classification
Cell 0
Cell 1
...
Cell 241
Cell Cell Cell 242 243 244 Block Block Block 0 1 2 Block Block 0 1
... ...
Cell 3607 Block 3351
......
Cell 3608 Block 3352 Block 3351
Cell 3609 Block 3353 Block 3352
Cell 3610 Block 3354 Block 3353
Controller
Cell 3611 Block 3355 Block 3354
Dataflow
Initial value = 0
Initial load
Fig. 12. Block diagram and processing flow of cell histogram generation.
WindowWindow 0 1
Result output
SRAM for intermediate cell histogram
Time
3.3. Histogram Normalization
Fig. 10. Cell-based pipeline flow.
Figure 13 shows the architecture for histogram normalization. The architecture consists of two stages to
Figure 11 portrays the memory bandwidth analysis of HOG-based object detection. The window-based approach
200
implement L2-Hys normalization [12]. The first stage includes four Cell MAC modules, an approximation module, a Newton method module, and a threshold module. The second stage comprises four Cell MAC modules and a Newton method module. In the first stage, Cell MAC modules first calculate the sum of squares of input cell histogram. Secondly, an initial value for Newton method is approximated to bit shift operation. Thirdly, Newton method calculates an inverse number of square roots. Furthermore, then Cell MAC modules normalize a cell histogram. Finally, a normalized cell histogram is compared with a threshold and outputted to the second stage. In the second stage, the sum of squares and an inverse number of square roots is calculated as in the first stage. Furthermore, then Cell MAC modules normalize a cell histogram and extract 36-dimension HOG features.
Initial value = 0
12 cycles
Threshold (0.2)
Approximation to shift operation Newton method (Three times iterations)
MUX
MUX
REG
REG MUL
Normalize coefficient ADD Second stage
REG Cell MAC
Concatenate
Cell MAC
17 cycles
Cell MAC
Cell MAC
Classification core 14
Classification core 13
Intermediate result from neighbor MAC HOG feature 99th coefficients
MAC (99th SVM coefficients)
Intermediate result from neighbor MAC HOG feature 104th coefficients
MAC (104th SVM coefficients)
COMPARATOR
Classification core
Classification result
MAC result corresponding to one detection window
The number of cycle counts was estimated using a VerilogHDL simulator. The proposed architecture was compared with architecture without parallelization and without a pipeline. Estimation results are presented in Fig. 15, which demonstrates the superiority of the proposed architecture for HDTV resolution. The parallelization in the cell histogram generation and histogram normalization contributes to reduction of the cycle counts. Introduction of the proposed simultaneous SVM calculation architecture enables the reuse of intermediate results, allowing the cycle count reduction. Results show that the number of cycle counts in cell histogram generation, histogram normalization, and SVM classification are reduced by 85%, 65%, and 99%, respectively, compared with the number of cycle counts of architecture without parallelization and without pipeline. In the proposed architecture, the overall process requires 2.54 × 106 cycles per frame. Therefore, it is inferred that the proposed architecture can accommodate HDTV resolution video at 30 fps with 76.2 MHz.
LINE_BUFFER
Cell MAC
SRAM for intermediate results
MAC (98th SVM coefficients)
3.5. Performance Evaluation
Cell Normalize histogram coefficient
Cell MAC
4 cycles
Cell MAC
...
Intermediate result HOG feature 98th coefficients
Fig. 14. Block diagram of SVM classification module.
First stage
Cell MAC
Classification core 1
Classification core 0
HOG feature
...
Cell histogram generation 36-dimension cell histogram
SRAM for SVM coefficients
Controller
Cell MAC
Newton method (Four times iterations) Normalize coefficient
36-dimension HOG feature SVM classification
# of cycle counts / frame [106 cycles / frame] 5
Fig. 13. Block diagram of histogram normalization.
8.8 7.1
Architecture w/o parallelization and w/o pipeline
3.4. SVM Classification In SVM classification, extracted features and SVM coefficients are multiplied and accumulated until the operations reach window level. Then the accumulation result is compared with an SVM threshold to judge whether the window includes a target object. Figure 14 shows a block diagram for simultaneous SVM classification. This architecture includes 15 classification cores. One classification core manages MAC operations of 7 blocks. Consequently, the architecture is able to handle 105 blocks corresponding to one detection window. Sufficient parallelism reduces the required cycle count to manage the workload of 10.6 GOPS.
Proposed architecture
10
105
110
120
Image resolution : HDTV
107 122.9 1.30 2.47 1.02 2.54
Cell histogram generation Histogram normalization SVM classification Overall process
Fig. 15. Reduction of cycle count. 4. FPGA IMPLEMENTATION To evaluate the effectiveness of our approach, we implemented the proposed architecture onto a prototyping board (tPad Multimedia Development Kit; Terasic Technologies Inc.). The board has DE2-115 with Cyclone IV EP4CE115 (Altera Corp.), a 5-megapixel digital image sensor module, and an 8-inch LCD touch screen module. Figure 16 portrays a demonstration system of real-time object detection to verify the proposed technique.
201
6. ACKNOWLEDGMENTS
Resource utilization and comparison to conventional FPGA implementations are presented in Table 2. Our FPGA implementation can generate HOG features and detect objects with 40 MHz for SVGA resolution video at 72 fps. The FPGA resource utilizations are as follows: 34,403 LEs, 68 embedded multipliers, and 0.34 Mbit block RAMs. Our implementation shows the best performance with minimum memory usage and minimum operating frequency. If a high-resolution camera, 0.63 Mbit block RAMs, and the operating frequency of 76.2 MHz are available, then the proposed schemes are readily expandable to HDTV resolution video at 30 fps.
This work was supported by the VLSI Design and Education Center (VDEC), The University of Tokyo, in collaboration with Cadence Design Systems Inc. and Synopsys Inc. 7. REFERENCES [1] N. Dalal, et al., “Histograms of Oriented Gradients for Human Detection,” in Proceedings of the 2005 International Conference on Computer Vision and Pattern Recognition, vol. 2. Washington, DC, USA: IEEE Computer Society, 2005, pp. 886–893.
5-megapixel camera 8-inch LCD touch screen
[2] Li Zhang, et al., “Efficient Scan-Window Based Object Detection using GPGPU,” IEEE, CVPRW, 2008. [3] S. Bauer, et al., “FPGA Implementation of a HOGbased Pedestrian Recognition System,” MPC-Workshop, July, 2009.
Detection window
Fig. 16. Architecture verification by FPGA implementation.
[4] S. Bauer, et al., “FPGA-GPU Architecture for Kernel SVM Pedestrian Detection,” IEEE CVPRW 2010.
Table 2. Resource utilization
[5] M. Hiromoto, et al., “Hardware Architecture for HighAccuracy Real-Time Pedestrian Detection with CoHOG Features,” IEEE ICCVW 2009.
[3] [5] [6] [7] [8] [9] Ours Cyclone IV FPGA Spartan 3 Virtex-5 Stratix II Virtex-4 Cyclone III Virtex-5 # of LUTs 42,435 28,495 37,940 8,921 34,838 17,383 34,403 # of registers N/A 5,980 66,990 4,221 22,612 2,181 23,247 68 # of DSP blocks 18 2 120 3 N/A N/A Working memory (Mbits) 1.08 2.196 N/A 1.584 2.094 1.296 0.34 0.63 Resolution 800×600 320×240 640×480 752×480 640×480 640×480 800×600 1920×1080 Frame rate (fps) 20 38 30 60 20 62.5 72 30 Operating frequency (MHz) 63 167 127.49 N/A 70 44.85 40 76.2 Image scanning method Window-based Cell-based in HOG feature extraction Cell-based Image scanning method Window-based in classification
[6] R. Kadota, et al., “Hardware Architecture for HOG Feature Extraction,” in Proceedings of the 2009 International Conference on Intelligent Information Hiding and Multimedia Signal Processing. Washington, DC, USA: IEEE Computer Society, 2009, pp. 1330–1333.
5. CONCLUSION
[7] T. P. Cao, et al., “Real-Time Vision-Based Stop Sign Detection System on FPGA,” in Proceedings of Digital Image Computing: Techniques and Applications. Los Alamitos, CA, USA: IEEE Computer Society, 2008, pp. 465–471.
This paper presents a proposal of a novel architecture of real-time HOG feature extraction for HDTV resolution video. The proposed scheme has a simplified HOG algorithm with cell-based scanning, simultaneous SVM calculation, cell-based pipeline architecture, and parallelized modules. The simplified algorithm contributes to reduction of the workload from 447.7 GOPS to 10.6 GOPS with 3% accuracy degradation. The cell-based algorithm and pipeline architecture provide memory bandwidth of 0.499 Gbps at HDTV resolution. The memory bandwidth of 0.499 Gbps can be handled by a 32-bit memory bus with reasonably low operating frequency. Parallelized modules greatly accelerate HOG feature extraction and object detection. The proposed architecture on FPGA prototyping board shows the best performance with minimum memory usage and minimum operating frequency, compared with the performance of conventional processors. The proposed schemes provide expandability to HDTV resolution video (1920 × 1080 pixels) at 30 fps at 76.2 MHz.
[8] Y. Yazawa, et al., “FPGA Hardware with TargetReconfigurable Object Detector by Joint-HOG,” in Proceeding of SSII. Yokohama, Japan, 2011. [9] K. Negi, et al., “Deep pipelined one-chip FPGA implementation of a real-time image-based human detection algorithm,” IEEE FPT 2011. [10] J. E. Volder, “The CORDIC Trigonometric Computing Technique,” IRE Trans. Electron. Comput. EC-8:330-334, 1959. [11] INRIA Person Dataset. http://pascal.inrialpes.fr/data/human/ [12] D. G. Lowe, “Distinctive image features from scale invariant keypoints,” International Journal of Computer Vision, Vol.60, No.2, pp.91-110, 2004.
202