Preview only show first 10 pages with watermark. For full document please download

Intel Perceptual Computing Sdk Samples Manual

   EMBED


Share

Transcript

INTEL® PERCEPTUAL COMPUTING SDK Samples Version 1.0 LEGAL DISCLAIMER THIS DOCUMENT CONTAINS INFORMATION ON PRODUCTS IN THE DESIGN PHASE OF DEVELOPMENT. INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR. INTEL MAY MAKE CHANGES TO SPECIFICATIONS AND PRODUCT DESCRIPTIONS AT ANY TIME, WITHOUT NOTICE. DESIGNERS MUST NOT RELY ON THE ABSENCE OR CHARACTERISTICS OF ANY FEATURES OR INSTRUCTIONS MARKED "RESERVED" OR "UNDEFINED." INTEL RESERVES THESE FOR FUTURE DEFINITION AND SHALL HAVE NO RESPONSIBILITY WHATSOEVER FOR CONFLICTS OR INCOMPATIBILITIES ARISING FROM FUTURE CHANGES TO THEM. THE INFORMATION HERE IS SUBJECT TO CHANGE WITHOUT NOTICE. DO NOT FINALIZE A DESIGN WITH THIS INFORMATION. THE PRODUCTS DESCRIBED IN THIS DOCUMENT MAY CONTAIN DESIGN DEFECTS OR ERRORS KNOWN AS ERRATA WHICH MAY CAUSE THE PRODUCT TO DEVIATE FROM PUBLISHED SPECIFICATIONS. CURRENT CHARACTERIZED ERRATA ARE AVAILABLE ON REQUEST. CONTACT YOUR LOCAL INTEL SALES OFFICE OR YOUR DISTRIBUTOR TO OBTAIN THE LATEST SPECIFICATIONS AND BEFORE PLACING YOUR PRODUCT ORDER. COPIES OF DOCUMENTS WHICH HAVE AN ORDER NUMBER AND ARE REFERENCED IN THIS DOCUMENT, OR OTHER INTEL LITERATURE, MAY BE OBTAINED BY CALLING 1-800-548-4725, OR BY VISITING INTEL'S WEB SITE HTTP://WWW.INTEL.COM. ANY SOFTWARE SOURCE CODE REPRINTED IN THIS DOCUMENT IS FURNISHED UNDER A SOFTWARE LICENSE AND MAY ONLY BE USED OR COPIED IN ACCORDANCE WITH THE TERMS OF THAT LICENSE ANY SOFTWARE SOURCE CODE REPRINTED IN THIS DOCUMENT IS FURNISHED UNDER A SOFTWARE LICENSE AND MAY ONLY BE USED OR COPIED IN ACCORDANCE WITH THE TERMS OF THAT LICENSE INTEL, THE INTEL LOGO, INTEL CORE, INTEL MEDIA SOFTWARE DEVELOPMENT KIT (INTEL MEDIA SDK) ARE TRADEMARKS OR REGISTERED TRADEMARKS OF INTEL CORPORATION OR ITS SUBSIDIARIES IN THE UNITED STATES AND OTHER COUNTRIES. MPEG IS AN INTERNATIONAL STANDARD FOR VIDEO COMPRESSION/DECOMPRESSION PROMOTED BY ISO. IMPLEMENTATIONS OF MPEG CODECS, OR MPEG ENABLED PLATFORMS MAY REQUIRE LICENSES FROM VARIOUS ENTITIES, INCLUDING INTEL CORPORATION. *OTHER NAMES AND BRANDS MAY BE CLAIMED AS THE PROPERTY OF OTHERS. COPYRIGHT © 2011-2013, INTEL CORPORATION. ALL RIGHTS RESERVED. ii SDK Samples | Intel Corporation Version 1.0 Optimization Notice Intel compilers, associated libraries and associated development tools may include or utilize options that optimize for instruction sets that are available in both Intel and non-Intel microprocessors (for example SIMD instruction sets), but do not optimize equally for non-Intel microprocessors. In addition, certain compiler options for Intel compilers, including some that are not specific to Intel micro-architecture, are reserved for Intel microprocessors. For a detailed description of Intel compiler options, including the instruction sets and specific microprocessors they implicate, please refer to the “Intel Compiler User and Reference Guides” under “Compiler Options." Many library routines that are part of Intel compiler products are more highly optimized for Intel microprocessors than for other microprocessors. While the compilers and libraries in Intel compiler products offer optimizations for both Intel and Intelcompatible microprocessors, depending on the options you select, your code and other factors, you likely will get extra performance on Intel microprocessors. Intel compilers, associated libraries and associated development tools may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include Intel® Streaming SIMD Extensions 2 (Intel® SSE2), Intel® Streaming SIMD Extensions 3 (Intel® SSE3), and Supplemental Streaming SIMD Extensions 3 (SSSE3) instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. While Intel believes our compilers and libraries are excellent choices to assist in obtaining the best performance on Intel and non-Intel microprocessors, Intel recommends that you evaluate other compilers and libraries to determine which best meet your requirements. We hope to win your business by striving to offer the best performance of any compiler or library; please let us know if you find we do not. Notice revision #20110307 iii SDK Samples | Intel Corporation Version 1.0 Table of Contents SDK Samples ........................................................................1 List of Samples and Tools ...................................................................... 1 Building Samples .................................................................................. 2 Sample: attribute_detection .................................................................. 3 Sample: audio_recorder ........................................................................ 4 Sample: camera_uvmap........................................................................ 4 Sample: camera_viewer ........................................................................ 5 Sample: depth_smoothing ..................................................................... 6 Sample: face_detection ......................................................................... 7 Sample: face_recognition ...................................................................... 8 Sample: gesture_viewer ........................................................................ 9 Sample: gesture_viewer.cs .................................................................. 10 Sample: gesture_viewer_simple ........................................................... 11 Sample: gesture_viewer_simple.cs ....................................................... 11 Sample: landmark_detection ............................................................... 12 Sample: simple_module ...................................................................... 12 Sample: voice_recognition ................................................................... 13 Sample: voice_recognition.cs ............................................................... 14 Sample: voice_synthesis ..................................................................... 14 Sample: voice_synthesis.cs ................................................................. 14 Tool : camera_info ............................................................................. 15 Tool : capture_viewer ......................................................................... 15 Tool : sdk_info ................................................................................... 17 iv SDK Samples | Intel Corporation Version 1.0 v SDK Samples | Intel Corporation Version 1.0 SDK Samples The Intel® Perceptual Computing SDK is a library of pattern detection and recognition algorithm implementations exposed through standardized interfaces. This document describes the SDK samples that demonstrate how to use the SDK APIs. List of Samples and Tools The following samples show the SDK API usages. The samples with a .cs suffix are in C#. Category Sample Description Raw Data Processing camera_viewer The sample demonstrates how to capture color and depth images from the camera device and render them on the screen. camera_uvmap The sample shows how to map depth pixel coordinates to color pixel coordinates. depth_smoothing The sample shows how to smooth raw depth data for more stable depth information. audio_recorder The sample demonstrates how to record audio data to WAVE files. attribute_detection These samples show how to use the face analysis interface. SDK Interface face_detection landmark_detection face_recognition gesture_viewer gesture_viewer.cs gesture_viewer_simple These samples show finger tracking, pose/gesture recognition, and event notification. gesture_viewer_simple.cs voice_recognition voice_recognition.cs voice_synthesis voice_synthesis.cs 1 SDK Samples Version 1.0 The sample shows how to use the voice recognition interface for voice command and control and dictation. The sample shows how to use the voice synthesis interface. Intel Corporation Module Development simple_module The sample illustrates how to develop a SDK module. The SDK provides the following tools for troubleshooting purposes: Sample Description capture_viewer The tool visualizes any color, depth, and audio streams from the input devices. camera_info The tool shows essential camera information for maintenance purposes. sdk_info This tool shows essential SDK setup information for trouble shooting purposes. The SDK provides the following framework and game engine samples: Sample Description Hellounity The sample shows how to use the SDK with the Unity* game engine. See $(PCSDK_DIR)\framework\Unity\hellounity\README.txt for details. Helloprocessing The sample shows how to use the SDK with the Processing* framework. See $(PCSDK_DIR)\framework\Processing\README.txt for details. ofxPCSDK The sample shows how to use the SDK with openFrameworks*. See $(PCSDK_DIR)\framework\openFrameworks\README.txt for details. Building Samples Most samples come with source code. They are buildable with Microsoft* Visual Studio* 2008 or Microsoft Visual Studio 2010. The building steps are as follows:     2 Click on the sample solution file $(PCSDK_DIR)\sample\\.sln folder. For example, the camera_viewer sample’s solution file is $(PCSDK_DIR)\sample\camera_viewer\camera_viewer.sln. As the sample is under the privileged directory C:\Program Files. The Microsoft Visual Studio will prompt for alleviated privilege. Accept that to continue the process. Alternatively, you can copy the sample directory to any non-privilleged location. The sample solution file is for Microsoft Visual Studio 2008. For Microsoft Visual Studio 2010, there is a conversion process. Click BuildRebuild to build the sample. SDK Samples Version 1.0 Intel Corporation The prebuilt sample binaries are under $(PCSDK_DIR)/bin/$(PlatformName). Sample: attribute_detection The attribute_detection sample demonstrates how to use the SDK face analysis interface for attribute detection. The sample supports the following command line options: attribute_detection [-sdname ] [-nframes ] [-file ] [-record -file ] [-load ] –sdname Specify an input device name. The default value is DepthSense Device 325. –nframes Specify the maximum number of frames to render before the sample exists. –file Specify a file name for recording (used together with –record) or playback (used alone). –record Enable the recording mode. Use together with –file to specify the recording file name. -load Load a specific input SDK module into the SDK session. The sample displays the color image from the input device. If there is a detected face, the sample draws a rectangle around the face with the face identifier, which uniquely identifies the face, and shows the detected face attributes as illustrated in Figure 1. The attributes detected are age group, gender, smile, and blink. Figure 1: Sample attribute_detection Render Window 3 SDK Samples Version 1.0 Intel Corporation Sample: audio_recorder The audio_recorder sample records audio data from any microphone device to a .wav file. The sample supports the following command options: audio_recorder [-nframes ] [-sdname ] [-ml ] [-file ] –sdname Specify an input device name. The default value is DepthSense Device 325. –nframes Specify the maximum number of frames to record before the sample exists. –ml Specify the color picture resolution, for example, 640x480. –file Specify the output file name. The default is Default_DSAudio.wav. It is recommended to use the -file option to specify an output location other than the SDK installation directory, as the SDK installation directory is privileged and not writable. Example: audio_recorder –nframes 200 –nchannels 1 –smprate 48000 –sdname “DepthSense Device 325” –file c:\temp\temp.wav Sample: camera_uvmap The camera_uvmap sample shows how to retrieve synchronous color and depth pictures and map coordinates between them. The sample supports the following command line options: camera_uvmap [-csize ] [-dsize ] –csize Specify the color picture resolution, for example, 640x480. –dsize Specify the depth picture resolution, for example, 640x480. The sample calculates the color pixel coordinates for each depth pixel and draws a red dot on the color image, as illustrated in Figure 2. The application developer can click a specific location on the depth image to get a cross check on the color image coordinates. 4 SDK Samples Version 1.0 Intel Corporation Figure 2: Sample camera_uvmap Render Window Sample: camera_viewer The camera_viewer sample renders color and/or depth images from a camera device. The sample supports the following command line options: camera_viewer [-sdname ] [-nframes ] [-csize ] [-dsize ] [-file ] [-record -file ] [-load ] 5 –sdname Specify an input device name. The default value is DepthSense Device 325. –nframes Specify the maximum number of frames to render before the sample exists. –csize Specify the color picture resolution, for example, 640x480. –dsize Specify the depth picture resolution, for example, 640x480. –file Specify a file name for recording (used together with –record) or playback (used alone). –record Enable the recording mode. Use together with –file to specify the recording file name. SDK Samples Version 1.0 Intel Corporation -load Load a specific input SDK module into the SDK session. The sample searches the specified input device for the color and/or depth streams and renders them for the specified number of frames, as illustrated in Figure 3. Figure 3: Camera_Viewer Render Windows Sample: depth_smoothing The depth_smoothing sample shows how to access the raw depth stream without any filtering. The sample displays the raw depth stream and the filtered depth stream side by side. Figure 4 shows the depth image after filtering. Figure 4: Sample depth_smoothing Render Window 6 SDK Samples Version 1.0 Intel Corporation Sample: face_detection The face_detection sample demonstrates how to use the SDK face analysis interface for face detection. The sample supports the following command line options: face_detection [-sdname ] [-nframes ] [-file ] [-record -file ] [-load ] –sdname Specify an input device name. The default value is DepthSense Device 325. –nframes Specify the maximum number of frames to render before the sample exists. –file Specify a file name for recording (used together with –record) or playback (used alone). –record Enable the recording mode. Use together with –file to specify the recording file name. -load Load a specific input SDK module into the SDK session. The sample displays the color image from the input device. If there is a detected face, the sample draws a rectangle around the face and shows the face identifier, which uniquely identifies the face, as illustrated in Figure 5. If the face tracking is lost, the sample increases the face identifier for the next detected face. The sample will stop either after the specified frames are processed, or the rendering window is closed. Figure 5: Sample face_detection Render Window 7 SDK Samples Version 1.0 Intel Corporation Sample: face_recognition The face_recognition sample demonstrates how to use the SDK face analysis interface for face recogniton. The sample supports the following command line options: face_detection [-sdname ] [-nframes ] [-file ] [-record -file ] [-load ] –sdname Specify an input device name. The default value is DepthSense Device 325. –nframes Specify the maximum number of frames to render before the sample exists. –file Specify a file name for recording (used together with –record) or playback (used alone). –record Enable the recording mode. Use together with –file to specify the recording file name. -load Load a specific input SDK module into the SDK session. The sample displays the color images from the input device. If a person is recognized, the sample shows his/her name on the face rectangle, as illustrated in Figure 6. To add a name to the database, the user can press the ‘a’ key and then enter the person’s name. Adding more than a single view of the same person increases recognition robustness. Figure 6: Face recognition matching the registered profile 8 SDK Samples Version 1.0 Intel Corporation Sample: gesture_viewer The Gesture_viewer sample shows how to use the SDK finger tracking interface. The sample supports the following command line options: gesture_viewer [-nframes ] [-file ] [-record -file ] [-load ] [-iuid ] –nframes Specify the maximum number of frames to render before the sample exists. –file Specify a file name for recording (used together with –record) or playback (used alone). –record Enable the recording mode. Use together with –file to specify the recording file name. -load Load a specific input SDK module into the SDK session. -iuid Specify the finger tracking module identifier. The sample shows two rendering windows: the depth map image and the label map blob image, as illustrated in Figure 7 with more details in Figure 8:       The sample draws red circles (thumb as green or blue circles depending on the left or right hand) at fingertip positions. The circle radius represents the fingertip mass volume. If there are two hands in the camera view, they are marked as different colors. The sample draws the special geometric nodes, SDK label LABEL_HAND_UPPER, LABEL_HAND_MIDDLE, and LABEL_HAND_LOWER, as blue (or yellow) dots. These nodes are designed for grabbing or turning a virtual object. The sample draws the fingertip geometric node, SDK label LABEL_HAND_FINGERTIP, as a yellow (or blue) dot. The fingertip node is designed for pointing to the screen. The sample shows a blue bar proportion to the hand openness value at the right. If any pose or gesture is recognized, the sample shows the pose or gesture icon in the upper-left corner of the depth window. Figure 7: Sample gesture_viewer Depth and Labelmap Windows 9 SDK Samples Version 1.0 Intel Corporation Figure 8: Sample gesture_viewer Depth Windows Sample: gesture_viewer.cs The gesture_viewer.cs sample is the simplified C# version of the gesture_viewer sample. As shown in Figure 9, the sample prints out the tracked palm center positions and the time stamps. Figure 9: Sample gesture_viewer.cs Output 10 SDK Samples Version 1.0 Intel Corporation Sample: gesture_viewer_simple The gesture_viewer_simple sample implements the same functionalities as the gesture_viewer sample using the UtilPipeline interface. For simplicity, the sample does not draw the label map image, as illustrated in Figure 10. Figure 10: Sample gesture_viewer_simple Render Window Sample: gesture_viewer_simple.cs The gesture_viewer_simple.cs sample is the C# version of the gesture_viewer_simple sample. Figure 11 shows the sample output. Figure 11: Sample gesture_viewer_simple.cs Output 11 SDK Samples Version 1.0 Intel Corporation Sample: landmark_detection The landmark_detection sample implements facial landmark detection using the UtilPipeline interface. Click on the sample to run. There is no command line options. The sample draws dots on recognized facial feature points, left and right eye corners and mouth corners, as illustrated in Figure 12. Figure 12: Sample landmark_detection Render Window Sample: simple_module The simple_module sample shows how to extend the SDK functionalities to perform simple math. The sample extends the SDK interfaces for the arithmetic reciprocal operation, that is, 1/N, where N is the input value. The sample extends three interfaces: a synchronous function, an asynchronous function, an asynchronous function with power management. The sample does not take any input parameters. Run it from the command prompt window. The result of 0.3 reciprocal is shown in Figure 13. 12 SDK Samples Version 1.0 Intel Corporation Figure 13: Sample simple_module Output Sample: voice_recognition The voice_recognition sample demonstrates how to use the voice recognition module interface for voice command and control and dictation. The sample supports the following command line options: voice_recognition [-iuid ] [-file ] [-grammar ] [-sdname ] [-eos ] [-realtime on|off] [-nframes ] 13 –iuid Specify the voice recognition module identifier. –file Specify a .WAV file name as input. The default is to obtain input from the microphone. -grammar Specify a string of comma delimited words as the command list and enable the command and control mode. For example, the command list can be one,two,three,four. The default is to enable the dictation mode. -sdname Specify the input device name. -eos Specify the silence interval between phrases in milliseconds. The default is 200 milliseconds. For faster speech, choose a smaller value. -realtime If specified, the data reading speed from the file will be equal to the input data rate from the microphone device SDK Samples Version 1.0 Intel Corporation –nframes Specify the maximum number of audio frames to process before the sample exists. The sample prints dots to indicate that it is working. If any commands or dictation phrases are recognized, the sample will print the recognized commands or dictation phrases. Press any key to exit the sample. Command-line example: voice_recognition.exe -grammar "one,two,three,four,five,six,seven,eight,nine" Sample: voice_recognition.cs The voice_recognition.cs sample is a C# sample for voice dictation. There is no command option. The sample prints dots to indicate it is listening and prints out the recognized phrase. Sample: voice_synthesis The voice_synthesis sample shows how to use the voice synthesis interface for text to speech translation. The sample supports the following command line options: voice_synthesis [-iuid ] [-file ] [-text ] –iuid Specify the voice synthesis module identifier. –file Specify a .WAV file name as output. The default is tts_output.wav. -text Specify the text string to be synthesized. For example, the string can be “I can fly like a bird”. The sample writes the synthesized speech to the output .wav file. The user can listen to the output by using any media player application. Sample: voice_synthesis.cs The voice_synthesis.cs sample shows how to use the voice synthesis interface for text to speech translation in C#. The sample has no command line option. It renders a fixed sentence: “I am speaking. Is it nice?”. 14 SDK Samples Version 1.0 Intel Corporation Tool : camera_info The camera_info tool shows the camera information for the Creative* interactive gesture camera for maintenance purposes, as illustrated in Figure 14. The user can use SaveAs to save the camera information to a file. Figure 14: Tool camera_info Window Tool : capture_viewer The capture_viewer tool visualizes color, depth, vertices, and audio streams from any input device. The tool presents a tree view of devices and their streams and configurations as illustrated in Figure 15. 15 SDK Samples Version 1.0 Intel Corporation Figure 15: Tool capture_viewer Window The user can select any number of streams by clicking on the checkbox in front of the streams. Then the user can use ControlDisplay to playback the selected streams, and ControlStop to stop streaming. The tool displays each stream in a separate window. The user can use FileOpen to add an existing recorded file or multiple files to the playback list, and then select certain streams in the files for playback, as illustrated in Figure 16. 16 SDK Samples Version 1.0 Intel Corporation Figure 16: Sample capture_viewer Window with Recorded Files If, for any reasons, the device list changes, the applications can use the FileRescan to rescan the devices. The following are specific actions available on some streams:   For vertices data, the developer can press F1/F2/F6 to switch different views. For audio data, the developer can use the mouse wheel to adjust amplitudes. About frame rate: If the user selects two depth streams on the same DepthSense Device 325 device and their configurations differ only in the frame rate, the tool will render the depth stream using the frame rate of the selected stream. Tool : sdk_info The sdk_info tool shows the SDK setup information for trouble shooting purposes, as illustrated in Figure 17. The user can use SaveAs to save the setup information to a file. 17 SDK Samples Version 1.0 Intel Corporation Figure 17: Tool sdk_info Window 18 SDK Samples Version 1.0 Intel Corporation