Preview only show first 10 pages with watermark. For full document please download

Hp Autonomy Ocr For Worksite - Data Sheet

   EMBED


Share

Transcript

Data sheet HP Autonomy OCR for WorkSite Find all hidden information across the enterprise. In today’s enterprise environment, often the most critical information is the 10% of information that has not been indexed. There are thousands of valuable documents such as contracts, signed agreements, court documents, and other scanned content that are not full text searchable because they were created by processes that do not include character recognition capabilities. Image-only scanned documents can be generated by: • Ad-hoc desktop scanners that do not have the ability to generate text for indexing • 3rd parties who provide non-OCRed scanned content as email attachments • Desktop OCR processes that lack enterprise throughput, failover, and error handling • Internet research downloads or imports These image-only documents create a huge challenge, as this information is only visible via navigation and metadata searches in WorkSite; image-only documents are not returned in full text search result sets because their content has not been indexed by IDOL. Not only does this represent a large, unquantifiable risk, it is also renders many critical documents – a rich source of business information - underutilized. Powered by HP Autonomy IDOL, the Optical Character Recognition (OCR) module extracts the content from these documents into the IDOL index collection allowing search and thus enables your organization to fully leverage the benefits of this once lost content. Low cost of ownership The plug-in nature of the module allows organizations to leverage existing IDOL infrastructure, eliminating the need for workstations or software on the desktop. Zero desktop footprint also means less maintenance for IT overall. Powerful server-side processing Installed as a back end service, the OCR module performs two important functions: • Back file OCR The OCR module identifies image-only WorkSite documents, generates OCR text and indexes the content • OCR all incoming documents As part of the indexing process, the OCR module continuously extracts text from new and revised WorkSite documents No more hidden documents Extracting the content from the image files, including email attachments, enables smart business decisions by providing the correct users with search access to this critical content. Important enterprise knowledge captured in these documents can be found, regardless of how they are searched within WorkSite. Fast and powerful seamless automation Since the HP Autonomy OCR module is an IDOL plugin leveraging the existing IDOL indexing process, no middleware processes are involved, resulting in fast OCR processing. Documents can be made available for searching in minutes. Image files added to, and any files already present in WorkSite, automatically become searchable as part of the normal IDOL indexing process, without any additional input or work from the end user or IT staff. Key benefits • • • • Boosts efficiency by finding hidden content including email attachments Seamless server side integration with WorkSite; no desktop footprint Powered by IDOL with support for over 1000 file types and over 120 languages Shares existing IDOL infrastructure About HP Autonomy HP Autonomy is a global leader in software that processes human information, or unstructured data, including social media, email, video, audio, text and web pages, etc. Autonomy’s powerful management and analytic tools for structured information together with its ability to extract meaning in real time from all forms of information, regardless of format, is a powerful tool for companies seeking to get the most out of their data. Autonomy’s product portfolio helps power companies through enterprise search analytics, business process management and OEM operations. Autonomy also offers information governance solutions in areas such as eDiscovery, content management and compliance, as well as marketing solutions that help companies grow revenue, such as web content management, online marketing optimization and rich media management. Please visit autonomy.com to find out more. Accuracy across formats and languages IDOL’s ability to understand content in over 120 languages gives the HP Autonomy OCR module the power to extract information from practically any document regardless of its origin or language with an un-paralleled level of accuracy. OCR is performed on all graphic files and documents (PDF, TIFF, JPEG or GIF) regardless of size - a document can be one page or a collated set. Also, the OCR process is performed in place so document integrity is always maintained. Get connected hp.com/go/getconnected Current HP driver, support, and security alerts delivered directly to your desktop www.autonomy.com Copyright © 2013 HP Autonomy. All rights reserved. Other trademarks are registered trademarks and the properties of their respective owners. 20130204_RL_DS_HP_ Autonomy_OCR_WorkSite