Transcript
Online Guide
Adobe ® Acrobat ® Catalog • Commands • Using Acrobat Catalog • Preparing PDF document collections for indexing • Building an index • Distributing and maintaining an index • Troubleshooting
1
Commands Index
To define and build an inNew... To change an index definiOpen… To build an already-deBuild... Scheduling automatic Schedule... Purging Purge...and rebuilding an Windows
File
Edit
Index
To define and New To change an Open Quit
Setting preferPreferences
To build an alBuild Scheduling Schedule Purging Purge and
Macintosh
The Macintosh version has a Preferences command. Windows preferences are set in the acrocat.ini file.
2
Using Acrobat Catalog You use Adobe Acrobat Catalog to build full-text indexes of PDF document collections. A full-text index is a searchable database of all text in a document or set of documents. Readers of the documents you have indexed use the Acrobat Search plug-in for Acrobat Exchange or Acrobat Reader to search them. Acrobat Search appears in Exchange or Reader as the Search command on the Tools menu. Index building has three phases: 1 Preparing documents for indexing 2 Building the index for the documents 3 Maintaining the index The third phase is particularly important with dynamic indexes—indexes for constantly changing information such as PDF business documents stored on a network file server.
3
Preparing PDF document collections for indexing Before you index a document collection, you need to organize the documents on the disk drive or network server volume. If the documents have chapters or sections, consider breaking them up into smaller files. Consider using the Document Info fields in the documents to make them easier to search. If you are building the index on one platform and it will be used on another, name the documents carefully.
Organizing a PDF collection and its index Separating PDF documents into parts Adding information to PDF documents for efficient searching Naming PDF documents
4
Organizing a PDF collection and its index When you define and build an index, Catalog gives the index-definition file a .pdx extension and creates an index support folder with nine subfolders and the same name as the PDX file. It places the PDX file and the support folder in the same folder.
5
The simplest organization is to have the index itself—the PDX file and the support folder containing the nine subfolders—in the folder that contains the indexed document collection:
Legal
leglindx.pdx
leglindx
Index for documents in torts and breaches
6
torts
breaches
This structure is the easiest for users to understand and the easiest to move to another drive or server volume.
• If this structure isn’t feasible, any structure in which the index and the indexed documents are in a folder branch that can be moved as a single unit simplifies the move.
• However, almost any organization is possible. There is only one restriction on the Macintosh and three on other platforms. On all platforms, the entire index— both the PDX file and the support folder containing the nine subfolders—must be in a single folder. And on all platforms except the Macintosh, two further restrictions apply: The indexed documents must reside on a single disk drive or network server volume, and the index must be on the same drive or volume as the indexed documents.
7
Separating PDF documents into parts Consider creating a separate PDF file for each chapter or section. When you separate a document into parts and then search it, the search results are more sharply focused. For example, searching a travel book separated in this way for beach would return chapters about locations with beaches and probably give them a high relevance ranking. If the search returned chapters about locations without beaches at all, it would give them a low ranking. See Interpreting relevance ranking in the Acrobat Search Guide for details.
8
Adding information to PDF documents for efficient searching When PDF documents are indexed, Acrobat Search users can limit searches to just those documents that contain specific Document Info field values. For example, a search could be limited to just those documents whose author is “Bob Jones” and that list “Status report” as the subject. For this reason, encourage document publishers who use PDF Writer to enter Document Info field values for all their documents during conversion to PDF. If the documents are already converted, encourage them to use Acrobat Exchange. The standard Document Info fields are Title, Subject, Author, and Keywords. PDF documents created with Acrobat Distiller already have Author and Title information. See Tips on filling in Document Info fields.
9
You can also define custom data fields such as Document Type, Document Number, and Document Identifier. But note that such fields appear only in custom versions of Acrobat Exchange and that you need a good understanding of the PDF format to customize Exchange. For details, see Supporting custom Document Info fields.
10
Tips on filling in Document Info fields It’s a good idea to standardize usage in the Document Info fields across your organization. For example:
• Always put a descriptive title in the Title field. Even though the filename of the document appears in the Search Results dialog box if the title field is empty, filenames are often not very descriptive. • Always use the same field for category information. For example, don’t use the Subject field for some documents and the Keywords field for others.
• Always use the same word for the same category. For example, don’t use biology for some documents and life sciences for others. • You might use the Author field to identify the group responsible for the document. For example, the author of a hiring policy document might be the Human Resources department.
11
• If documents are identified by part numbers, add the numbers as keywords. For example, add something like doc#=m234 to the Keywords field.
• To categorize documents by type, use the Subject or Keywords field, or both. For example, you might use status report as a Subject value and monthly or weekly as a Keywords field value for a single document.
• If you are publishing a large number of documents, make a table that shows which values are assigned to which documents. While you are developing the index, use the table to maintain consistency. When you publish the index, include the table as part of the documentation.
12
Supporting custom Document Info fields To support a custom data field used in a customized version of Acrobat Exchange, you declare the field in the acrocat.ini file (Windows) or in the Catalog Preferences dialog box (Macintosh).
Note: For information on customizing Exchange, see the Acrobat Software Development Kit. (You can find the Kit at http://www.adobe.com/acrobat/ moreinfo.) For information on the integer, date, and string data types mentioned in this topic, see Portable Document Format Reference Manual, version 2. To define a custom field (Windows) To define a custom field (Macintosh) Sample custom fields
13
To define a custom field (Windows) 1 After the line containing [Fields] in the acrocat.ini file, insert a line with the following syntax:
Field0= CustomFieldName, DataType where CustomFieldName is the name of the field and can be up to 64 characters and where DataType is one of the following: int
for integer fields
date
for date fields
str
for string fields.
For more details, see Sample custom fields. 2 To declare additional fields, insert similar lines. Begin each line with Field1, Field2, Field3, and so on. Do not skip field numbers.
14
3 Save the acrocat.ini file, and then create a new indexdefinition (PDX) file—existing PDX files won’t work. The new index definition will contain the custom field definitions.
15
To define a custom field (Macintosh) 1 Choose Edit > Preferences, and select Custom Fields in the left panel of the Preferences dialog box. 2 Type the name of the custom field in the Field Name text box. (For details on this step and the next, see Sample custom fields.) 3 Select the Field Type option (integer, date, string) from the Field Type menu. 4 Click Add to include the custom field in the scroll box. 5 Click OK to accept all of the changes you have made to the Catalog Preferences dialog box, including Custom Fields. Click Cancel to cancel all of the changes to this dialog box.
Note: You cannot edit a custom field. To change a field, remove it and add it again.
16
Sample custom fields If you defined two custom fields, DocumentIdenfier and DocumentType, they might appear as follows in the Windows acrocat.ini file:
Field0=DocumentIdentifier,int Field1=DocumentType,str On a Macintosh, custom fields would appear as follows in the Custom Fields group of preferences:
DocumentIdentifier (Integer) DocumentType (String) The DocumentIdentifier field takes integer values from 0 to 65,535, and the DocumentType field takes strings from 0 to 256 characters. If you want to use document identifiers with letters and special characters, you must declare the DocumentIdentifier field as a string.
17
Naming PDF documents When you name PDF documents and build indexes for cross-platform document collections, the safest approach is to observe MS-DOS filenaming conventions. Acrobat Catalog and Acrobat Search use unique document identifiers as well as pathnames to identify indexed documents, so MS-DOS filenames may not be absolutely necessary. However, ambiguities caused when names created for one platform are mapped to names usable on another can slow searches and even prevent documents from being located.
18
• If you are using the Macintosh version of Catalog to build a cross-platform indexed document collection and you don’t want to change long PDF filenames to MS-DOS filenames, check Make Include/Exclude Folders DOS Compatible in the Index group of preferences before you build your index. If you check this preference, you must use MS-DOS filenaming conventions for the folder names. You do not have to use these conventions for the names of the files that the folders contain.
• If you are using the Macintosh version with OS/2 LAN Server but want to be sure that the indexed files are searchable on all PCs, either configure LAN Server Macintosh (LSM) to enforce MS-DOS filenaming conventions or index only FAT volumes. (HPFS volumes may contain unretrievable long filenames.)
19
• Do not alternate between using a Windows and a Macintosh version of Catalog when you build or update an index if you are indexing PDF documents with long filenames that will be truncated for Windows use.
• Even for documents that will be searched only by Macintosh users, do not use deeply nested folders or pathnames longer than 256 characters.
20
• If you are planning to deliver the document collection and index on an ISO 9660-formatted CD-ROM, you should use ISO 9660 filenames. With the Macintosh version of Catalog, check Log Compatibility Warnings in the Logging group of preferences to be warned of noncompliant filenames. • Avoid using high ANSI characters, such as some nonEnglish characters, in the names of files and folders used for the index or the indexed files. The font used by Catalog does not support character codes 133 through 159. For information on MS-DOS and ISO 9660 filename conventions, see Naming conventions.
21
Building an index Before you can build an index, you need to ensure enough free disk space to accommodate the index and the temporary files created during the build. The index files require 10 to 30% of the space required by the documents being indexed, closer to 10% if the documents contain many graphics. The temporary files require 10 to 30% of the space required by the index files.
Building an index Choosing options for an index Setting preferences for Acrobat Catalog
22
Building an index You have to define an index before Acrobat Catalog can build it. The definition lists the folders containing the documents to be indexed and indicates any changes to the default settings for index options. It should also include an index title and a description of the index. These will be available, and useful, to users of the index. On the Macintosh, you can simplify index building by dragging and dropping the folders rather than listing them.
To define and build an index To build an already-defined index To change an index definition Drag-and-drop index building (Macintosh) Stopping a build
23
To define and build an index 1 Choose Index > New (Windows) or File > New (Macintosh), and give the index a useful title. 2 Provide useful information about the index (up to 250 characters) in the Index Description text box. 3 For each folder that contains the documents to be indexed, click Add in the Include Directories box and use the Select dialog box to locate and select the folder. If an included folder contains a subfolder with PDF files that you do not want to index, click Add in the Exclude Directories box and select the folder. To select a folder:
• In Windows, open the folder by double-clicking the folder name; then click OK.
24
• On a Macintosh, select the folder by clicking the folder name; then click Select folder name. If you don’t plan to move the index and document collection , you can add folders from multiple servers or disk drives. Before doing so, however, select Allow Indexing On a Separate Drive in the Index group of preferences. (Choose Edit > Preferences to reach the group.)
• To change the default index options for this index, click Options and select the index options you want. You can exclude specific words (stopwords) from the index, exclude numbers, remove some of the user’s search options (Case Sensitive, Sounds Like, Word Stemming), or adapt the index to documents created in Acrobat 1.0 or to CD-ROM use. 4 Click Build to display the Save Index File dialog box. The build begins only after you have saved the index definition (next step).
25
5 Name the index-definition file, select a folder for it, and click OK (Windows) or Save (Macintosh). For the filename, retain the .pdx extension provided. The pathname of the folder should not contain high ANSI characters (such as some foreign characters) or the slash (/) character. Acrobat Catalog begins building the index. The folder you select for the PDX file will also contain the folder for the nine subfolders containing the index data files being built.
• In Windows, this folder must be on the disk or network server volume where the documents to be indexed are stored.
• On the Macintosh, if you don’t plan to move the index and documents, you can put the folder on a different disk or network server volume from the disk or volume where the documents to be indexed are stored. Before doing so, select Allow Indexing On a Separate Drive.
26
As Catalog builds the index, it displays messages that report the progress of the build. You can stop the build at any time by clicking Stop. If the message displayed reports an error, see Catalog error messages for help with correcting it. All messages are saved, with date and time stamps, in a log file.
27
About Catalog log files Every time Acrobat Catalog builds or updates an index, it displays messages that report on the progress or failure of the build and writes these messages to a log file. In the log file, each message is date and time stamped. Over time, the file compiles a record of every document that is indexed and records the dates and times when the index is updated. The file is deleted when it reaches a maximum size—1 MB by default. The log file for an index is created the first time the index is built:
• In Windows, Catalog creates the log file in the same folder as the index-definition (PDX) file and gives it the same name, except that the extension is .log rather than .pdx.
28
• On the Macintosh, Catalog creates the log file in the Catalog application folder by default. You can use the Logging group of preferences to save the log file in the same folder as the index or in a folder you select. You can also change the name of the log file.
29
To build an already-defined index 1 Choose Index > Build to open the Select Index File to Build dialog box. 2 Locate and select the PDX file for the index you want to build, and click OK. Acrobat Catalog builds the index and places it in the selected folder.
Note: On the Macintosh, you can also build an already-defined index by dragging and dropping. For information on building already-defined indexes in batches, see Purging and rebuilding an index.
30
CATALOG Page 31 Wednesday, September 25, 1996 1:14 PM
To change an index definition 1 Choose File > Open (Macintosh) or Index > Open (Windows). 2 Locate and select the PDX file you want to revise, and click OK. 3 Make your changes in the dialog box called Edit Index Definition (Windows) or Index Definition (Macintosh). 4 If the changes are minor, click Build to rebuild the index. If the changes are major (for example, adding or removing support for search options), skip the build step at this time. Purge and rebuild after you have saved the revised definition. 5 Click Save to save the revised definition.
31
Drag-and-drop index building (Macintosh) You can control the details of a drag-and-drop index build by providing the index definition yourself or by altering Drop Folders preferences. Alternatively, you can leave the details to Acrobat Catalog. To build an index: Drag a folder containing PDF documents to the Catalog application icon. Or drag multiple folders or an entire disk. When you release the mouse button, Catalog begins building the index.
• If a folder contains a PDX file, Catalog uses that definition to index the documents in the folder and in any other folders listed in the definition.
• If a folder does not contain a PDX file, Catalog places a new default index (named index.pdx) in the folder and uses it to index the documents in the folder.
32
To specify a new name for the PDX file and index support folder: Type a name in the Default Index Name text box of Drop Folders Preferences. To save the PDX file and index support folder outside of the document folder: Select Outside Dropped Folder from the Save Index menu of Drop Folders Preferences. To build an index from the contents of only the folder or folders that you just dropped: Check Delete Existing Indexes in Drop Folders Preferences. If you do not select this option, Catalog adds the dropped folder or folders to the Include Directories list of the PDX file, indexes the files that have not yet been indexed, and merges the indexes.
33
Stopping a build When a build is in progress, you can stop it at any time. After a build is stopped, Acrobat Catalog is ready to build a new index, update an existing index, or process scheduled builds. When you stop a build, Catalog maintains the partial results of the build. This way, the next time you update the index, the work that has already been done will be preserved.
34
To stop a build: Click Stop in the Catalog window to stop a build. When you click Stop, an error message appears in the Message text box and is written to the log file as follows:
Search Engine Message: E3-0024 (VDK): [specific to error] This message is normal and indicates only that the build was not completed. If necessary, the partial index can be searched. To restart a stopped build: 1 Select Schedule from the Index menu. 2 Click Start to restart the scheduled builds. Even if the messages that appear indicate that the build can’t be restarted, give the process some time before stopping it again. It may be successful despite the messages.
35
Choosing options for an index Choosing options for an index Excluding words (stopwords) from an index Excluding numbers from an index Choosing not to support search options Optimizing for CD-ROM Adding unique document identifiers to 1.0 PDF files
36
Choosing options for an index You use the Options dialog box to change Acrobat Catalog defaults for a particular index definition. You can exclude specified terms ( stopwords) and numbers, and disable support for Acrobat Search’s Match Case, Sounds Like, and Word Stemming features. If the collection contains PDF files created by version 1.0 of Acrobat PDF Writer or Acrobat Distiller, select the Add IDs to Acrobat 1.0 PDF Files option. In Windows, the defaults are fixed. You can change them for a particular definition, but not permanently. On the Macintosh, you can change the defaults for most of the options in the Index Defaults group of preferences.
37
To choose any index option: 1 In the dialog box called New Index Definition or Edit Index Definition (Windows) or Index Definition (Macintosh), click Options. You display the dialog box by choosing New or Open from the Index menu (Windows) or File menu (Macintosh). 2 Make additions or changes in the Options dialog box. 3 Click OK. The changes apply the next time the index is built or updated.
38
Excluding words (stopwords) from an index You can exclude—”stop”—up to 500 words from an index. You might exclude articles such as “the” and “a”, conjunctions such as “but” and “or”, and prepositions such as “for” and “by.” The advantage of excluding stopwords from an index is that it makes the index smaller—typically 10 to 15% smaller. The disadvantage is that users of the index cannot find phrases that contain the stopwords. In all their searches, they have to work around them. To help them, you should provide a list of the stopwords with the index.
39
To exclude stopwords from an index:
• To exclude a stopword, type the word in the Word text box of the Options dialog box and click Add. Stopwords can be up to 24 characters long and are case sensitive. (To stop the, you need to enter both “The” and “the.”)
• To remove a word from the list of stopwords, select a word in the Word To Not Include In Index text box and click Remove.
40
Excluding numbers from an index To reduce the size of an index, typically by 10 to 20%, you can exclude numbers from it. In Windows, the default is to include numbers—you need to specify exclusion in the index definition. On a Macintosh, you can change the default in the Index Defaults group of preferences as well as specify exclusion for a particular index. The disadvantage of excluding numbers from an index is that users of the index cannot find phrases that contain numbers. For this reason, you should inform users of an index when numbers are excluded from it. To exclude numbers from an index: Check Do Not Include Numbers in the Options dialog box.
41
Choosing not to support search options The three Acrobat Catalog word options—Case Sensitive, Sounds Like, and Word Stemming—support options used with the Search command in Acrobat Exchange and Reader. But they increase the time required for index updates, the time required for searches, and the size of the index. (The Case Sensitive and Sounds Like options increase the size of the index by 5 to 10% apiece, and the Word Stemming option increases it by 10 to 20%.) You can disable support for any or all of the options. In Windows, the Catalog options are enabled by default and you need to disable them in the index definition. On a Macintosh, you can change option defaults in the Index Defaults group of preferences as well as disable them for a particular index.
42
• Case Sensitive supports the Match Case option in Acrobat Search.
• Sounds Like supports the Sounds Like option in Acrobat Search. The option expands searches for proper names. • Word Stemming supports the Word Assistant in Acrobat Search when it previews a search with the Word Stemming option. The Word Stemming option finds words that share a word stem with the search term. (Searching for manage also locates managed and managing, but not manager.) The option works even if it is not supported; only its Word Assistant preview is eliminated. To change a word option in an index: To disable support for an Acrobat Search option, deselect it in the Options dialog box. To enable support for an option, select it.
43
Optimizing for CD-ROM The Optimize for CD-ROM option arranges index files for the fastest possible access on a CD-ROM. In addition, the option makes it easier for you to modify Document Info fields or security settings after you have indexed a document. Normally, when a user searches a document that has been modified after it was indexed, a message indicates that the document was changed and the user has to choose whether to use the index nevertheless. When you select the option, the message and choice are bypassed. To optimize an index for CD-ROM: Check Optimize for CD-ROM in the Options dialog box.
44
Adding unique document identifiers to 1.0 PDF files You may need to add unique document identifiers to PDF documents created with version 1.0 of Acrobat Distiller or PDF Writer and used in cross-platform environments. Version 2 and later of these programs add the identifiers themselves. The need arises when Macintosh or UNIX filenames are shortened to become DOS filenames and filenaming ambiguities result. Acrobat Search uses the unique identifiers to resolve the ambiguities. To add Acrobat identifiers to 1.0 PDF files: Check Add IDs to Acrobat 1.0 PDF Files in the
Options dialog box.
45
Setting preferences for Acrobat Catalog Setting preferences (Macintosh) Setting preferences (Windows) Tips on settings for efficient indexes
46
Setting preferences (Macintosh) The Edit > Preferences dialog box has five groups of preferences:
• The Index group has general Acrobat Catalog preferences.
• The Index Defaults group has preferences for options used in index definitions.
• The Logging group has preferences for logging index builds.
• The Drop Folders group has preferences for building an index by dragging and dropping. • The Custom fields group has preferences for custom document information fields used with customized versions of Acrobat Exchange.
47
To change preferences (Macintosh): 1 Choose Edit > Preferences. 2 Click the icon for one of the five groups of preferences in the left panel to access the group. You can also use the Up and Down Arrow keys to select an icon. 3 Change the preference settings by typing a new value in the text box or by clicking an option to select or deselect it. 4 To complete the task, choose one of these options:
• Click OK to accept the changes and close the dialog box.
• Click Default; then click OK to restore the preferences to their default settings and close the dialog box.
• Click Cancel to revert to the previous values and close the dialog box.
48
Setting preferences (Windows) You change preferences by editing the acrocat.ini file located in the Windows folder. For a list of Windows preferences and defaults, see The acrocat.ini file. To change preferences (Windows): 1 Using a text editor such as Notepad, or a word processor, open the acrocat.ini file in the Windows directory. If you use a word processor, open the file as a text file. 2 Edit the preference settings you want to change. 3 Save the file. If you are using a word processor, save the file as a text file. 4 Restart Acrobat Catalog.
49
The acrocat.ini file Windows preferences appear in the [Options] section of the acrocat.ini file. Settings that might require changing follow. With each item, the value following the equal sign (=) is the default.
• DocumentWordSections=1. 0=small, 1=medium, and 2=large. Used with the three DocumentWordSections settings in the file. Determines the maximum size (in words and terms) of a document before Acrobat Catalog creates two or more indexes for the document. Consider changing for small- or largememory machines.
• IndexAvailableGroupSize=1024. The number of PDF files Catalog processes before making a partial index available or before updating the current index with entries for new and changed documents. The larger the number, the faster the index is built or updated and the faster searches with the index are completed. The smaller the number, the more frequently the index is
50
updated with the partial results of the current build and the more current is the information available to Acrobat Exchange users.
• WindowsOnlyFilenames=No. Yes is appropriate only if all searchers use Windows 3.1.
• MemoryPercent=20. If the percentage of memory available at the start of a build drops below this figure, the build stops.
• DocumentWordSectionsSmall=200000 • DocumentWordSectionsMedium=400000 • DocumentWordSectionsLarge=800000 • MaxLogFileSize=1000000 (1024K). When the maximum is reached, the log file is deleted and a new file created.
• GroupSizeForCDROM=4000. Anything above 4000 documents would become unreliable.
51
Tips on settings for efficient indexes • For small indexes and fast searches, specify the largest possible build-group size, 1024 files, with Index Available in the Index group of preferences (Macintosh) or with IndexAvailableGroupSize in acrocat.ini (Windows).
• To make partial indexes available quickly during large updates, specify a small build-group size (100 or fewer) with Index Available in the Index group of preferences (Macintosh) or with the IndexAvailableGroupSize option in acrocat.ini (Windows). However, note that decreasing this setting slows the update and the execution of search queries.
52
• For fast updates, use the largest setting for the Document Section Size preference in the Index group of preferences (Macintosh) or for the DocumentWordSections preference in acrocat.ini (Windows). The memory required to process a document is 10 times the number of words in the document. For example, the largest setting for a computer with 24 megabytes of memory would be 2,400,000 (2.4 million) words.
Note: The Document Section Size/DocumentWordSection setting determines the maximum size of a document before Acrobat Catalog creates two or more indexes for the document. • For fast updates on a Macintosh, increase the Index Disk Cache Size in the Index group of preferences as much as possible.
53
Distributing and maintaining an index Providing useful information about the index Keeping dynamic indexes up to date Moving a document collection and its index Deleting an index
54
Providing useful information about the index Index users need the following information:
• The kind of documents indexed • The search options supported • The person to contact or phone number to call with questions
• Whether numbers or stopwords are excluded from the index, and a list of any stopwords
•The location of any index-description document The primary index-description document is the index-definition (PDX) file itself. When you define an index, you can put up to 250 characters in the Index Description text box. When index users list available indexes, they can read these descriptions.
55
Even if you can fit all the necessary information into 250 characters, consider providing a separate index-description document. Such a document could do the following:
• List the folders containing documents included in a LAN-based index, or list the documents included in a CD-ROM-based index. You might also include a brief description of the contents of each folder or document.
• If Document Info field values are assigned to indexed documents, list the values for each document. You can place index-description documents in the same folders as the indexes they describe. Alternatively, place all the index-description documents in a central location. That way users can easily find descriptions of all the indexes without knowing where the indexes themselves are located.
56
Keeping dynamic indexes up to date You can schedule one-time updates, schedule updates at regular intervals, or arrange for updating to go on continuously. Acrobat Catalog updates are incremental, to minimize updating time and permit searching to go on uninterrupted during updates. This technique causes the index to grow with each update, however, and you need to purge and rebuild the index periodically to reclaim disk space and speed up searches.
Scheduling automatic builds Purging and rebuilding an index Tips on updating
57
Scheduling automatic builds You can set up Acrobat Catalog to build (or update) an index or a batch of indexes automatically. You arrange the build to take place at the following times:
• At a specified interval, such as every hour or every seven days, and at a particular time
• Once only, and immediately • Continuously Use the once-only method, rather than the normal way to build immediately, to build several indexes in a single batch. To schedule an automatic build: 1 Choose Index > Schedule. 2 For each index you want to build, click Add, and select the name of the index-definition (PDX) file. Then click OK (Windows) or Open (Macintosh).
58
3 Click Every, Once, or Continuously. 4 If you clicked Every, select a measure such as hours for the interval and enter the interval. If you want to delay processing the selected indexes until a specified time, select Starting At and use the up and down arrows to select the time. 5 Click Start. If you selected the Starting At option, Catalog waits for the time you specified before building or updating the indexes. If you selected Once or Continuously, it starts immediately. If you selected Continuously, it updates the indexes in the order they are listed in the Indices to Build list. It continues to update the indexes until you click Stop. To add a new index to the existing schedule: 1 Create a new index definition. 2 Click Save As. 3 Before you save the PDX file, select Add Index to Schedule.
59
Purging and rebuilding an index When you update an index simply by rebuilding it, entries for deleted documents and for the original versions of changed documents remain in the index but are marked as invalid. This incremental updating slightly increases the time required for searches that use the index, and it can greatly increase the disk space required by the index. (For example, if every document indexed has changed since an initial build, the space required for the index is doubled.) Because these increases accumulate over time, you occasionally need to purge the index before rebuilding it.
60
You should also purge and rebuild if you change the optional search features supported by an index or change the stopwords list used to build an index. Otherwise, search performance may be slowed or search results distorted. To purge and rebuild an index: 1 Choose Index > Purge. 2 Locate and select the name of an index-definition (PDX) file for the index you want to purge and rebuild. 3 Click OK (Windows) or Open (Macintosh). Acrobat Catalog purges the existing index. If the index is currently in use, users are given time to complete queries in progress before the purge begins. (The default “time before purge” is 905 seconds; that is, 15 minutes.) Users receive an “Index unavailable for searching” message if they attempt to enter a new query. If a message indicating that the purge has failed to complete appears, look up the message in Troubleshooting for help.
61
4 After the purge completes, choose Index > Build. 5 Locate and select the PDX file for the index, and click OK (Windows) or Open (Macintosh). Catalog rebuilds the index.
Tip: A faster way to purge an index is simply to delete the nine subfolders of the index folder: assists, morgue, parts, pdd, style, temp, topicidx, trans, and work. Do this only if you are sure the index is not in use.
62
Tips on updating You must update an index if either of the following occurs:
• Documents are added to or removed from the collection.
• The hierarchy of the indexed folders changes. Consider updating an index when either of the following occurs:
• Documents in the indexed document collection change.
• A new Document Info field is defined for the documents in the collection, and data values for the new field are added to the documents. Here are some ways to cut updating time:
• Don’t support the Sounds Like, Case Sensitive, or Word Stemming search options.
63
• Use stopwords and exclude numbers. • Install Acrobat Catalog on the computer where the indexed documents are stored; make sure this computer is the fastest available. If the program and documents are on different computers and it is feasible to move the documents temporarily, move them to the Catalog computer for updating and then move them back.
Note: For information on Catalog preferences that affect updating time, see Tips on settings for efficient indexes.
64
Moving a document collection and its index You can develop and test an indexed document collection on a local hard drive and then move the finished document collection to a network server or CD-ROM. Or you can move a collection from one server, maybe because it is full, to another. An index definition contains relative paths between the index-definition (PDX) file and the folders containing the indexed documents. You don’t have to rebuild the index after moving the indexed document collection if you have maintained these relative paths. If the PDX file and the folders containing the indexed documents are in the same folder, you can maintain the relative path simply by moving that folder.
65
If the relative path changes, you must create a new index after you move the indexed document collection. However, you can still use the original PDX file. To use the original PDX file, first move the indexed documents. Then copy the PDX file to the folder where you want to create the new index and update the Include and Exclude lists as necessary. On a Macintosh, if the index resides on a drive or server volume separate from any part of the collection it applies to, moving either the collection or the index will break the index. If you intend to move a document collection either to another network location or onto a CD, create and build the index in the same location as the collection.
66
Deleting an index If you need to delete an index entirely rather than just purging it, delete the index-definition (PDX) file, the log file for the index, and the index folder and all of its nested subfolders. Use the normal file-deletion procedures for your operating system.
67
1996 Adobe Systems Incorporated. All rights reserved. Adobe Acrobat 3.0 Catalog Online Guide This manual, as well as the software described in it, is furnished under license and may be used or copied only in accordance with the terms of such license. The content of this manual is furnished for informational use only, is subject to change without notice, and should not be construed as a commitment by Adobe Systems Incorporated. Adobe Systems Incorporated assumes no responsibility or liability for any errors or inaccuracies that may appear in this book. The copyrighted software that accompanies this manual is licensed to the End User for use only in strict accordance with the End User License Agreement, which the Licensee should read carefully before commencing use of the software. Except as permitted by such license, no part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, recording, or otherwise, without the prior written permission of Adobe Systems Incorporated. Adobe, the Adobe logo, Acrobat, Acrobat Capture, the Acrobat logo, Distiller, Acrobat Exchange, Adobe Type Manager, PostScript, and the tagline “If you can dream it, you can do it” are trademarks of Adobe Systems Incorporated. Microsoft and Windows are registered trademarks and ActiveX and Windows NT are trademarks of Microsoft Corporation in the U.S. and other countries. Apple, Macintosh, Power Macintosh, and QuickTime are registered trademarks and AppleScript and TrueType are trademarks of Apple Computer, Inc. Lotus Notes is a registered trademark of Lotus Development Corporation. Netscape and Netscape Navigator are trademarks of Netscape Communications Corporation. UNIX is a registered trademark in the U.S. and other countries, licensed exclusively through X/Open Company, Ltd. Pentium is a trademark of Intel Corporation. All other products or name brands are trademarks of their respective owners. This product contains an implementation of the LZW algorithm licensed under U.S. Patent 4,558,302.
68
This software includes software licensed from Verity, Inc., copyright 1994. All rights reserved. The address of Verity, Inc., is 894 Ross Drive, Sunnyvale, California 94089. Verity ® and TOPIC ® are registered trademarks of Verity, Inc. in the United States and other countries. English Electronic Thesaurus copyright 1993 by INSO Corporation. Adapted from the Oxford Thesaurus copyright 1991 by Oxford University Press and from Roget's II: The New Thesaurus copyright 1980 by Houghton Mifflin Company. All rights reserved. Reproduction or disassembly of embodied programs and databases prohibited. 1994 This software includes software licensed from RSA Data Security, Inc. Written and designed at Adobe Systems Incorporated, 345 Park Ave, San Jose, CA 95110-2704. Adobe Systems Europe Limited, Adobe House, 5 Mid New Cultins, Edinburgh EH11 4DU, Scotland, United Kingdom Adobe Systems Co., Ltd., Yebisu Garden Place Tower, 4-20-3 Ebisu, Shibuya-ku, Tokyo 150, Japan For defense agencies: Restricted Rights Legend. Use, reproduction, or disclosure is subject to restrictions set forth in subparagraph (c)(1)(ii) of the Rights in Technical Data and Computer Software clause at 252.227-7013. For civilian agencies: Restricted Rights Legend. Use, reproduction, or disclosure is subject to restrictions set forth in subparagraphs (a) through (d) of the commercial Computer Software Restricted Rights clause at 52.227-19 and the limitations set forth in Adobe’s standard commercial agreement for this software. Unpublished rights reserved under the copyright laws of the United States. (9/96)
69
How to use this online guide Page back or page forward. Undoes a change of page or view, or redoes a change (Go Back/Go Forward). Go to the Contents. Go to the Index. Go to the how-to page (this page). Go to the “parent” of the current topic.
text
Go to the indicated topic. Go to the next page of a continued topic. End of a continued topic.
For instructions on printing this guide, go to the next page.
70
How to print this online guide You can print separate topics or the entire guide. Since the pages of the guide have been made small for online viewing, Windows and Macintosh users may prefer to print them two to a page of paper— ”two up.” To print pages two up: 1 Choose File > Print Setup (Windows) or File > Page Setup (Macintosh). 2 Follow the instructions for your platform:
• In Windows, click Options, select 2 up on the Paper tab, click OK to return to the Print Setup dialog box, and click OK again to close it.
71
• On a Macintosh, choose 2 Up from the Layout menu and click OK.
Note: If you can’t perform step 2, you may not be using an Adobe or PostScript printer driver. If you are and you still can’t perform the step, install the Adobe printer driver on the Acrobat CD-ROM. See the Acrobat Getting Started guide for installation instructions. 3 Choose File > Print. 4 Indicate the page range. Click OK (Windows) or Print (Macintosh).
72