Preview only show first 10 pages with watermark. For full document please download
Solr Ref Guide 3.5
-
Rating
-
Date
November 2018 -
Size
6.5MB -
Views
2,327 -
Categories
Transcript
Solr Reference Guide Jan 10, 2012 Table of Contents Solr and Lucene _______________________________________________________________ 18 Lucid Imagination ______________________________________________________________ 19 About This Guide ______________________________________________________________ 20 Further Assistance _____________________________________________________________ 22 Getting Started _______________________________________________________________ 23 Installing Solr _______________________________________________________________ 23 Got Java? ________________________________________________________________ 24 Installing Solr _____________________________________________________________ 24 To install Solr _________________________________________________________ 24 Running Solr ________________________________________________________________ 25 Start the Server ___________________________________________________________ 26 Add Documents ___________________________________________________________ 26 Ask Questions ____________________________________________________________ 28 A Quick Overview ____________________________________________________________ 31 A Step Closer ________________________________________________________________ 34 Using the Solr Administration User Interface _________________________________________ 36 Overview of the Solr Admin UI __________________________________________________ 36 Configuring the Admin UI in solrconfig.xml ______________________________________ 37 The Solr Section _____________________________________________________________ 38 Displaying the Solr Schema __________________________________________________ 39 Displaying the Solr Configuration File __________________________________________ 40 Running Field Analysis to Test Analyzers, Tokenizers, and TokenFilters ________________ 41 Using the Schema Browser __________________________________________________ 45 Displaying the Configuration of a Field _____________________________________ 46 Displaying Additional Details about a Parameter ______________________________ 47 Exploring the Most Popular Terms for a Field ________________________________ 48 Displaying Statistics of the Solr Server _________________________________________ 49 Displaying Start-up Time Statistics about the Solr Server __________________________ 50 Displaying Information about a Distributed Solr Configuration _______________________ 51 Pinging the Solr Server to Test Its Responsiveness ________________________________ 53 Viewing and Configuring Logfile Settings ________________________________________ 54 The App Server Section ________________________________________________________ 56 Displaying Java Properties ___________________________________________________ 57 Displaying the Active Threads in the Java Environment ____________________________ 58 Enabling or Disabling the Server in a Load-balanced Configuration ___________________ 59 The Make a Query Section ______________________________________________________ 60 Using the Full Interface to Submit Queries ______________________________________ 61 The Assistance Section ________________________________________________________ 63 Page 2 of 397 Solr Reference Guide Jan 10, 2012 Documents, Fields, and Schema Design ____________________________________________ 65 Overview of Documents, Fields, and Schema Design _________________________________ 65 How Solr Sees the World ____________________________________________________ 66 Field Analysis _____________________________________________________________ 66 Solr Field Types ______________________________________________________________ 67 Field Type Definitions in schema.xml ___________________________________________ 68 Field Types Included with Solr ________________________________________________ 69 Working with Dates ________________________________________________________ 70 Working with External Files __________________________________________________ 71 Field Type Properties _______________________________________________________ 72 Field Properties by Use Case _________________________________________________ 73 Defining Fields _______________________________________________________________ 74 Copying Fields _______________________________________________________________ 75 Dynamic Fields ______________________________________________________________ 76 Other Schema Elements _______________________________________________________ 77 Unique Key _______________________________________________________________ 78 Default Search Field ________________________________________________________ 78 Query Parser Operator ______________________________________________________ 78 Putting the Pieces Together _____________________________________________________ 78 Choosing Appropriate Numeric Types __________________________________________ 79 Working With Text _________________________________________________________ 79 Understanding Analyzers, Tokenizers, and Filters _____________________________________ 81 Overview of Analyzers, Tokenizers, and Filters ______________________________________ 81 What Is An Analyzer? _________________________________________________________ 82 Analysis Phases ___________________________________________________________ 84 What Is A Tokenizer? _________________________________________________________ 85 What Is a Filter? _____________________________________________________________ 86 Tokenizers __________________________________________________________________ 88 Standard Tokenizer ________________________________________________________ 90 Classic Tokenizer __________________________________________________________ 91 Keyword Tokenizer _________________________________________________________ 91 Letter Tokenizer ___________________________________________________________ 92 Lower Case Tokenizer ______________________________________________________ 92 N-Gram Tokenizer _________________________________________________________ 93 Edge N-Gram Tokenizer _____________________________________________________ 93 ICU Tokenizer _____________________________________________________________ 94 Path Hierarchy Tokenizer ____________________________________________________ 95 Regular Expression Pattern Tokenizer __________________________________________ 95 UAX29 URL Email Tokenizer __________________________________________________ 97 White Space Tokenizer ______________________________________________________ 98 Page 3 of 397 Solr Reference Guide Jan 10, 2012 Filter Descriptions ____________________________________________________________ 98 ASCII Folding Filter _______________________________________________________ 100 Classic Filter _____________________________________________________________ 101 Common Grams Filter _____________________________________________________ 102 Collation Key Filter ________________________________________________________ 102 Edge N-Gram Filter _______________________________________________________ 102 English Minimal Stem Filter _________________________________________________ 104 Hunspell Stem Filter _______________________________________________________ 104 Hyphenated Words Filter ___________________________________________________ 105 ICU Folding Filter _________________________________________________________ 105 ICU Normalizer 2 Filter ____________________________________________________ 106 ICU Transform Filter _______________________________________________________ 107 Keep Words Filter _________________________________________________________ 107 KStem Filter _____________________________________________________________ 109 Length Filter _____________________________________________________________ 109 Lower Case Filter _________________________________________________________ 110 N-Gram Filter ____________________________________________________________ 110 Numeric Payload Token Filter _______________________________________________ 112 Pattern Replace Filter ______________________________________________________ 112 Phonetic Filter ___________________________________________________________ 114 Porter Stem Filter _________________________________________________________ 115 Position Filter Factory ______________________________________________________ 116 Remove Duplicates Token Filter ______________________________________________ 116 Reversed Wildcard Filter ___________________________________________________ 117 Shingle Filter ____________________________________________________________ 118 Snowball Porter Stemmer Filter ______________________________________________ 119 Standard Filter ___________________________________________________________ 120 Stop Filter ______________________________________________________________ 121 Synonym Filter ___________________________________________________________ 122 Token Offset Payload Filter _________________________________________________ 123 Trim Filter ______________________________________________________________ 124 Type As Payload Filter _____________________________________________________ 124 Word Delimiter Filter ______________________________________________________ 125 CharFilterFactories ___________________________________________________________ 128 solr.MappingCharFilterFactory _______________________________________________ 129 solr.HTMLStripCharFilterFactory ______________________________________________ 129 solrPatternReplaceCharFilterFactory __________________________________________ 130 Language Analysis ___________________________________________________________ 131 KeyWordMarkerFilterFactory ________________________________________________ 133 StemmerOverrideFilterFactory _______________________________________________ 134 Dictionary Compound Word Token Filter _______________________________________ 134 Page 4 of 397 Solr Reference Guide Jan 10, 2012 Unicode Collation _________________________________________________________ 135 Sorting Text for a Specific Language ______________________________________ 135 Sorting Text for Multiple Languages ______________________________________ 136 Sorting Text with Custom Rules __________________________________________ 137 Searching ___________________________________________________________ 138 ICU Collation ________________________________________________________ 138 ISO Latin Accent Filter _____________________________________________________ 139 Arabic __________________________________________________________________ 139 Brazilian Portuguese ______________________________________________________ 140 Bulgarian _______________________________________________________________ 141 Chinese ________________________________________________________________ 141 Chinese Tokenizer ____________________________________________________ 141 Chinese Filter Factory _________________________________________________ 141 Simplified Chinese ________________________________________________________ 142 CJK ____________________________________________________________________ 143 Czech __________________________________________________________________ 143 Dutch __________________________________________________________________ 144 Finnish _________________________________________________________________ 144 French _________________________________________________________________ 145 Elision Filter _________________________________________________________ 145 French Light Stem Filter ________________________________________________ 145 Galician ________________________________________________________________ 146 German ________________________________________________________________ 146 Greek __________________________________________________________________ 147 Hindi ___________________________________________________________________ 148 Indonesian ______________________________________________________________ 148 Italian __________________________________________________________________ 149 Lao, Myanmar, Khmer _____________________________________________________ 149 Latvian _________________________________________________________________ 150 Persian _________________________________________________________________ 150 Persian Filter Factories _________________________________________________ 150 Polish __________________________________________________________________ 151 Portuguese ______________________________________________________________ 151 Russian _________________________________________________________________ 152 Russian Letter Tokenizer _______________________________________________ 152 Russian Lower Case Filter ______________________________________________ 153 Russian Stem Filter ___________________________________________________ 153 Spanish ________________________________________________________________ 154 Swedish ________________________________________________________________ 155 Swedish Stem Filter ___________________________________________________ 155 Thai ___________________________________________________________________ 155 Turkish _________________________________________________________________ 156 Page 5 of 397 Solr Reference Guide Jan 10, 2012 Running Your Analyzer _______________________________________________________ 156 Indexing and Basic Data Operations ______________________________________________ 163 What Is Indexing? ___________________________________________________________ 163 The Solr Example Directory _________________________________________________ 164 The curl Utility for Transferring Files __________________________________________ 164 Uploading Data with Solr Cell using Apache Tika ___________________________________ 165 Key Concepts ____________________________________________________________ 166 Trying out Tika with the Solr Example Directory _________________________________ 167 Input Parameters _________________________________________________________ 168 Order of Operations _______________________________________________________ 169 Configuring the Solr ExtractingRequestHandler __________________________________ 170 Multi-Core Configuration _______________________________________________ 171 Metadata _______________________________________________________________ 171 Examples of Uploads Using the Extraction Request Handler ________________________ 172 Capture and Mapping __________________________________________________ 172 Capture, Mapping, and Boosting _________________________________________ 172 Using Literals to Define Your Own Metadata ________________________________ 172 XPath ______________________________________________________________ 172 Extracting Data without Indexing It _______________________________________ 173 Sending Documents to Solr with a POST _______________________________________ 173 Sending Documents to Solr with Solr Cell and SolrJ ______________________________ 173 Uploading Data with Index Handlers _____________________________________________ 174 XMLUpdateRequestHandler for XML-formatted Data ______________________________ 175 Configuration ________________________________________________________ 182 Adding Documents ____________________________________________________ 175 Commit and Optimize Operations ________________________________________ 176 Delete Operations ____________________________________________________ 177 Rollback Operations ___________________________________________________ 178 Using curl to Perform Updates with the Update Request Handler. _______________ 178 A Simple Cross-Platform Posting Tool _____________________________________ 179 XSLTRequestHandler to Transform XML Content _________________________________ 179 CSVRequestHandler for CSV Content __________________________________________ 180 Parameters __________________________________________________________ 181 Using the JSONRequestHandler for JSON Content ________________________________ 182 Examples ___________________________________________________________ 183 Update Commands ____________________________________________________ 184 Indexing Using SolrJ ______________________________________________________ 185 Page 6 of 397 Solr Reference Guide Jan 10, 2012 Uploading Structured Data Store Data with the Data Import Handler ___________________ 185 Concepts and Terminology __________________________________________________ 186 Configuration ____________________________________________________________ 187 Data Import Handler Commands _____________________________________________ 189 Parameters for the full-import Command __________________________________ 190 Data Sources ____________________________________________________________ 191 ContentStreamDataSource _____________________________________________ 191 FieldReaderDataSource ________________________________________________ 191 FileDataSource _______________________________________________________ 192 JdbcDataSource ______________________________________________________ 192 URLDataSource ______________________________________________________ 192 Entity Processors _________________________________________________________ 193 The SQL Entity Processor _______________________________________________ 194 The XPathEntityProcessor ______________________________________________ 195 The FileListEntityProcessor ______________________________________________ 197 LineEntityProcessor ___________________________________________________ 199 PlainTextEntityProcessor _______________________________________________ 200 Transformers ____________________________________________________________ 200 ClobTransformer _____________________________________________________ 201 The DateFormatTransformer ____________________________________________ 202 The HTMLStripTransformer _____________________________________________ 202 The LogTransformer ___________________________________________________ 203 The NumberFormatTransformer __________________________________________ 203 The RegexTransformer _________________________________________________ 204 The ScriptTransformer _________________________________________________ 205 The TemplateTransformer ______________________________________________ 206 Special Commands for the Data Import Handler _________________________________ 206 The Data Import Handler Development Console _________________________________ 207 Detecting Languages During Indexing ___________________________________________ 210 Configuring Language Detection _____________________________________________ 211 Configuring Tika Language Detection _____________________________________ 211 Configuring LangDetect Language Detection ________________________________ 211 langid Parameters ________________________________________________________ 212 UIMA Integration ____________________________________________________________ 214 Configuring UIMA _________________________________________________________ 215 Content Streams ____________________________________________________________ 217 Stream Sources __________________________________________________________ 218 RemoteStreaming ________________________________________________________ 218 Debugging Requests ______________________________________________________ 218 Searching ___________________________________________________________________ 220 Overview of Searching in Solr __________________________________________________ 221 The Velocity Search UI _____________________________________________________ 224 Page 7 of 397 Solr Reference Guide Jan 10, 2012 Relevance _________________________________________________________________ 225 Query Syntax and Parsing _____________________________________________________ 227 Common Query Parameters _________________________________________________ 228 The defType Parameter ________________________________________________ 229 The sort Parameter ___________________________________________________ 229 The start Parameter ___________________________________________________ 230 The rows Parameter ___________________________________________________ 230 The fq (Filter Query) Parameter __________________________________________ 230 The fl (Field List) Parameter ____________________________________________ 231 The debugQuery Parameter _____________________________________________ 232 The explainOther Parameter ____________________________________________ 232 The timeAllowed Parameter _____________________________________________ 232 The omitHeader Parameter _____________________________________________ 232 The wt Parameter ____________________________________________________ 233 The cache=false Parameter _____________________________________________ 233 The Standard Query Parser _________________________________________________ 233 Standard Query Parser Parameters _______________________________________ 234 The Standard Query Parser's Response ____________________________________ 234 Sample Responses ________________________________________________ 234 Specifying Terms for the Standard Query Parser _____________________________ 236 Term Modifiers ___________________________________________________ 236 Wildcard Searches ________________________________________________ 236 Fuzzy Searches __________________________________________________ 237 Proximity Searches _______________________________________________ 238 Range Searches __________________________________________________ 238 Boosting a Term with ^ ____________________________________________ 239 Specifying Fields in a Query to the Standard Query Parser _____________________ 239 Boolean Operators Supported by the Standard Query Parser ___________________ 240 The Boolean Operator + ___________________________________________ 241 The Boolean Operator AND (&&) _____________________________________ 241 The Boolean Operator NOT (!) _______________________________________ 241 Escaping Special Characters ________________________________________ 242 Grouping Terms to Form Subqueries ______________________________________ 242 Grouping Clauses within a Field ______________________________________ 242 Differences between Lucene Query Parser and the Solr Standard Query Parser _____ 242 Specifying Dates and Times _________________________________________ 243 Page 8 of 397 Solr Reference Guide Jan 10, 2012 The DisMax Query Parser ___________________________________________________ 243 DisMax Parameters ___________________________________________________ 244 The q Parameter _________________________________________________ 245 The q.alt Parameter _______________________________________________ 245 The qf (Query Fields) Parameter _____________________________________ 245 The mm (Minimum Should Match) Parameter ___________________________ 245 The pf (Phrase Fields) Parameter ____________________________________ 247 The ps (Phrase Slop) Parameter _____________________________________ 247 The qs (Query Phrase Slop) Parameter ________________________________ 247 The tie (Tie Breaker) Parameter _____________________________________ 247 The bq (Boost Query) Parameter _____________________________________ 248 The bf (Boost Functions) Parameter __________________________________ 248 Examples of Queries Submitted to the DisMax Query Parser ___________________ 248 The Extended DisMax Query Parser ___________________________________________ 249 Extended DisMax Parameters ___________________________________________ 250 The boost Parameter ______________________________________________ 250 The lowercaseOperators Parameter ___________________________________ 251 The pf2 Parameter ________________________________________________ 251 The pf3 Parameter ________________________________________________ 251 The stopwords Parameter __________________________________________ 251 Examples of Queries Submitted to the Extended DisMax Query Parser ___________ 251 Local Parameters in Queries ________________________________________________ 251 Basic Syntax of Local Parameters ________________________________________ 252 Query Type Short Form ________________________________________________ 252 Specifying the Parameter Value with the ' v ' Key ____________________________ 252 Parameter Dereferencing _______________________________________________ 252 Function Queries ____________________________________________________________ 253 Using FunctionQuery ______________________________________________________ 261 Example of Function Queries Using the top Function ______________________________ 261 Sort By Function _________________________________________________________ 262 Highlighting ________________________________________________________________ 262 Using Boundary Scanners with the Fast Vector Highlighter _________________________ 266 The breakIterator Boundary Scanner ______________________________________ 266 The simple Boundary Scanner ___________________________________________ 267 MoreLikeThis _______________________________________________________________ 267 Common Parameters for MoreLikeThis ________________________________________ 268 Parameters for the StandardRequestHandler ___________________________________ 268 Parameters for the MoreLikeThis Request Handler _______________________________ 269 Page 9 of 397 Solr Reference Guide Jan 10, 2012 Faceting ___________________________________________________________________ 269 General Parameters _______________________________________________________ 270 The facet Parameter ___________________________________________________ 270 The facet.query Parameter _____________________________________________ 270 Field-Value Faceting Parameters _____________________________________________ 271 The facet.field Parameter _______________________________________________ 272 The facet.prefix Parameter _____________________________________________ 272 The facet.sort Parameter _______________________________________________ 272 The facet.limit Parameter _______________________________________________ 272 The facet.offset Parameter ______________________________________________ 273 The facet.mincount Parameter ___________________________________________ 273 The facet.missing Parameter ____________________________________________ 273 The facet.method Parameter ____________________________________________ 273 The facet.enum.cache.minDf Parameter ___________________________________ 274 Range Faceting __________________________________________________________ 274 The facet.range Parameter _____________________________________________ 275 The facet.range.start Parameter _________________________________________ 275 The facet.range.end Parameter __________________________________________ 275 The facet.range.gap Parameter __________________________________________ 276 The facet.range.hardend Parameter ______________________________________ 276 The facet.range.include Parameter _______________________________________ 276 The facet.range.other Parameter _________________________________________ 277 Date Faceting Parameters __________________________________________________ 277 LocalParams for Faceting ___________________________________________________ 277 Tagging and Excluding Filters ___________________________________________ 277 Changing the Output Key _______________________________________________ 278 Result Grouping _____________________________________________________________ 278 Request Parameters _______________________________________________________ 279 Examples _______________________________________________________________ 280 Grouping Results by Field ______________________________________________ 280 Grouping by Query ____________________________________________________ 283 Distributed Result Grouping _________________________________________________ 284 Page 10 of 397 Solr Reference Guide Jan 10, 2012 Spell Checking ______________________________________________________________ 285 Configuring the SpellCheckComponent ________________________________________ 286 Define Spell Check in solrconfig.xml ______________________________________ 286 Add It to a Request Handler ____________________________________________ 287 Spell Check Parameters ____________________________________________________ 288 The spellcheck Parameter ______________________________________________ 289 The spellcheck.q or q Parameter _________________________________________ 289 The spellcheck.build Parameter __________________________________________ 289 The spellcheck.reload Parameter _________________________________________ 290 The spellcheck.count Parameter _________________________________________ 290 The spellcheck.onlyMorePopular Parameter _________________________________ 290 The spellcheck.extendedResults Parameter _________________________________ 290 The spellcheck.collate Parameter _________________________________________ 290 The spellcheck.maxCollations Parameter ___________________________________ 290 The spellcheck.maxCollationTries Parameter ________________________________ 291 The spellcheck.maxCollationEvaluations Parameter __________________________ 291 The spellcheck.collateExtendedResult Parameter ____________________________ 291 The spellcheck.dictionary Parameter ______________________________________ 291 The spellcheck.accuracy Parameter _______________________________________ 292 The spellcheck. tags separately, and then maps all the instances of that field to a dynamic field named foo_t.
curl "http://localhost:8983/solr/update/extract?literal.id=doc2&captureAttr=true &defaultField=text&fmap.div=foo_t&capture=div" -F "[email protected]"
Capture, Mapping, and Boosting The command below captures
tags separately, maps the field to a dynamic field named foo_t, then boosts foo_t by 3.
curl "http://localhost:8983/solr/update/extract?literal.id=doc3&captureAttr=true &defaultField=text&capture=div&fmap.div=foo_t&boost.foo_t=3" -F "[email protected]"
Using Literals to Define Your Own Metadata To add in your own metadata, pass in the literal parameter along with the file:
curl "http://localhost:8983/solr/update/extract?literal.id=doc4&captureAttr=true &defaultField=text&capture=div&fmap.div=foo_t&boost.foo_t=3&literal.blah_s=Bah" -F "[email protected]"
Page 172 of 397
Solr Reference Guide
Jan 10, 2012
XPath The example below passes in an XPath expression to restrict the XHTML returned by Tika:
curl "http://localhost:8983/solr/update/extract?literal.id=doc5&captureAttr=true &defaultField=text&capture=div&fmap.div=foo_t&boost.foo_t=3&literal.id=id &xpath=/xhtml:html/xhtml:body/xhtml:div/descendant:node()" -F "[email protected]"
Extracting Data without Indexing It Solr allows you to extract data without indexing. You might want to do this if you're using Solr solely as an extraction server or if you're interested in testing Solr extraction. The example below sets the extractOnly=true parameter to extract data without indexing it.
curl "http://localhost:8983/solr/update/extract?&extractOnly=true" --data-binary @tutorial.html -H 'Content-type:text/html'
The output includes XML generated by Tika (and further escaped by Solr's XML) using a different output format to make it more readable:
curl "http://localhost:8983/solr/update/extract?&extractOnly=true&wt=ruby&indent=true" --data-binary @tutorial.html -H 'Content-type:text/html'
Sending Documents to Solr with a POST The example below streams the file as the body of the POST, which does not, then, provide information to Solr about the name of the file.
curl "http://localhost:8983/solr/update/extract?literal.id=doc5&defaultField=text" --data-binary @tutorial.html -H 'Content-type:text/html'
Sending Documents to Solr with Solr Cell and SolrJ SolrJ is a Java client that you can use to add documents to the index, update the index, or query the index. You'll find more information on SolrJ in Client APIs.
Page 173 of 397
Solr Reference Guide
Jan 10, 2012
Here's an example of using Solr Cell and SolrJ to add documents to a Solr index. First, let's use SolrJ to create a new SolrServer, then we'll construct a request containing a ContentStream (essentially a wrapper around a file) and sent it to Solr:
public class SolrCellRequestDemo \{ public static void main (String\[\] args){color} throws IOException, SolrServerException \{ SolrServer server = new CommonsHttpSolrServer("http://localhost:8983/solr"); ContentStreamUpdateRequest req = new ContentStreamUpdateRequest("/update/extract"); req.addFile(new File("apache-solr/site/features.pdf")); req.setParam(ExtractingParams.EXTRACT_ONLY, "true"); NamedList<Object> result = server.request(req); System.out.println("Result: " + result); }
This operation streams the file features.pdf into the Solr index. The sample code above calls the extract command, but you can easily substitute other commands that are supported by Solr Cell. The key class to use is the ContentStreamUpdateRequest, which makes sure the ContentStreams are set properly. SolrJ takes care of the rest. Note that the ContentStreamUpdateRequest is not just specific to Solr Cell. You can send CSV to the CSV Update handler and to any other Request Handler that works with Content Streams for updates.
Page 174 of 397
Solr Reference Guide
Jan 10, 2012
Uploading Data with Index Handlers Index Handlers are Update Handlers designed to add, delete and update documents to the index. Solr includes several of these to allow indexing documents in XML, CSV and JSON. The example URLs given here reflect the handler configuration in the supplied solrconfig.xml. If the name associated with the handler is changed then the URLs will need to be modified. It is quite possible to access the same handler using more than one name, which can be useful if you wish to specify different sets of default options.
Index Handlers covered in this section: XMLUpdateRequestHandler for XML-formatted Data XSLTRequestHandler to Transform XML Content CSVRequestHandler for CSV Content Using the JSONRequestHandler for JSON Content Indexing Using SolrJ
XMLUpdateRequestHandler for XML-formatted Data Configuration The default configuration file has the update request handler configured by default.
Adding Documents Documents are added to the index by sending an XML message to the update handler. The XML schema recognized by the update handler is very straightforward: The element introduces one more more documents to be added. The element introduces the fields making up a document. The element presents the content for a specific field. For example:
Page 175 of 397
Solr Reference Guide
Jan 10, 2012
Patrick Eagar Sports 796.35 128 12.40 Summer of the all-rounder: Test and championship cricket in England 1982 0002166313 1982 Collins ...
If the document schema defines a unique key, then an /update operation silently replaces a document in the index with the same unique key, unless the element sets the allowDups attribute to true. If no unique key has been defined, indexing performance is somewhat faster, as no search has to be made for an existing document. Each element has certain optional attributes which may be specified. Command Command Description
Optional
Parameter Description
Parameter
Introduces one or more
commitWithin= Add the document within the specified
documents to be added to
number
number of milliseconds
boost=float
Default is 1.0. Sets a boost value for
the index.
Introduces the definition of a specific document.
the document.To learn more about boosting, see Searching.
Defines a field within a
boost=float
document.
Default is 1.0. Sets a boost value for the field.
Other optional parameters for , including allowDups, overwritePending, and overwriteCommitted, are now deprecated.
Page 176 of 397
Solr Reference Guide
Jan 10, 2012
Commit and Optimize Operations The operation writes all documents loaded since the last commit to one or more segment files on the disk. Before a commit has been issued, newly indexed content is not visible to searches. The commit operation opens a new searcher, and triggers any event listeners that have been configured. Commits may be issued explicitly with a message, and can also be triggered from parameters in solrconfig.xml. The operation requests Solr to merge internal data structures in order to improve search performance. For a large index, optimization will take some time to complete, but by merging many small segment files into a larger one, search performance will improve. If you are using Solr's replication mechanism to distribute searches across many systems, be aware that after an optimize, a complete index will need to be transferred. In contrast, post-commit transfers are usually much smaller. The and elements accept these optional attributes: Optional
Description
Attribute maxSegments
Default is 1. Optimizes the index to include no more than this number of segments.
waitFlush
Default is true. Blocks until index changes are flushed to disk.
waitSearcher
Default is true. Blocks until a new searcher is opened and registered as the main query searcher, making the changes visible.
expungeDeletes Default is false. Merges segments and removes deleted documents. Here are examples of and using optional attributes:
Delete Operations Documents can be deleted from the index in two ways. "Delete by ID" deletes the document with the specified ID, and can be used only if a UniqueID field has been defined in the schema. "Delete by Query" deletes all documents matching a specified query. A single delete message can contain multiple delete operations.
Page 177 of 397
Solr Reference Guide
Jan 10, 2012
0002166313 0031745983 subject:sport publisher:penguin
Rollback Operations The rollback command rolls back all add and deletes made to the index since the last commit. It neither calls any event listeners nor creates a new searcher. Its syntax is simple: .
Using curl to Perform Updates with the Update Request Handler. You can use the curl utility to perform any of the above commands, using its --data-binary option to append the XML message to the curl command, and generating a HTTP POST request. For example:
curl http://localhost:8983/update -H "Content-Type: text/xml" --data-binary ' Patrick Eagar Sports 796.35 0002166313 1982 Collins '
For posting XML messages contained in a file, you can use the alternative form:
curl http://localhost:8983/update -H "Content-Type: text/xml" --data-binary @myfile.xml
Short requests can also be sent using a HTTP GET command, URL-encoding the request, as in the following. Note the escaping of "<" and ">":
curl http://localhost:8983/update?stream.body=%3Ccommit/%3E
Responses from Solr take the form shown here:
Page 178 of 397
Solr Reference Guide
Jan 10, 2012
0 127
The status field will be non-zero in case of failure. The servlet container will generate an appropriate HTML-formatted message in the case of an error at the HTTP layer.
A Simple Cross-Platform Posting Tool For demo purposes, the file example/exampledocs/post.jar includes a cross-platform Java tool for POST-ing XML documents. Open a window and run:
java -jar post.jar
-
By default, this will contact the server at localhost:8983. The "-help" option outputs the following information on its usage:
SimplePostTool: version 1.2
This is a simple command line tool for POSTing raw XML to a Solr port. XML data can be read from files specified as command line args; as raw commandline arg strings; or via STDIN. Examples:
java -Ddata=files -jar post.jar *.xml java -Ddata=args -jar post.jar '