Lucene Syntax Tutorial
On this page:
Lucene provides a powerful search syntax that can help you create more accurate and efficient search queries. In this tutorial, we will cover some of the basic Lucene syntax expressions and provide examples for each one.
Querying Fields
You can specify which document fields to search by using the field name followed by a colon :
. For example, to search for documents containing the word "apple" in the title
field, you can use the following query:
title:apple
Phrase Queries
Phrase queries search for an exact phrase in the indexed documents. The syntax for phrase queries is to enclose the phrase in double quotes. For example, to search for documents containing the exact phrase "big apple" in the title field, you can use the following query:
title:"big apple"
Boolean Queries
Boolean queries allow you to combine multiple search expressions using Boolean operators such as AND, OR, and NOT. The syntax for Boolean queries is as follows.
AND
The AND operator is represented by the uppercase word AND
and requires both terms to be present in the documents. For example, to search for documents containing both the words "apple" and "pie" in the title field, you can use the following query:
title:apple AND title:pie
title:apple AND description:apple
The plus sign +
is an alternative representation of the AND operator and can be used as follows:
+title:apple +title:pie
OR
The OR operator is represented by the word OR and retrieves documents that contain either term. For example, to search for documents containing either the word "apple" or the word "pie" in the title field, you can use the following query:
title:apple OR title:pie
title:apple OR description:apple
You can use parentheses and commas as alternative represenation of the OR operator, for example:
title:(apple, pie)
NOT
The NOT operator is represented by the uppercase word NOT and retrieves documents that do not contain the specified term. For example, to search for documents containing the word "apple" but not the word "pie", you can use the following query:
title:apple AND NOT title:pie
The minus sign -
is an alternative representation of the NOT operator and can be used as follows:
+title:apple -title:pie
Wildcard Queries
Wildcard queries allow you to search for terms with varying characters using wildcards. The syntax for wildcard queries is to use an asterisk *
as a placeholder for one or more characters. For example, to search for documents containing the word "apples" or "apple" in the title field, you can use the following query:
title:apple*
Fuzzy Queries
Fuzzy queries allow you to search for terms with similar spellings using the tilde ~
operator. The syntax for fuzzy queries is to append the tilde to the end of the term, followed by a number representing the maximum edit distance. For example, to search for documents containing the word "apple" or words with similar spellings such as "aple" or "aplle", you can use the following query:
title:apple~1
Range Queries on Date Fields
You can search for documents within a specific date range by using range queries on date fields. The syntax involves using the field name followed by either curly braces { }
or square brackets [ ]
, with dates in formats like:
YYYY-MM-DD
(e.g., 2022-12-31)YYYY-MM-DDTHH:MM:SS
(e.g., 2022-12-05T16:00:20-05:00)
Curly braces { }
exclude the boundary values, while square brackets [ ]
include them.
Search Across Dates
For example, to search for documents created between January 1, 2020, and December 31, 2020, including both boundaries, you can use the following query:
createdAt:[2020-01-01 TO 2020-12-31]
The search result includes documents created on 2020-01-01
and 2020-12-31
. If you use curly brackets instead, the result excludes documents created on 2020-01-01
and 2020-12-31
.
Search Across Times
You can also search across times. For example, to search for documents with a created time between 4 PM and 5 PM on December 5, 2022, you can use the following query:
createdAt:[2022-12-05T16:00:00 TO 2022-12-05T17:00:00]
Range Queries on Number Fields
You can search for documents that fall within a specific numerical range by using range queries on number fields. The syntax for range queries on number fields is to use the field name followed by curly { }
or square brackets [ ]
with the range values. Curly brackets exclude the boundaries from the search while square brackets include the boundaries.
For example, to search for documents with a number of earnings per share between
eps:[1 TO 5.90]
Nested Search Conditions
Nested search conditions can help you create more complex queries by grouping expressions together with brackets. This allows you to control the order in which the expressions are evaluated. For example, you can group expressions together to specify that certain conditions should be evaluated before others, or to ensure that the right combination of expressions are used to produce the desired search results.
The syntax for using nested search conditions with brackets is as follows:
(expression_1) AND (expression_2 OR expression_3)
In this example, the search engine will first evaluate expression_1
. If expression_1
is true, then it will evaluate (expression_2 OR expression_3)
. If expression_1
is false, then it will not evaluate (expression_2 OR expression_3)
.
Here is an example of how to use nested search conditions with brackets:
(title:apple OR title:pie) AND (createdAt:[2020-01-01 TO 2020-12-31] AND eps:[1 TO 5.90])
In this example, the search engine will first evaluate the expression (title:apple OR title:pie)
. If a document has "apple" or "pie" in the title, then it will evaluate the second expression (createdAt:[2020-01-01 TO 2020-12-31] AND eps:[1 TO 5.90])
. If a document does not have "apple" or "pie" in the title, then it will not evaluate the second expression.