Full Text Search

How to use full text search in PostgreSQL.

Postgres has built-in functions to handle Full Text Search queries. This is like a "search engine" within Postgres.

Preparation

For this guide we'll use the following example data:

id	title	author	description
1	The Poky Little Puppy	Janette Sebring Lowrey	Puppy is slower than other, bigger animals.
2	The Tale of Peter Rabbit	Beatrix Potter	Rabbit eats some vegetables.
3	Tootle	Gertrude Crampton	Little toy train has big dreams.
4	Green Eggs and Ham	Dr. Seuss	Sam has changing food preferences and eats unusually colored food.
5	Harry Potter and the Goblet of Fire	J.K. Rowling	Fourth year of school starts, big drama ensues.

Usage

The functions we'll cover in this guide are:

`to_tsvector()`

Converts your data into searchable tokens. to_tsvector() stands for "to text search vector." For example:


_10select to_tsvector('green eggs and ham');
_10-- Returns 'egg':2 'green':1 'ham':4

Collectively these tokens are called a "document" which Postgres can use for comparisons.

`to_tsquery()`

Converts a query string into tokens to match. to_tsquery() stands for "to text search query."

This conversion step is important because we will want to "fuzzy match" on keywords. For example if a user searches for eggs, and a column has the value egg, we probably still want to return a match.

Match: `@@`

The @@ symbol is the "match" symbol for Full Text Search. It returns any matches between a to_tsvector result and a to_tsquery result.

Take the following example:


_10select *
_10from books
_10where title = 'Harry';

The equality symbol above (=) is very "strict" on what it matches. In a full text search context, we might want to find all "Harry Potter" books and so we can rewrite the example above:


_10select *
_10from books
_10where to_tsvector(title) @@ to_tsquery('Harry');

Basic full text queries

Search a single column

To find all books where the description contain the word big:


_10select
_10  *
_10from
_10  books
_10where
_10  to_tsvector(description)
_10  @@ to_tsquery('big');

Search multiple columns

Right now there is no direct way to use JavaScript or Dart to search through multiple columns but you can do it by creating computed columns on the database.

To find all books where description or title contain the word little:


_10select
_10  *
_10from
_10  books
_10where
_10  to_tsvector(description || ' ' || title) -- concat columns, but be sure to include a space to separate them!
_10  @@ to_tsquery('little');

Match all search words

To find all books where description contains BOTH of the words little and big, we can use the & symbol:


_10select
_10  *
_10from
_10  books
_10where
_10  to_tsvector(description)
_10  @@ to_tsquery('little & big'); -- use & for AND in the search query

Match any search words

To find all books where description contain ANY of the words little or big, use the | symbol:


_10select
_10  *
_10from
_10  books
_10where
_10  to_tsvector(description)
_10  @@ to_tsquery('little | big'); -- use | for OR in the search query

Notice how searching for big includes results with the word bigger (or biggest, etc).

Partial search

Partial search is particularly useful when you want to find matches on substrings within your data.

Implementing partial search

You can use the :* syntax with to_tsquery(). Here's an example that searches for any book titles beginning with "Lit":


_10select title from books where to_tsvector(title) @@ to_tsquery('Lit:*');

Extending functionality with RPC

To make the partial search functionality accessible through the API, you can wrap the search logic in a stored procedure.

After creating this function, you can invoke it from your application using the SDK for your platform. Here's an example:


_10create or replace function search_books_by_title_prefix(prefix text)
_10returns setof books AS $$
_10begin
_10  return query
_10  select * from books where to_tsvector('english', title) @@ to_tsquery(prefix || ':*');
_10end;
_10$$ language plpgsql;

This function takes a prefix parameter and returns all books where the title contains a word starting with that prefix. The :* operator is used to denote a prefix match in the to_tsquery() function.

Handling spaces in queries

When you want the search term to include a phrase or multiple words, you can concatenate words using a + as a placeholder for space:


_10select * from search_books_by_title_prefix('Little+Puppy');

Creating indexes

Now that we have Full Text Search working, let's create an index. This will allow Postgres to "build" the documents pre-emptively so that they don't need to be created at the time we execute the query. This will make our queries much faster.

Searchable columns

Let's create a new column fts inside the books table to store the searchable index of the title and description columns.

We can use a special feature of Postgres called Generated Columns to ensure that the index is updated any time the values in the title and description columns change.


_10alter table
_10  books
_10add column
_10  fts tsvector generated always as (to_tsvector('english', description || ' ' || title)) stored;
_10
_10create index books_fts on books using gin (fts); -- generate the index
_10
_10select id, fts
_10from books;

Search using the new column

Now that we've created and populated our index, we can search it using the same techniques as before:


_10select
_10  *
_10from
_10  books
_10where
_10  fts @@ to_tsquery('little & big');

Query operators

Visit PostgreSQL: Text Search Functions and Operators to learn about additional query operators you can use to do more advanced full text queries, such as:

Proximity: `<->`

The proximity symbol is useful for searching for terms that are a certain "distance" apart. For example, to find the phrase big dreams, where the a match for "big" is followed immediately by a match for "dreams":


_10select
_10  *
_10from
_10  books
_10where
_10  to_tsvector(description) @@ to_tsquery('big <-> dreams');

We can also use the <-> to find words within a certain distance of each other. For example to find year and school within 2 words of each other:


_10select
_10  *
_10from
_10  books
_10where
_10  to_tsvector(description) @@ to_tsquery('year <2> school');

Negation: `!`

The negation symbol can be used to find phrases which don't contain a search term. For example, to find records that have the word big but not little:


_10select
_10  *
_10from
_10  books
_10where
_10  to_tsvector(description) @@ to_tsquery('big & !little');

Resources

PostgreSQL: Text Search Functions and Operators

Edit this page on GitHub

Full Text Search

How to use full text search in PostgreSQL.

Preparation#

Usage#

to_tsvector()#

to_tsquery()#

Match: @@#

Basic full text queries#

Search a single column#

Search multiple columns#

Match all search words#

Match any search words#

Partial search#

Implementing partial search#

Extending functionality with RPC#

Handling spaces in queries#

Creating indexes#

Searchable columns#

Search using the new column#

Query operators#

Proximity: <->#

Negation: !#

Resources#