Set-up Site-wide Searching in Tripal v3

laceysanderson's picture


Tripal Version: 


Search API Module
Search API Database Service


Short Description: 

The following tutorial will guide you to set-up a search page with a single keyword search box that will search across entity types and fields. For example, if your user enters the keyword "Lentil" and you have two organisms with "Lentil" in the common name, a number of projects with "Lentil" in either the title or description, genes with "Lentil" as the source organism", etc. All these results would be shown to the user.

Note: Unfortunatly, this tutorial will not show you how to use the default Drupal Search box to get these results (since it's not possible) nor how to create a search block to imitate the Drupal Search Box (which is possible but the tutorial was too long already).

Note: This tutorial saves the index in the Drupal Database. This was done for ease of set-up and display in a tutorial. While the Drupal Database solution can be sufficient for smaller sites (or even larger sites which are configured optimally; beyond the scope of this tutorial), you may find you want to extend your Site-wide Search to use ElasticSearch or Apache Solr. This will hopefully be covered in a later tutorial.

Installing Drupal Search API

  • Search API: This module provides an interface for much more powerful, efficient searching than the Drupal core search module. Specifically, it allows you to use more powerful engines such as Elastic Search and Apache Solr, as well as, advanced features such as facets (for narrowing down search results based on fields or entity type), fuzzy search, etc.
  • Search API Database Service: This module provides a Search Backend/Server defining how your search index should be stored. Specifically, it just stores the index in your current drupal database.

Simply install the above to modules as you would any other Drupal module. For instructions reference the Tutorial.

Screenshot Modules enable page with Database Search, Search API and Search views enabled.

Define your Search Backend/Server

First up we need to tell the Search API where we want our index stored. This tutorial covers using a basic Drupal Database storage backend for your search. For large sites, you will want to look into using Elastic Search or Apache Solr. To get to the configuration page for the Search API you can simply click on the Configure link shown in the above screenshot on the enable modules page or Navigate to Configuration > Search API through the administrative toolbar. You should see the following screen:

The first thing I always do is delete the "Default node index". We don't need it and I don't like seeing red  X's on configuration screens -make me think I've done something wrong ;-). Next click on "Add Server". Since we are configuring a basic drupal database seach server we don't actually have to install any third-part software or set-up an external server. Instead we just fill out the following configuration form to tell the Search API to use it's own database to store the index we will create in the next step. Simply give this server a name ("Drupal Database") and select "Database service" for the "Service Class". I also always, enable "Search on parts of a word" since I feel that if the search is slow due ot this feature, it's time to upgrade to a different service class (ie: Elastic Search or Apache Solr). Click "Create Server" to finish configuring the Search backend.

Add Server Config form

You should see the following screen assuming all went well. Click on "Search API" at the top of the screen (circled) to get back to the main configuration screen.


Define a Search Index

Now that we've told the Search API where we want to store our index, we have to define the index itself :-). Back at the main Search API Configuration page (2nd screenshot of this post) click on "Add index". The resulting page should like the following screenshot. Name your index something descriptive with the word search in it (this name will be used when setting up the search form/listing (view)) such as "Tripal Content Search" and select "Tripal Content" as the "Item Type" where the item type defines what entities should be indexed. One thing to keep in mind here is that the Search API currently doesn't support multi-entity (ie: Both Tripal and Node content) search without the Search API Multi-index Search extension module which I have not personally tried. Notice that we didn't check any of the Bundles -this ensures that all Tripal Content will be indexed by the search. Finally select the database backend you created in the first step as the "Server" and click "Create Index". 

Next you need to configure which fields should be indexed. You will be presented with a very long list of fields (how long depends on how many Tripal Content types you have) and you need to check the checkbox beside each field you would like to be searched using the single search box. The very first thing you should do is scroll to the bottom of the list and expand the "Add Related Fields" fieldset. If you are interested, add these fields first since doing so after you've checked a ridiculous number of checkboxes causes them all to be cleared...

The first ~10 fields will be general for all Tripal Content Types (ie: Content Id, Type, Bundle; boxed in the screenshot below), I usually only select the Title and Bundle and I usually give a pretter generous boost to the title (Title Boost=5 in the screenshot below). The boost drop-down influences the "Relevance" that a search result will have. By increasing the boost for the title I'm saying "if the users keywords are in the title it is more likely this content is the one they're looking for".

Next we get down to the list of entity type specific fields. What you select here is completely dependant upon your own site and content but I like to select most (if not all) of these fields just to be safe. Keep in mind this directly affects the size of your index so if you know there is no useful information in a given field then don't select it. You can always come back and edit this at a later date (although it does require re-indexing your site). The most imporant thing to consider at this point is what boost to apply to the various fields. As a rule of thumb, I like to give a high boost (but not as high as the title; ie: 3) for name fields and a default boost otherwise. You may even want to apply a negative boost to fields users are extremely unlikely to search on (but that you may want to use in facets) or that are likely to pruduce false positives (ie: analysis program version). Once you are done overthinking this step, click on "Save Changes".

Finally (last step for creating the index!), you just need to pick the extra features you would like supported :-). I don't bother selecting any of the "Data Alterations" but I get a little over excited when it comes to "Processors". Keep in mind the order you select Processors is important (ie: if you have html filter after highlighting then it will remove your highlighting ;-) ). I think "Ignore case", "HTML Filter" and "Highlighting" are the most useful and select them in that order. It may also be a good idea to add "Tokenizer" if you are indexing any long text fields since you can get errors if the default tokenize fails resulting in "overly long" words.  Click "Save Configuration".

Your index will have been scheduled for indexing! Depending upon the amount of content you have, this could take awhile as it will only index 50 pieces of Tripal content per Drupal Cron run. If you click on the view tab at the top of the index config page you can see the progress on the indexing process. You can get back to this screen from the main search api configuration page by clicking on the name of the index.


Creating a Search Interface for your users

At this point you should have an index of your Tripal Content. However, you still haven't created any functionality for end users—the data might be indexed, but they can't search it, yet. To create the Search page we are going to use views. Start by going to the Views Administration UI (Structure > Views) and click on "Add new view".

Name it something descriptive as this will show up in the administrative listing and for the view type (the drop-down beside "Show") select the name of the index you created in the last step (ie: Tripal content Search"). Name the page something helpful to the user (i avoid the word Tripal and describe the data instead; ie: "Search Biological Data") and change the path (ie: search/biological-data). Click "Continue & edit".

You will be taken to the Edit Views UI which can be intimidating, even if you've been introduced to it before. With that in mind the following screenshot is attempting to orient you to the parts of the UI we will use in reference to a search form/results. I will go through Fields, Filters and sorts as we need them so don't feel the need to try to achomplish anything from this screenshot except to point out that you should focus on the left side of the UI when looking for the parts I'm going to mention later.

Make sure to save your view periodically by clicking on the "Save" button at the top of the page.

Configuring What is displayed for each Search Result

First off, we are going to change what is displayed for each result. By default just the unique identifier is displayed which of course is not useful to the user. We want to hide that field by clicking on its name, "Indexed Tripal Content: Tripal content id" which opens the configuration pop-up and then checking "Hidden". Since we will be using this field to create our link, we also want to change the "Thousands marker" to "- None-". Clcik "Apply (all displays)" to save these changes.

Then click on the "Add" button beside the fields title to open the add fields pop-up shown in the next screenshot. For this tutorial my search results are going to include the title linked to the content and the highlighted "context" of the search. To add the title, filter the fields view to "Indexed Tripal Content" and then click the checkbox beside "Indexed Tripal Content: Title". Click "Apply (all displays)" to add this field to the view.

Once you add it you will be shown a configuration form for the field. We don't want a label so uncheck that checkbox and we want to make it a link so expand the "Rewrite Results" fieldset, check "Output this field as a link" and set the link path to "bio-data/[id]". This uses tokens to fill in the unique identifier and use it to create the path to the entity for each search result.

Next we want to add the Highlighted search context. To do this click on the "Add" fields button again but this time filter to "Search" and check "Search: Excerpt". On configuration, again, remove the label.

At this point, if you click on the "Update Preview" button you will see a list of titles for your content and then emptiness underneath each title since there was no keyword entered yet so an excerpt could not be generated.

Adding the Keywords Search Box

Click on the "Add" button beside "Filter Criteria" and in the resulting pop-up, select "Search" for the filter and then check "Search: Fulltext Search". Click "Apply (all displays)" to add the filter.

In order to let the users see it, we need to expose this filter. We do that by clicking the checkbox beside "Expose this filter to visitors..." on the filter configuration form. We also want to change the Label to "Keywords". Other then those two changes, the defaults will work so click "Apply (all displays)".

If you save your view now and then go to the page, you should see a nice little Keyowrds box and if you click Apply it should show you filtered results with context highlighting!!

Sort by "Relevance"

Click on the "Add" button beside "Sort Criteria" and check "Search: Relevance". Apply it to the display and configure it to "Sort descencing" so that higher scoring results are shown first.

Only Show results when user clicks Search

On the right-side of the Views UI (you may have to expand the "Advanced" fieldset) under "Exposed Form", Click on "Exposed form Style: Basic" and change it to "Input Required". This ensures that the user doesn't see results until they click "Search". In the resulting Configuration form, Change the button Label to "Search" and uncheck "Expose Sort order".

Now Save your view -You're Done!