Enhancing search result relevance in Drupal can be achieved by configuring Solr Query Re-Ranking and utilizing the Search API. Follow these steps to fine-tune your search results:
Prerequisites
- Drupal installed and configured.
- Apache Solr installed and running.
- Search API and Search API Solr modules installed in Drupal.
Steps
- Configure Solr Server:
- Navigate to Configuration -> Search and Metadata -> Search API.
- Add a new server and select Solr as the backend.
- Enter the required details such as the Solr endpoint URL.
- Create an Index:
- Go to Configuration -> Search API and add a new index.
- Assign the index to the Solr web server you configured earlier.
- Select the content types to be indexed.
- Save the configuration and start indexing your content.
- Configure Query Re-Ranking:
- Edit the Solr configuration files (e.g.,
solrconfig.xml
andschema.xml
) to enable query re-ranking. - Use the
rerank
andrerankQuery
parameters to configure re-ranking behavior in your queries.
- Edit the Solr configuration files (e.g.,
- Use Search API to Fine-Tune Results:
- In Drupal, configure the Search API settings to leverage Solr’s re-ranking capabilities.
- Adjust the weights of fields and other ranking criteria to improve result relevance.
- Test and Optimize:
- Perform search queries and analyze the results.
- Iteratively adjust the configuration to achieve the desired search relevance.
Additional Tips
- Regularly update your Solr schema and configuration to leverage new features and improvements.
- Monitor search performance and tweak the settings as your content evolves.
The path to better search results
Today, we still maintain both versions. The use of our legacy Drupal content has steadily decreased, yet it still has a substantial presence in search results. This often leads to confusion, especially when members trying to learn about features in modern Drupal find themselves on a legacy Drupal tutorial.
We have always enabled faceted searching, which allows members to narrow down the results to a specific version of Drupal after performing the initial search. Recently, based on member feedback, we decided to explore additional methods to better surface content relevant to modern Drupal, which is what the majority of our members now use.
Initially, I thought setting the “Drupal 10” facet as the default could address this issue. However, after spending an entire day exploring this, I realized this approach was impractical. Facets filter based on values present in the result set, so if a search only returns “Drupal 7” content, the “Drupal 10” facet won’t appear, and you cannot select an option that does not exist.
First attempt: boosting
I then explored using boosting. Boosting adjusts the relevance score of an item based on query time criteria. For example, you might give higher relevance to a result if the keywords in the search query appear in the title field rather than in the body. The Search API Solr module already supports this and we use it to rank courses and guides higher than tutorials.
I considered creating a Search API processor plugin similar to the existing solr_boost_more_recent
to add a Lucene expression to the document boost factors. This Lucene function runs as part of the query and uses the result as a multiplier to boost the document. For instance, you could boost the relevance of any document tagged with “Drupal 10” in the versions field by 5, regardless of the query.
$boosts = $query->getOption('solr_document_boost_factors', []);
$boosts['taxonomy_versions'] = sprintf('if(termfreq(%s,"%s"),%2F,0.0)', SolrBackendInterface::FIELD_PLACEHOLDER, $term, $boost_factor);
if ($boosts) {
$query->setOption('solr_document_boost_factors', $boosts);
}
This results in a Solr query like:
{!boost b=sum(if(termfreq(tm_X3b_und_version,"Drupal 10"),5.0,0.0))} (tm_X3b_und_body:+"testing" ...)
However, this method does not work with multi-word phrases like “Drupal 10”. While I could get it working with “10”, this solution was not ideal.
I also tried other variations, such as indexing the term ID field as an integer to see if the term ID is in the multi-value integer field:
if(itm_taxonomy_versions:1461,5.0,0.0)
or
if(exists(query({v=itm_taxonomy_versions:"1045"})),10,0)
After much experimentation and consultation in Slack, I was directed towards query re-ranking as a possible alternative.
Using query re-ranking to influence sort order in Solr search results
Query re-ranking in Solr adjusts the order of search results after the initial query execution. A re-ranking query applies a secondary scoring phase to the top N results. Unlike boosting, re-ranking modifies the order based on additional criteria post-query.
Run the query for the keyword, “views”, then take the results and run another query against those results, applying a ranking weight to any document that matches the second query.
I wrote an event subscriber that hard codes a re-ranking query to all searches on Drupalize.Me:
<?php
declare(strict_types=1);
namespace DrupaldmeEventSubscriber;
use Drupalsearch_api_solrEventPreQueryEvent;
use Drupalsearch_api_solrEventSearchApiSolrEvents;
use SymfonyComponentEventDispatcherEventSubscriberInterface;
class DmeSearchApiSolrSubscriber implements EventSubscriberInterface {
public static function getSubscribedEvents(): array {
return [
SearchApiSolrEvents::PRE_QUERY => 'preQuery',
];
}
public function preQuery(PreQueryEvent $event): void {
$solarium_query = $event->getSolariumQuery();
$rq = '{!rerank reRankQuery=$rqq reRankDocs=1000 reRankWeight=-7 reRankOperator=add}';
$solarium_query->addParam('rq', $rq);
$solarium_query->addParam('rqq', '(itm_taxonomy_versions:1045 OR itm_taxonomy_versions:1044)');
}
}
This approach allows specifying a negative reRankWeight
, which effectively demotes Drupal 7 content, and avoids the need for future adjustments when new Drupal versions are added.
While I chose a hard-coded approach to minimize custom code, I also developed a proof-of-concept showing that re-ranking could be implemented in a Search API processor plugin with a configuration form.
In the preprocessSearchQuery()
method of the plugin, you can set Solr query string parameters using:
$query->setOption('solr_param_rq', $rq);
This uses the solr_param_*
prefix, which the Search API Solr module recognizes and applies to the Solarium query.
Now, when you perform a search on our site, the results should be more relevant.
Next, I plan to explore adding support for Solr 9’s DenseVector searching feature to perform semantic searches. This will help return relevant tutorials for queries such as “How do I put fields on the page?” even if those specific phrases are not in the text.
Learn more about Solr and Drupal
If you’re interested in learning more about integrating Apache Solr and Drupal, check out our Search API and Solr in Drupal course.