Multisite Apache Solr Search with Domain Access
This post is more than 10 years old. I do not delete posts, because even old information is still useful, but please know that some material on this page may be outdated or incorrect. Thanks!
Using one Apache Solr search core with more than one Drupal website isn't too difficult; you simply use a module like Apache Solr Multisite Search, or a technique like the one mentioned in Nick Veenhof's post, Let's talk Apache Solr Multisite. This kind of technique can save you time (and even money!) so you can use one Hosted Apache Solr subscription with multiple sites. The only caveat: any site using the solr core could see any other site's content (which shouldn't be a problem if you control all the sites and don't expose private data through solr).
There are two ways to make Apache Solr Search Integration work with Domain Access (one of which works similarly to the methods mentioned above for multisite), and which method you use depends on how your site's content is structured.
Solr Search with Domain Access - Siloed Content
If the content you are indexing and searching is unique per domain (just like it would be unique per multisite Drupal instance), then you can set up Domain Access to index content with a different Apache Solr hash per site, like so:
First, in a custom module, use hook_domain_batch() to tell Domain Access to add variable configuration for the apachesolr_site_hash variable per-domain (this requires the Domain Configuration module to be enabled, as well as, obviously, the Apache Solr module):
</pre></code>
<p>Second, visit <code>/admin/structure/domain/batch/apachesolr_site_hash</code> and enter a different hash for each domain.</p>
<p>Third, use <code>hook_apachesolr_query_alter()</code> to alter solr queries to search using the site-specific hash:</p>
<?php
/**
* Implements hook_apachesolr_query_alter().
*/
function MODULENAME_apachesolr_query_alter($query) {
// Get the current domain.
$domain = domain_get_domain();
$hash = domain_conf_variable_get($domain['domain_id'], 'apachesolr_site_hash');
// Add the current domain's apachesolr site hash to the query.
$query->addFilter('hash', $hash);
}
?>
At this point, if you reindex all your content on all your domains, each domain will only find content specific to the domain. (This method was discussed in this issue in Domain Access's issue queue.).
Problem: Single node, multiple domains
There's a major issue that I've seen a few times with this situation: what if there is a node (or many nodes) that are published to multiple domains (shared across more than one domain)? In this case, the content will show up only when searching on the domain where Solr indexing was run first. So, if a piece of content is published to domain A and domain B, but solr indexes the node on domain A, the content won't show in results for domain B, because the apachesolr site hash for that content was set to domain A's hash.
So, to avoid this issue, we can't actually use Apache Solr's site hash when indexing nodes (or at least, we can't only use it). Instead, we need to add an array of assigned domains for each document in Apache Solr's index, and use that array to filter search results when searching on each individual domain.
Solution: Adding domain access info to the index for shared content
The fix involves three parts:
First, when indexing a document in solr, we need to add domain information to the index so we can filter our query with it later. We'll do this with hook_apachesolr_index_document_build():
</pre></code>
<p>Second, we need to filter the search query sent to Apache Solr using <code>hook_apachesolr_query_alter()</code> so it filters based on the domain where the search is being performed:</p>
<?php
/**
* Implements hook_apachesolr_query_alter().
*/
function MODULENAME_apachesolr_query_alter($query) {
// Add domain key to filter all queries.
$domain = domain_get_domain();
$query->addParam('fq', 'im_domain_id:' . $domain['domain_id']);
}
?>
Third, we need to change the 'url' passed into the search result, so it's a relative URL that will work on all your domains (by default, Apache Solr seems to use an absolute URL to the domain on which the node was indexed, meaning some links will link users off one domain to another domain!). You can do this using template_preprocess_search_result() in your theme (or in a custom module substituting MODULENAME for THEMENAME below):
</pre></code>
<p>Once all this is done, you need to reindex all your content (across all domains) before everything will start working correctly. Once that's done, you'll have multi-domain apache solr working with shared content. Nice!</p>
<p>The inspiration behind this method of making solr work well filtering content across multiple domains comes from the <a href="https://drupal.org/project/domain_solr">Domain Access Solr Facet</a> module, which doesn't yet have a Drupal 7 release, but is relatively simple, and has a <a href="https://drupal.org/node/1213296#comment-8492829">patch for the D7 port</a> in the issue queue.</p>
Comments