Solr Search API

Last modified by Admin on 2024/10/28 16:48

cogSolr engine management: configuration, indexing, listeners, script service, etc. This module does not handle the search queries
TypeJAR
Category
Developed by

XWiki Development Team

Rating
0 Votes
LicenseGNU Lesser General Public License 2.1
Bundled With

XWiki Standard

Installable with the Extension Manager

Description

Checkout the Solr Core to understand what information is being indexed.

Configuration

The following properties can be configured in the xwiki.properties file for the Solr API:

#-------------------------------------------------------------------------------------
# Solr Search
#-------------------------------------------------------------------------------------

#-# [Since 4.5M1]
#-# The Solr server type. Currently accepted values are "embedded" (default) and "remote".
# solr.type=embedded

#-# [Since 4.5M1]
#-# The location where the embedded Solr instance home folder is located.
#-# The default is the subfolder "store/solr" inside folder defined by the property "environment.permanentDirectory".
# solr.embedded.home=/var/local/xwiki/store/solr

#-# [Since 12.2]
#-# The URL of the Solr server (the root server and not the URL of a core).
#-# The default value assumes that the remote Solr server is started in a different process on the same machine, using the default port.
# solr.remote.baseURL=http://localhost:8983/solr

#-# [Since 5.1M1]
#-# Elements to index are not sent to the Solr server one by one but in batch to improve performances.
#-# It's possible to configure this behavior with the following properties:
#-#
#-# The maximum number of elements sent at the same time to the Solr server
#-# The default is 50.
# solr.indexer.batch.size=50
#-# The maximum number of characters in the batch of elements to send to the Solr server.
#-# The default is 10000.
# solr.indexer.batch.maxLength=10000

#-# [Since 5.1M1]
#-# The maximum number of elements in the background queue of elements to index/delete
#-# The default is 10000.
# solr.indexer.queue.capacity=100000

#-# [Since 6.1M2]
#-# Indicating if a synchronization between SOLR index and XWiki database should be run at startup.
#-# Synchronization can be started from search administration.
#-# The default is true.
# solr.synchronizeAtStartup=false

#-# [Since 12.5RC1]
#-# Indicates which wiki synchronization to perform when the "solr.synchronizeAtStartup" property is set to true.
#-# Two modes are available:
#-#   - WIKI: indicate that the synchronization is performed when each wiki is accessed for the first time.
#-#   - FARM: indicate that the synchronization is performed once for the full farm when XWiki is started.
#-# For large farms and in order to spread the machine's indexing load, the WIKI value is recommended, especially if
#-# some wikis are not used.
#-# The default is:
# solr.synchronizeAtStartupMode=FARM

Setup a remote Solr server

Solr is not so great at retro-compatibility when it comes to core schema, so it's safer to use the version of Solr that your version of XWiki uses as the embedded version. Here is a compatibility matrix to help with the choice:

XWiki versionSolr version
11.4 to 11.57.7.x (XWiki embeds 7.7.1)
11.6 to 13.28.1.x (XWiki embeds 8.1.1)
12.3 to 13.08.5.x (XWiki embeds 8.5.1)
13.1 to 14.78.8.x (XWiki embeds 8.8.0)
14.8 to 16.1.08.11.x (XWiki embeds 8.11.2)
16.2.0+9.4.x (XWiki embeds 9.4.1)

Download and install Solr. XWiki 16.6.0+ You will need to enable the analysis-extras module.

Debian based system

If your Solr instance is installed on a Debian/Ubuntu system take a look at InstallationViaAPT.

Manual install

The Solr REST API is unfortunately too limited, so you will need to create several cores on your Solr instance. For each one, download the zip file synchronized with your version of XWiki and unzip its content in a new folder located with other Solr cores with the following names:

XWiki <16.2.0

Solr8:

Indicate in xwiki.properties file that you want to use a remote Solr instance, and its URL:

solr.type=remote

solr.remote.baseURL=http://solrhost/solr

When using solr.remote.baseURL you can control the name of the search core (and the prefix for the other cores) using solr.remote.corePrefix property (default the main core is "xwiki" and the others are prefixed with "xwiki_").

Data transfer upon moving the Solr of an existing instance to a remote Solr

TODO: add a note about how to move data for data cores (ratings & events) from the embedded Solr to the remote Solr

Backup remote Solr data

TODO: add a note about what and how to backup the data from the external Solr server.

Performances

By default XWiki ships with an embedded Solr. This is mostly for ease of use but the embedded instance is not really recommended by the Solr team so you might want to externalize it when starting to have a wiki with a lots of pages. Solr is using a lot of memory and a standalone Solr instance is generally better in term of speed than the embedded one. It should not be much noticeable in a small wiki but if you find yourself starting to have memory issues and slow search results you should probably try to install and setup an external instance of Solr using the guide.

Also the speed of the drive where the Solr index is located can be very important because Solr/Lucene is quite filesystem intensive. For example putting it in a SSD might give a noticeable boost.

You can also find more Solr-specific performance details on https://wiki.apache.org/solr/SolrPerformanceProblems. Standalone Solr also comes with a very nice UI, along with monitoring and test tools.

Size on disk

It depends on the size of each document but an instance like the http://www.myxwiki.org farm (mostly standard documents in lots of wikis) uses 3.2GB of disk space to store around 180000 documents, which gives approximately 18KB per document.

Prerequisites & Installation Instructions

We recommend using the Extension Manager to install this extension (Make sure that the text "Installable with the Extension Manager" is displayed at the top right location on this page to know if this extension can be installed with the Extension Manager).

You can also use the manual method which involves dropping the JAR file and all its dependencies into the WEB-INF/lib folder and restarting XWiki.

Dependencies

Dependencies for this extension (org.xwiki.platform:xwiki-platform-search-solr-api 16.9.0):

Get Connected