Title: How can I manually modify the configuration set TeamPage uses for Solr?

This document explains how to install a new configuration set in your TeamPage Solr server, including how to manually delete collections and the existing "traction" configuration set.

(This is an advanced procedure which should not generally be necessary unless you have been instructed to perform it by Traction support staff. If you're a TeamPage customer having problems with your Solr setup, or you think you need to use a custom configuration set for Solr, please file a support request by clicking the "New Support Request" button on this page.)

Solr uses ZooKeeper to maintain metadata about both configuration sets and collections / cores / shards. The configuration set that comes with TeamPage's Solr installers is currently the default for English language data. To use an alternative configuration set, all collections that use the existing one must be deleted, and then the configuration set must be deleted. (Formerly, it seemed that when the last collection that referenced a configuration set had been deleted, the configuration set was also automatically deleted, but this doesn't seem to be the case, at least at the time of writing using Solr 6.3.0.)

Deleting existing collections



There are a few ways to delete the collections. One or the other may make more sense depending upon your circumstances.

Before doing anything else



Whichever method you choose, you should first perform one additional step to ensure that the rest of this procedure goes smoothly.

Normally, TeamPage creates a new Solr collection when you first successfully finish setting up and applying the connection details in the Solr Setup dialog. It records that it has created this collection, and will not try to do so again unless you delete the collection from within the Index Administration dialog; or unless you tell it that you have done so by taking TeamPage offline and modifying the appropriate journal configuration setting. But if TeamPage is still set up to dispatch documents to the Solr index, the first time it tries to do so, if it knows that the collection no longer exists in Solr, it is going to recreate it. This would be inconvenient if you're in the middle of this procedure to remove all collections and install a new configuration set.

To avoid this, modify the "Indexing Mode" setting in the Solr Setup dialog so that it's set to "Off - Do Not Track Changes of Update Index". This change will be reverted later when you're creating the new collection in the Solr server, at which time you'll have to re-feed your TeamPage corpus, but in the meantime, it will prevent TeamPage from doing any unnecessary work, and from attempting to recreate the collection that you're about to delete.

You may also elect to set the "Solr Enabled" setting to "no" to prevent that feature from being exposed to users while you're performing the rest of this procedure.

Option 1 (preferred): Use the Solr Index Administration dialog to delete the collection



This is the preferred method because if you delete the collection outside of TeamPage's admin UI, you'll have to later shut down your TeamPage server and manually modify db.properties to tell TeamPage that the collection has been deleted so that it can later be recreated.

The "Delete Collection" button is at the bottom of the dialog. Click it to delete the collection in Solr.

Option 2: Use Solr's admin UI to delete the collection



Assuming your Solr service is running and listening on the default port, 8983, you can connect to the admin interface over http on that port, select the collection you want to delete, and delete it.



Option 3: Use the ZooKeeper command line interface tool to delete the collection



Assuming your Solr installation is installed /solr, and is currently running with ZooKeeper listening on port 9983, you can use the ZK CLI tool to remove a collection by its ID. First, you can use the "list" command to see all the data that are being tracked:

/solr → ./server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:9983 -cmd list


/ (9)
DATA:
    
 /configs (1)
  /configs/traction (9)
   /configs/traction/currency.xml (0)
   DATA: ...supressed...
   /configs/traction/protwords.txt (0)
   DATA: ...supressed...
   /configs/traction/solrconfig.xml (0)
   DATA: ...supressed...
   /configs/traction/synonyms.txt (0)
   DATA: ...supressed...
   /configs/traction/stopwords.txt (0)
   DATA: ...supressed...
   /configs/traction/schema.xml (0)
   DATA: ...supressed...
   /configs/traction/_rest_managed.json (0)
   DATA:
       {"initArgs":{},"managedList":[]}
       
   /configs/traction/lang (4)
    /configs/traction/lang/stoptags_ja.txt (0)
    DATA: ...supressed...
    /configs/traction/lang/stopwords_en.txt (0)
    DATA: ...supressed...
    /configs/traction/lang/stopwords_ja.txt (0)
    DATA: ...supressed...
    /configs/traction/lang/userdict_ja.txt (0)
    DATA: ...supressed...
   /configs/traction/mapping-FullNumber.txt (0)
   DATA: ...supressed...
 /zookeeper (1)
 DATA:
     
 /overseer (6)
 DATA:
     
  /overseer/collection-queue-work (0)
  /overseer/queue-work (0)
  /overseer/collection-map-failure (0)
  /overseer/collection-map-completed (0)
  /overseer/queue (0)
  /overseer/collection-map-running (0)
 /aliases.json (0)
 /live_nodes (1)
  /live_nodes/10.0.1.15:8983_solr (0)
 /collections (1)
  /collections/1462825620166 (3)
  DATA:
      {"configName":"traction"}
   /collections/1462825620166/leaders (1)
    /collections/1462825620166/leaders/shard1 (1)
     /collections/1462825620166/leaders/shard1/leader (0)
     DATA:
         {
           "core":"1462825620166_shard1_replica1",
           "core_node_name":"core_node1",
           "base_url":"http://10.0.1.15:8983/solr",
           "node_name":"10.0.1.15:8983_solr"}
   /collections/1462825620166/state.json (0)
   DATA:
       {"1462825620166":{
           "replicationFactor":"1",
           "shards":{"shard1":{
               "range":"80000000-7fffffff",
               "state":"active",
               "replicas":{"core_node1":{
                   "core":"1462825620166_shard1_replica1",
                   "base_url":"http://10.0.1.15:8983/solr",
                   "node_name":"10.0.1.15:8983_solr",
                   "state":"active",
                   "leader":"true"}}}},
           "router":{"name":"compositeId"},
           "maxShardsPerNode":"1",
           "autoAddReplicas":"false"}}
   /collections/1462825620166/leader_elect (1)
    /collections/1462825620166/leader_elect/shard1 (1)
     /collections/1462825620166/leader_elect/shard1/election (1)
      /collections/1462825620166/leader_elect/shard1/election/101749641657450496-core_node1-n_0000000001 (0)
 /overseer_elect (2)
  /overseer_elect/leader (0)
  DATA:
      {"id":"101749641657450496-10.0.1.15:8983_solr-n_0000000001"}
  /overseer_elect/election (1)
   /overseer_elect/election/101749641657450496-10.0.1.15:8983_solr-n_0000000001 (0)
 /security.json (0)
 DATA:
     {}
 /clusterstate.json (0)
 DATA:
     {}



For a collection ID like "1462825620166", you can issue the "clear" command pointing to that collection as "/collections/1462825620166":

/solr → ./server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:9983 -cmd clear /collections/1462825620166


You will also still have to delete the collection folder -- which will be in the SOLR_HOME directory in a folder named something like, e.g., "1462825620166_shard1_replica" -- as in Option 4 below.

Option 4: Shut down Solr and manually delete the collections and ZooKeeper data.



This may be the easiest way to delete many collections at the same time. If you delete the ZooKeeper database, as well, then Solr will re-initialize it when you start it up again.

With the Solr service shut down, find the SOLR_HOME directory that contains all the "[collection ID]_shard[n]_replica" folders and the "zoo_data" folder. Delete all of those, leaving the solr.xml and zoo.cfg files in place.

For example, if SOLR_HOME is set to /solr/traction:

/solr/traction → rm -rf 1462825620166_shard1_replica1 zoo_data


Then you can start your Solr service again.

Notes for Options 2-4



If you do not delete the collections from Solr through TeamPage's Solr Index Administration dialog, TeamPage will not know that the collection doesn't exist, and will not know that it needs to recreate it. At the time of writing, the only way to address this is to shut down the TeamPage servers in question, find the db.properties file for the journal being used with each server, and remove the solr_collection_[collection ID]_exists=true property in each case.

For example, for collection ID "1462825620166", with the hosting TeamPage server shut down, after finding the right journal folder, we can find and remove the property solr_collection_1462825620166_exists=true from that journal's db.properties file. Then you can start your TeamPage servers again and move to the next step.

Deleting the existing "traction" configuration set



After deleting the collections that refer to the "traction" configuration set, you can delete the configuration set itself. (If you used Option 4 above, you have already deleted the ZooKeeper data and you don't need to do anything for this step.)

If you used other options, you should delete the existing "traction" configuration set using the ZK CLI with a command like this:

/solr/traction → ./server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:9983 -cmd clear /configs/traction


Now your Solr server will be ready to accept a new configuration set called "traction".

Installing the new "traction" configuration set



Currently, in order to make your configuration set used by TeamPage, you must install it as "traction".

You can install it using the ZK CLI via the "upconfig" command. Assuming all the files for your configuration set are contained in "conf" folder at the path /home/user/solr/configsets/my_custom_config/conf, the command will look something like this:

/solr → ./server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:9983 -cmd upconfig -confname traction -confdir /home/user/solr/configsets/my_custom_config/conf


This command, like all the ZK CLI commands, does not exit with any status, but if you don't see any error messages, you should confirm that the new configuration set has been installed using the "list" command again, like this:

/solr → ./server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:9983 -cmd list


You should see the new configuration set data near the top of the output:

/ (9)
DATA:
    
 /configs (1)
  /configs/traction (8)
   /configs/traction/currency.xml (0)
   DATA: ...supressed...
   /configs/traction/protwords.txt (0)
   DATA: ...supressed...
   /configs/traction/solrconfig.xml (0)
   DATA: ...supressed...
   /configs/traction/synonyms.txt (0)
   DATA: ...supressed...
   /configs/traction/stopwords.txt (0)
   DATA: ...supressed...
   /configs/traction/schema.xml (0)
   DATA: ...supressed...
   /configs/traction/_rest_managed.json (0)
   DATA:
       {"initArgs":{},"managedList":[]}
       
   /configs/traction/lang (1)
    /configs/traction/lang/stopwords_en.txt (0)
    DATA: ...supressed...
 /zookeeper (1)
 DATA:
     
 /overseer (6)
 DATA:
     
  /overseer/collection-queue-work (0)
  /overseer/queue-work (0)
  /overseer/collection-map-failure (0)
  /overseer/collection-map-completed (0)
  /overseer/queue (0)
  /overseer/collection-map-running (0)
 /aliases.json (0)
 /live_nodes (1)
  /live_nodes/10.0.1.15:8983_solr (0)
 /collections (0)
 /overseer_elect (2)
  /overseer_elect/leader (0)
  DATA:
      {"id":"101749641657450496-10.0.1.15:8983_solr-n_0000000001"}
  /overseer_elect/election (1)
   /overseer_elect/election/101749641657450496-10.0.1.15:8983_solr-n_0000000001 (0)
 /security.json (0)
 DATA:
     {}
 /clusterstate.json (0)
 DATA:
     {}



Creating new collections using the new configuration set



If you shut down your TeamPage servers to use any option other than 1 above for deleting collections, and you've already edited the db.properties files for the affected journals to tell TeamPage that the collections no longer exist in Solr, you're ready to start the servers again now.

Back in your Solr Setup dialog, you can switch the "Indexing Mode" setting back to "On - Track Changes and Update Index". When you click the "Apply" button, since TeamPage should now know that the collection no longer exists, it will create a new collection based upon your newly installed "traction" configuration set. You should see a confirmation message that the Solr setup has been completed, and if you click on the "Index Administration" tab, you should see that dialog load without any errors, and with a "Searchable" count of 0.

Re-feeding your corpus to Solr for indexing



When you're ready, click the "Start" button under "Feed All Documents", just like you've finished configuring your TeamPage server's connection to Solr for the first time. This will set the status for all documents in all active spaces to "Needs Indexing", and the external search indexer daemon will gradually feed them in batches to your Solr service. Depending upon the size of your corpus, this process may take just a few minutes or many hours. There will be some additional load on both the TeamPage and Solr services during this time, but if they are configured with enough heap memory and have sufficient processing power, this should not be disruptive for your users.

As long as the "Solr Enabled" setting is set to "yes" in the Solr Setup dialog, and Solr is designated as the preferred search engine (server settings > General > Search Settings), users will be able to immediately start getting search results from whatever documents have already been indexed in Solr. Since journal entries and attachments are set to "Needs Indexing" in reverse chronological order by original entry creation date, the most recently created entries will generally be available earliest, which can be convenient. Note that space share folders have their files' statuses set to "Needs Indexing" after all journal entries and attachments, so those files will likely be added to the index later.

In any case, if you set "Solr Enabled" to "no" in the Solr Setup dialog earlier, don't forget to change it back to "yes" now, or whenever you're ready for users to have access to Solr search features again.



Attachments:
solr-admin-delete.png
Article: solrsearch55 (permalink)
Categories: :solrsearch:FAQ
Date: March 14, 2019; 12:03:47 PM Eastern Daylight Time

Author Name: Dave Shepperton
Author ID: shep