#%env/templates/metas.template%# #%env/templates/header.template%# #%env/templates/submenuIndexControl.template%#

Index Sources & Targets

YaCy supports multiple index storage locations. As an internal indexing database a deep-embedded multi-core Solr is used and it is possible to attach also a remote Solr.

Solr stores the main search index. It is the home of two cores, the default 'collection1' core for documents and the 'webgraph' core for a web structure graph. Detailed information about the used Solr fields can be edited in the Schema Editor.
Lazy Value Initialization. If checked, only non-zero values and non-empty strings are written to Solr fields.
 
 
Use deep-embedded local Solr

This will write the YaCy-embedded Solr index which stored within the YaCy DATA directory.
The Solr native search interface is accessible at
/solr/select?q=*:*&start=0&rows=3&core=collection1 for the default search index (core: collection1) and at
/solr/select?q=*:*&start=0&rows=3&core=webgraph for the webgraph core.
If you switch off this index, a remote Solr must be activated.
Use remote Solr server(s)

Here you can define a single or a set of remote solr servers. If both, an internal and an external Solr is used, then both are mirrored. That means, every write request goes to internal and external Solr, but a read request goes only to the internal index. Only if the internal index does not give any result on a search request, also the remote is requested.
#(table)#::
 
Solr Hosts
#{list}# #{/list}#
Solr Host Administration Interface
Index Size
#[url]# #[size]#
#(/table)#
Solr URL(s)

You can set one or more Solr targets here which are accessed as a shard. For several targets, list them using a ',' (comma) as separator. The set of remote targets are used as shard of a complete index. The host part of the url is used as key for a hash function which selects one of the shards (one of your remote servers). When a search request is made, all servers are accessed synchronously and the result is combined.
Sharding Method
An external Solr installation is easily done following these steps (by example for Solr 4.1.0):
  • Download solr-4.1.0.tgz from http://lucene.apache.org/solr/
  • Decompress solr-4.1.0.tgz (with 'tar xfz solr-4.1.0.tgz') and put solr-4.1.0 into ~/
  • Consider that YaCy is already running and stored in ~/yacy/
  • To configure the multi-core configuration of YaCy, execute:
    mkdir ~/solr-4.1.0/example/solr/webgraph
    cp -R ~/solr-4.1.0/example/solr/collection1/conf ~/solr-4.1.0/example/solr/webgraph/conf
    ~/yacy/bin/apicat.sh /api/schema.xml?core=collection1 > ~/solr-4.1.0/example/solr/collection1/conf/schema.xml
    ~/yacy/bin/apicat.sh /api/schema.xml?core=webgraph > ~/solr-4.1.0/example/solr/webgraph/conf/schema.xml
  • edit ~/solr-4.1.0/example/solr/solr.xml and put in the following content:
    <?xml version="1.0" encoding="UTF-8" ?>
    <solr persistent="true">
      <cores adminPath="/admin/cores" defaultCoreName="collection1">
        <core name="collection1" instanceDir="collection1" />
        <core name="webgraph" instanceDir="webgraph" />
      </cores>
    </solr>
  • Finally, start the external Solr with:
    cd ~/solr-4.1.0/example/ && java -jar start.jar
  • open http://localhost:8983/solr/ to visit Solr's administration console.
Web Structure Index The web structure index is used for host browsing (to discover the internal file/folder structure), ranking (counting the number of references) and file search (there are about fourty times more links from loaded pages as in documents of the main search index).
use citation reference index (lightweight and fast)
use webgraph search index (rich information in second Solr core)
Peer-to-Peer Operation The 'RWI' (Reverse Word Index) is necessary for index transmission in distributed mode. For portal or intranet mode this must be switched off.
support peer-to-peer index transmission (DHT RWI index)
#%env/templates/footer.template%#