yacy_search_server/source/de/anomic/yacy/dht/PartitionScheme.java
orbiter 5892fff51f introduction of dht-burst modes: this can expand the number of target peers in some cases where a better heuristic is needed. The problematic cases are either when a muti-word search is made (still a hard case for our term-oriented DHT) or when a network operator wants that all robinson peers are asked. We therefore introduced two new network steering values that switch on more peers during the peer selection. Because the number of peers can now be very large, the number of maximum httpc connections was also increased.
Please see new coments in yacy.network.freeworld.unit for details of the new DHT selection methods.
The number of maximum peers is now not fixed to a specific number but may increase with
- the partition exponent
- the number of redundant peers
- the robinson burst percentage
- the multiword burst percentage
The maximum can then be the number of senior peers (all visible peers).

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7479 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-13 17:37:28 +00:00

75 lines
3.2 KiB
Java
Executable File

// PartitionScheme.java
// ------------------------------
// part of YaCy
// (C) 2009 by Michael Peter Christen; mc@yacy.net
// first published on http://yacy.net
// Frankfurt, Germany, 28.01.2009
//
// $LastChangedDate: 2009-01-23 16:32:27 +0100 (Fr, 23 Jan 2009) $
// $LastChangedRevision: 5514 $
// $LastChangedBy: orbiter $
//
// This program is free software; you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation; either version 2 of the License, or
// (at your option) any later version.
//
// This program is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU General Public License
// along with this program; if not, write to the Free Software
// Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
package de.anomic.yacy.dht;
import de.anomic.yacy.yacySeed;
/**
* A PartitionScheme is a calculation of index storage positions in the network of peers.
* Depending on the size of the network and the requirements of use cases the network
* may be defined in different ways. A search network can be constructed in two different ways:
* * partition by document
* * partition by word
* Both schemes have different advances and disadvantages:
* A partition by document has the following properties:
* + all documents are distinct from each other, no double occurrences of documents and document infos
* + good performance scaling up to about 100 peers
* + no index distribution necessary
* - not good scaling for a very large number of peers (1000+ servers cannot be requested simulanously)
* A partition by word has the following properties:
* + can scale for a very large number of peers
* + can almost unlimited scale in number of documents because the number of peers may scale to really large numbers
* + in case of a search only a very small number of peers must be asked
* - double occurrences of documents cannot be avoided very easy
* - index distribution is necessary and is IO and bandwidth-intensive
* - a search request does not scale good, because so many references are accumulated at a single peer
* The partition by word may be enhanced: a vertical scaling provides a better scaling for search,
* but increases further the complexity of the distribution process.
*
* In YaCy we implement a word partition with vertical scaling. To organize the complexity of the
* index distribution and peers selection in case of a search request, this interface is needed to
* implement different distribution schemes.
*
* @author Michael Christen
*
*/
public interface PartitionScheme {
public int verticalPartitions();
public long dhtPosition(final byte[] wordHash, final String urlHash);
public long dhtPosition(final byte[] wordHash, final int verticalPosition);
public int verticalPosition(final byte[] urlHash);
public long[] dhtPositions(final byte[] wordHash);
public long dhtDistance(final byte[] word, final String urlHash, final yacySeed peer);
}