theli
5e0b6f8f83
*) sorting peer name list on Blacklist_p.html
...
*) restructuring of sharedBlacklist_p.java
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2405 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-13 13:29:50 +00:00
theli
6c8366aea1
*) Bugfix for blacklist import function
...
- wrong property name
- list was accidentally imported into a new blacklist file
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2404 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-13 09:20:43 +00:00
theli
eee44be602
*) adding an interface for customized blacklist classes
...
- now it's possible to use a customized blacklist engine
instead of the default one
- this can be done by configuring the property BlackLists.class
See: http://www.yacy-forum.de/viewtopic.php?t=2108
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2397 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-12 14:28:14 +00:00
theli
66f1eb07d9
*) Bugfix for IllegalArgumentException in transferURL
...
See: http://www.yacy-forum.de/viewtopic.php?p=24560
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2391 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-12 10:54:19 +00:00
theli
d2e8e76218
*) now it's possible to configure the yacy blacklist separately for dht, search, proxy, crawler
...
See: http://www.yacy-forum.de/viewtopic.php?t=2541
http://www.yacy-forum.de/viewtopic.php?p=24516
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2389 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-12 02:42:10 +00:00
orbiter
f43c90fa98
fixed handling of null referer in crawlOrder
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2384 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 21:46:34 +00:00
orbiter
abf22f6e60
removed url normalform computation from htmlFilterContentScraper.
...
This method was implemented in de.anomic.net.URL
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2377 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 15:09:22 +00:00
orbiter
ec5149ff3b
fix for busyCacheFlush detection
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2365 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-07 22:28:09 +00:00
orbiter
f58283def2
better control of index flush
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2364 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-07 22:07:17 +00:00
orbiter
80b6c90d54
enhancements to prevent blocking during dht transfer receive
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2362 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-07 21:49:39 +00:00
hermens
d56f06401e
- Cache known URLs during indexReceive to avoid getting blocked during loadedURL.exists() whenever possible
...
- Small logging updates
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2359 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-07 11:42:00 +00:00
theli
c7b6389ca1
*) renaming indexDistribution.dhtReceiptLimitEnabled property to indexDistribution.transferRWIReceiptLimitEnabled
...
so that the default value is taken over by all peers
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2356 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-07 11:01:01 +00:00
orbiter
9183d21f25
renamed new index class to old name
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2342 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-02 20:01:59 +00:00
orbiter
c4e922885a
replaced indexURLEntry by new class that uses a kelondroRow.Entry object
...
to store the index entry. This is another step to move to the new database structure.
A side effect of this change is, that index storage uses much less RAM space,
which affects the index RAM cache.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2341 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-02 19:59:28 +00:00
orbiter
5f72be2a95
some redesign of EURL storage
...
* store() is now called explicitely
* more urls are written to the EURL table
* the EURL stack does not store the complete entry any more, now only the URL hash
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2323 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-24 15:25:47 +00:00
orbiter
58df8b7bbf
a large collection of different changes
...
* mainly for the transition to the new indexing database structure
* a bugfix for an endless loop inside kelondroTree iteration
* a bugfix for bulk read inside a kelondroTree iteration; the bug caused that some elements had been iterated twice
* very strong speed enhancement for url/domain extraction
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2320 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-23 22:39:41 +00:00
hydrox
8ba8e2b7d9
*) added cache for blacklists urlhashs recieved by DHT. DHT does not request URLs listed in this cache.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2251 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-28 08:51:34 +00:00
hermens
53cbcc6d6e
Implement emergency break in index receive when the limit of the ramCache is exceeded by more than cacheLimit
...
See: http://www.yacy-forum.de/viewtopic.php?p=22911#22911
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2248 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-27 11:14:30 +00:00
theli
b20496e42b
*) make DHT DoS check configurable (requested by KoH)
...
- check can be disabled via property indexDistribution.dhtReceiptLimitEnabled
- upper bound can be configured via indexDistribution.dhtReceiptLimit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2234 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-21 19:28:42 +00:00
hermens
38a1410361
Don't test a remote peer's seed during hello.respond as its IP might not be proper, especially while still virgin
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2187 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-07 23:59:45 +00:00
orbiter
5041d330ce
refactoring
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2150 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-28 11:44:50 +00:00
orbiter
90d569d70f
refactoring of index management:
...
url storage is part of index management; moved plasmaURL to indexURL
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2122 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-19 23:50:55 +00:00
orbiter
a930be4ba3
refactoring of index management:
...
generalized the index entry
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2121 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-19 23:19:20 +00:00
orbiter
7dd57a3828
added a busy-time estimation at DHT/RWI-Receive
...
to be done: usage of this value on client-side
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2116 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-19 14:52:00 +00:00
theli
fcec40fcc6
*) don't accept messages without subject or payload
...
See: http://www.yacy-forum.de/viewtopic.php?p=21656
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2115 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-19 11:57:17 +00:00
orbiter
82b2bc6932
patch for index-transfer DoS problem
...
see http://www.yacy-forum.de/viewtopic.php?p=21627#21627
note that this function will make the index-transfer functionality void
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2114 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-18 22:24:51 +00:00
orbiter
a474669338
start with refactoring of index management
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2110 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-16 16:11:55 +00:00
allo
799c04091d
Bugfix for Spam-Bug (Header manipulation)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2057 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-04 16:41:30 +00:00
orbiter
dbe96e6541
added hand-over of search filter and prefer ranking to yacy protocol
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2029 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-20 10:15:00 +00:00
orbiter
00a5d435e2
- fixed some bugs with domain filter
...
- added new ranking filter "prefermask": urls that match the filter are ranked better
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2022 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-13 23:19:36 +00:00
orbiter
bd283b8443
fixed bugs:
...
- null pointer exception during startup of a robinson-configured peer
- wrong time calculation of default value of re-crawl option
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2005 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-06 16:28:28 +00:00
orbiter
0a4c2e89ed
remote crawl orders are now only accepted if sum over all
...
queues is less than 100 (the indexing queue was not measured before)
see also: http://www.yacy-forum.de/viewtopic.php?p=19374#19374
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1947 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-22 14:23:24 +00:00
orbiter
1f4412a146
adopted isListed to discussed new behavior as discussed (url, getFile)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1940 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-20 22:31:59 +00:00
orbiter
3286b1f498
re-organisation of lurl-creation and -stacking
...
this was necessary to prevent useless write to the database
in case of blacklist appearance of the url
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1905 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-17 10:16:07 +00:00
hermens
289da326e5
*) Bugfix: remove blacklisted URL from loadedURL, when received via DHT transfer
...
see: http://www.yacy-forum.de/viewtopic.php?p=18976#18976
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1904 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-16 23:58:44 +00:00
rramthun
9f979d4fa5
Domain-lists gzip-compressable and sendable via cr-send/receive
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1883 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-13 20:12:31 +00:00
orbiter
f188611fc6
apply blacklist on rwis during dht receive
...
very experimental!
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1865 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-09 10:46:02 +00:00
theli
5ee0125046
*) adding possibility to configure the server port for seed uploading via scp.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1861 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-08 16:34:05 +00:00
allo
7afa5c1b8e
staticIP fix
...
tried to solve http://www.yacy-forum.de/viewtopic.php?p=18663#18663
D 2006/03/08 07:08:20 YACY yacyClient.publishMySeed mySeed error - not proper: IP is not proper: -UNRESOLVED_PATTERN-
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1859 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-08 12:23:26 +00:00
theli
f108048a2c
*) Bugfix for NullpointerException in hello.java
...
*) Correcting for loop in hello.java
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1854 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-08 06:40:38 +00:00
orbiter
bae3783d38
added a snippet marking
...
(search words are now bold in snippets)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1823 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-05 01:11:06 +00:00
allo
f73d51f94b
reverted last change
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1810 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-03 19:20:35 +00:00
allo
8997b83806
store the staticIP(dyndns) in seed, not the real IP
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1809 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-03 17:33:05 +00:00
allo
7c5f8f997a
some more staticIP fixes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1784 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-28 12:20:19 +00:00
orbiter
d31a4e0b4f
some small enhancements with cache flushing parameters and data structures
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1767 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-25 16:10:31 +00:00
hermens
3208fe14ed
*) log exceptions in crawlOrder.java to the logfile instead of stdout
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1735 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-22 01:04:38 +00:00
orbiter
7eb10675b3
re-organization of index management
...
this was done to be prepared for new storage algorithms
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1635 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-14 00:12:07 +00:00
theli
d0f76fc9bc
*) setting logging level for thread pools to info
...
*) new layout for bookmark list
(Allo: please take a look if it's acceptable for you)
*) crawlReceipt.java: displaying peer name in logging message
*) Network.html: adding button for manual peer ping
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1584 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-09 08:29:07 +00:00
orbiter
fb7411d7bb
re-structuring of ranking application:
...
concentration of all ranking attributes in the
plasmaSearchRankingProfile
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1541 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-05 01:47:51 +00:00
orbiter
d98418390b
- introduced rankingProfile Class
...
- selection of ranking and timing profiles for each search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1539 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-04 23:51:00 +00:00