Commit Graph

903 Commits

Author SHA1 Message Date
borg-0300
7a48090fcf - fix for "uk" language
- svn attributes added

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5803 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-14 11:40:44 +00:00
orbiter
c3aff2521e fix for NPE
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5789 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-09 13:32:56 +00:00
orbiter
44e01afa5b - refactoring
- a little bit more abstraction
- new interfaces for index abstraction

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5783 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-07 09:34:41 +00:00
orbiter
c2359f20dd refactoring: better abstraction of reference and metadata prototypes.
This is a preparation to introduce other index tables as used now only for reverse text indexes. Next application of the reverse index is a citation index.
Moved to version 0.74

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5777 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-03 13:23:45 +00:00
orbiter
5b138ada16 fixes to web structure reference collection and url construction
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5775 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-03 08:29:40 +00:00
orbiter
7ba078daa1 - added fast site-operator
- refactoring merge into BLOBArray

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5770 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-02 13:26:47 +00:00
orbiter
587838bd09 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5758 6c8d7289-2bf4-0310-a012-ef5d649a1542 2009-03-30 21:13:53 +00:00
orbiter
96eaecda3e - added migration class to go from index collections to the index cell data structure.
- added better control over file deletion, because this sometimes fails, especially on windows

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5756 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-30 15:31:25 +00:00
lulabad
df87e4dbf6 missing count of send Index and URLs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5747 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-28 20:49:58 +00:00
orbiter
9a90ea05e0 added a merge operation for IndexCell data structures
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5727 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-18 16:14:31 +00:00
orbiter
83792d9233 more refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5722 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-16 16:24:53 +00:00
borg-0300
cdbdc731c5 small updates: unescape, isCGI
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5720 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-16 08:49:49 +00:00
orbiter
209f25f5f5 refactoring to integrate indexCell data structures
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5718 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-16 00:18:37 +00:00
borg-0300
359a238acf faster isCGI()
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5717 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-14 13:47:49 +00:00
borg-0300
f75628e53b some corrections
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5716 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-14 11:08:32 +00:00
orbiter
7dff1cba62 removed option to use different primary keys in kelondro tables
this option was never used and there is also no use to set other columns but the first as the primary key. as a result, access methods to the key do not need to compute key positions, and they work faster.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5711 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-13 16:52:31 +00:00
orbiter
7f67238f8b refactoring of plasmaWordIndex: less methods in the class, separated the index to CachedIndexCollection
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5710 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-13 14:56:25 +00:00
orbiter
14a1c33823 refactoring of wordIndex class
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5709 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-13 10:34:51 +00:00
orbiter
13c666adef performance hack to ObjectIndex put() method:
Java standard classes provide a Map Interface, that has a put() method that returns the object that was replaced by the object that was the argument of the put call. The kelondro ObjectIndex defined a put method in the same way, that means it also returned the previous value of the Entry object before the put call. However, this value was not used by the calling code in the most cases. Omitting a return of the previous value would cause some performance benefit. This change implements a put method that does not return the previous value to reflect the common use. Omitting the return of previous values will cause some benefit in performance. The functionality to get the previous value is still maintained, and provided with a new 'replace' method. 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5700 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-11 20:23:19 +00:00
hermens
8c60d6d117 In DHT selection delete only those references that were actually selected
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5693 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-10 13:56:30 +00:00
orbiter
3b28daab40 code-beautification (to be consistent with external documentation paper)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5686 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-09 10:24:15 +00:00
orbiter
efcd95dc37 simplification of (internal) query process / refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5671 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-06 15:53:20 +00:00
orbiter
aa44d9bad9 more refactoring of kelondro.text / deleted de.anomic.index
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5664 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-02 11:04:13 +00:00
orbiter
6ffc6e3389 more refactoring of indexer and kelondro classes;
- integrating the indexer into kelondro as package 'text'
- renaming of classes in kelondro.index

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5663 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-02 10:00:32 +00:00
orbiter
76ef5f0f14 refactoring of index package: better names for the classes (to be continued)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5661 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-01 23:58:14 +00:00
orbiter
c728879ab8 fixes to yacyURL - more exceptions in case that urls are strange
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5657 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-26 22:33:47 +00:00
orbiter
7542336ae5 performance enhancement to yacyURL: omit second processing of resolveBackpath. This method is already applied during initialization of the object and was called a second time when the url was exportet.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5656 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-26 21:52:32 +00:00
orbiter
e521e81148 bugfix in yacyURL (for latest performance hack)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5654 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-26 07:46:47 +00:00
orbiter
54625360f7 performance update
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5653 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-25 23:27:21 +00:00
orbiter
46632f4385 performance update to yacyURL
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5651 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-25 00:17:34 +00:00
orbiter
8444357291 added new row interator in kelondro tables files that enumerates rows
without an order by the primary key. The result is a very fast enumeration of the Eco table data structure. Other table data types are not affected.
The new enumerator is used for the url export function that can be accessed from the online interface (Index Administration -> URL References -> Export). This export should now be much faster, if all url database files are from type Eco
The new enumeration is also used at other functions in YaCy, i.e. the initialization of the crawl balancer and the initialization of YaCy News.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5647 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-24 10:40:20 +00:00
lotus
6117e083e5 option to customize tray label (tooltip) with tray.label
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5642 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-23 21:07:08 +00:00
lotus
9519d84372 changed "dooble" variable to "browserintegration" to be less specific
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5636 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-22 17:32:17 +00:00
lotus
8429083972 adjusted tray for dooble:
you can now set dooble=true in yacy.init to disable the menu and browser popups by default

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5633 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-22 15:11:44 +00:00
orbiter
c852d2d70e - reject too old seeds
- do not store the complete seed in the reverse name cache, only the hash of the peer

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5628 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-21 00:39:44 +00:00
orbiter
4f9dae2571 remove reference in crawl entries
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5623 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-19 22:58:00 +00:00
orbiter
c12bb8a6d0 - refactoring of the http client
- added a protection against memory leaks for the access tracker

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5621 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-19 16:24:46 +00:00
orbiter
62505bb3cb more bugfixes as recommendet by findbugs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5619 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-17 09:12:47 +00:00
orbiter
94c42691d8 - reject less transmissions as transmission receiver
- do not flag too much receiver when something goes wrong during transmission as sender

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5616 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-16 21:28:48 +00:00
orbiter
26978b2a25 - better memory protection in kelondro caches: computation of needed memory for cache grow
- removed excessive gc calls
- step to 16 vertical DHT partitions

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5611 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-15 23:35:59 +00:00
orbiter
be0c492ae5 fix for memory leak bug in new dht transmissions
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5606 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-14 00:01:05 +00:00
orbiter
40d9849aa4 - better control of chunk size in dht selection
- more restrict values in selection
- step to 4 vertical partitions

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5603 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-12 14:47:32 +00:00
orbiter
411f2212f2 more memory leak fixing hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5599 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-11 13:31:10 +00:00
orbiter
985d421f91 found and fixed some memory leaks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5596 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-11 11:24:15 +00:00
orbiter
6c627dbdff update to the server core
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5591 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-10 13:26:26 +00:00
orbiter
5393f356aa fix for termination problem
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5589 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-10 01:08:06 +00:00
orbiter
6a876ecb88 first fixes to the DHT transmission process
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5588 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-10 00:48:54 +00:00
orbiter
c25c334b75 replaced old DHT transmission method with new method. Many things have changed! some of them:
- after a index selection is made, the index is splitted into its vertical components
- from differrent index selctions the splitted components can be accumulated before they are placed into the transmission queue
- each splitted chunk gets its own transmission thread
- multiple transmission threads are started concurrently
- the process can be monitored with the blocking queue servlet
To implement that, a new package de.anomic.yacy.dht was created. Some old files have been removed.
The new index distribution model using a vertical DHT was implemented. An abstraction of this model
is implemented in the new dht package as interface. The freeworld network has now a configuration
of two vertial partitions; sixteen partitions are planned and will be configured if the process is bug-free.
This modification has three main targets:
- enhance the DHT transmission speed
- with a vertical DHT, a search will speed up. With two partitions, two times. With sixteen, sixteen times.
- the vertical DHT will apply a semi-dht for URLs, and peers will receive a fraction of the overall URLs they received before.
  with two partitions, the fractions will be halve. With sixteen partitions, a 1/16 of the previous number of URLs.
BE CAREFULL, THIS IS A MAJOR CODE CHANGE, POSSIBLY FULL OF BUGS AND HARMFUL THINGS.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5586 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-10 00:06:59 +00:00
orbiter
65a1de6c05 longer timeout for remote crawl queries
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5573 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-04 16:55:09 +00:00
orbiter
78b7361937 fixed problem with logging
YOU MUST DELETE DATA/LOG TO MAKE THIS WORK! (sorry..)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5552 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-01 00:21:08 +00:00
orbiter
94110df85a moved logging partially to kelondro
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5545 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-31 01:06:56 +00:00
orbiter
024da2916b refactoring of logging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5544 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 23:33:47 +00:00
orbiter
83ce65707a (almost) completed partition of classes in kelondro
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5543 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 22:44:20 +00:00
orbiter
7ee494fde5 more refactoring of kelondro:
- seperated BLOB from table classes
- renamed 'coding' package to 'order'

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5542 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 22:08:08 +00:00
orbiter
bf93767ec6 refactoring of kelondro database classes
(to be continued)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5540 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 15:33:00 +00:00
orbiter
fc27bf8c4c refactoring of kelondro classes:
kelondro shall become independent from other packages.
moved bytebuffer, date and memory to kelondro

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5539 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 14:48:11 +00:00
f1ori
aaafe05c02 * revert debug change
* contains instead of startsWith, because there might me localizied strings
* decode punycode for every domainpart seperately (see http://forum.yacy-websuche.de/viewtopic.php?f=9&t=1749)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5516 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-24 00:33:38 +00:00
orbiter
419469ac27 added more methods to control the vertical DHT (not yet active .. )
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5514 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-23 15:32:27 +00:00
orbiter
dedfc7df7f removed distinction between DHT-in and DHT-out. This is necessary to make room for the new cell data structure, which cannot use this this distinction in the first place, but will enable the same meaning with different mechanisms (segments, later)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5511 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-22 00:03:54 +00:00
orbiter
5080fc33bf fix for http://forum.yacy-websuche.de/viewtopic.php?p=12247#p12247
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5506 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-20 22:57:58 +00:00
low012
78778df464 *) this should adjust the Dev/Main detection of the updater to the new version numbers (0.7x is Dev, if x != 0)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5504 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-18 21:48:03 +00:00
orbiter
9d119c6b61 migration of auto-update rules to new release strategy:
next stable will be 0.7, development releases are 0.*x, experimental will be if x = 1, 2, 3

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5458 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-09 10:08:11 +00:00
lotus
c8451614f3 fix for overflow
http://forum.yacy-websuche.de/viewtopic.php?p=11696#p11696

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5440 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-05 18:28:27 +00:00
orbiter
c4c4c223b9 fixed a problem with attribute flags on RWI entries that prevented proper selection of index-of constraint
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5437 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-04 02:27:29 +00:00
orbiter
6072831235 no cr transmission for robinson peers
see also: http://forum.yacy-websuche.de/viewtopic.php?p=10290#p10290

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5436 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-03 23:44:42 +00:00
orbiter
e004da48d3 - added fast fingerprint computation for files (any). Will be used in new index dump method
- refactoring

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5415 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-29 12:22:13 +00:00
f1ori
2d2ce24011 * remove all encoding-stuff from proxy
encoding is handled by parsers or browser, proxy only passes through


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5410 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-23 19:14:54 +00:00
lotus
449e697436 fix for null-seed in seedfile
http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1653

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5401 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-19 12:10:01 +00:00
orbiter
2802138787 - refactoring of CrawlStacker (to prepare it for new multi-Threading to remove DNS lookup bottleneck)
- fix of shallBeOwnWord target computation heuristic


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5392 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-15 00:02:58 +00:00
orbiter
1779c3c507 - added a read cache to the RAFile interface to RandomAccessFile
- added a write buffer to BLOBHeap
- modified the BLOBBuffer (is now only to buffer non-compressed content)
- added content compression to the HTCache
The new read cache will decrease the start/initialization time of BLOB files,
like the HTCache, RobotsTxt and other BLOBHeap structures.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5386 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-10 11:15:19 +00:00
orbiter
47292e696a more performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5379 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-04 12:54:16 +00:00
orbiter
d39d420b39 performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5376 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-03 15:38:29 +00:00
orbiter
c6525ab75f fix for NPE in seed handling
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5371 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-01 23:08:27 +00:00
danielr
538359a0ff simple fix to get DHT working again (maybe something more has to be done ;)
fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1578



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5327 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-11 18:55:16 +00:00
f1ori
7e1fe05e3c * added utf8-encoding to many getBytes-calls
* utf8 should work now


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5323 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-08 20:24:31 +00:00
orbiter
3f746be5d4 - consolidation and refactoring of many DHT target - computing methods
- implemented vertical DHT acceptance ("my own DHT") to accept new targets
- added new target computation for global search: addresses vertical targets also
- enhanced remote crawling: collection of remote crawl urls if queue has less than 100 entries (was: 0 entries)
- better performance value computations for PPM selection in network configuration

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5319 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-06 10:07:53 +00:00
orbiter
d014b2728a Design-check, Extension and Refactoring of DHT target position computation:
- two different computations (but mathematical equivalent) of the DHT distance had been consolidated
- moved from 0.0 .. 1.0 double-range position computation to 0 .. Long.Max range for DHT targets
- added fast Long - to - hash computation
- high-precision target computation of gaps for new peers
- added new target computation for horizontal and vertical DHT targets (not yet in use)
- old horizontal-only DHT targets will be upwards compatible to new horizontal and vertical DHT positions

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5318 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-03 00:27:23 +00:00
orbiter
ffed5fc415 fixed problem with lost peers in database
migrated seedDB from BLOBTree to BLOBHeap

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5263 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-10 14:40:02 +00:00
orbiter
6fb865fbdc - fix of bug in iterator in kelondroBLOBHeap which caused bug in crawl profile listing
- some refactoring of classes that use kelondroMap (Map instead of HashMap)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5262 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-10 08:39:11 +00:00
orbiter
820a03f9d6 - removed some warnings
- used fix in SVN 5233 for ysearch.java and search.java

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5237 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-01 20:20:39 +00:00
lotus
dda771db9d - search result layout
- tray only for windows

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5222 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-29 12:39:57 +00:00
lotus
31c31e54e4 new tray icon image for different icon sizes (e.g. linux)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5216 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-28 08:54:33 +00:00
f1ori
9589dfe080 * removed trayicon popupmenu title
* added some menu items to trayicon


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5213 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-27 08:25:16 +00:00
lotus
5a637f004d localized tray
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5212 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-26 11:09:54 +00:00
lotus
9d4f0325e1 - removed shutdown from search page (we have it in tray now!)
- fixed doubleclick action for tray

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5211 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-26 10:55:08 +00:00
lotus
214277dad6 - revert r5202
- cleanup
- installer checks for JRE 1.6 only

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5210 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-26 06:52:36 +00:00
f1ori
7afa084207 * add nativ java trayicon, using reflections
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5209 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-25 19:36:49 +00:00
orbiter
6e7d113eac fix for wrong index initialization after network switch
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5203 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-23 23:30:25 +00:00
orbiter
00c1535f84 added ranking and evaluation of language type in a search
the wanted language is taken from the browser user-agent string

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5192 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-21 00:04:42 +00:00
orbiter
bfcf9b7aa3 - added language detection using metadata from documents: html and odt documents provide this information
- metadata and results from statistical analysis are compared and result is printed out as debug lines
- added ranking profile for wanted language
- added class with ISO 639 table, a list of all valid country codes that will be used for the language identification

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5187 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-19 22:19:11 +00:00
orbiter
ce2a7ed116 integrated language detection classes into condenser environment
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5180 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-18 13:12:33 +00:00
orbiter
0cd0fee546 fixed bug with wrong proxy result enqueueing. See:
http://forum.yacy-websuche.de/viewtopic.php?p=8130#p8130
- removed the online status property. This influenced the proxy behavior and created some complexity that was not needed because the online status was never used as it was ceated for (offline browsing)
- checked all proxy identification procedures during crawling and enhanced transparency and error checking
- fixed a proxy identification routine that caused the wrong selection of the proxy result queue

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5173 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-16 21:56:23 +00:00
orbiter
1eb813bd43 shifted index deletion-on-exit rule to the class where the errors are produced
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5141 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-12 11:51:48 +00:00
orbiter
3f3673b6e5 extended balancer:
- added automatic time delay in case that a large number of urls come from the same domain
- added additional time delay in case that an url is a dynamic (CGI) url. This shall cause less IO on targets


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5128 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-08 21:50:37 +00:00
orbiter
d09ddabd09 corrected a design mistake (5-byte hashes not necessary)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5119 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-04 21:28:00 +00:00
orbiter
77ee0765a4 - added domain statistic generation to IndexControlURLs_p.html servlet
- added 'delete all' button to all results of such a domain statistic output which causes that all urls to this domain are deleted
- extended stack cleaner to clean also the statistics: they are not completely destroyed, only the smallest counting domains are removed


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5117 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-04 19:41:57 +00:00
lotus
423a89ebe8 * fix if yacy was installed to a path with whitespace
* show nice dots when waiting for restart/update

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5110 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-03 18:49:02 +00:00
orbiter
ead39064c5 fixed problem with wrong result number calculation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5105 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-03 10:04:46 +00:00
orbiter
05dbba4bab added logging conditions to all fine and finest log line calls
this will prevent an overhead for the generation of the log lines in case that they then are not printed

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5102 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-03 00:30:21 +00:00
orbiter
536e77e8b7 modifications towards a single database operation to read/write http header and cached file at once:
- removed distinction between header file types for http and ftp; ftp is simulated by using http properties
- removed all old resourceInfo classes that handled this distinction
- introduced a new distinction between http request and http response objects
- unified new response objects with two other object types that had been introduced elsewhere
- changed all servlet call methods to use the new http request header object type
- divided static object keys for http header properties into request and response types
- refactoring here and there (a large number of type changes and many methods merged/moved)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5079 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-25 18:11:47 +00:00
borg-0300
08cdf6db8a fix for wrong "VegaYacyB" peers
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5077 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-24 11:30:00 +00:00
danielr
753a1ae430 - changed default browser from netscape to firefox
- fixed "Inefficient use of keySet iterator instead of entrySet iterator" [WMI_WRONG_MAP_ITERATOR, FindBugs]
- fixed some possible null pointer accesses


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5063 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-20 07:54:56 +00:00
danielr
be28af50f5 - fixed "yacy2yacy no proxy"-problem
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5058 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-17 10:16:32 +00:00
danielr
a087090bbb fixed starting crawl results in "No parser available to parse mimetype 'application/octet-stream'"
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5047 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-10 11:31:40 +00:00
hermens
3ac1988059 Add some sanity checks for invalid seeds
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5042 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-08 13:56:29 +00:00
danielr
621b473b18 * removed some warnings of findbugs (http://findbugs.sf.net)
- removed unnecessary code (unused variables, String.toString)
- corrected some calculations (cast int to double or long ;)
- improved little performance (using Integer.valueOf() instead of new Integer)
- log if some File-actions fail (mkdir(), delete(), ...) and some ignored exceptions
- finalized some (more) fields
- finally close some streams
- made inner classes static if not using environment
- generalized some equals (from specificClass to Object)
- fixed some potential nullpointer accesses


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5039 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-06 19:43:12 +00:00
danielr
17b7845eb5 * refactoring
- moved constants from plasmaSwitchboard to own class (all 232 ;)
- moved remoteProxy-Methods to httpRemoteProxyConfig, better names
- removed some unnecessary code (else-statements)
* formatting (correct indentation)
* minor bugfixes (due to findbugs.sf.net)
* hopefully fixed "missing quote" (announcing StringParts as UTF-8)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5031 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-02 13:57:00 +00:00
danielr
3bb870bfcd added final where possible
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5030 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-02 12:12:04 +00:00
lotus
694084c570 fix for NPE on shutdown
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5021 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-27 06:59:56 +00:00
lotus
d42eae25f8 yacyTray:
fix for unproper shutdown
some messages

installer:
start shortcuts minimized

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5014 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-24 06:49:30 +00:00
orbiter
c3d461d191 - removed superfluous copyright statement
- updated my email address

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5011 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-20 17:14:51 +00:00
lotus
62afea0c9f some improvements for yacyTray
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5008 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-18 14:17:52 +00:00
lotus
28c39e2aa4 fix for new starter files
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5002 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-14 18:41:11 +00:00
lotus
fa695c2d9f tray is now only shown on Windows and doesn't block on linux
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4997 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-13 19:03:38 +00:00
lotus
f8a1e3175e new yacyTray
this will make a YaCy icon in the tray area on supported platforms
enabled by default
the search page will open on double click

used JDIC 0.9.4 from https://jdic.dev.java.net/

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4992 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-13 07:51:45 +00:00
orbiter
38eb5bd1ee fixed a bug in kelondroBLOBHeap. The following files are probably inconsistent and should be deleted:
DATA/HTCACHE/responseHeader.heap
DATA/PLASMADB/crawlRobotsTxt.heap

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4988 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-11 10:38:20 +00:00
orbiter
0f5fe8cc53 refactoring of method calling for objects from kelondroMapDataMining
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4985 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-11 07:15:46 +00:00
orbiter
01d1ae6676 patch for negative time in case that the time of the computer is changed
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4984 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-11 07:05:08 +00:00
orbiter
4acf0a61cd refactoring of kelondroObjects (mainly renaming to kelondroMap)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4982 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-10 22:08:16 +00:00
orbiter
1400cdc91e - refactoring of resourceObserver (moved it to crawler)
- partly redesign of diskUsage: little bit more functional behavior, less side effects, better error case handling
- the resourceObserver can now show a error message if the diskUsage is 'out of order'

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4973 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-07 00:03:37 +00:00
orbiter
a6719dfd2b - refactoring of robots parser
- no more keep-order parameter in remove (it was not possible to make this strict, and not useful)
- some small enhancements in balancer
- robots parser without references in switchboard
- changes synchronization in robots

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4969 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-05 00:35:20 +00:00
orbiter
e81be7d4f2 added many missing user-agent declarations for yacy http client connections.
the most important fix was the addition of the yacybot user-agent for robots.txt loading,
because web masters look for that access to see if the crawler behaves correctly.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4968 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-04 11:03:03 +00:00
orbiter
474659a71f - modified and enhanced the crawl balancer: better list export, fixing of damaged crawl queue at start-up, re-sorting at start-up to enhance domain order
- added option to set minimum crawl delta for domains in balancer
- added default values to crawl deltas in yacy.init
- added configuration for these deltas in performance queues
- enhanced performance setting computation (more time for indexing queue for a faster flush
- remote crawling is now enabled during local crawling if indexer has space and time for more links
- added database stub for new distributed file system
- refactoring of time computation to get an abstraction level that will be used by a TTL rule in new distributed file system

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4966 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-03 13:08:37 +00:00
orbiter
080cda97ef added another peer selection rule:
- select also non-robinson (dht-) peers if their peer tags match with search words
- the peer tag '*' can now act as catch-all rule: shall be selected always

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4963 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-30 23:04:32 +00:00
orbiter
d37fd064f9 changed peer selection for search targets:
- less dht targets are selected
- more other peers are selected: all robinson peers with more than one million urls

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4962 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-30 22:42:52 +00:00
orbiter
69aac0d74c modified the diskUsage class regarding the following two aspects:
1. The usage and dependency of the plasmaSwitchboad was used many times in the past but this was
a bad mistake. The classes should be independent from the switchboard to support a better abstraction. Therefore the object was removed. The parameters from the switchboard are computed outside and then handed over.
2. the class is considered as a tightly connected to hardware resources. Classes which handle data that cannot be replicated because it would need to replicate hadware should not support dynamic object allocation, but should be coded as collection of private static methods. Therefore all class objects had been transformed into static private objects.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4961 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-30 21:47:53 +00:00
danielr
0c1dc703e4 - set staticIP at startUp
- added setting for reduced menu (simpleMenu)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4959 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-29 18:35:15 +00:00
lotus
ac85c52bae better readability for MIN_FREE_DISK_SPACE
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4948 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-21 10:20:36 +00:00
danielr
6b7e873962 resourceObserver refactoring and some synchronisation for console output
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4939 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-19 12:40:44 +00:00
lotus
877299cc74 better installer on Windows Vista
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4927 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-14 18:34:12 +00:00
danielr
68c38c2d34 - WatchCrawler shows status without JavaScript
- Performance can be scaled + DHT-profile
- names for pool-threads
- some small refactorings


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4923 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-14 10:24:58 +00:00
lotus
fc79f013c4 better solution to update shortcut
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4920 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-12 20:04:32 +00:00
lotus
48edbef5c7 * fix: display proper port on 1st startup
* new message on portchange
* first implementation of external link-update for search page (still inactive)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4915 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-11 19:04:39 +00:00
det
0727bb1e63 rework of console message handling; add of debugging output
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4914 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-11 18:43:12 +00:00
orbiter
f5ef7f222e - fixed a bug in parser (directory paths had not been recognized)
- no access check when a search is made only local without snippet fetch
- added comment and status message in resourceObserver (this takes very long at startup time!)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4911 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-11 09:54:58 +00:00
hermens
75b4a5ced4 reinstate old timeout values for transferRWI and transferURL
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4903 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-07 23:48:46 +00:00
orbiter
3330181aa0 refactoring:
find a better way to store BLOBs; generalize current BLOG data structure (kelondroDyn)
and prepare it to replace it with something better. The best candidate is the kelondroHeap,
which will become the kelondroBLOBHeap;
removed also some never-used classes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4902 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-07 23:12:24 +00:00
lotus
260553c3a5 better messages
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4897 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-07 14:24:11 +00:00
danielr
4b71912e76 fixed wrong class name
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4894 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-06 17:13:31 +00:00
danielr
7feae906aa - organize imports
- removed potential null pointer accesses
- removed unnecessary casts


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4893 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-06 16:01:27 +00:00
det
f597185026 Initial import of the resource observer framework
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4892 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-06 13:10:21 +00:00
orbiter
e0e7f86f82 some bugfixes for the peer-ping process
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4885 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-05 12:52:27 +00:00
danielr
cbe722c480 small code cleanUp
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4884 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-05 12:41:55 +00:00
orbiter
b21598bdd0 - enhanced handling of own IP address inside seed
- prevention of false information of own IP address
- enabled searching before an own IP address is assigned (before first ping happened)
- removed warning about limited search function
- added better time-out settings for peer-ping process (10 seconds complete, 5 seconds for back-ping)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4883 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-05 11:01:20 +00:00
orbiter
40d7f485f3 - fixed several NPE bugs
- fixed loosing of own seed hash (hopefully)
- fixed a bug with crawl start s beginning with (bookmark) files
- added better IP recognition during hello process


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4882 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-04 22:24:00 +00:00
orbiter
2f381b8d7a - fixed at least two causes for a NPE after a use case switch.
A large refactoring was neccessary
- added another crawl start option: automatic restriction to sub-path
- removed crawlStartSimple and renamed crawl start expert
   to crawl start (without expert)
- some changes to texts in crawl start
- added some more deletions when an web index is deleted:
   delete also queues and robots cache


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4881 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-04 21:34:57 +00:00
orbiter
74e3a547db more logging for mySeed loading error cases
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4871 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-01 11:24:46 +00:00
orbiter
698293ef32 fix for lonely peers in networks with only one peer, especially intranet peers
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4858 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-26 20:48:14 +00:00
orbiter
8a0e401320 - fix for bad code in peer actions
- fix for bad images in basic config

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4854 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-25 22:56:28 +00:00
orbiter
0b52ef3e4b - update of grafics
- check for startsWith("127.") instead of equals("127.0.0.0")

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4853 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-25 20:30:37 +00:00
orbiter
18ad12eceb added another fix for localhost addresses in seeds
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4852 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-25 18:41:56 +00:00
orbiter
11e00a0849 - refactoring of seedURL handling
- additional check for seedURL pointing to localhost: deny such peers

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4851 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-25 18:35:38 +00:00
lotus
f284386b63 update deploy improvements for windows - ready for release now :-)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4849 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-24 20:03:11 +00:00
orbiter
25192e0d36 added a deletion button to indexControlRWIs that deletes the complete web index
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4847 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-24 12:30:50 +00:00
lotus
eac62a6882 * ported restart on Windows to unix-style, works on _noconsole now
* removed Win9x scripts from build for more tidiness and less decisions for newbies

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4835 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-22 15:48:51 +00:00
orbiter
03438ee977 added missing implementation of network-path reference
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4834 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-22 00:08:14 +00:00
orbiter
2ba7914f0b fix for NPE exception while fetching remote crawl jobs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4833 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-21 23:07:37 +00:00
lotus
4a48717017 * automatic update for windows
pleas disable before release because 2nd update fails at the moment
and commandline handling has to be improved for windows
* update via new unTar class
please review stream- and exceptionhandling because I'm fairly new to Java
maybe it can be done concurrent
* updated windows startscripts to values from yacy.init

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4832 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-21 15:23:56 +00:00
orbiter
2f29ab8779 more target server access security
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4809 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-16 19:50:28 +00:00
orbiter
cfe6790498 - added option to switch between yacy networks, especially between the two default networks (freeworld and intranet),
from the ConfigNetwork online interface
- to make this possible, a large refactoring and reorganisation of data structures was necessary

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4803 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-14 21:36:02 +00:00
orbiter
78087da287 - changed seed file storage to clear text
- fixed kill script
- fixed saving of seed file (had been corrupted by latest changes)
- some refactoring

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4799 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-14 20:30:44 +00:00
orbiter
db032fb6de - added RWI transmissions to the event terminal
- fixed bug in Collage
- added 'embedded mode' to collage
- integrated Collage to terminal_p as iframe in embedded mode (Pictures now visible in terminal_p)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4797 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-13 11:46:20 +00:00
hermens
5bfc02ccfb Repair publishThread
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4781 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-09 11:14:32 +00:00
orbiter
b32736762c enhanced rssTerminal
- 3 lines possible
- distinguishing of private and public data, if not authorized only public data is shown
- shows now more events, including local searches in clear text if user is logged in
- simplyfied peer events
- better recognition of 'real' new peers
- presentation of peer pings from other peers

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4771 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-06 23:05:48 +00:00
orbiter
1689030ee8 refactoring: moved all crawler classes into their own package
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4768 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-06 00:32:41 +00:00
orbiter
d2ba1fd2ab major step forward to network switching (target is easy switch to intranet or other networks .. and back)
This change is inspired by the need to see a network connected to the index it creates in a indexing team.
It is not possible to divide the network and the index. Therefore all control files for the network was moved to the network within the INDEX/<network-name> subfolder.
The remaining YACYDB is superfluous and can be deleted.
The yacyDB and yacyNews data structures are now part of plasmaWordIndex. Therefore all methods, using static access to yacySeedDB had to be rewritten. A special problem had been all the port forwarding methods which had been tightly mixed with seed construction. It was not possible to move the port forwarding functions to the place, meaning and usage of plasmaWordIndex. Therefore the port forwarding had been deleted (I guess nobody used it and it can be simulated by methods outside of YaCy).
The mySeed.txt is automatically moved to the current network position. A new effect causes that every network will create a different local seed file, which is ok, since the seed identifies the peer only against the network (it is the purpose of the seed hash to give a peer a location within the DHT).
No other functional change has been made. The next steps to enable network switcing are:
- shift of crawler tables from PLASMADB into the network (crawls are also network-specific)
- possibly shift of plasmaWordIndex code into yacy package (index management is network-specific)
- servlet to switch networks 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4765 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-05 23:13:47 +00:00
danielr
d7b21bc90c re-added gzip POST for transferRWI/URL (HTTP/1.1 compliant)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4761 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-04 10:53:04 +00:00
danielr
d4bce6affd refactoring (initialized static fields, removed empty if/else, serialized some fields in serializable classes)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4755 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-03 09:06:00 +00:00
orbiter
d0678f7ab9 refactoring as result of
http://forum.yacy-websuche.de/viewtopic.php?f=6&t=959&p=7560#p7560

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4752 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-01 22:40:42 +00:00
orbiter
483e9a2066 - shifted tld recognition methods from yacyURL to serverDomains
- changed isLocal Property in such a way that it is possible to see if a domain is in the internet (and not intranet)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4751 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-30 23:06:42 +00:00
orbiter
5813cc149f fix for bad rssTerminal behavior
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4744 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-28 20:34:37 +00:00
orbiter
d0b893523e - protection against RAM overflow caused by new peer rss news
- more XSS protection

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4742 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-26 22:53:04 +00:00
orbiter
cf042e6957 reverted change by mistake in yacyVersion
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4740 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-26 01:08:59 +00:00
orbiter
9935e83c86 added new news window into the status page. At this moment it is just a test.
The news inside the window are about peer arrivals and departures, remote search accesses and crawls

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4739 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-26 01:00:10 +00:00
orbiter
bac38cfa18 added very rudimentary peer news as rss feed. An example can be retrieved with
http://localhost:8080/xml/feed.rss?channel=PEERNEWS
to be extended and integrated in interface ...

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4738 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-24 23:30:13 +00:00
orbiter
724bbdf9b2 refactoring of RSS reader
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4736 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-24 21:31:07 +00:00
orbiter
b9a2a2d287 more search performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4735 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-24 15:09:06 +00:00
orbiter
e024e3b9cf added new default profiles to distinguish snippet fetch for local and global search
the difference is, that a local search will no not cause a re-indexing of loaded pages

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4731 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-24 08:42:08 +00:00
danielr
9b03310f8a bin jetzt wach :/
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4729 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-23 07:50:21 +00:00
danielr
7bd8601f04 delete old releases compatible with java 1.5 ;)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4728 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-23 07:22:20 +00:00
danielr
da386a1924 fixed deleteOldDownloads if there are no downloads
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4726 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-22 21:36:52 +00:00
danielr
21418a22a3 removed DEBUG output
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4725 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-22 17:14:34 +00:00
danielr
79a3edeeef deleting downloaded releases after x days (default 30)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4724 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-22 16:53:53 +00:00
orbiter
1995faef8d - refactoring of Colage back-end: move to plasma package
- renamed also the plasmaCrawlResults to have a consistent naming for url and image queues
- added a double-check for the images
- added additional queues for the images: all worse-quality images go there, so the queue can be used also if no sizes are given; no image is lost
- added a cleanup for the stacks so they cannot flood the memory

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4722 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-21 22:42:49 +00:00
orbiter
5e3ce46339 - better logging when rejecting a url because it is not in declared domain
- more XSS attack protection

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4720 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-20 21:36:25 +00:00
orbiter
2f629d20a7 - tried to fix the '4217666-problem'
- removed more unused code

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4715 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-19 04:24:29 +00:00
orbiter
04c1226c80 added/fixed missing integrity-test else-case during deploy in case that we update with a tar file
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4700 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-15 15:20:35 +00:00
orbiter
8fe39ebd74 -fixed file transmission with POST. The only usage was in ranking transmission, therefore:
-fixed ranking transmission

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4681 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-12 08:12:51 +00:00
orbiter
82a9861779 fix for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4680 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-11 12:55:43 +00:00
orbiter
5d1fbb25e7 fix for bad deploy:
- the name of downloaded release files is adopted if the httpc delivers uncompressed tar.gz files (the .gz is removed from the file name)
- the deploy method is able to handle tar-file (not tar.gz-files)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4679 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-11 12:37:17 +00:00
orbiter
202a3adb3e refactoring of HttpClient Writer processes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4678 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-10 22:47:05 +00:00
orbiter
e356625b22 - refacotring of stream copy handling to support time-consuming operations
- made usage of BufferedStreams explizit to distinct different copy method in serverFileUtils (byte-by-byte and using an own buffer)
- introduced another timeout setting (java internal property)
- more restrictions to clients accessing a single host (a security setting to prevent DoS by mistake)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4674 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-10 09:53:07 +00:00
orbiter
c3342e1178 - removed class with only one static method
- removed connection method with too long time-out

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4672 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-09 23:35:20 +00:00
orbiter
f97971b63b fixed NPE problems doing a shutdown from command-line
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4671 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-09 22:59:17 +00:00
danielr
7a35126e91 http timeouts von alten httpc wieder gesetzt
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4670 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-09 11:02:14 +00:00
orbiter
2c1c3bb6eb - some refactoring (sorry Daniel, hab in deinem Code rumgewütet)
- fixed broken downloads (flush was missing)
- different problem handling when download is corrupted
- different default values in yacy.init

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4669 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-08 21:36:33 +00:00
danielr
d96e2badc7 - fixed POST in proxy
- prepared http connection tracking
- refactoring (mainly moving StreamTools to serverFileUtils)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4668 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-08 21:17:40 +00:00
danielr
ab330cfdca Network.html: removed ; from location
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4658 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-07 08:13:38 +00:00
danielr
5c3c1fdf41 replaced httpc with Apache Jakarta Commons HttpClient (includes some refactoring ;)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4640 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-05 13:17:16 +00:00
orbiter
7f9f639d20 - refactoring and abstraction of index reference (urls) handling: blacklisting is part of reference filtering
- refactoring of word/phrase handling: word abstraction from condenser becomes part of index element handling
- removed unused code parts from condenser

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4603 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-26 15:37:49 +00:00