Commit Graph

820 Commits

Author SHA1 Message Date
theli
24a02cbeef *) Bugfix for not parsable application/xhtml+xml resources if
an URL has no extension
   See: http://www.yacy-forum.de/viewtopic.php?p=23687

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2280 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-07 05:36:19 +00:00
orbiter
b0ca5fa784 some correction algorithm for preload time computation during assortment open
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2279 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-05 09:20:59 +00:00
orbiter
e22cbaee97 - extended logging for preload
- reduced preload-time for IndexImport_p.java

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2278 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-05 09:02:58 +00:00
orbiter
671fd9a5c9 work towards new indexing database structure
(no effect on current functionality yet)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2277 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-04 14:47:27 +00:00
orbiter
92f4cb4d73 added option to configure the start-up delay time for kelondro database files.
the start-up delay is used to pre-load the database node cache

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2276 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-03 23:57:33 +00:00
orbiter
6643da3fbd bugfix for http://www.yacy-forum.de/viewtopic.php?p=23463#23463
(affected URL DB Cleaner)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2263 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-01 09:51:00 +00:00
hydrox
8ba8e2b7d9 *) added cache for blacklists urlhashs recieved by DHT. DHT does not request URLs listed in this cache.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2251 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-28 08:51:34 +00:00
hermens
53cbcc6d6e Implement emergency break in index receive when the limit of the ramCache is exceeded by more than cacheLimit
See: http://www.yacy-forum.de/viewtopic.php?p=22911#22911



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2248 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-27 11:14:30 +00:00
orbiter
66964dc015 removed high/med/low from kelondroRecords cache control.
this was done because testing showed that cache-delete operations
slowed down record access most, even more that actual IO operations.
Cache-delete operations appeared when entries were shifted from low-priority
positions to high-priority positions. During a fill of x entries to a database,
x/2 delete situation happen which caused two or more delete operations.
removing the cache control means that these delete operations are not
necessary any more, but it is more difficult to decide which cache elements
shall be removed in case that the cache is full. There is not yet a stable
solution for this case, but the advantage of a faster cache is more important
that the flush problem.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2244 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-25 10:31:38 +00:00
borg-0300
4c6083b264 network picture;
back to old version

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2242 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-24 02:52:24 +00:00
borg-0300
955915385a network picture;
small changes;

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2241 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-23 15:37:59 +00:00
borg-0300
027fa8ab1c network picture;
bigger; 
more dot steps; 
small other;

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2240 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-23 13:53:29 +00:00
theli
b20496e42b *) make DHT DoS check configurable (requested by KoH)
- check can be disabled via property indexDistribution.dhtReceiptLimitEnabled
   - upper bound can be configured via indexDistribution.dhtReceiptLimit

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2234 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-21 19:28:42 +00:00
orbiter
12af69dd86 cosmetics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2212 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-18 11:49:31 +00:00
allo
67a8c74be3 Fix for dynamic login with static password.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2210 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-18 08:04:51 +00:00
allo
ef9eb50c3c fix for adminlogin
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2209 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-17 11:15:16 +00:00
allo
6fe2fed87e cookieauth works with static Admin.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2208 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-16 08:04:02 +00:00
theli
45b39ee1be *) solving unpacking problems with to long filename by
a) renaming the parent folder in the tgz file to yacy
      (can be configured via build properties file)
   b) reconfiguring build file to throw an error if a file
      name is too long 
Please note that currently there is _no_ proplem with too long
class names because of step a.

      

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2207 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-14 15:18:41 +00:00
theli
fb090652df *) use a more compact for plasmaWordIndexAssortmentImporter.java because the long name
caused problems during untar operation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2206 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-14 14:28:46 +00:00
theli
4ca0857c0c *) Index transfer now considers the pause time send by busy peers during
index transfer / index distribution
   See: http://www.yacy-forum.de/viewtopic.php?p=22647#22491

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2205 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-14 09:40:42 +00:00
orbiter
75ed507d39 some debugging of new kelondroFlexTable class
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2190 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-09 12:52:57 +00:00
orbiter
370c481fa7 bugfixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2171 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-02 22:46:32 +00:00
orbiter
c36e9fc8d3 full integration of kelondroRow
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2167 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-02 12:45:57 +00:00
orbiter
c75cacda95 added a flex-width-array: this is a table where it is
possible to add columns to an existing table

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2163 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-01 16:01:24 +00:00
orbiter
4a907a570f 1st step to migrate kelondroTree to usage of kelondroRow instead of byte[][]
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2162 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-31 23:31:46 +00:00
orbiter
09f780df27 more bugfixes for the new row/stack handling changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2160 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-30 21:24:01 +00:00
orbiter
3c3c047d0a integrated kelondroRow into kelondroStack
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2156 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-30 15:28:05 +00:00
orbiter
5bb565944f integration of new kelondroRow into some parts of kelondro,
especially into the array storage

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2155 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-30 14:36:20 +00:00
orbiter
eaa6f012f0 refactoring: better naming for classic DB (files in WORDS)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2151 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-28 11:59:16 +00:00
orbiter
5041d330ce refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2150 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-28 11:44:50 +00:00
orbiter
7b3b12888c refactoring: integrated indexContainer abstraction layer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2149 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-28 01:09:31 +00:00
orbiter
cb295fbbdc refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2147 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-26 23:55:30 +00:00
rramthun
bc94a714b2 Better explanation for the auto-dom-filter.
Some javadoc.
Small change to DetailedSearch.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2146 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-26 12:18:12 +00:00
orbiter
196b8abb30 refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2144 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-26 09:32:50 +00:00
hermens
b48327904a Don't disconnect peers that report 'busy' during index transfer.
These peers are already being marked as not accepting remote index transmissions by yacyClient.transferIndex. That should by enough to prevent further transfer attempts until newer seed information is received.



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2142 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-25 00:59:14 +00:00
orbiter
4d8f8ba384 added cache-performance analysis for node caches
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2140 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-24 09:40:08 +00:00
orbiter
bd057b44dd - automatic setting of peer-does-not-accept-remote-crawl
- increased percentage of object cache to node cache to 30%

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2136 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-23 22:03:09 +00:00
orbiter
81e79f2caf fixed new cache behaviour changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2134 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-23 20:04:48 +00:00
orbiter
cda087f43b - integrated cache miss storage into object cache
- removed cache-miss handling from indexURL
todo: new Monitoring in PerformanceMemory_p

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2132 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-23 16:43:28 +00:00
orbiter
757ec28430 refactoring: better data capsulation for indexURL
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2131 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-23 08:59:45 +00:00
theli
61078b3885 *) adding support for delayed shutdown
- needed by Ismael to receive the Steering page properly on shutdown
   - now the steering page should always be displayed properly in the web browser

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2129 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-22 08:02:35 +00:00
orbiter
90d569d70f refactoring of index management:
url storage is part of index management; moved plasmaURL to indexURL

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2122 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-19 23:50:55 +00:00
orbiter
a930be4ba3 refactoring of index management:
generalized the index entry

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2121 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-19 23:19:20 +00:00
hermens
df7e1d9df3 Changes to plasmaURL and subclasses:
- Improve performance of plasmaURL.exists() by remembering URL-hashes that are not present
- Use a more realistic estimation of memory usage by the existsIndex cache
- Routine cleanup of the existsIndex to limit its memory usage



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2113 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-17 13:08:57 +00:00
orbiter
a474669338 start with refactoring of index management
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2110 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-16 16:11:55 +00:00
rramthun
f08e33680c Added Blog-news-symbol as requested.
I think I will change the character distance a little bit later.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2101 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-15 15:42:06 +00:00
theli
f331def5d8 *) Bugfix for distribution. Incorrect behavior if peerCount == selectedCount
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2098 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-15 10:03:24 +00:00
auron_x
55ea4cbfe6 *)reverted patch for memory-display issue
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2095 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-14 18:09:28 +00:00
theli
5048b05bc6 *) Index Transfer should only restart at the beginning if the delete
option is configured. Otherwise we have an endless loop

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2092 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-14 13:27:50 +00:00
auron_x
53d9ab6db7 *)fixed bug in PerformanceMemory_p.java which caused negative memory-values on big peers
see http://www.yacy-forum.de/viewtopic.php?t=2370

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2091 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-14 08:42:41 +00:00
theli
ddfe0f0e27 *) don't try to parse referer string if it's null
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2090 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-13 15:45:04 +00:00
theli
bcc950c533 *) Bugfix for Index Transfer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2088 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-13 15:28:57 +00:00
orbiter
015d044c25 tried to fix some problems with latest changes to httpc
very experimental!

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2078 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-10 16:01:14 +00:00
orbiter
3e31820c3d - corrections to PerformanceMemory display of object cache
- configuration of object cache size in kelondroTree initializer

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2075 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-10 09:08:42 +00:00
orbiter
461548698c configuration of index transfer chunk size
see http://www.yacy-forum.de/viewtopic.php?p=20951#20951
new properties in yacy.init:
indexDistribution.minChunkSize = 5
indexDistribution.maxChunkSize = 1000
indexDistribution.startChunkSize = 50

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2073 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-09 11:43:10 +00:00
orbiter
29b1b0823c added monitoring of new object cache to performanceMemory page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2072 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-09 10:03:12 +00:00
theli
9104001e7c *) Better error handling for assortment import
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2067 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-08 07:58:22 +00:00
hermens
51e3bb576f Don't increase dhtTransferIndexCount when the last transferred index was smaller
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2064 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-07 17:44:33 +00:00
hermens
a0ca4c5fb8 Remove a possible race condition between DHT transfer and deQueue
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2059 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-05 13:17:00 +00:00
hermens
0cfba8950f Removing unnecessary and possibly dangerous synchronization of the wordIndex
when deleting transferred indexes



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2058 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-05 12:48:17 +00:00
orbiter
d6213f8a85 quickfix for http://www.yacy-forum.de/viewtopic.php?p=19482#19482
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2042 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-25 15:35:25 +00:00
orbiter
b0036249c1 added some attributes to network picture
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2032 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-21 21:21:35 +00:00
hermens
cbcf7418ef Cleanup synchronization in plasmaWordIndex
-  only synchronize when changing data in more than one database
see: http://www.yacy-forum.de/viewtopic.php?t=2167



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2031 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-20 14:00:47 +00:00
orbiter
60e5aff9fc some enhancements to the remote crawl trigger
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2030 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-20 11:53:15 +00:00
orbiter
dbe96e6541 added hand-over of search filter and prefer ranking to yacy protocol
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2029 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-20 10:15:00 +00:00
rramthun
0604203bce Updated and corrected German language file
Changed Italian language file for an Italian/English interface and not Italian/German

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2024 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-18 11:37:03 +00:00
orbiter
00a5d435e2 - fixed some bugs with domain filter
- added new ranking filter "prefermask": urls that match the filter are ranked better


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2022 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-13 23:19:36 +00:00
orbiter
14d6e476c9 tried to solve some problems with new picture viewer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2019 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-10 22:34:47 +00:00
orbiter
9324425165 fix for remote crawl reject
see http://www.yacy-forum.de/viewtopic.php?p=20075#20075

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2017 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-10 21:34:17 +00:00
borg-0300
30e4fc39a5 HTCache extended
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2015 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-08 13:05:04 +00:00
orbiter
d0dd8b14d2 fixed picture tag and presentation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2014 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-07 22:09:59 +00:00
borg-0300
da6a8bafa2 rename currCacheSize -> curCacheSize;
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2010 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-07 13:15:15 +00:00
borg-0300
92110aea32 nullpointer fix for profile(); other minor change;
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2009 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-07 12:43:59 +00:00
orbiter
f0833b0328 introduced simple search interface
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2007 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-06 21:48:24 +00:00
orbiter
47b541b2d1 added better option handling in yacysearch
added depth option for image presentation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2001 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-05 10:34:24 +00:00
orbiter
c9e16bfd48 first try to insert image search (does not work yet)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2000 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-04 23:12:10 +00:00
orbiter
f77775220b fixed parser error
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1999 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-04 22:28:46 +00:00
orbiter
22de954a57 added some log output to parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1996 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-04 15:01:21 +00:00
orbiter
83e0e765ec redesigned some parts of the html scanner & parser
to better support image tags

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1995 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-04 14:36:01 +00:00
orbiter
ac114d69c0 tried to fix some problems with time-outs during search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1994 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-03 23:26:08 +00:00
orbiter
e2e8d0c188 some kind of refactoring of yacysearch:
made 'room' for new picture search result presentation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1993 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-03 22:47:59 +00:00
orbiter
6b63e26cbb - removed search function from index.html/java, only imput left
- added media fetcher/crawler class (not ready yet)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1992 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-03 15:36:53 +00:00
orbiter
bc3e80fe42 quickfix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1990 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-02 23:03:17 +00:00
orbiter
d8d0ac29c3 added image-viewer servlet that can do:
- each image that is requested is stored in the cache
- the image is taken from the cache if exists there
- the image can be scaled
The purpose of creation a scaled image is because of copyright problems
In a further stept the retrieval of not-shrinked images is restricted
to either access from localhost or with given authentication
This servlet can be used for image-preview purpose after an image search

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1989 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-02 22:59:53 +00:00
orbiter
ddc6394d9b fixed bug about auto-depth 0
see http://www.yacy-forum.de/viewtopic.php?p=19751#19751

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1988 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-02 21:30:04 +00:00
orbiter
60351fa3f7 small fix to previous commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1987 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-02 20:52:34 +00:00
orbiter
a469874e3f added and fixed time-out behaviour during search
see also: http://www.yacy-forum.de/viewtopic.php?p=19823#19823

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1986 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-02 20:40:07 +00:00
orbiter
1d0b0d6e2a synchronized local searched to prevent that several searches are performed at the same time
see also: http://www.yacy-forum.de/viewtopic.php?p=19761#19761

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1985 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-02 18:51:18 +00:00
hermens
22b9d03bbf Correcting remaining time issue in getContainers
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1984 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-31 09:52:55 +00:00
orbiter
d58788b753 added some synchronisation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1982 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-30 15:38:46 +00:00
orbiter
e566d1d8d6 some bugfixes regarding new crawling options
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1980 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-28 22:54:36 +00:00
orbiter
c7f1300300 -fixes for last commit
-some more ranking attributes (comments only)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1979 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-28 15:37:45 +00:00
orbiter
f2421f6a47 some small attribut changes regarding cache flush
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1974 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-27 23:14:04 +00:00
orbiter
7a650d0023 several bugfixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1971 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-27 16:45:29 +00:00
orbiter
59d52fb4a9 fixed some problems with crawl profiles
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1967 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-26 14:52:01 +00:00
orbiter
708cc6c8d9 fixed some bugs for auto-filter and added monitor in profile list
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1959 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-24 00:38:40 +00:00
rramthun
250864406f ...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1955 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-23 20:24:53 +00:00
orbiter
e82899ba57 fixed missing urls map initializer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1950 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-23 16:13:23 +00:00
orbiter
63f39ac7b5 added 3 new crawling steering options:
- re-crawl by age of page (enter in minutes)
- auto-domain-filter
- maximum number of pages per domain
NOT YET TESTED!

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1949 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-23 16:05:16 +00:00
orbiter
1fc3b34be6 some pre-work (without function yet) to implement:
- re-crawl (by age of last crawl)
- auto-crawl-filter by crawl depth (to be explained..)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1948 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-22 15:28:17 +00:00
theli
c9e6b5e391 *) check size of indexing-queue and crawler pool before processing remote triggered crawl jobs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1946 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-22 14:19:03 +00:00
orbiter
1509314ea6 set tighter control during DHT index and peer selection
see http://www.yacy-forum.de/viewtopic.php?p=19329#19329

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1945 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-22 13:54:35 +00:00
hydrox
fcc0683200 *) undoing last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1944 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-22 09:01:25 +00:00
hydrox
9411961eec *) another little fix for DHT-Transfer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1943 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-22 08:49:39 +00:00
hydrox
8b14a0c833 *) little fix for DHT-Transfer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1941 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-21 10:35:57 +00:00
orbiter
1f4412a146 adopted isListed to discussed new behavior as discussed (url, getFile)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1940 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-20 22:31:59 +00:00
orbiter
063ef4660a bug?
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1936 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-19 22:06:15 +00:00
orbiter
82358677a9 added another shiftK2W to flushCacheSome
this should fix the bug that the DHT cache is not flushed if there is no indexing

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1935 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-19 21:33:31 +00:00
orbiter
128e4ab199 - in serverSystem: maxPathLength is now a variable, not a method
- upon startup the calculated maximum path length is shown

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1932 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-19 01:33:20 +00:00
orbiter
30e3e3a0fd adopted MAXPATHLENGTH to host system capabilities
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1930 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-19 00:29:27 +00:00
borg-0300
85bb8e32a1 Bugfix for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1928 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-18 19:05:56 +00:00
borg-0300
3fe402069f try to fix
see: http://www.yacy-forum.de/viewtopic.php?p=19175#19175

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1927 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-18 18:51:52 +00:00
orbiter
f16f1f15cd bugfix for 100% CPU bug; thanks to Matthias for analysis
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1926 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-18 16:14:24 +00:00
borg-0300
254a13efd9 MAXPATHLENGTH used
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1925 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-18 15:20:50 +00:00
borg-0300
8865948e4e Cleanup;
Methode replaceRegex added;
Constant MAXPATHLENGTH added;

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1923 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-18 13:34:32 +00:00
orbiter
6c70f4a0cf renamed wordHashes for a word hash set generation to wordHashSet
This was done because the wordHashes iterator will get another integer
parameter and then conflicts with the wordHashes set generation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1921 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-18 01:04:16 +00:00
orbiter
d5f8f40c31 removed correcting iterator
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1920 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-18 01:01:00 +00:00
orbiter
488a0ed580 replaced old keyIterator and rowIterator by buffered iterators
that are synchronized with database access
Main change is done in kelondroTree, other classes are only adoptions

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1918 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-17 23:43:24 +00:00
hermens
4e9a8f41fd rwiDBCleaner + dbImporter: Iterate over small excerpts of
word hashes instead of the whole DB especially while changing
the DB in the process.
see http://www.yacy-forum.de/viewtopic.php?p=19136#19136



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1917 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-17 23:39:10 +00:00
hermens
474379ae63 remove TABs from plasmaDbImporter.java
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1916 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-17 21:52:36 +00:00
orbiter
dba02f399f starting of re-design of kelondroTree iterator
- new access to iterator
- added many IOException handling in other Classes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1914 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-17 20:52:43 +00:00
orbiter
f02b426073 made kelondroTree.nodeIterator private
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1910 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-17 18:10:48 +00:00
borg-0300
5f6fdf1786 Bugfix for getCachePath(URL url)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1909 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-17 16:39:29 +00:00
orbiter
303b6463a8 added debug line to URL storage for testing
see http://www.yacy-forum.de/viewtopic.php?p=19129#19129

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1908 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-17 16:30:31 +00:00
orbiter
91dca2cd8d fixed a bug in last commit: LURL entries cannot be written,
because a stored property was not set to false (but true)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1906 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-17 13:07:34 +00:00
orbiter
3286b1f498 re-organisation of lurl-creation and -stacking
this was necessary to prevent useless write to the database
in case of blacklist appearance of the url

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1905 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-17 10:16:07 +00:00
orbiter
0b903c5317 removed usage of kelondroNaturalOrder from plasmaCondenser to experimental
exclude cause of a 100% bug.
see http://www.yacy-forum.de/viewtopic.php?p=19076#19076

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1900 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-16 22:26:04 +00:00
orbiter
4239db0d1c fixed new ordering for backup iterator TreeSet
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1899 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-16 22:20:28 +00:00
orbiter
33eba5ecb8 temporary disabling last change, does not work (cannot debug right now)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1896 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-16 16:50:56 +00:00
orbiter
f0464042fc fix for latest iterator-replacement-fix:
iterator generated TreeSet which did not resprect rotations
this has now be implemented using kelondroOrder Objects
and by adding this rotation-rules to the ording

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1895 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-16 16:44:29 +00:00
borg-0300
ec21c585cb try to fix path too long
see http://www.yacy-forum.de/viewtopic.php?p=19079

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1893 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-15 20:16:15 +00:00
orbiter
a6a3f4b694 fix for svn 1888
this is a redesign of the no-iterator solution

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1892 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-15 16:01:42 +00:00
hydrox
8da13088e9 *)removed multiple DHT_Distribution_Threads
*)boosted DHT_Distribution sending chunk parallel to multiple peers

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1890 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-15 11:27:43 +00:00
orbiter
283a7181c6 try to fix new 100% cpu bug, possibly caused by iterator method
see http://www.yacy-forum.de/viewtopic.php?p=18900#18900

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1888 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-14 23:22:49 +00:00
orbiter
f588c0724f removed cache flush in case of DHT receive
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1885 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-13 23:23:45 +00:00
orbiter
e94b374d56 update to cache flush method
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1884 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-13 23:06:06 +00:00
orbiter
bcd99fe83e introduced a second RAM cache for DHT transfer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1880 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-13 10:43:12 +00:00
hydrox
360a460da8 *)URL-Cleaner: moved logging-statement to correct position
*)plasmaURLPattern: host is now added to the hashset in lowercase

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1879 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-12 18:08:48 +00:00
orbiter
02f9765013 quickfix for time problem during cache restore
see http://www.yacy-forum.de/viewtopic.php?p=18810#18810

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1878 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-11 21:55:32 +00:00
hermens
ad119f06af *) Don't overwrite new entries with older ones
see: http://www.yacy-forum.de/viewtopic.php?t=2015



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1874 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-10 16:28:01 +00:00
orbiter
be88687d8c fixed some problems with new cache flush karenz
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1873 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-10 13:57:30 +00:00
theli
d3da7c9a08 *) Adding support for robots Allow directive
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1872 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-09 14:03:54 +00:00
hydrox
f046e1814a *fix or last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1869 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-09 12:35:50 +00:00
hydrox
c55c51e2a8 *)added keywords to IndexCleaner_p.java
*)updated Logging

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1868 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-09 12:23:05 +00:00
orbiter
ddbeda738e added minimum age of word in cache to performance menu
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1866 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-09 11:31:17 +00:00
orbiter
f188611fc6 apply blacklist on rwis during dht receive
very experimental!

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1865 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-09 10:46:02 +00:00
orbiter
0ec28b8f8e added DBCleaner from Hydrox
see http://www.yacy-forum.de/viewtopic.php?p=18093#18093
The servlet is now named IndexCleaner_p.
See http://localhost:8080/IndexCleaner_p.html
The Servlet was adopted to fit in the overall architecture

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1863 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-08 22:06:11 +00:00
theli
fb4100d47b *) undoing last commit.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1856 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-08 07:59:32 +00:00
theli
a84cc71218 *) removing getTotalRuntime
- not needed anymore

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1855 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-08 07:37:21 +00:00
auron_x
dce08771d1 *) Fix for wrong estimated and elapsed times when import was paused
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1850 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-07 22:51:18 +00:00