Commit Graph

1305 Commits

Author SHA1 Message Date
karlchenofhell
0c7b8cf632 - added first version of new wiki-parser
- added blacklist support to manual URLFetcher stack fill
- fix for NPE: http://www.yacy-forum.de/viewtopic.php?t=3559

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3385 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-21 22:31:36 +00:00
orbiter
f7803a6ce4 enhanced crawl balancer
- new domains now get a chance to get crawled early
- less IO operations
- new balancing method
- better dump order at shutdown time
- bugfixes regarding not found url hashes (no more superfluous cache kill)
- domain access time is now shared over all balancer stacks
- viewing the stack does no more disturbish the balancing algorithm that much
- intelligent selection of best next domain using domain access times
- extra double-check (to double-check the double-check)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3384 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-21 16:23:31 +00:00
orbiter
39b0658839 Redesign of Webinterface menu structure
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3381 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-20 14:35:29 +00:00
orbiter
dc0c06e43d PLEASE MAKE A BACK-UP OF YOUR COMPLETE DATA DIRECTORY BEFORE USING THIS
redesign for better IO performance
enhanced database seek-time by avoiding write operations at distant
positions of a database file. until now, a USEDC counter was written
at the head-section of a kelondroRecords database file (which is the
basic data structure of all kelondro database files) to store the
actual number of records that are contained in the database. Now, this
value is computed from the database file size. This is either done
only once at start-time, or continuously when run in asserts enabled.
The counter is then updated only in RAM, and written at close of the
file. If the close fails, the correct number can be computed from the
file size, and if this is not equal to the stored number it is a strong
evidence that YaCY was not shut down properly.
To preserve consistency, the complete storage-routine had to be re-written.
Another change enhances read of nodes in some cases, where the data-tail
can be read together with the data-head. This saves another IO lookup during
each DB node fetch.
Includes also many small bugfixes.
IF ANYTHING GOES WRONG, ALL YOUR DATA IS LOST: PLEASE MAKE A BACK-UP

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3375 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-20 08:35:51 +00:00
hydrox
5af76fccd7 *) peer-search on Network.html now is case-insensitive
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3374 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-19 13:00:41 +00:00
karlchenofhell
c016fcb10f - added streaming-support to CrawlURLFetchStack_p servlet
- bug for NPE in list.java
- use more constants

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3373 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-19 12:47:46 +00:00
karlchenofhell
65af9d3215 - continue shifting even in the case the stacked URL could not be found
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3372 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-18 01:50:41 +00:00
karlchenofhell
d114a0136e - crawl profile: don't add null-values
- added some settings and statistics for url-fetcher 'server'-mode
- added own stack for fetchable URLs
- added possibility to fill stack via shift from peer's queues, via POST (addurls=$count and url$num=$url) or via file-upload
- added "htroot" to classpath of linux start-script

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3370 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-17 19:16:53 +00:00
karlchenofhell
a46dc43f45 - added lock symbol for restart- and stutdown-buttons on Status-page (see http://www.yacy-forum.de/viewtopic.php?p=31444#31444)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3369 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-16 00:25:45 +00:00
karlchenofhell
b2a9d32f29 why do I always forget some lines? sorry...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3368 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-14 15:11:03 +00:00
karlchenofhell
e6ddf135bb - enabled fetching new crawls via /yacy/list.html?list=queueUrls for testing purposes
- sent URLs are taken off the limit-stack (of the global crawl trigger) (may be moved somewhere else in future versions)
- added option to set the requested chunk-size

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3367 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-14 14:50:55 +00:00
karlchenofhell
67d96249b4 - fix for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3366 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-13 21:17:43 +00:00
karlchenofhell
c5a2ba3a23 - prepared URL fetch from other peers
- more feedback for user

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3365 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-13 20:18:12 +00:00
auron_x
5ba531a722 *) higher precision for QPH also on status-page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3363 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-11 09:33:39 +00:00
karlchenofhell
4e5eda6ef9 huch...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3362 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-10 20:25:45 +00:00
karlchenofhell
50b59e312f - added experimental CrawlURLFetch_p-Servlet to fetch new URLs from a specified location (\n-seperated list). Requested by Theli.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3361 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-10 20:20:00 +00:00
karlchenofhell
6c6375577e - fix for http://www.yacy-forum.de/viewtopic.php?t=3523
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3360 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-10 11:50:18 +00:00
karlchenofhell
ea20d8d7c5 - return to edited wiki-page after submit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3359 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-09 19:47:21 +00:00
orbiter
30d79d69a6 fix for wrong display of search statistics
see http://www.yacy-forum.de/viewtopic.php?p=31242#31242

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3352 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-08 10:42:35 +00:00
hydrox
faad869865 *) added peer-search to Network.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3347 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-07 11:20:31 +00:00
orbiter
c464157a6e replaced some toString()
see http://www.yacy-forum.de/viewtopic.php?p=31151#31151

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3345 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-06 16:26:56 +00:00
orbiter
d25caa07bf redesigned some parts of http authentication
added another access check for peer hops

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3340 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-05 19:46:50 +00:00
low012
588e48ce0b *) Part II of last commit. Note to myself: check svn commandline syntax :-(
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3339 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-05 18:21:11 +00:00
low012
0d2431d6f7 *) removed printed out '<br />' in row Hit-Size Miss-Size by moving <br /> from Java file to HTML file.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3338 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-05 18:16:01 +00:00
hydrox
ff829e97f8 *) fixed headlines in blog (see: http://www.yacy-forum.de/viewtopic.php?t=3442 )
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3337 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-05 14:40:02 +00:00
hydrox
9184113284 *) fixed News deletion. News are now removed if they are no longer in a news-stack. This does not effect News-entires in the news-db that have no stack-entries.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3336 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-05 13:35:36 +00:00
karlchenofhell
a647a7ca8d - <tt>-tags look like <span class="tt">-tags now, fix for EDIT 3 of http://www.yacy-forum.de/viewtopic.php?t=3485
- Typo: crawl depth 0 indexes the given url, 1 indexes all links on it

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3335 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-04 18:22:53 +00:00
karlchenofhell
3bafd643c0 - fix for http://www.yacy-forum.de/viewtopic.php?t=3483
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3330 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-03 20:22:39 +00:00
karlchenofhell
6667930352 - old versions may be reviewed and restored
- removed explicit replacement of '<' and '>', fix for first bug in: http://www.yacy-forum.de/viewtopic.php?t=3485

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3329 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-03 19:48:18 +00:00
karlchenofhell
bf8f120340 - reduced margin of headlines in wiki (someone has to create a nice base.css urgently)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3328 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-03 16:04:43 +00:00
karlchenofhell
f2e6f19b90 - added versioning to Wiki
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3327 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-03 15:20:12 +00:00
karlchenofhell
2401e748a3 - fixed wrong replacement of POST-parameters in httpd ('<' and '>' are still replaced, don't know why): http://www.yacy-forum.de/viewtopic.php?t=3466
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3324 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-03 01:26:05 +00:00
orbiter
ceff987dd7 higher precision display for qmh
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3323 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-03 01:18:38 +00:00
orbiter
b2f4087400 redesign of last-seen fieln inside seed:
the field contains now a time in UDC-0 (instead relative to local UDC offset)
this fixes a bug in peer selection, where an iteration over all seeds
ordered by lastseen did not work correctly.
Problems may occur because the new meaning of this field may mix with
the different meaning of that field in older peers

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3322 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-02 23:54:27 +00:00
karlchenofhell
e68cdeeeb3 - reverted parseArg(String) to use a byte-array to handle correct UTF-8 parsing
- arguments aren't passed html-escaped to the servlets anymore, bug-fix for http://www.yacy-forum.de/viewtopic.php?p=30573

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3321 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-02 21:20:53 +00:00
orbiter
e00e850a98 removed constants (no connection with yacySeed.dna identifier)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3320 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-02 14:52:54 +00:00
orbiter
fcc11391a8 some redesign attempts because sorting of lastseen does not work correctly
not finished yet
target: better selection of peer-ping targets, which should enhance stabilization of the net

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3319 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-02 13:12:31 +00:00
netbude
c80707ccd5 File is again valid XHTML
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3318 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-01 15:00:38 +00:00
orbiter
c2d6edf21d integrated number of remote targets as 'partitions' into remote search protocol
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3317 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-01 13:27:23 +00:00
orbiter
f696d3c1eb added double computation to kelondroMapObjects
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3316 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-01 09:48:31 +00:00
theli
8a0ed1ce50 *) bugfix template variable qph for local peer renamed to my-qph
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3315 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-01 08:52:20 +00:00
theli
8d4dc11d38 *) adding qph to network.xml
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3314 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-01 08:48:55 +00:00
orbiter
819ff21c92 fixed QPM output
QPM is temporarily called QPH (until more search requests are present?)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3313 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-01 00:17:35 +00:00
auron_x
a480bb7afa *) changed stats-display on status-page and added QPM
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3312 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-31 23:43:25 +00:00
theli
8c9ee3b442 *) missing table cell added
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3311 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-31 17:26:35 +00:00
orbiter
4f6eed5623 QPM increment
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3309 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-31 16:21:20 +00:00
orbiter
306c50ac40 QPM (queries per minute) statistic stub
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3308 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-31 15:39:11 +00:00
michitux
48da3184f0 Two fixes: escaped some &s in QuickCrawlLink_p.html and added correct id to the skype-field in ConfigProfile_p.html (note: ids must be unique in a (x)html-document - in most cases you can simply use the same like for name)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3307 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-31 15:11:39 +00:00
orbiter
7598e1243e removed unused variables/imports
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3306 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-31 09:28:47 +00:00
orbiter
47ab83a7c0 added flag for YaCyHop - proxy access for all paths that start with /yacy/
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3304 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-31 00:09:51 +00:00