Commit Graph

3233 Commits

Author SHA1 Message Date
theli
bd03c6b874 *) bugfix in bookmarksDB:
- NullpointerException when trying to get an unknown bookmark
   - bookmarks can either start with http or https

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3427 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-03 11:56:46 +00:00
orbiter
b466baa574 added some memory protection
too large collection arrays are now avoided. By default, the biggest
collection index is 7. larger collections are dumped into a commons
directory, but cannot yet be used. Bevore doing a dump, the collection
is splittet into a part which has only root-references, and stored back
to the collection; the remaining part goes to commons

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3426 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-03 00:55:51 +00:00
low012
ce360ef43e *) no more HTML in plasmaCrawlProfile.java anymore
*) <br> will not be displayed in items in Auto Filter Content on WatchCrawler_p.html anymore
*) removed unnecessary replaceHTML()


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3425 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-02 21:09:28 +00:00
karlchenofhell
93e1ad2bca - fix for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3424 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-02 01:50:21 +00:00
karlchenofhell
88245e44d8 - improved version of robots.txt (delete your old htroot/robots.txt before updating):
- robots.txt is a servlet now
  - no need to rewrite the whole file each time a section is added or removed
  - user-defined disallows, added manually, won't be overwritten anymore
- new config-setting: httpd.robots.txt, holding names of the disallowed sections

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3423 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-02 01:19:38 +00:00
karlchenofhell
9623bf7bbe - removed call of java 1.5 method
- added config servlet for local robots.txt
- removed YPStats_p as it is of no use anymore
- supertemplates use XHTML now
- quick-fix for http://www.yacy-forum.de/viewtopic.php?p=32296#32296

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3422 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-01 13:54:14 +00:00
daburna
f4c13b422c *updated translation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3421 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-01 09:36:59 +00:00
theli
9b33562ed1 *) adding mimetype application/x-rar
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3420 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-28 13:11:59 +00:00
orbiter
51e12049fa third generation of R/W head path optimization
- data from collection arrays are read in order
- merged data is written in order

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3419 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-28 11:13:23 +00:00
karlchenofhell
1fe505f0b0 - adapted User_p to general web-interface style (and removed status-only page on changes)
- beautified WikiHelp.html + typos
- IP hasn't been set correctly in Blog.xml

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3418 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-28 09:22:31 +00:00
karlchenofhell
92b6bc0ad2 - fixed wrongly applied replacement of "<" and ">" in Blog and simplified the code a bit
- added check, whether active blacklist engine is supported by blacklist cleaner

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3417 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-28 00:04:32 +00:00
karlchenofhell
a1d68fe092 - use .class rather than Class.forName for classes in class-path
- added Bost's patch for Diff.findDiagonale() from: http://www.yacy-forum.de//files/patch_685.txt
- fixed minor bugs in Blog

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3416 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-27 22:52:22 +00:00
orbiter
10a3c20b8d some more enhancements to R/W Head path optimization
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3415 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-27 15:54:02 +00:00
orbiter
f4cfd19835 second Generation of collection R/W head path optimization:
- permanent cache flush is switched off. The optimized cache flush
  works better if it is a large number of collections that is flushed
  together
- the flush size can be configured instead the flush divisor. There is
  only one size for all flushes
- collection records that shall be removed during collection transition
  (jump from one collection file to another) are now not really removed
  but only marked in RAM. add-operations to the collection use these
  marked collection spaces
- index bulk write operations are now separated for each file of a kelondroFlex


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3414 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-27 13:01:22 +00:00
hydrox
e92e8b2ae3 *) added RSS-Feed for blog
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3413 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-27 10:05:46 +00:00
hydrox
a107961099 *) fixed blog-comment-deletion without admin-rights is no longer possible
*) fixed no empty blog-comments anymore

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3412 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-27 08:37:47 +00:00
daburna
ea2dbcb034 *updated translation for
-blacklistcleaner
-blogcomments
-header.template
*small changes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3411 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-27 07:52:14 +00:00
(no author)
cf47075855 CSS corrects
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3410 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 23:03:10 +00:00
orbiter
1fda50fd3c correct R/W head positioning in kelondroFlex
and some enhancements

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3409 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 22:25:39 +00:00
hydrox
116fc016d0 *) fix for Blogcomment-Preview
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3408 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 16:18:43 +00:00
orbiter
304412a049 first generation of collection index R/W head path optimization
- collections are now hand-over as collection lists to collection index for merge opertations
- collection index lists are separated into 'new' and 'extend' lists
- lists are written separately
- write operations are done into array sets and array indexes. These are now serialized
- write operations into index files are sorted by index;
  that means that a R/W head does not need to go forward
  and backward, only forward
More enhancements are possible

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3407 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 15:49:23 +00:00
hydrox
54fef3574f *) missing files for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3406 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 14:38:34 +00:00
hydrox
cb89c74d52 *) added blog-comments
*) removed debug-output when deleting news

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3405 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 14:36:01 +00:00
karlchenofhell
6fbe31425a - some code-cleanup (no more syntax-warnings here)
- added deletion from loadedURLs of URLs to be blacklisted in IndexControl_p

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3404 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 12:56:50 +00:00
orbiter
32867580ee update to kelondroRecords needed fo last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3403 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 11:55:36 +00:00
orbiter
e3480d4ad3 fix for warning in crawl balancer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3402 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 11:54:43 +00:00
daburna
ed021a3f70 bugfix, see http://www.yacy-forum.de/viewtopic.php?t=3573
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3401 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 10:05:28 +00:00
karlchenofhell
31ad42535a - added buttons to add complete domain or single URL to blacklist to IndexControl_p
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3400 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-25 23:14:45 +00:00
orbiter
8668ac5d91 preparations for collection index cache flush optimization
(hand-over commit, no functional change to current code)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3399 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-25 21:06:26 +00:00
allo
42e9747650 fixed /path/forwarding. uncomment, if you want to use it.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3398 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-25 20:07:35 +00:00
karlchenofhell
e0decf4653 - added support for changing invalid entries in blacklist cleaner
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3397 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-25 19:36:05 +00:00
karlchenofhell
c58ef48e1c - increased size of subject text-field
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3396 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-25 18:48:25 +00:00
karlchenofhell
1d31ebbeec - added experimental PHP script which redirects from a vhost to a peer, using a public seed-file
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3395 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-25 15:18:22 +00:00
auron_x
9cbf94222f *) added seedurl to network.xml as requested by lulabad
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3394 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-25 10:24:42 +00:00
karlchenofhell
39a2000d8b - added support for [[Bookmark:$bookmarkTag|description]]-link-listings (requested by theli) to wiki-parser
- added support for <pre>-tags to wiki-parser

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3393 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-24 21:26:48 +00:00
karlchenofhell
619653c054 - fix for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3392 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-24 15:40:56 +00:00
karlchenofhell
26f5757b40 - added support for multiple paths per domain to default-blacklist
warning: an interface-change had been neccessary:
- remove(String, String) has been renamed to removeAll(String, String), because it removes all path-entries for the specified host
- remove(String, String, String) has been added to delete only a path-entry
- geBlacklistType(String) has been renamed to getBlacklistType(String)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3391 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-24 13:56:32 +00:00
karlchenofhell
3d6ab19f7e - remove double entries in blacklist as well
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3390 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-23 18:27:56 +00:00
karlchenofhell
bf7a69197d - fix for possible NPE in queues_p
- WatchCrawler_p:
  - display crawler traffic
  - pause/resume local- and global crawler


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3389 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-22 22:26:11 +00:00
allo
9702d3abba further supertemplate test
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3388 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-22 11:01:16 +00:00
karlchenofhell
a5a36d9252 - hopefully last fix fo 1.5 methods (sorry for that, eclipse isn't that helpful in identifying those methods)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3387 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-22 08:04:09 +00:00
karlchenofhell
e97b6f0458 - we still use Java 1.4 ...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3386 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-21 22:43:31 +00:00
karlchenofhell
0c7b8cf632 - added first version of new wiki-parser
- added blacklist support to manual URLFetcher stack fill
- fix for NPE: http://www.yacy-forum.de/viewtopic.php?t=3559

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3385 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-21 22:31:36 +00:00
orbiter
f7803a6ce4 enhanced crawl balancer
- new domains now get a chance to get crawled early
- less IO operations
- new balancing method
- better dump order at shutdown time
- bugfixes regarding not found url hashes (no more superfluous cache kill)
- domain access time is now shared over all balancer stacks
- viewing the stack does no more disturbish the balancing algorithm that much
- intelligent selection of best next domain using domain access times
- extra double-check (to double-check the double-check)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3384 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-21 16:23:31 +00:00
low012
801eea8849 *) Fixed bug where pairReplace() got caught in infinite recursion. (http://www.yacy-forum.de/viewtopic.php?t=3466)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3383 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-20 22:07:59 +00:00
theli
c8862e47fb *) adding mimetype for svg
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3382 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-20 15:25:46 +00:00
orbiter
39b0658839 Redesign of Webinterface menu structure
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3381 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-20 14:35:29 +00:00
orbiter
c3e8c23f5d fix for 'CANNOT FETCH ENTRY: hash is null' bug
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3380 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-20 13:53:21 +00:00
orbiter
badab8d924 fixed some more bugs in new db handling
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3379 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-20 12:29:12 +00:00
orbiter
e72d253577 fixed problem with initial cache load
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3378 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-20 11:20:48 +00:00