Commit Graph

383 Commits

Author SHA1 Message Date
orbiter
a9e73b6852 fixed great mess with localization paths. the problem was:
automatic re-translation after update did not work. hopefully now

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3952 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-04 10:32:30 +00:00
orbiter
36a37f758b fix for oom exception during release download
see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=101&hilit=

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3950 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-03 22:55:47 +00:00
low012
2158f83d43 *) cosmetics, changed a character to get rid of "warning: unmappable character for encoding UTF8" during compilation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3946 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-01 17:17:29 +00:00
orbiter
1782ef57e5 - added SSI parser and include directive for <!--# include virtual="<file>" -->
- added chunked file transfer for non-yacy clients
- SSIs are streamed using chunked transfer, partly delivered pages can be seen in browser before transmission is finished
- added client-side network unit identification
- cleaned up code

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3926 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-26 14:37:10 +00:00
allo
6074264267 dynamic rights.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3847 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-09 19:34:09 +00:00
allo
854eb1492f .yacy /.yacyh urls for the feedreader
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3844 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-09 12:56:08 +00:00
allo
7a5b22a0b8 Integration of FeedReader in Bookmarks.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3841 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-08 23:27:42 +00:00
allo
7921f07c9d userDB fix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3837 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-08 16:11:10 +00:00
allo
7b2e1bb8f2 Feedparser with reflection.
TODO: This needs a special build.xml entry


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3832 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-08 14:31:09 +00:00
karlchenofhell
8bff810d19 - fixed logging output of serverMemory.request()
- don't start up if DATA/yacy.running exists as this is usually a sign of an already started yacy-instance

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3831 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-08 12:45:03 +00:00
karlchenofhell
f05ca43780 - the wiki-parser works for remote wiki-code now, not displaying links anymore as if they were local (ViewProfile comment)
- fixed wrong link to CrawlStart on Status-page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3816 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-07 11:35:48 +00:00
karlchenofhell
30c3d909b1 - fixed charset problem in ConfigProfil_p.html (use accept-charset="UTF-8" in forms)
- fixed wrong XML output if no peers are known in Network.xml
- simplified parsing of table properties in wikiCode and ZTableToken
- reimplemented GC heuristics. They are needed to constantly ensure that an amount of free memory is available which is higher than Java's max. limit for performing a Full GC (please use serverMemory.request(long, boolean) rather than serverMemory.available(long, boolean) to provide data for averaging over the last GCs)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3793 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-05 11:37:19 +00:00
allo
4392ee0c51 BugFix for typo and wrong include
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3789 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-04 16:06:58 +00:00
allo
d1e1580223 Surftips Blacklist
Blacklists List Hardcoded instead of only updated on firststart / migration.java

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3788 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-04 15:36:10 +00:00
hydrox
44bac7dea1 *) blog-comments can now be moderated
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3778 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-01 06:02:55 +00:00
allo
957a25afff getRight(rightName) instead of get...Right()
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3774 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-31 14:48:20 +00:00
low012
a0149317ac *) fixed bug where headlines were added to directory of a wiki page multiple times (http://www.yacy-forum.de/viewtopic.php?t=4034)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3762 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-25 16:36:09 +00:00
karlchenofhell
baa9402b97 - wiki-parser is now configurable via the config setting wikiParser.class which holds the class-name for the parser to use
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3742 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-20 16:19:25 +00:00
karlchenofhell
601fc7d1c5 - added source to J7Zip-modifed.jar and it's license (changelog is still to come)
- moved HTML-*replace-methods from wikiCode to de.anomic.data.htmlTools
- prepared use of different wiki parsers as suggested here: http://www.yacy-forum.de/viewtopic.php?p=34444#34444

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3741 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-20 13:29:12 +00:00
theli
b1680ab71f *) bugfix for ArrayIndexOutOfBoundsException in robots-parser (thanks to low012)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3739 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-18 13:39:08 +00:00
theli
9a4375b115 *) robots.txt: adding support for crawl-delay
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3737 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-18 13:00:42 +00:00
allo
65a8a9fc58 fix for nullpointer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3726 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-14 16:56:13 +00:00
orbiter
139c59ebbd - fixed dht selction problem: the seed tables used a wrong ordering
- cleaned some code

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3693 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-09 17:59:36 +00:00
theli
cb43ae11ba *) Bugfix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3668 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-06 12:57:22 +00:00
theli
0b5fc3c28c *) moving date functions to serverDate class
*) Sitemap-parser
   - logging added
   - parsing of modDate added

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3667 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-06 12:36:49 +00:00
theli
6f46245a51 *) Bookmarks: Ajax icon is displayed while loading title
*) First version of a sitemap parser added
   - currently only autodetection of sitemap files is supported
*) DB-Import restructured
   - pause/resume should work again now


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3666 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-06 09:52:04 +00:00
orbiter
e48189c710 enhanced cluster routing
- cluster definitions can now contain an addition for local ip addresses
- cluster-cluster communication uses the local ip address instead the global address, if one is given

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3624 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-29 22:05:34 +00:00
theli
2399ed817c *) robots.txt parser now extracts the sitemap-URL (will be used later)
*) some javadoc added
*) junit testclass for robots.txt parser added

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3602 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-26 15:42:38 +00:00
rramthun
e6fb6426a3 *) Some cosmetical changes and corrections
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3582 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-19 16:16:54 +00:00
orbiter
40c14a4f0e - better implementation of search query properties
- basic protection against start-up problems when database files are corrupted
- auto-delete of not-critical databases during startup when load error occurs
- on-the-fly reset option for all database tables
- automatic on-the-fly reset for seed tables during enumeration exceptions

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3547 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-05 10:14:48 +00:00
allo
f4af360f7c bugfix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3494 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-20 15:37:19 +00:00
hydrox
9b5fb3908d *) a peer-message are now created when a blog-comment is written
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3480 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-15 12:58:17 +00:00
orbiter
6ad39bae1e fixed shutdown problem
this fixes the 'inconsistency' messages during start-up

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3457 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-09 08:48:47 +00:00
karlchenofhell
264a82eec8 - fix for http://www.yacy-forum.de/viewtopic.php?t=3657
- fix for http://www.yacy-forum.de/viewtopic.php?p=32758#32758
- Diff takes any objects now, not only strings

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3455 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-08 22:04:15 +00:00
orbiter
d755a8026d - better OOM protection
- better memory allocation for FlexTable indexes
- splitting between static index and dynamic index (only the dynamic part must grow)
- to enable a merge-iteration of new splittet index, a huge number of classes needed to be adopted for new iterator classes
- added new iterator classes that support cloneable iterators
- adopted all iterator classes to implement cloneable itarators

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3453 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-08 16:15:40 +00:00
orbiter
1cba31de43 redesigned ram organization for database caches
- each cache can now allocate as much memory as is available
- no more fixed limits
- replaced old performance memory monitor by new one
- added supervision methods as static functions into the classes that provide cache functionality
- steering of ram allocation is done with two simple limits that are ram availability-relative


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3434 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-06 22:43:32 +00:00
theli
bd03c6b874 *) bugfix in bookmarksDB:
- NullpointerException when trying to get an unknown bookmark
   - bookmarks can either start with http or https

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3427 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-03 11:56:46 +00:00
karlchenofhell
9623bf7bbe - removed call of java 1.5 method
- added config servlet for local robots.txt
- removed YPStats_p as it is of no use anymore
- supertemplates use XHTML now
- quick-fix for http://www.yacy-forum.de/viewtopic.php?p=32296#32296

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3422 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-01 13:54:14 +00:00
karlchenofhell
a1d68fe092 - use .class rather than Class.forName for classes in class-path
- added Bost's patch for Diff.findDiagonale() from: http://www.yacy-forum.de//files/patch_685.txt
- fixed minor bugs in Blog

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3416 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-27 22:52:22 +00:00
hydrox
54fef3574f *) missing files for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3406 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 14:38:34 +00:00
hydrox
cb89c74d52 *) added blog-comments
*) removed debug-output when deleting news

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3405 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 14:36:01 +00:00
karlchenofhell
6fbe31425a - some code-cleanup (no more syntax-warnings here)
- added deletion from loadedURLs of URLs to be blacklisted in IndexControl_p

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3404 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 12:56:50 +00:00
orbiter
e3480d4ad3 fix for warning in crawl balancer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3402 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 11:54:43 +00:00
karlchenofhell
39a2000d8b - added support for [[Bookmark:$bookmarkTag|description]]-link-listings (requested by theli) to wiki-parser
- added support for <pre>-tags to wiki-parser

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3393 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-24 21:26:48 +00:00
karlchenofhell
619653c054 - fix for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3392 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-24 15:40:56 +00:00
karlchenofhell
a5a36d9252 - hopefully last fix fo 1.5 methods (sorry for that, eclipse isn't that helpful in identifying those methods)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3387 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-22 08:04:09 +00:00
karlchenofhell
e97b6f0458 - we still use Java 1.4 ...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3386 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-21 22:43:31 +00:00
karlchenofhell
0c7b8cf632 - added first version of new wiki-parser
- added blacklist support to manual URLFetcher stack fill
- fix for NPE: http://www.yacy-forum.de/viewtopic.php?t=3559

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3385 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-21 22:31:36 +00:00
low012
801eea8849 *) Fixed bug where pairReplace() got caught in infinite recursion. (http://www.yacy-forum.de/viewtopic.php?t=3466)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3383 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-20 22:07:59 +00:00
karlchenofhell
d114a0136e - crawl profile: don't add null-values
- added some settings and statistics for url-fetcher 'server'-mode
- added own stack for fetchable URLs
- added possibility to fill stack via shift from peer's queues, via POST (addurls=$count and url$num=$url) or via file-upload
- added "htroot" to classpath of linux start-script

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3370 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-17 19:16:53 +00:00
orbiter
c464157a6e replaced some toString()
see http://www.yacy-forum.de/viewtopic.php?p=31151#31151

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3345 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-06 16:26:56 +00:00
(no author)
e218940293 The copyright sign "\u00A9" is already replaced by "&copy;". String "(C)" is not a unicode sequence!
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3334 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-04 18:16:27 +00:00
low012
1bc4d8d470 *) If there is more than one pair of patterns in a line, all of them (and not only one pair) will be replaced.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3333 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-04 15:53:40 +00:00
low012
ea7a8cf7aa *) <hr> and <br> tags are XHTML compliant now.
*) Avoid superflous trailing blank in non-proportional sections.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3332 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-04 15:03:13 +00:00
karlchenofhell
f2e6f19b90 - added versioning to Wiki
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3327 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-03 15:20:12 +00:00
karlchenofhell
02a73dce87 - added Diff-class for wiki-versioning (forthcoming, first need suitable serverObjects.put() for it)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3325 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-03 05:24:44 +00:00
orbiter
e4910f03d1 tag storage fix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3302 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-30 11:52:15 +00:00
orbiter
991182b29b more space for bookmarks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3299 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-30 00:20:03 +00:00
orbiter
88fa764b64 implemented new kelondroObjects into bookmarkDB
- Bookmark-Objects are stored inside the kelondroObjects cache
- removed superfluous classes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3298 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-30 00:17:55 +00:00
orbiter
9c05e2a820 re-design ob kelondroMap
- this class is replaced by an object that can hold any type of object
- this object must be defined as a class that implements kelondroObjectsEntry
- the kelodroMap is now implemented as kelondroMapObjects

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3297 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-29 23:51:10 +00:00
allo
669c21db05 first version of abstracted kelondroMap Cache.
get returns a kelondroCachedObject(or in most cases a subclass of it),
or a map, which can be used to construct a kelondroCachedObject.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3295 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-29 19:10:55 +00:00
allo
14f2068daf some more bookmark changes towards multiuser bookmarks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3291 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-28 17:38:43 +00:00
allo
ff79c52fc0 bookmark users can now edit bookmarks.
TO COME: tag bookmarks with username, list bookmarks of a special user, filter private bookmarks for users.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3274 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-23 10:24:26 +00:00
allo
f40169fcd7 preparing multiuser bookmarks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3256 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-19 19:42:50 +00:00
orbiter
c0851ee943 refactoring: moved and renamed de.anomic.data.searchResults to plasma package
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3248 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-19 00:38:03 +00:00
allo
c39dda2374 finished refactoring of searchtemplates.
now plasmaSwitchboard.searchFromLocal calculates a searchResults structure,
which is parsed in the yacysearch/detailedSearch Servlets.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3244 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-18 10:42:36 +00:00
allo
35039982da refactoring of search process: store results in a searchResults structure. At the moment, its just stored in it, and read from it again.
Next step: return searchResults instead of serverObjects, and parse the results in the servlets.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3241 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-18 07:41:15 +00:00
karlchenofhell
3c43e605ba - don't accept malformed bookmarks, fix for: http://www.yacy-forum.de/viewtopic.php?t=3414 (first report)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3238 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-17 23:39:03 +00:00
orbiter
d07b132a0d - fixed colors of network grafic
- added option to activate write cache for seed-db
- did not activate write cache because it did not work

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3236 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-17 19:39:31 +00:00
allo
0c81bd39d4 XSS-safe put as default.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3217 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-16 14:07:54 +00:00
(no author)
37e53b4a6a replaced tree database structure for seed db by flex data structure
I don't know if this helps, we will find out...

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3177 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-07 23:34:13 +00:00
allo
4cb688018d wikiAdmin Recht
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3006 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-24 11:32:00 +00:00
low012
29fa17bd40 *) simplified some code in wikiCode.java
*) deleted outdated text in Settings_p-html (see http://www.yacy-forum.de/viewtopic.php?p=28027)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3005 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-24 02:52:38 +00:00
orbiter
30888e7a2f implementation of search constraints
Such constraints may formulate specific restrictions to web searches
This is implemented by scraping information for constraints from a web
page during parsing, and storing flags to the pages within the web index.

In this first step, only information for index pages ("index of", directory listings)
are scraped and stored in flags
- added new flag class kelondroBitfield
- added scraper method in condenser
- added bitfield structure for all scrape types (see also condenser)
- added bitfield structure for appearance locations (see RWIEntry)
- added handover protocol for remote search and index distribution
- extended kelondroColumn class to hold bitfield types
- added another search attribute on search page (index.html)
- extended search-filter to enable filtering of non-matching constraints
- set all new database types to be default
- refactoring: moved word hash generation to condenser class

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2999 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-23 02:16:30 +00:00
orbiter
497428c8ec refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2949 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-10 01:13:33 +00:00
theli
f37e2041e8 *) adding soap function to import yacy bookmarks from xml or html (transfered via soap attachments)
*) soapHandler: code cleanup for service deployment

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2915 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-05 09:56:39 +00:00
(no author)
0e79f2fd7e name of the file to tranlate apears ahead its translation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2868 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-27 23:51:57 +00:00
allo
5a6488256d catch the "username too short" exception
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2844 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-22 21:40:16 +00:00
allo
1d0c0edda3 first version of posts/get from the del.icio.us api
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2713 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-07 22:16:09 +00:00
orbiter
df1629b05a - code cleanup
- version 0.471
- moved surftipps to own web page


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2676 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-29 22:27:20 +00:00
borg-0300
f18304ddd3 unused/not needed imports removes;
properties added;

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2628 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-18 22:21:18 +00:00
theli
97d2a08ef1 *) restructuring needed to support parsing of documents using various charsets
- serverFileUtils.java: 
   -- adding methods to copy from stream to writer and readers to writers
   -- moving httpc writeX methods into serverFileUtils class
   - serverCharBuffer.java: removing inheritance from Writer class
   - replacing htmlFilterOutputStream by htmlFilterWriter class which handles
     content as char stream
   - htmlFilterContentTransformer.java: deactivating getText mode 
    (still needs to be migrated to use char streams instead of byte streams)
   - changes in several classes to use htmlFilterWriter instead of htmlFilterOutputStream
   - changes in Scraper and Transformer classes to operate on chars instead of bytes
   - httpdProxyHandler.java: bugfix. clientTimeout setting was missing in config file

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2617 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-18 10:12:11 +00:00
low012
cd636eb00e *) Fix for the fix...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2615 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-18 01:24:26 +00:00
low012
f9a5b55a9e *) Fixed bug described in http://www.yacy-forum.de/viewtopic.php?p=25448#25448
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2614 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-18 01:19:54 +00:00
low012
8a30c5343d *) Fixed bug where exclamation marks could get lost between [=...=] and <pre>...</pre>
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2612 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-17 23:42:36 +00:00
low012
d8f4b17e31 *) Hopefully fixed bug described in http://www.yacy-forum.de/viewtopic.php?t=2825.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2611 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-17 22:57:10 +00:00
theli
f3ac4dbbb9 *) better handling of server shutdown
See: e.g. http://www.yacy-forum.de/viewtopic.php?t=2584

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2468 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-03 14:59:00 +00:00
orbiter
db1eae0227 * simplified initialization of database objects
* replaced kelondroTree for NURLs by kelondroFlex
* replaced kelondroTree for EURLs by kelondroFlex
take care, may be very buggy
please finish crawls before updating. crawls will be lost.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2452 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-24 02:19:25 +00:00
orbiter
23dd972608 fixed memory calculation in performanceMemory web page
fixed also maximum cache size computation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2429 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-20 01:20:34 +00:00
orbiter
6ad471ef96 * applied many compiler warning recommendations
* cleaned up code
* added unit test code
* migrated ranking RCI computation to kelondroFlex and kelondroCollectionIndex


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2414 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-16 19:49:31 +00:00
allo
cf1186597b utf fix from theli
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2412 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-16 15:26:04 +00:00
theli
5e0b6f8f83 *) sorting peer name list on Blacklist_p.html
*) restructuring of sharedBlacklist_p.java

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2405 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-13 13:29:50 +00:00
orbiter
6d2f15971a there is a very strange error that causes that the kelondroRecords structure
is corrupted. The cause is, that the deleted-records-chain has wrong entries,
and one of the pointers in that chain points to a place behind the file end.
This causes an IndexOutOfBoundsException within an IO operation.
I currently don't know the reason that the deleted-records-chain is
corrupted, but the error can be catched. If this now happens with the
assortment database, the database is deleted.
See also:
http://www.yacy-forum.de/viewtopic.php?p=24586#24586

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2396 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-12 13:45:23 +00:00
theli
d2e8e76218 *) now it's possible to configure the yacy blacklist separately for dht, search, proxy, crawler
See: http://www.yacy-forum.de/viewtopic.php?t=2541
        http://www.yacy-forum.de/viewtopic.php?p=24516

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2389 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-12 02:42:10 +00:00
orbiter
cfbacbbf08 reverted change in robotsParser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2378 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 15:29:29 +00:00
orbiter
abf22f6e60 removed url normalform computation from htmlFilterContentScraper.
This method was implemented in de.anomic.net.URL


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2377 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 15:09:22 +00:00
allo
4e9f02c8ec integration of Michaels string-extraction.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2337 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-26 23:11:15 +00:00
allo
1b2ea58ee9 wrong substring invocation.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2313 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-20 13:49:38 +00:00
orbiter
3879a0ecd0 replaced java.net.URL usage by use of new class de.anomic.net.URL
This shall be seen as an experiment to exclude all cases where
there could be a DNS lookup during URL comparisment.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2290 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-13 01:21:53 +00:00
orbiter
92f4cb4d73 added option to configure the start-up delay time for kelondro database files.
the start-up delay is used to pre-load the database node cache

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2276 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-03 23:57:33 +00:00