Commit Graph

3000 Commits

Author SHA1 Message Date
orbiter
f7aaeb3fad created new main menu entry 'Customization and Integration'
- moved some already existing servlets to this menu
- renamed the skin servlet to appearance
- added a set-to-default-button to the search page appearance setting
- removed the peer profile servlet which is now replaced by a field in the new appearance servlet

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4980 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-10 19:57:09 +00:00
lotus
5488543b8f disabled disk usage logpoints
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4979 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-10 07:30:50 +00:00
orbiter
1e6d12f146 Major update to BLOB data structures:
- introduced a new BLOB file format: kelondroBLOBHeap. This is a flat file with an index in RAM.
  very similar to the eco-tables, but with flexible value sizes. It will replace the kelondroBLOBTree,
  which is based on a kelondroTree, a file-AVL-based index data structure.
- the HTCACHE header file was replaced by the new blob heap file structure
- the robots.txt file was replaced by the new blob heap file structure
- the robots parser was enhanced (bugfixing for double-loading of the same robots.txt)
- other BLOB-dependent data structures were prepared to use also the new BLOB heap
- fixed a bug in the snippet fetch process: the file header was not written to the header index
There should now be less IO during snippet fetch and during crawling


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4978 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-10 00:47:37 +00:00
orbiter
81f75f5056 - removed unnecessary classes (these objects are much easier to handle using generics)
- generalized BLOB referencing. This is the preparation to use another BLOB class, the kelondroHeap


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4977 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-07 23:52:53 +00:00
orbiter
b38f467e3c better SRU compliance
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4976 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-07 21:50:24 +00:00
orbiter
7052f2f61f - added copyright header of ResourceObserver
- commented/removed some code to eliminate code warnings

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4974 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-07 00:40:45 +00:00
orbiter
1400cdc91e - refactoring of resourceObserver (moved it to crawler)
- partly redesign of diskUsage: little bit more functional behavior, less side effects, better error case handling
- the resourceObserver can now show a error message if the diskUsage is 'out of order'

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4973 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-07 00:03:37 +00:00
f1ori
b6301a54fa * added class ListDirs to provoid generic listing of directories in systemdirectories and jar-files
* yacy runs, when classes are in a jar-file (->build-jar ant-target)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4971 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-06 14:11:40 +00:00
lotus
f2e2d09916 - fix for index transfer
- imported a random startpoint function from plasmaDHTChunk
in case there was already a gap at the beginning of the index, the transfer process was endless selecting from first startpoint
tested & working on my index

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4970 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-06 13:16:17 +00:00
orbiter
a6719dfd2b - refactoring of robots parser
- no more keep-order parameter in remove (it was not possible to make this strict, and not useful)
- some small enhancements in balancer
- robots parser without references in switchboard
- changes synchronization in robots

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4969 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-05 00:35:20 +00:00
orbiter
e81be7d4f2 added many missing user-agent declarations for yacy http client connections.
the most important fix was the addition of the yacybot user-agent for robots.txt loading,
because web masters look for that access to see if the crawler behaves correctly.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4968 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-04 11:03:03 +00:00
orbiter
474e29ce4a added options to configure the 'corporate identity'-icons, the home page link and the greeting line from
the skin menue. Additionally an example is given there how to integrate a search page with an iframe.
Please see the skin menu.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4967 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-03 23:37:04 +00:00
orbiter
474659a71f - modified and enhanced the crawl balancer: better list export, fixing of damaged crawl queue at start-up, re-sorting at start-up to enhance domain order
- added option to set minimum crawl delta for domains in balancer
- added default values to crawl deltas in yacy.init
- added configuration for these deltas in performance queues
- enhanced performance setting computation (more time for indexing queue for a faster flush
- remote crawling is now enabled during local crawling if indexer has space and time for more links
- added database stub for new distributed file system
- refactoring of time computation to get an abstraction level that will be used by a TTL rule in new distributed file system

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4966 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-03 13:08:37 +00:00
orbiter
080cda97ef added another peer selection rule:
- select also non-robinson (dht-) peers if their peer tags match with search words
- the peer tag '*' can now act as catch-all rule: shall be selected always

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4963 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-30 23:04:32 +00:00
orbiter
d37fd064f9 changed peer selection for search targets:
- less dht targets are selected
- more other peers are selected: all robinson peers with more than one million urls

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4962 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-30 22:42:52 +00:00
orbiter
69aac0d74c modified the diskUsage class regarding the following two aspects:
1. The usage and dependency of the plasmaSwitchboad was used many times in the past but this was
a bad mistake. The classes should be independent from the switchboard to support a better abstraction. Therefore the object was removed. The parameters from the switchboard are computed outside and then handed over.
2. the class is considered as a tightly connected to hardware resources. Classes which handle data that cannot be replicated because it would need to replicate hadware should not support dynamic object allocation, but should be coded as collection of private static methods. Therefore all class objects had been transformed into static private objects.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4961 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-30 21:47:53 +00:00
danielr
da917cf4b1 undo reduced menu
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4960 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-30 07:11:13 +00:00
danielr
0c1dc703e4 - set staticIP at startUp
- added setting for reduced menu (simpleMenu)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4959 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-29 18:35:15 +00:00
danielr
f7f9ceb967 diskUsage: replaced blocking sleep with semaphore
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4957 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-26 12:05:12 +00:00
lotus
4a53649ee7 fixed dht-urls and ranking distribution log statistics
* NOTE: please have in mind that there can be whitespaces in pathnames

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4956 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-26 07:12:03 +00:00
lotus
8d83185cb4 fixed dht-chunks/protocol log statistics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4955 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-25 08:15:07 +00:00
danielr
63eadfdf84 fixed unlimited FileSizeLimit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4954 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-24 19:11:27 +00:00
lotus
2dc7c00c1c fixed indexing log statistics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4953 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-24 07:01:04 +00:00
danielr
dba7ba079e fixed NPE seen with queues_p.xml (serverClassLoader finds already loaded classes)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4952 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-23 16:55:46 +00:00
det
273fb01142 revert last fix; was wrong
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4950 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-21 21:07:28 +00:00
det
b6f50851fa fix memory requirement calculation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4949 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-21 20:58:57 +00:00
lotus
ac85c52bae better readability for MIN_FREE_DISK_SPACE
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4948 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-21 10:20:36 +00:00
lotus
54a73b58cf fixed restart on Windows when directory had spaces in it's name
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4947 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-21 09:19:26 +00:00
det
609aaf0df3 rework of the windows part
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4943 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-20 12:13:06 +00:00
det
1a4f26ba30 exclude HTDOCS from recursiv scan
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4942 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-20 10:03:49 +00:00
det
6c07e894d9 add needed sleep
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4941 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-20 09:53:23 +00:00
hermens
d742cc080c Fix for RAMCache not flushing
see: http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1255



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4940 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-19 18:27:48 +00:00
danielr
6b7e873962 resourceObserver refactoring and some synchronisation for console output
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4939 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-19 12:40:44 +00:00
orbiter
6bdd99e065 - more asserts to solve the ooB-problem
- better caching (?), lets see how it behaves

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4937 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-18 21:08:56 +00:00
orbiter
b928ae492a some code-cleanup and possible speed enhancements in different core methods
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4935 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-17 23:56:39 +00:00
danielr
6a9cc29cdd workaround for IndexOutOfBoundsException in ResultURLs.getExecutorHash() seen @ CrawlResults.html?process=4
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4934 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-17 18:56:04 +00:00
orbiter
c998dc6556 - added security functions to flush url and search caches in case that memory is full
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4933 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-16 21:39:58 +00:00
orbiter
f4ae8082c3 - better error analysis for ooRange Exception in kelondroBase64Ordering
- quadcore support for kelondroRowSet array ordering

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4932 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-15 23:25:57 +00:00
orbiter
84cbe75005 more asserts
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4930 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-15 00:04:59 +00:00
orbiter
e269c12710 small changes in partition routine
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4929 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-14 23:17:56 +00:00
orbiter
31efb8fbee - fix for LOG path generation when the DATA/LOG does not exists (fix for bug introduced in SVN 4923)
- some more/better asserts
- slight performance enhancements in remove method in index management. Works for all who do not run using asserts (the majority)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4928 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-14 22:51:47 +00:00
lotus
877299cc74 better installer on Windows Vista
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4927 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-14 18:34:12 +00:00
orbiter
21c87c36e3 added a log line
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4925 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-14 11:57:46 +00:00
danielr
68c38c2d34 - WatchCrawler shows status without JavaScript
- Performance can be scaled + DHT-profile
- names for pool-threads
- some small refactorings


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4923 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-14 10:24:58 +00:00
lotus
fc79f013c4 better solution to update shortcut
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4920 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-12 20:04:32 +00:00
det
c0dfe49743 also exclude collection.0028.commons and RANKING at startup check
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4919 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-12 15:17:01 +00:00
det
11656741f1 exclude LOCALE and RELEASE at startup check
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4917 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-12 11:25:25 +00:00
lotus
48edbef5c7 * fix: display proper port on 1st startup
* new message on portchange
* first implementation of external link-update for search page (still inactive)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4915 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-11 19:04:39 +00:00
det
0727bb1e63 rework of console message handling; add of debugging output
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4914 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-11 18:43:12 +00:00
lotus
43c47218ef fox for open browser on Windows
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4912 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-11 10:18:01 +00:00