Commit Graph

3023 Commits

Author SHA1 Message Date
orbiter
6e4d2f0800 fix for the network image sync bug
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7059 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-21 10:59:21 +00:00
orbiter
e10cd115a9 - added a new RSS reader interface. This is not finished but you can now load and look at RSS feeds. It will be used to index RSS feeds in a way that is appropriate for such kind of data.
- refactoring of Mediawiki and PHPBB3 loader interface names (just renamed)
- removed two old not used RSS loader interfaces
- fixed a bug in RSS parser library of cora
- added a new RSS parser component to the set of yacy document parsers

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7053 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-20 11:30:02 +00:00
orbiter
933dc1a600 removed old rss parser (will be replaced with parser from cora package)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7052 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-20 07:42:38 +00:00
orbiter
70dd26ec95 added the new crawl scheduling function to the crawl start menu:
- the scheduler extends the option for re-crawl timing. Many people misunderstood the re-crawl timing feature because that was just a criteria for the url double-check and not a scheduler. Now the scheduler setting is combined with the re-crawl setting and people will have the choice between no re-crawl, re-crawl as was possible so far and a scheduled re-crawl. The 'classic' re-crawl time is set automatically when the scheduling function is selected
- removed the bookmark-based scheduler. This scheduler was not able to transport all attributes of a crawl start and did therefore not support special crawling starts i.e. for forums and wikis
- since the old scheduler was not aber to crawl special forums and wikis, the must-not-match filter was statically fixed to all bad pages for these special use cases. Since the new scheduler can handle these filters, it is possible to remove the default settings for the filters
- removed the busy thread that was used to trigger the bookmark-based scheduler
- removed the crontab for the bookmark-based scheduler

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7051 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-19 23:52:38 +00:00
orbiter
5a994c9796 added a scheduler based on API actions
- every process that is monitored with the API Steering interface can now be scheduled!
- added input methods in Steering interface to set a scheduling time
- added a view on the steering api that shows only crawl jobs inside the Crawl Profile servlet
- added a scheduling call process in the cleanup process handler that triggers the scheduled processes
This causes that the cleanup now also looks for scheduled processes. Such processes are therefore not executed at
the same time as given in the target execution time but they will be executed within the cleanup process time window.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7050 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-19 12:13:54 +00:00
orbiter
189a986ebd - modified api-call interface to record api calls with references to api-call database (carries pk)
- added recording date, last execution date and next execution date for a scheduler (scheduler to be implemented next)
- extended database access methods for more data formats, especially for date insert/retrieval
- extended 'Steering' interface to show new database fields
- migrated Steering to new http client
- extended cora http client to transmit authentication and also added some convenience methods (http response code)
- simplified database back-end (not so much specialized methods for multiple properties)
- extended date formatter to produce a special format to show dates in html (  in spaces of date format)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7049 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-18 15:56:38 +00:00
orbiter
f616cdfce4 better resistance of NetworkImage generation against heavy load
this is needed for the network image on the yacy.net home page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7046 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-18 09:51:00 +00:00
mikeworks
2f8ff8ec02 de.lng: Added some German translation for Config* pages that I have found untranslated
ConfigNetwork_p.html: Updated Javascript for P2P <-> Robinson selection to use the new ID values - sorry for breaking this in 6996 (undoing id -> name changes again in 7041 and 7042 because the name tag is not allowed in XHMTL Strict 1.0)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7045 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-18 05:55:11 +00:00
orbiter
86d7f8a989 - the web visualization can now be generated in custom color
- added input fields in WatchWebStructure_p.html
- introduced enum classes for Draw Mode and Filter Mode

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7044 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-17 10:44:00 +00:00
orbiter
7fdb17bb96 redirect uncaught exceptions to logging + small other changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7042 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-16 12:33:06 +00:00
orbiter
237cfc44b0 fixed auto-set values for robinson selection; this reverts a single line from SVN 6996
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7041 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-15 18:23:47 +00:00
orbiter
171f2bd84e - removed unused network oanet
- added new network definition 'allip' which can be used in networks where intranet and internet-addresses shall be indexed
- added a auto-switch-off for global search if there are no global peers

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7030 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-09 23:41:17 +00:00
orbiter
388aa021c2 - concurrent loading of OSM tiles
- added  a 4-time re-try in case that tile server does not respond

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7025 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-08 23:14:08 +00:00
orbiter
6388a58fc7 better memory management and slightly less (in total and temporary) RAM allocation:
- confirm that database objects that are not supposed to grow do not have a index memory management that is designed for growth
- changed index sorting method in such a way that it allocates less objects during quicksort
- database classes classes renaming (shorter, naming addresses that objects hold in RAM)
- added a large number of asserts to check if objects actually take the RAM that they should have


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7019 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-04 13:33:12 +00:00
orbiter
5924a0d851 - enhanced concurrency in database index access for multicore
- added statistics about database index caches in PerformanceMemory_p.html
- adoped many classes to use the new statistics
- added missing close statements

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7018 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-03 04:58:48 +00:00
orbiter
55a2536bcf enhancement in drawing speed and reduction of object allocation during drawing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7017 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-03 02:44:08 +00:00
low012
1bfa21f973 *) HTML fix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7013 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-02 17:31:05 +00:00
low012
ced07970c1 *) fix for last commit
*) HTML fixes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7010 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-01 19:04:57 +00:00
low012
4e60c69f84 *) HTML fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7009 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-28 15:54:44 +00:00
orbiter
66266a288e better network image cache
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7008 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-28 14:54:56 +00:00
orbiter
e7ea3b3cc5 added a buffer for network images to reduced load on yacy.net network image server
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7007 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-28 12:45:53 +00:00
orbiter
d5c65b17a6 added another network activity visualization: show strong query activity as radiation around peer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7006 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-28 11:40:58 +00:00
sixcooler
15e8c13526 ... migrating to HttpComponents-Client-4.x ...
(gzip decompression, httploader, robots, ...)

+ enable proxy-crawling while log is fine

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7001 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-27 01:16:26 +00:00
orbiter
a55af783bf healing for color blindness
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7000 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-26 22:54:06 +00:00
mikeworks
6b13101d18 Collage.html: Fixed problem where the German translation broke the action that contained Collage in a form
build.xml: Fixed check for existing private.key, added check for non existing release in target sign and changed the include filenames for changed libs
Added log4j.properties file to eliminate the warning about a not initialized log4j subsystem with parameters for one console appender

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6998 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-26 20:18:19 +00:00
orbiter
63c5634b0f added online documentation for ranking configuration
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6997 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-26 10:08:21 +00:00
mikeworks
aa663cda4d ConfigUpdate_p.html and ConfigUpdate_p.java: Added check for downloaded releases and disabled buttons in case no new releases available
de.lng: Updated German translation for additional String in ConfigUpdate_p.html
XHTML 1.0 Strict fixes for all the other .html files
yacy/ui/css/yacyui-portalsearch.css: added .hidden class that was removed from ConfigProperties_p.html
Switchboard.java: Added URL for thread Remote Crawl Job and set URL for Remote Crawl URL Loader to null to fix empty href=""

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6996 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-25 13:19:16 +00:00
low012
afd1cd7979 *) HTML fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6995 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-24 15:42:09 +00:00
mikeworks
0f248e7433 ConfigBasic.html: XHTML 1.0 Strict fixes
DictionaryLoader_p.html: Filled <dt> elements to eliminate warnings
Moved CSS for portalsearch field from header to metas template because it belongs in the <head>er
yacui-portalsearch.css Added #yacylivesearch form { display: inline; } because HTML 1.0 Strict does not allow <form><input> and the added <p> would otherwise provoke a line break
de.lng: Updates translations for added <dt> elements and deactivated statement in DictionaryLoader_p.html


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6994 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-24 15:26:53 +00:00
low012
0b89fa2c8d *) HTML fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6992 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-24 12:28:48 +00:00
low012
ad96a14d0a *) jump to Crawl Profile editor if a profile is selected to be edited
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6991 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-23 17:35:09 +00:00
mikeworks
e4ced6484b yacyinteractive.html:
- added type="text/javascript" to script resource
- removed unintentional "\" from <a> link
- changed "name" tag in <form> element to "id" for XHTML 1.0 Strictness
(remaining warnings come from script elements writing end tags like </tr> that might confuse some validators)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6990 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-23 06:36:06 +00:00
sixcooler
b7102eff92 ... migrating to HttpComponents-Client-4.x ...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6989 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-22 23:08:37 +00:00
sixcooler
52718e6dcb ... migrating to HttpComponents-Client-4.x ...
monitoring: replaced unused 'idletime' by uploading bytes
added some kind of 'upload-throttling' at dht-out :-)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6983 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-22 00:51:41 +00:00
mikeworks
b143f6b169 ConfigHeuristics_p.html: XHTML 1.0 Strict Changes
- added empty action tag to form
- replaced name tags with id (name is not a valid tag in XHTML 1.0 Strict)
- changed label for target (so now clicking on the labels also activates the checkboxes)
de.lng: Test with Subversion properties #2

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6982 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-21 22:40:34 +00:00
sixcooler
5fa8038f10 ... migrating to HttpComponents-Client-4.x ...
monitoring and first try to use remoteProxy

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6979 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-20 01:14:28 +00:00
low012
2d2771a12e *) more HTML fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6976 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-18 19:21:59 +00:00
low012
eb8550526d *) fixed small HTML bug
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6975 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-18 18:40:41 +00:00
mikeworks
b4d5bb6a3e Steering.html: Changed link from Settings_p.html to ConfigAccounts_p.html for setting not existing Administrator password
de.lng: Added missing translations for Steering.html during restart/update

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6973 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-18 13:31:44 +00:00
low012
0e6fed1fb6 *) less HTML errors (according to https://addons.mozilla.org/de/firefox/addon/249/)
*) followed some suggestions by PMD

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6970 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-18 09:10:46 +00:00
low012
2d263a7157 *) less HTML errors (according to https://addons.mozilla.org/de/firefox/addon/249/)
*) Is line 112 there on purpose or can it be deleted?

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6969 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-17 19:00:54 +00:00
low012
2de0ded377 *) trying to fix bug described in http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2900
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6964 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-10 18:03:07 +00:00
mikeworks
d851758dc6 Added German translation for ConfigHeuristics_p.html to de.lng
Fixed Network -> Heuristics title tag of the page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6963 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-05 22:58:51 +00:00
orbiter
43e6ce62af use heuristics only if user is authenticated
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6962 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-05 21:52:02 +00:00
suessthomas
7feb549ce6 Small HTML-Fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6960 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-04 22:16:58 +00:00
orbiter
aa66da5135 corrected hint for debian installation update
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6959 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-06-30 14:31:16 +00:00
orbiter
b6fb239e74 redesign of parser interface:
some file types are containers for several files. These containers had been parsed in such a way that the set of resulting parsed content was merged into one single document before parsing. Using this parser infrastructure it is not possible to parse document containers that contain individual files. An example is a rss file where the rss messages can be treated as individual documents with their own url reference. Another example is a surrogate file which was treated with a special operation outside of the parser infrastructure.
This commit introduces a redesigned parser interface and a new abstract parser implementation. The new parser interface has now only one entry point and returns always a set of parsed documents. In case of single documents the parser method returns a set of one documents.
To be compliant with the new interface, the zip and tar parser had been also completely redesigned. All parsers are now much more simple and cleaner in its structure. The switchboard operations had been extended to operate with sets of parsed files, not single parsed files.
additionally, parsing of jar manifest files had been added.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6955 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-06-29 19:20:45 +00:00
orbiter
59c894029b removed confusing double set button in ConfigHeuristics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6954 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-06-28 22:27:20 +00:00
orbiter
11b7853940 added a configuration page for search heuristics. currently you can switch on there:
- a site-operation heuristic that loads all direct links from a portal page if the site-operator is used
- a direct crawl for search results from scroogle for the given search terms
The configuration page can be found directly beside the network configuration page


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6951 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-06-27 21:38:16 +00:00
orbiter
5d00888c95 - added animated visualization for DHT-in and DHT-out in network graphic
- found and fixed a possible memory leak in YaCy internal RSS feed system
- some refactoring in RSS feed mechanisms to make this possible

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6950 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-06-27 10:45:20 +00:00