Commit Graph

4487 Commits

Author SHA1 Message Date
orbiter
cb6f709a16 - enhancements in surrogate reading
- better display of map in location search

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7636 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-02 00:11:37 +00:00
low012
1ff9947f91 *) added new user right: extended search right (allows to define users who can query more results than anonymous users)
*) cleaned up code a little bit

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7635 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-01 23:32:40 +00:00
orbiter
156cf02703 - added an index constraint 'has location' to the condenser
- added evaluation of the 'has location' constraint to search using the /location operator


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7633 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-31 09:41:30 +00:00
orbiter
0430a94eaa the location search shows now not re-evaluated locations but only such locations that are attached as metadata to web pages
- added parser for in-text appearing geo-locations
- added geo-locations to rss search result
- added evaluation of metadata-attached geo-locations in yacysearch_location to show search results within a map


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7631 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-30 23:26:36 +00:00
orbiter
9b25d07295 - added geo information parsing to html parser
- extended metadata information in index with geolocalisation
- added display of location in yacydoc and ViewFile

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7629 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-30 00:49:47 +00:00
f1ori
efcf37a953 * show info in log, if robots.txt is rejected due to wrong mime-type
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7628 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-28 19:55:15 +00:00
low012
16cd919795 *) fixed Exceptions which caused 500 error when entering invalid URL mask or invalid prefer mask, invalid masks are ignored, error message is displayed on yacysearch.html (what about yacysearch.rss and yacysearch.json?)
*) fixed "more options" link on yacysearch.html

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7623 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-23 00:48:19 +00:00
low012
1a24917cea *) fixed NPE which occured when empty String was entered as search word
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7622 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-23 00:44:38 +00:00
orbiter
b1a8d0c020 enhancements to web cache and less strict caching rules
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7620 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-22 10:35:26 +00:00
orbiter
f3baaca920 - enhancements to DNS IP caching and crawler speed
- bugfixes (NPEs)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7619 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-22 09:34:10 +00:00
low012
e7860b1239 *) <mode="Homer">D'oh!</Homer>
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7618 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-21 22:23:20 +00:00
low012
82f1580a60 *) trying to fix ConcurrentModificationException
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7617 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-21 22:20:19 +00:00
low012
9f0286b380 *) fixed potential "java.lang.IllegalArgumentException: Illegal group reference" which occured if special characters which are also used as metacharacters in regular expression were used inside of <pre>...</pre> (see: http://veerasundar.com/blog/2010/01/java-lang-illegalargumentexception-illegal-group-reference-in-string-replaceall/)
The class still contains a potential ConcurrentModificationException which occurs when the List which contains the elements of the table of content is moified during a recursion of tagReplace(). Will try to fix this later today.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7615 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-21 18:02:09 +00:00
orbiter
78d4c45d09 enhancement during search process: fast fail of search in case that all index feeder have terminated.
This change should affect filtering and navigators and should cause that search navigation gets faster

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7614 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-21 13:05:51 +00:00
orbiter
ba03ca8620 added more configuration options for search:
- removed configuration button for 'search only for admin' from index.html and added this to ConfigPortal
- added configuration of link verification options (iffresh, cacheonly, nocache, ifexist) to ConfigPortal
- added configuration of navigation options to ConfigPortal
- added an option to switch off automatic index cleaning in case that a link verification method fails


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7613 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-21 07:50:34 +00:00
f1ori
e0c7d490f9 * fix bug #6
* exclude signature files from auto-deletion of unknown files in DATA/RELEASE


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7612 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-20 17:59:58 +00:00
orbiter
a50f28e6e7 - fixed missing save operation for peer name change
- fixed import of mediawiki dump files
- added script to add mediawiki dump files

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7609 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-19 23:52:09 +00:00
orbiter
2b5f8585bf performance hack for Balancer and ip address parsing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7608 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-17 21:09:18 +00:00
low012
2861d0888a *) simplified code\n*) fixed potential NumberFormatExceptions
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7600 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-15 01:03:35 +00:00
orbiter
1989ebc24b removed more warnings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7598 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 22:52:30 +00:00
orbiter
b62b79675b removed type cast warnings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7594 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 21:08:18 +00:00
orbiter
8f11d3a5bb redesigned the ScoreMap classes:
- new concurrent score map using atom operation from java concurrency classes
- redesigned difference beween StaticScore and Dynamic Score into ScoreMap and ReversibleScoreMap allowed that many classes can now use simple ScoreMap Objects which can be used better in concurrent environments using the ConcurrentScoreMap
- switched from DynamicScore to ConcurrentScoreMap usage wherever possible

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7586 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-13 01:41:44 +00:00
orbiter
a564230c48 more enhancements against blocked threads occurred in seed age evaluation (blocks httpd in some cases)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7585 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-12 22:54:41 +00:00
orbiter
dc0db3550e avoid string conversion
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7584 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-11 00:59:27 +00:00
orbiter
694fa3a2a5 - replaced more direct string-based UTF-8 conversions by predefined UTF-8 conversion
- changed menu structure slightly

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7583 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-10 23:25:07 +00:00
orbiter
30aed9824a moved getBytes() to UTF8.getBytes() to use a default String encoding
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7580 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-10 12:35:32 +00:00
orbiter
1214615185 fix for 'invisible entry', see http://forum.yacy-websuche.de/viewtopic.php?p=22133#p22133
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7576 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-09 17:04:34 +00:00
orbiter
3820525464 more memory protection: auto-flush of caches in case of memory shortage
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7575 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-09 16:32:34 +00:00
orbiter
7962d35425 - removed file upload function in crawl start and replaced it with an input field for a file path where the crawl start file is loaded. This was necessary to support the API steering for file crawl starts, for two reasons:
1) if the file is changed for a re-crawl this is not reflected in the steering because it would take the previously uploaded crawl start file
2) browsers do not submit the full path of the selected file even if this path is shown in the input field because of security reasons. There is no work-around or hack to make the submission of the full path possible

- fixed deletion of crawl start point urls in crawl stack and balancer double-check
- fixed a problem with steering self-call (no resolving of localhost)
- added more logging for the crawler to supervise why crawl urls are not taken by the loader
- added a javascript onload-function to select domain restriction in all cases where a crawl is started from a file or from a url
- fixed the restrict-to-domain pattern computation, added a 'www.'-prefix and added this functionality also to a crawl start from file 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7574 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-09 12:50:39 +00:00
orbiter
e1b6916423 always try to guess the size of a StringBuilder to prevent too many memory re-allocations
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7572 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-09 09:29:05 +00:00
low012
3b40b98256 *) set SVN properties
*) minor changes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7567 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-08 01:51:51 +00:00
orbiter
2af8e33773 better performance computing search targets with index abstracts
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7566 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-07 23:32:01 +00:00
orbiter
619b561a4a enhanced secondary search: index abstracts decompression is now much faster and does not cause strong CPU load after several searches with more than one word
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7565 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-07 23:12:39 +00:00
orbiter
27ecdb5444 use less peers for remote search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7561 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-07 21:24:46 +00:00
orbiter
cb1f49d0f2 replaced all 'new String' with default encoding (missing) or UTF-8 encoding with a String generation method that uses a pre-defined Charset constant for UTF-8. This avoids a cache-lookup for the Charset object using String hashing of the String 'UTF-8'.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7558 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-07 20:36:40 +00:00
orbiter
7138f4036b less synchronization, better thread dump tool
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7556 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-07 15:29:45 +00:00
orbiter
8d14916c74 more patches for a better out-of-memory management
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7555 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-07 01:45:11 +00:00
orbiter
c2c5b12882 - even less memory for circle tool
- background thread for bookmark initialization: this uses a DNS lookup which may cause long waiting times during startup

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7554 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-06 12:30:22 +00:00
orbiter
799c534935 one more patch again OOM during secondary remote search
see also: http://forum.yacy-websuche.de/viewtopic.php?f=6&t=3202

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7551 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-05 19:52:34 +00:00
orbiter
77b1e921a9 this asserts prevents a network operation in case of sabotage and must be removed therefore
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7550 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-04 14:34:37 +00:00
orbiter
bed79402be introduction of a new remote search load control: the remote search has taken 10 results per peer with a time-out of 3 seconds so far. The attributes of number of results per peer and time-out time can now be configured.
This has two aspects: the user who searches may want to increase these values to get more results and more load on the remote side and the user of the server which is accessed for this search may want to restrict the load. Both sides can now be configured. The server-site maximum load parameters are defined by a network definition and the client-side search request load can be defined by each user individually but when the remote search is done the requested service is limited to the network definition.

You can find now in the network definition file:
network.unit.remotesearch.maxcount and network.unit.remotesearch.maxtime
and in the yacy.conf file:
remotesearch.maxcount and remotesearch.maxtime

There is currently no web interface to define the client-side remote search attributes, please set them manually
    

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7548 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-04 13:44:00 +00:00
orbiter
6dfaf6fef7 fix for bug in deletion of old seeds
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7547 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-04 10:00:37 +00:00
orbiter
993b9bc1a8 memory/performance hacks, less synchronization, better concurrency
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7544 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-03 11:30:04 +00:00
orbiter
42d90664f3 - fixed a memory leak in the httpc.post method (no finish)
- patched some more memory-saving relevant code
- some more minor bug fixes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7541 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-01 09:03:33 +00:00
orbiter
38dce547c0 better concurrency (less locking on date formatting) more logging and minor bug fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7540 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-28 06:28:29 +00:00
f1ori
59dea3a284 * implement url proxy, a proxy via the url http://peer:port/proxy.html?url=http://domain.tld/path
* enable with proxyURL = true
* could be useful to browse specific pages with proxy or use own improvements in proxy

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7538 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-27 21:39:38 +00:00
mikeworks
8b7b783c49 Tray.java: Broke the build on with wrong non UTF-8 encoded file and french umlauts (unmappable character for encoding UTF8)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7537 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-27 15:01:46 +00:00
mikeworks
db65ada467 Tray.java: Added localization for french tray icon command - although this can probably also done better than with if statements. (preferably also from the locales file)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7536 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-27 11:42:33 +00:00
orbiter
89d337841c more logging for OOMs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7534 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-26 09:27:15 +00:00
orbiter
b1781d7aae some more performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7533 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-26 01:24:49 +00:00