Commit Graph

7285 Commits

Author SHA1 Message Date
orbiter
f3baaca920 - enhancements to DNS IP caching and crawler speed
- bugfixes (NPEs)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7619 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-22 09:34:10 +00:00
low012
e7860b1239 *) <mode="Homer">D'oh!</Homer>
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7618 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-21 22:23:20 +00:00
low012
82f1580a60 *) trying to fix ConcurrentModificationException
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7617 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-21 22:20:19 +00:00
f1ori
df71776929 * fix bug #7
* log requires poison to finish, so Base64Order main-function doesn't finish, when called from debian configure script


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7616 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-21 19:42:22 +00:00
low012
9f0286b380 *) fixed potential "java.lang.IllegalArgumentException: Illegal group reference" which occured if special characters which are also used as metacharacters in regular expression were used inside of <pre>...</pre> (see: http://veerasundar.com/blog/2010/01/java-lang-illegalargumentexception-illegal-group-reference-in-string-replaceall/)
The class still contains a potential ConcurrentModificationException which occurs when the List which contains the elements of the table of content is moified during a recursion of tagReplace(). Will try to fix this later today.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7615 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-21 18:02:09 +00:00
orbiter
78d4c45d09 enhancement during search process: fast fail of search in case that all index feeder have terminated.
This change should affect filtering and navigators and should cause that search navigation gets faster

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7614 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-21 13:05:51 +00:00
orbiter
ba03ca8620 added more configuration options for search:
- removed configuration button for 'search only for admin' from index.html and added this to ConfigPortal
- added configuration of link verification options (iffresh, cacheonly, nocache, ifexist) to ConfigPortal
- added configuration of navigation options to ConfigPortal
- added an option to switch off automatic index cleaning in case that a link verification method fails


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7613 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-21 07:50:34 +00:00
f1ori
e0c7d490f9 * fix bug #6
* exclude signature files from auto-deletion of unknown files in DATA/RELEASE


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7612 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-20 17:59:58 +00:00
orbiter
18ec7fe53c added a clearall.sh script that deletes the complete index and everything else that belongs to crawling
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7611 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-20 08:36:29 +00:00
orbiter
d98884f1d5 added script for importmediawiki.sh in build.xml
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7610 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-19 23:58:11 +00:00
orbiter
a50f28e6e7 - fixed missing save operation for peer name change
- fixed import of mediawiki dump files
- added script to add mediawiki dump files

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7609 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-19 23:52:09 +00:00
orbiter
2b5f8585bf performance hack for Balancer and ip address parsing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7608 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-17 21:09:18 +00:00
orbiter
43e1660512 fix/enhancement in Crawler: do not generate domain match pattern if crawl depth is 0
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7607 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-17 21:07:44 +00:00
orbiter
b1d133b69f another anhancement to the ThreadDump function: better multiple dumps and filtering out of not interesting dump parts
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7606 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-17 20:48:39 +00:00
orbiter
a35d513bd8 fix for not-deleted .gap and .idx files
see also: http://forum.yacy-websuche.de/viewtopic.php?p=22128#p22128

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7605 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-17 17:09:19 +00:00
orbiter
a6935e7dc8 fix for active dns resolving: do not resolve in case that the dns server is not available (offline mode)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7604 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-16 07:05:10 +00:00
orbiter
859c99886c fix for multiple thread dump
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7603 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-15 23:05:51 +00:00
orbiter
61acf55da4 avoided using a synchronized(this) for the hash computation to prevent that the lock on the object is (accidently) stolen by another thread and replaced this synchronization using the protocol object. Made also the protocol object final.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7602 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-15 09:52:39 +00:00
orbiter
c2a968c23f fix for bug in formatting in ThreadDump
and added hint for linux/Mac users that they may use the LOCKED feature using the start option -l

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7601 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-15 08:39:05 +00:00
low012
2861d0888a *) simplified code\n*) fixed potential NumberFormatExceptions
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7600 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-15 01:03:35 +00:00
orbiter
078ecacf61 avoid synchronization in DigestURI hash requests
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7599 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-15 00:47:30 +00:00
orbiter
1989ebc24b removed more warnings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7598 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 22:52:30 +00:00
orbiter
0324de1467 removed debug line
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7597 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 21:34:42 +00:00
orbiter
1aba7869bf patch for Windows: do not use the thread lock feature from previous commit if used on Windows
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7596 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 21:33:36 +00:00
orbiter
0a11727374 added new feature for Thread dump:
"THREADS WITH STATES: LOCK FOR OTHERS"
will show only such threads that lock other threads. This is the 'opposite part' of the blocked threads.
Because that this uses a thread dump that is produced with a kill -3 on the PID of the process and such thread dumps are written by the Java core outside of System.out and Sytem.err it is necessary to read the dump from a log in the file system. Such a log is only written if YaCy is started with startYACY.sh on a linux system. That means:
this feature is only available on linux and Mac OS X if YaCy is started with ./startYACY.sh -l


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7595 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 21:32:20 +00:00
orbiter
b62b79675b removed type cast warnings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7594 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 21:08:18 +00:00
orbiter
a07a1a8b1e removed type cast warnings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7593 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 21:07:15 +00:00
orbiter
8edaccfedf removed unused variables
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7592 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 21:03:37 +00:00
orbiter
e6c3507b17 disabled some of the previous changes (did not work in openjdk)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7591 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 20:48:36 +00:00
orbiter
f9e5c21083 update to thread dump logs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7590 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 20:46:04 +00:00
pca
4a237bfa5d Windows Installer:
- add support for Windows Firewall on Win XP (SP2/SP3), Vista and Win 7 (open port 8090) - this should cover almost every Windows installation at home

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7589 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 19:32:08 +00:00
sixcooler
9199b9e3c6 also putting jcifs-1.3.15 into classpath
(let me me build YaCy again :-)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7588 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-13 22:44:50 +00:00
suessthomas
9956dc9dce Update jcifs-library to Version 1.3.15. Small Changes, read: http://jcifs.samba.org/ - "Minor adjustments have been applied to DcerpcHandle locking routines in the SID class to fix sporadic occurances of "All pipe instances are busy" errors under high load."
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7587 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-13 20:09:52 +00:00
orbiter
8f11d3a5bb redesigned the ScoreMap classes:
- new concurrent score map using atom operation from java concurrency classes
- redesigned difference beween StaticScore and Dynamic Score into ScoreMap and ReversibleScoreMap allowed that many classes can now use simple ScoreMap Objects which can be used better in concurrent environments using the ConcurrentScoreMap
- switched from DynamicScore to ConcurrentScoreMap usage wherever possible

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7586 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-13 01:41:44 +00:00
orbiter
a564230c48 more enhancements against blocked threads occurred in seed age evaluation (blocks httpd in some cases)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7585 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-12 22:54:41 +00:00
orbiter
dc0db3550e avoid string conversion
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7584 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-11 00:59:27 +00:00
orbiter
694fa3a2a5 - replaced more direct string-based UTF-8 conversions by predefined UTF-8 conversion
- changed menu structure slightly

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7583 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-10 23:25:07 +00:00
lotus
bbb7aea8f3 fix basic config change in portal mode
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7582 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-10 20:04:15 +00:00
pca
df68bf6001 Windows Installer:
- check Windows-Version on startup, support only Windows 2000 and newer (necessary for Sun-JRE and as preparation for firewall section)
- little changes in JRE section handling

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7581 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-10 19:30:05 +00:00
orbiter
30aed9824a moved getBytes() to UTF8.getBytes() to use a default String encoding
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7580 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-10 12:35:32 +00:00
lotus
cb6d307bba adding extension for parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7579 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-09 20:36:01 +00:00
pca
61a64bdbef Windows Installer:
- detect JRE at startup, showing install-option depends on result
- hide window for external call "attrib"
- some cleanup and restructure for readability

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7578 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-09 20:19:34 +00:00
orbiter
4d733608fb fix for broken JSON, see: http://forum.yacy-websuche.de/viewtopic.php?p=22162#p22162
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7577 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-09 20:08:20 +00:00
orbiter
1214615185 fix for 'invisible entry', see http://forum.yacy-websuche.de/viewtopic.php?p=22133#p22133
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7576 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-09 17:04:34 +00:00
orbiter
3820525464 more memory protection: auto-flush of caches in case of memory shortage
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7575 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-09 16:32:34 +00:00
orbiter
7962d35425 - removed file upload function in crawl start and replaced it with an input field for a file path where the crawl start file is loaded. This was necessary to support the API steering for file crawl starts, for two reasons:
1) if the file is changed for a re-crawl this is not reflected in the steering because it would take the previously uploaded crawl start file
2) browsers do not submit the full path of the selected file even if this path is shown in the input field because of security reasons. There is no work-around or hack to make the submission of the full path possible

- fixed deletion of crawl start point urls in crawl stack and balancer double-check
- fixed a problem with steering self-call (no resolving of localhost)
- added more logging for the crawler to supervise why crawl urls are not taken by the loader
- added a javascript onload-function to select domain restriction in all cases where a crawl is started from a file or from a url
- fixed the restrict-to-domain pattern computation, added a 'www.'-prefix and added this functionality also to a crawl start from file 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7574 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-09 12:50:39 +00:00
orbiter
96bb33ed9b added default size to StringBuffer in logger (and it is not possible to replace the StringBuffer with a StringBuilder...)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7573 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-09 09:53:57 +00:00
orbiter
e1b6916423 always try to guess the size of a StringBuilder to prevent too many memory re-allocations
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7572 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-09 09:29:05 +00:00
low012
bea8137997 *) minor changes
*) fixed potential NPE in suggest.java

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7571 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-08 23:27:41 +00:00
low012
3e03963b1c *) minor changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7570 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-08 22:37:17 +00:00