Commit Graph

345 Commits

Author SHA1 Message Date
orbiter
deda54d684 - relaxed matching of string-search (this is now case-insensitive)
- added transport of string-search pattern to remote search protocol
- fixed a problem parsing snippets with a '-' inside

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7700 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-05 22:37:06 +00:00
orbiter
b77b8cac0c - enhanced html parser: recognized much more details in the content
- added more properties to solr index
- refactoring
- more constants in switchboard
- fix for some NPEs
- recognition of more images
- removed synchronization in HandleMap (obviously not necessary?)
- added a nolocal configuration to remove excessive dns lookup (works only on allip - default off). Indexes produced with this setting are all flagged with 'local' and are (on purpose) not usable for freeworld because they will be rejected as beeing local.



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7672 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-21 13:58:49 +00:00
orbiter
17530ca7b5 fix for bug http://bugs.yacy.net/view.php?id=10
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7642 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-04 12:20:20 +00:00
orbiter
0430a94eaa the location search shows now not re-evaluated locations but only such locations that are attached as metadata to web pages
- added parser for in-text appearing geo-locations
- added geo-locations to rss search result
- added evaluation of metadata-attached geo-locations in yacysearch_location to show search results within a map


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7631 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-30 23:26:36 +00:00
orbiter
8412f8787d fix for http://bugs.yacy.net/view.php?id=8
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7630 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-30 08:17:25 +00:00
orbiter
9b25d07295 - added geo information parsing to html parser
- extended metadata information in index with geolocalisation
- added display of location in yacydoc and ViewFile

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7629 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-30 00:49:47 +00:00
orbiter
b1a8d0c020 enhancements to web cache and less strict caching rules
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7620 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-22 10:35:26 +00:00
f1ori
df71776929 * fix bug #7
* log requires poison to finish, so Base64Order main-function doesn't finish, when called from debian configure script


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7616 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-21 19:42:22 +00:00
orbiter
78d4c45d09 enhancement during search process: fast fail of search in case that all index feeder have terminated.
This change should affect filtering and navigators and should cause that search navigation gets faster

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7614 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-21 13:05:51 +00:00
orbiter
2b5f8585bf performance hack for Balancer and ip address parsing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7608 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-17 21:09:18 +00:00
orbiter
b1d133b69f another anhancement to the ThreadDump function: better multiple dumps and filtering out of not interesting dump parts
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7606 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-17 20:48:39 +00:00
orbiter
a35d513bd8 fix for not-deleted .gap and .idx files
see also: http://forum.yacy-websuche.de/viewtopic.php?p=22128#p22128

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7605 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-17 17:09:19 +00:00
orbiter
859c99886c fix for multiple thread dump
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7603 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-15 23:05:51 +00:00
orbiter
61acf55da4 avoided using a synchronized(this) for the hash computation to prevent that the lock on the object is (accidently) stolen by another thread and replaced this synchronization using the protocol object. Made also the protocol object final.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7602 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-15 09:52:39 +00:00
orbiter
c2a968c23f fix for bug in formatting in ThreadDump
and added hint for linux/Mac users that they may use the LOCKED feature using the start option -l

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7601 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-15 08:39:05 +00:00
orbiter
078ecacf61 avoid synchronization in DigestURI hash requests
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7599 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-15 00:47:30 +00:00
orbiter
1989ebc24b removed more warnings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7598 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 22:52:30 +00:00
orbiter
0324de1467 removed debug line
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7597 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 21:34:42 +00:00
orbiter
1aba7869bf patch for Windows: do not use the thread lock feature from previous commit if used on Windows
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7596 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 21:33:36 +00:00
orbiter
0a11727374 added new feature for Thread dump:
"THREADS WITH STATES: LOCK FOR OTHERS"
will show only such threads that lock other threads. This is the 'opposite part' of the blocked threads.
Because that this uses a thread dump that is produced with a kill -3 on the PID of the process and such thread dumps are written by the Java core outside of System.out and Sytem.err it is necessary to read the dump from a log in the file system. Such a log is only written if YaCy is started with startYACY.sh on a linux system. That means:
this feature is only available on linux and Mac OS X if YaCy is started with ./startYACY.sh -l


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7595 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 21:32:20 +00:00
orbiter
a07a1a8b1e removed type cast warnings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7593 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 21:07:15 +00:00
orbiter
e6c3507b17 disabled some of the previous changes (did not work in openjdk)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7591 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 20:48:36 +00:00
orbiter
f9e5c21083 update to thread dump logs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7590 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 20:46:04 +00:00
orbiter
8f11d3a5bb redesigned the ScoreMap classes:
- new concurrent score map using atom operation from java concurrency classes
- redesigned difference beween StaticScore and Dynamic Score into ScoreMap and ReversibleScoreMap allowed that many classes can now use simple ScoreMap Objects which can be used better in concurrent environments using the ConcurrentScoreMap
- switched from DynamicScore to ConcurrentScoreMap usage wherever possible

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7586 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-13 01:41:44 +00:00
orbiter
a564230c48 more enhancements against blocked threads occurred in seed age evaluation (blocks httpd in some cases)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7585 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-12 22:54:41 +00:00
orbiter
dc0db3550e avoid string conversion
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7584 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-11 00:59:27 +00:00
orbiter
694fa3a2a5 - replaced more direct string-based UTF-8 conversions by predefined UTF-8 conversion
- changed menu structure slightly

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7583 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-10 23:25:07 +00:00
orbiter
30aed9824a moved getBytes() to UTF8.getBytes() to use a default String encoding
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7580 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-10 12:35:32 +00:00
orbiter
3820525464 more memory protection: auto-flush of caches in case of memory shortage
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7575 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-09 16:32:34 +00:00
orbiter
96bb33ed9b added default size to StringBuffer in logger (and it is not possible to replace the StringBuffer with a StringBuilder...)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7573 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-09 09:53:57 +00:00
orbiter
e1b6916423 always try to guess the size of a StringBuilder to prevent too many memory re-allocations
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7572 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-09 09:29:05 +00:00
low012
3b40b98256 *) set SVN properties
*) minor changes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7567 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-08 01:51:51 +00:00
orbiter
619b561a4a enhanced secondary search: index abstracts decompression is now much faster and does not cause strong CPU load after several searches with more than one word
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7565 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-07 23:12:39 +00:00
low012
bf27a72d53 *) set SVN properties
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7564 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-07 23:05:23 +00:00
low012
b649ce2dd7 *) minor changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7563 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-07 22:59:19 +00:00
orbiter
70a996a06c reverted SVN 7557 because these classes are called using reflection. The class declaration is in the log configuration. Without these classes you get errors during runtime and a non-formatted log output, i.e.:
STARTUP: Trying to load logging configuration from file /Data/workspace1/yacy/DATA/LOG/yacy.logging
Can't load log handler "net.yacy.kelondro.logging.ConsoleOutErrHandler"
java.lang.ClassNotFoundException: net.yacy.kelondro.logging.ConsoleOutErrHandler
java.lang.ClassNotFoundException: net.yacy.kelondro.logging.ConsoleOutErrHandler
	at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
	at java.util.logging.LogManager$3.run(LogManager.java:359)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.util.logging.LogManager.loadLoggerHandlers(LogManager.java:346)
	at java.util.logging.LogManager.initializeGlobalHandlers(LogManager.java:898)
	at java.util.logging.LogManager.access$900(LogManager.java:130)
	at java.util.logging.LogManager$RootLogger.getHandlers(LogManager.java:979)
	at java.util.logging.Logger.log(Logger.java:454)
	at java.util.logging.Logger.doLog(Logger.java:480)
	at java.util.logging.Logger.log(Logger.java:503)
	at net.yacy.kelondro.logging.Log$logRunner.run(Log.java:332)
Can't load log handler "net.yacy.kelondro.logging.LogalizerHandler"
java.lang.ClassNotFoundException: net.yacy.kelondro.logging.LogalizerHandler
java.lang.ClassNotFoundException: net.yacy.kelondro.logging.LogalizerHandler
	at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
	at java.util.logging.LogManager$3.run(LogManager.java:359)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.util.logging.LogManager.loadLoggerHandlers(LogManager.java:346)
	at java.util.logging.LogManager.initializeGlobalHandlers(LogManager.java:898)
	at java.util.logging.LogManager.access$900(LogManager.java:130)
	at java.util.logging.LogManager$RootLogger.getHandlers(LogManager.java:979)
	at java.util.logging.Logger.log(Logger.java:454)
	at java.util.logging.Logger.doLog(Logger.java:480)
	at java.util.logging.Logger.log(Logger.java:503)
	at net.yacy.kelondro.logging.Log$logRunner.run(Log.java:332)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7559 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-07 20:42:19 +00:00
orbiter
cb1f49d0f2 replaced all 'new String' with default encoding (missing) or UTF-8 encoding with a String generation method that uses a pre-defined Charset constant for UTF-8. This avoids a cache-lookup for the Charset object using String hashing of the String 'UTF-8'.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7558 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-07 20:36:40 +00:00
low012
9d366ee9d7 *) removed unused code (I assume that most of the code was really dead, but if you need any of the classes, tell me and I will put it back in.)
*) minor code cleanup in ViewLog

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7557 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-07 18:55:11 +00:00
orbiter
7138f4036b less synchronization, better thread dump tool
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7556 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-07 15:29:45 +00:00
orbiter
8d14916c74 more patches for a better out-of-memory management
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7555 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-07 01:45:11 +00:00
orbiter
ce0c8247fc removed (most probably!?!) superfluos System.err output
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7552 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-06 01:01:29 +00:00
orbiter
799c534935 one more patch again OOM during secondary remote search
see also: http://forum.yacy-websuche.de/viewtopic.php?f=6&t=3202

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7551 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-05 19:52:34 +00:00
orbiter
f8d0454c53 small bug fixes and experiments with search speed enhancement
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7549 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-04 14:29:22 +00:00
orbiter
993b9bc1a8 memory/performance hacks, less synchronization, better concurrency
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7544 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-03 11:30:04 +00:00
orbiter
42d90664f3 - fixed a memory leak in the httpc.post method (no finish)
- patched some more memory-saving relevant code
- some more minor bug fixes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7541 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-01 09:03:33 +00:00
orbiter
38dce547c0 better concurrency (less locking on date formatting) more logging and minor bug fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7540 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-28 06:28:29 +00:00
orbiter
b1781d7aae some more performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7533 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-26 01:24:49 +00:00
orbiter
b2f147d28e performance hack: excluded map encoding in many cases from synchronization block, especially when doing an iteration
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7532 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-25 21:29:55 +00:00
orbiter
5e186e0122 continuing the fight against deadlocks during time formatting: better caching.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7531 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-25 21:11:53 +00:00
orbiter
1110d16af9 performance hack: replaced generic row.getColBytes() call with row.getPrimaryKeyBytes() where the column is 0
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7529 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-25 12:41:27 +00:00