Commit Graph

807 Commits

Author SHA1 Message Date
sixcooler
9ab0ba41e2 using GzipDecompressingEntity from httpclient instead of our own
(was just fixed there in httpclient-4.1.2 and does a proper job)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7877 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-08-12 17:51:30 +00:00
sixcooler
07f5954570 try better handling of corrupt blobs
@developer: please revert if I'm wrong
see http://forum.yacy-websuche.de/viewtopic.php?f=8&t=3334

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7872 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-08-12 13:27:11 +00:00
orbiter
f970670a7c - bugfix in ServerScannerList
- speed up of generation of scanner list avoiding forced dns lookup

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7871 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-08-12 13:21:18 +00:00
orbiter
8e03b8ee8b better integration of server list in interactive search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7870 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-08-12 12:25:45 +00:00
orbiter
0a3ab7da1b do not sort concrrently the same array
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7868 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-08-12 08:06:21 +00:00
sixcooler
eb14111200 encapsulate potential expensive objects in TextSnippet to allow GC them asap
this reduces chance of OOMs at massive search & snippet-fetching

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7865 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-08-11 21:07:52 +00:00
orbiter
0d33cf352b removed synchronization in DNS resolve (solves a problem when loading snippets but in the past concurrent dns requests also caused deadlocks. but this is many years ago and we will give it another try)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7863 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-08-03 19:42:18 +00:00
orbiter
44d74f8f89 performance hacks for seed generation (because thread dumps showed multiple occurrences at these code points)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7861 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-08-03 18:32:11 +00:00
sixcooler
5cd07d7f84 early freeing resources on deleting index reference if search-verification fails (aka Switchboard.cleanupJob)
doing same thingy on other methods of touched files as well

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7860 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-08-02 15:52:33 +00:00
sixcooler
a311596881 finishing up my commits (7855-7858) which could be helpful for
not declaring inside loops (helps GC of some VMs)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7859 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-08-01 23:35:24 +00:00
sixcooler
9170a434ed throwing an exception again in FileUtils.copy(reader, writer)
OOMs could occour here and should not be ignored

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7858 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-08-01 23:32:58 +00:00
sixcooler
ce248cc8dd less byte-arrays of response-content, less byte-array <-> stream conversation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7856 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-08-01 23:31:08 +00:00
sixcooler
59b767eebd stop loading via http at defined maximum of bytes - even size is unknown before loading
using max-file-size of type int for parsing documents
(since content is used as byte-arrays, 'integer' should be maximum)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7855 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-08-01 23:28:23 +00:00
sixcooler
916d79111e Runtime.maxMemory() DOES change @ runtime:
I wondered getting Total-ram > Max-ram and MemoryControl.available() < 0
MemoryControl.available() < 0 causes some errors where its value is used for dimension of buffers for eg.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7852 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-19 12:48:50 +00:00
orbiter
299af4943c added another memory protection hack
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7849 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-17 17:55:08 +00:00
orbiter
1f300217f8 more protection for the cleanup thread
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7848 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-17 08:39:39 +00:00
orbiter
d13103a0a7 changed the way how the index cache is flushed: do not flush when a put was made because that could cause that many put calls synchronize for a long time when the dump or a merge is performed. Instead a watchdog thread is doing the dump and therefore puts cannot block any more which is good when a put happens during a search result preparation.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7847 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-17 00:02:42 +00:00
orbiter
b06faab9d3 do not allocate a StringBuilder object in case that there is not enough memory for that
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7846 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-16 23:17:19 +00:00
orbiter
6a6f27eaf3 do not sort arrays again if arrays are already sorted
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7845 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-16 19:21:39 +00:00
orbiter
3d043ce9d6 - refactoring
- do not start worker threads in Array class if concurrency is not used

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7844 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-16 19:13:30 +00:00
orbiter
48b78e9ff4 disabling concurrency in new sort since that is not working yet correctly
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7843 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-16 11:54:47 +00:00
orbiter
62ac73a108 fixed bugs and deadlocks in core database indexing structures:
- added new Array class that contains an abstraction of the java Arrrays class which replaces the home-brew quicksort algorithm.
- the new class is about four times slower than the old one, but it works correct (the old one had errors)
- fixed a synchronization problem

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7842 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-16 10:08:43 +00:00
orbiter
1912d0cccc changed handling of RowSet element retrieval: until today all elements had been copied from the underlying byte[] arrays into a new Entry object that again had a copy of a portion of that byte[] in its own bye[]. There was an option to just refer to the underlying byte[] with a pointer but that was almost never used. This commit now changes an interface to the Row class where it is now necessary to tell if a copy is always required. Fortunately the copy is only needed in very rare cases. That means that this change should cause much less memory allocation; it is expected that this happens especially during search situations.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7840 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-15 08:38:10 +00:00
orbiter
bb8e3f8523 code cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7839 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-14 21:42:30 +00:00
orbiter
11dc653de3 added a visualization of peer pings to the performance graphic
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7837 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-14 07:07:06 +00:00
orbiter
3a191cdf14 because newbies are scared about the memory consumption in the performance graph and arguments about high memory consumption according to bad knowledge about java garbage collection techniques, the memory display had been removed from the performance graph shown on the Status.html page. The memory graph can still be seen on the Performance page where the memory graph is just like it was.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7836 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-14 03:25:57 +00:00
orbiter
52d799e7c8 fix for solr auth
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7833 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-05 09:21:30 +00:00
orbiter
9eb8e9acd9 no error message about missing browser in headless environments
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7832 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-05 06:54:05 +00:00
orbiter
d3c89b90ce temporary adding the old httpclient-3.1 again because the solrj classes need them. should be removed as soon solrj supports httpclient-4
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7831 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-04 17:04:49 +00:00
orbiter
bd99969758 fixed bad query
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7830 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-04 16:53:18 +00:00
orbiter
768c59740c - replaced solrj 3.1 with solrj 3.3
- updated also slf4j
- added authentication for solrj


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7829 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-04 16:35:30 +00:00
low012
c7b95e8c81 *) Invalid crawl profiles (containing invalid mustmatch/mustnotmatch filters) will be moved from active crawls to invalid crawls (new file: DATA/INDEX/freeworld/QUEUES/crawlProfilesInvalid.heap). This file can not be edited yet, but it shoudl be easy to extend the CrawlProfileEditor accordingly.
*) Corrupt crawlProfilesPassive.heap would cause crawlProfilesActive.heap to be deleted. Don't know if this ever happend, but will not happen anymore.
*) Cleaned up a little bit.
*) Added some comments.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7827 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-03 23:55:55 +00:00
orbiter
6d2e252bcf fix for:
java.lang.NullPointerException
	at net.yacy.kelondro.index.RowCollection.<init>(RowCollection.java:97)
	at net.yacy.kelondro.index.RowSet.<init>(RowSet.java:48)
	at net.yacy.kelondro.rwi.ReferenceContainer.<init>(ReferenceContainer.java:58)
	at net.yacy.kelondro.rwi.ReferenceIterator.next(ReferenceIterator.java:69)
	at net.yacy.kelondro.rwi.ReferenceIterator.next(ReferenceIterator.java:43)
	at net.yacy.kelondro.blob.ArrayStack.merge(ArrayStack.java:1023)
	at net.yacy.kelondro.blob.ArrayStack.mergeWorker(ArrayStack.java:922)
	at net.yacy.kelondro.blob.ArrayStack.mergeMount(ArrayStack.java:869)
	at net.yacy.kelondro.rwi.IODispatcher$MergeJob.merge(IODispatcher.java:267)
	at net.yacy.kelondro.rwi.IODispatcher$MergeJob.access$300(IODispatcher.java:239)
	at net.yacy.kelondro.rwi.IODispatcher.run(IODispatcher.java:180)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7822 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-03 20:44:33 +00:00
orbiter
2d4bb139d3 - added counting of links with noindex tag for solr index
- bugfixes for solr index

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7820 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-03 06:40:05 +00:00
orbiter
892caccdca added default configuration in ConfigurationSet in case of new values
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7814 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-02 00:09:49 +00:00
orbiter
bda3eec0ff added parsing of canonical link element to html parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7812 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-01 16:38:01 +00:00
orbiter
b6f09a475d - added an index profile editor in the /indexFederated_p.html servlet for solr indexes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7811 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-30 15:49:21 +00:00
orbiter
b666a929e7 fixed Semaphore handling in case of interruptions
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7809 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-30 15:37:14 +00:00
orbiter
de7a054d77 added parser for such files like the new solr.key.list
it parses text files with the following syntax:
 - all lines beginning with '##' are comments
 - all non-empty lines not beginning with '#' are keyword lines
 - all lines beginning with '#' and where the second character is not '#' are commented-out keyword lines


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7808 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-29 15:35:45 +00:00
orbiter
267290a821 removed the semaphores from the cache dump process because I believe some of the semaphores may be lost somewhere which then causes that the cache is never flushed and then the peer dies from a OOM. The re-introduced synchronization may not be the best solution but should ensure that the caches are flushed.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7802 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-26 21:45:04 +00:00
orbiter
d8072d1866 added more info to DNS cache in /PerformanceMemory_p.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7798 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-24 08:27:36 +00:00
orbiter
f803da8aae code cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7797 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-24 00:24:00 +00:00
orbiter
84c9658644 added a file type navigator
added a protocol navigator

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7795 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-23 15:39:52 +00:00
orbiter
31283ecd07 - added a search option to filter only specific network protocols. i.e. get only results from ftp servers. Just add '/ftp' to your search.
for example search for "passwd /ftp". This can also be done with /http /https and /smb
- fixed some search throttling processes that should protect your peer against search DoS or strong search load

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7794 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-23 11:57:17 +00:00
orbiter
7db208c992 performance hacks: more pre-allocated StringBuilder
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7790 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-21 23:10:50 +00:00
orbiter
07e89a7ae5 added @Deprecated
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7788 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-20 22:33:45 +00:00
orbiter
9706fc55aa enhanced content scraper (should discover urls much faster in case of very large plain texts)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7787 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-20 22:29:45 +00:00
orbiter
996f0a8764 disabled assert in Base64Order which eats away too much performance during testing with -l
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7786 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-19 13:34:55 +00:00
orbiter
f667b9c289 enhanced identificator: using AtomicInteger for counter
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7785 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-19 13:31:10 +00:00
orbiter
16327d1cbe unwrapping of call depth (one call less for UTF8.String)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7784 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-19 13:15:01 +00:00
orbiter
f30d36b101 enhanced template engine
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7783 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-19 13:02:06 +00:00
orbiter
aa6c32d753 enhanced UTCDiffString
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7782 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-19 12:38:06 +00:00
f1ori
f87865a50b always shutdown log, fixes zombie processes in init stop script
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7780 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-15 09:14:51 +00:00
orbiter
115abc8917 - more attributes for search progress bar
- moved cache strategy to cora package

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7778 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-13 21:44:03 +00:00
orbiter
77fe69395d added jempbox-1.5.0.jar which is required by pdfbox-1.5 as stated in http://pdfbox.apache.org/dependencies.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7774 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-05 20:04:41 +00:00
sixcooler
df1725ef43 re-enable POST over proxy, which didn't work since update to httpcore-4.1.1
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7772 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-04 13:25:03 +00:00
orbiter
2683162ec5 - added more options to access grid picture, web structure picture and network graphics
- remove test class


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7770 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-02 23:27:26 +00:00
orbiter
0c1b29f3c9 - applied many small performance hacks
- added a memory limitation in the zip parser and the pdf parser
- added a search throttling: if there are too many search queries are still to be computed, then new requests are not accepted for some time. if after a one second still no space is there to perform another search, the search terminates with no results. this case should only happen in case of DoS-like situations and in case of strong load on a peer like if it is integrated in metager.
- added a search cache deletion process that removes search requests in case that throttling happens

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7766 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-01 19:31:56 +00:00
orbiter
fe0c08455b more concurrency (enhancement) hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7759 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-30 08:53:58 +00:00
orbiter
87082f407e less String object creation during search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7756 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-30 04:19:20 +00:00
orbiter
3c2b994bd6 write access/load time to solr index
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7752 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-27 12:35:08 +00:00
orbiter
a36fda991e hack to increase speed of url hash computation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7751 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-27 12:34:38 +00:00
orbiter
dbea40d536 - changed snippet fetch strategy logic: do not check if entry is in cache. This should reduce IO load on the HTCACHE which is a showstopper during large number of search requests
- forced a possible short memory status when a search is started to flush caches that may cause search-heaps with resource contention effects

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7747 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-27 09:32:03 +00:00
orbiter
4bea3f9714 hack to reduce resource contention caused by massive UTF8 decodings which use java.nio resources:
used a ASCII String <-> byte[] conversion wherever possible. Many Strings in YaCy are hashes which are pure ASCII (base64 hashes).
The new ASCII String <-> byte[] conversion method have less computation overhead than the UTF8 conversion.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7746 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-27 08:24:54 +00:00
orbiter
746e3c3b06 Replaced a widely-used Property Object in the httpd with HashMap<String, Object> which is not synchronized like Properties
A synchronization is not needed here and applies an overhead to the httpd process which is now removed.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7745 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-26 16:34:35 +00:00
orbiter
e28bd0d038 fix for some possible causes of memory leaks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7741 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-26 14:35:32 +00:00
orbiter
09ba6814c0 - non-blocking word hash computation with dynamic digest object generation (this was important!)
- (very) small performance enhancement in did-you-mean


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7740 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-26 12:58:11 +00:00
orbiter
10e2f588f8 - enhanced ybr ranking computation
- many speed/performance hacks
- added solr charding and new charding web interface
- added option to switch off the yacy index when using solr
- added new fail-url categories which are used to make a distinction which fail-urls to be sent to solr
- refactoring/renaming of some method names to distinguish host/url hashes better
- a large number of bug/npe fixes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7738 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-26 10:57:02 +00:00
orbiter
bd55dcee50 - commented out experimental distributed ranking loading
- less threads for blocking threads
- disable all threads for DHT transmission for networks with zero peers

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7737 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-24 21:08:01 +00:00
orbiter
98c4d25185 fix for endless loop in FTP crawling, see http://bugs.yacy.net/view.php?id=32
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7736 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-24 10:06:20 +00:00
orbiter
3ed4a09368 small features, some bug fixes and performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7733 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-23 21:08:04 +00:00
orbiter
b45701d20f this is a re-implementation of the YaCy Block Rank feature
This time it works like this:
- each peer provides its ranking information using the yacy/idx.json servlet
- peers with more than 1 GB ram will load this information from all other peers, combine that into one ranking table and store it locally. This happens during the start-up of the peer concurrently. The new generated file with the ranking information is at DATA/INDEX/<network>/QUEUES/hostIndex.blob
- this index is then computed to generate a new fresh ranking table. Peers which can calculate their own ranking table will do that every start-up to get latest feature updates until the feature is stable
- I computed new ranking tables as part of the distribition and commit it here also
- the YBR feature must be enabled manually by setting the YBR value in the ranking servlet to level 15. A default configuration for that is also in the commit but it does not affect your current installation only fresh peers
- a recursive block rank refinement is implemented but disabled at this point. it needs more testing

Please play around with the ranking settings and see if this helped to make search results better.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7729 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-18 14:26:28 +00:00
orbiter
d27a0a67ff fix in log initialization according to hint from Dominic
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7728 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-17 15:53:59 +00:00
orbiter
205cc75157 abstraction of surrogate main element (xmlns:geo was missing for wiki extracts)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7727 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-17 08:57:49 +00:00
orbiter
021840e5ba removed (almost) deadlocks and unnecessary CPU load
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7726 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-17 00:00:01 +00:00
orbiter
123375bfba added a new yacy protocol servlet 'idx'. This returns an index to one of the data entities that is stored in YaCy.
This servlet currently only serves for indexes to the web structure hosts. It can be tested by calling
http://localhost:8090/yacy/idx.json?object=host
This yacy protocol servlet is the first one that returns JSON code and that also shows index entries in a readable format. This will make the development of API applications much easier. This is also an example implementation for possible json versions of the other existing YaCy protocol interfaces.

The main purpose of this new feature is to provide a distributed block rank collection feature. Creating a block rank is very difficult if the forward-link data is first collected and then one peer must create a backward-link index. This interface provides already a partial backward index and therefore a collection of all these indexes needs only to be joined which is very easy. The result should be the computation of new block rank tables that all peers can perform.

To reduce load from peers this servlet buffers all data and refreshes it only once in 12 hours. This very slow update cycle is needed because the interface will be called round-robin from all peers once after start-up.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7724 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-15 22:57:31 +00:00
orbiter
5c981762c6 added bigrange option for network scan
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7721 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-14 09:13:16 +00:00
orbiter
bade61696f speed-up of network port scanner
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7719 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-14 09:03:16 +00:00
orbiter
1d8b0f74f4 one more fix for SVN 7713
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7716 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-13 15:31:24 +00:00
orbiter
0960261769 fix for svn 7713
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7715 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-13 15:20:57 +00:00
orbiter
5b579e21a3 code cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7713 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-13 06:21:40 +00:00
orbiter
039126cfaf better handling of on/off switched solr indexing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7709 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-08 22:47:20 +00:00
orbiter
dc54915df4 fix for very bad compare
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7708 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-08 08:45:58 +00:00
orbiter
9248a4eef4 reduce teh effect of 'Bildersuche findet generierte HTML-Seiten als Bilder'
see http://bugs.yacy.net/view.php?id=9

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7705 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-07 07:37:46 +00:00
orbiter
76f2817e00 a fix for the snippet computation and hopefully better snippets
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7701 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-05 23:05:38 +00:00
orbiter
deda54d684 - relaxed matching of string-search (this is now case-insensitive)
- added transport of string-search pattern to remote search protocol
- fixed a problem parsing snippets with a '-' inside

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7700 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-05 22:37:06 +00:00
orbiter
15e3a57b4e removed unused functions in condenser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7698 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-05 09:23:10 +00:00
orbiter
6e42d4de88 - added full-String search function: find things that match exactly what is quoted in the query
- re-structuring authentification methods to fix a problem with API steering

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7697 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-05 00:25:14 +00:00
orbiter
8e10b82280 small fix for solr export
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7696 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-03 22:21:45 +00:00
orbiter
6fa439c82b - refactoring of robots
- added option to crawler to send error-URLs to solr
- changed solr scheme slightly (no multi-value fields where no multi values are)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7693 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-02 14:05:51 +00:00
orbiter
e3d19d0a90 fix in Document inboundlinks/outboundlinks sorting
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7690 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-01 15:49:04 +00:00
orbiter
4e8fa03514 added more attributes to html evaluation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7688 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-29 15:36:44 +00:00
orbiter
528da7c9ea removed unused class and added license header for new class
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7680 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-28 13:14:30 +00:00
orbiter
f6077b3cc0 added more attributes for html parser and enhanced data structures
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7679 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-28 13:09:01 +00:00
sixcooler
4eb9c1e7c3 not setting userAgent from Constructor as default for following calls
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7676 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-26 17:39:16 +00:00
orbiter
d8e934c085 better abstraction of http client identification
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7675 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-26 13:35:29 +00:00
sixcooler
a3e707283d not using HTTPConnector anymore
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7674 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-26 11:46:31 +00:00
orbiter
9f1f47ec67 added some comments to explain the isLocal patch
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7673 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-21 21:59:56 +00:00
orbiter
b77b8cac0c - enhanced html parser: recognized much more details in the content
- added more properties to solr index
- refactoring
- more constants in switchboard
- fix for some NPEs
- recognition of more images
- removed synchronization in HandleMap (obviously not necessary?)
- added a nolocal configuration to remove excessive dns lookup (works only on allip - default off). Indexes produced with this setting are all flagged with 'local' and are (on purpose) not usable for freeworld because they will be rejected as beeing local.



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7672 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-21 13:58:49 +00:00
orbiter
3d5104d357 - fixed a bug in crawl start with file name (npe in new url)
- added deletion of solr index in IndexControlRWIs
- added asynchronous adding of large url lists (happens when crawls are startet with file)
- fixed npe in Image display
- replaced language warning with fine logging
- added a domain name cache in Domains that helps to speed up the isLocal property (less DNS lookups)
- added a new storage class for this new cache: KeyList. The domain key list is stored in DATA/WORK/globalhosts.list
- added concurrent solr updates and chunked transfers (50 documents until a commit is done) for high-speed feeding (> 40000 ppm)
- fixed a bug in content scraper that chopped off large parts of crawl lists (using crawl start from file)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7666 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-18 16:11:16 +00:00
orbiter
958ff4778e enhanced location search:
search is now done using verify=false (instead of verify=cacheonly) which will cause that much more targets can be found.
This showed a bug where no location information was used from the metadata (and other metadata information) if cache=false is requested. The bug was fixed.

Added also location parsing from wikimedia dumps. A wikipedia dump can now also be a source for a location search.
Fixed many smaller bugs in connection with location search.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7657 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-15 15:54:19 +00:00
sixcooler
8d63f3b70f just cosmetics - keeping my baby clean :-)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7656 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-15 00:48:39 +00:00
orbiter
e402622584 removed httpclient-3.1 (this was added with last commit which was a mistake)
the httpclient is required by solrj but no class from solrj is used which references to httpclient-3.1
Instead the YaCy http client library based on the apache http client 4.1 is used using a wrapper class
which is in net.yacy.cora.services.federated.solr.SolrHTTPClient

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7655 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-14 20:12:14 +00:00
orbiter
19fd13d3bc Added federated index storage to solr.
YaCy supports now the storage to remote solr indexes.
More federated storage (and search) methods may follow.

The remote index scheme is the same as produced by the SolrCell; see
http://wiki.apache.org/solr/ExtractingRequestHandler
Because this default scheme is used, the default example scheme can be used as solr configuration
This is also the same scheme that solr uses if documents are imported with apache tika.

federated solr storage is switched off by default.

To use this, do the following:
- set federated.service.solr.indexing.enabled = true
- download solr from http://www.apache.org/dyn/closer.cgi/lucene/solr/
- extract the solr (3.1) package, 'cd example' and start solr with 'java -jar start.jar'
- start yacy and then start a crawler. The crawler will fill both, YaCy and solr indexes.
- to check whats in solr after indexing, open http://localhost:8983/solr/admin/

Until now it is not possible to use the solr index to search with YaCy in that solr index.
This functionality is now available for two reasons:
1) to compare the functionality of Solr and YaCy and to compare the search speed
2) to use YaCy as a search appliance for people who need a crawler or other source harvesting methods
   that YaCy provides (like dublin core reading, wikimedia dump reading, rss feed reader etc) if people still
   want to use solr instead of YaCy.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7654 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-14 20:05:04 +00:00
orbiter
c17d102bd8 enhanced speed for OrderedScoreMap inc method and size comparisment in concurrent environments
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7653 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-13 22:04:23 +00:00
orbiter
b788182954 some enhancements to scoring speed
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7652 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-13 15:17:00 +00:00
orbiter
01690eab86 fix for mediawiki importer and wikicode parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7651 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-13 13:22:27 +00:00
orbiter
4c013d9088 more UTF8 getBytes() performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7649 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-12 05:02:36 +00:00
cominch
9ac02caf00 different initialization of empty variables in alternative constructor. This leads to wrong interpretation of user credentials, resulting in unnecessary "@" in front of host, and different urlhash values.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7646 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-06 10:59:31 +00:00
orbiter
57ce1fb491 reverted synchronization from SVN 7641
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7643 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-04 20:31:02 +00:00
orbiter
17530ca7b5 fix for bug http://bugs.yacy.net/view.php?id=10
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7642 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-04 12:20:20 +00:00
orbiter
7c8e764201 removed synchronization again...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7641 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-04 10:13:30 +00:00
orbiter
96c32e87b0 fixes to crawler and new user-agent crawl-delay handling
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7640 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-04 09:47:18 +00:00
orbiter
cb6f709a16 - enhancements in surrogate reading
- better display of map in location search

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7636 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-02 00:11:37 +00:00
low012
1ff9947f91 *) added new user right: extended search right (allows to define users who can query more results than anonymous users)
*) cleaned up code a little bit

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7635 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-01 23:32:40 +00:00
orbiter
564184909a enhanced the surrogate parser: better reading of UTF-8 characters
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7634 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-01 11:05:42 +00:00
orbiter
156cf02703 - added an index constraint 'has location' to the condenser
- added evaluation of the 'has location' constraint to search using the /location operator


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7633 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-31 09:41:30 +00:00
orbiter
41b8d7f655 fix for url normalization (no backpath resolving in post parameters)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7632 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-31 09:40:01 +00:00
orbiter
0430a94eaa the location search shows now not re-evaluated locations but only such locations that are attached as metadata to web pages
- added parser for in-text appearing geo-locations
- added geo-locations to rss search result
- added evaluation of metadata-attached geo-locations in yacysearch_location to show search results within a map


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7631 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-30 23:26:36 +00:00
orbiter
8412f8787d fix for http://bugs.yacy.net/view.php?id=8
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7630 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-30 08:17:25 +00:00
orbiter
9b25d07295 - added geo information parsing to html parser
- extended metadata information in index with geolocalisation
- added display of location in yacydoc and ViewFile

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7629 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-30 00:49:47 +00:00
lotus
cbf87fe72f write PID to yacy.running
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7627 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-26 15:11:29 +00:00
orbiter
b1a8d0c020 enhancements to web cache and less strict caching rules
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7620 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-22 10:35:26 +00:00
orbiter
f3baaca920 - enhancements to DNS IP caching and crawler speed
- bugfixes (NPEs)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7619 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-22 09:34:10 +00:00
f1ori
df71776929 * fix bug #7
* log requires poison to finish, so Base64Order main-function doesn't finish, when called from debian configure script


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7616 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-21 19:42:22 +00:00
orbiter
78d4c45d09 enhancement during search process: fast fail of search in case that all index feeder have terminated.
This change should affect filtering and navigators and should cause that search navigation gets faster

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7614 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-21 13:05:51 +00:00
orbiter
a50f28e6e7 - fixed missing save operation for peer name change
- fixed import of mediawiki dump files
- added script to add mediawiki dump files

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7609 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-19 23:52:09 +00:00
orbiter
2b5f8585bf performance hack for Balancer and ip address parsing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7608 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-17 21:09:18 +00:00
orbiter
b1d133b69f another anhancement to the ThreadDump function: better multiple dumps and filtering out of not interesting dump parts
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7606 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-17 20:48:39 +00:00
orbiter
a35d513bd8 fix for not-deleted .gap and .idx files
see also: http://forum.yacy-websuche.de/viewtopic.php?p=22128#p22128

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7605 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-17 17:09:19 +00:00
orbiter
a6935e7dc8 fix for active dns resolving: do not resolve in case that the dns server is not available (offline mode)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7604 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-16 07:05:10 +00:00
orbiter
859c99886c fix for multiple thread dump
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7603 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-15 23:05:51 +00:00
orbiter
61acf55da4 avoided using a synchronized(this) for the hash computation to prevent that the lock on the object is (accidently) stolen by another thread and replaced this synchronization using the protocol object. Made also the protocol object final.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7602 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-15 09:52:39 +00:00
orbiter
c2a968c23f fix for bug in formatting in ThreadDump
and added hint for linux/Mac users that they may use the LOCKED feature using the start option -l

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7601 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-15 08:39:05 +00:00
orbiter
078ecacf61 avoid synchronization in DigestURI hash requests
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7599 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-15 00:47:30 +00:00
orbiter
1989ebc24b removed more warnings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7598 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 22:52:30 +00:00
orbiter
0324de1467 removed debug line
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7597 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 21:34:42 +00:00
orbiter
1aba7869bf patch for Windows: do not use the thread lock feature from previous commit if used on Windows
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7596 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 21:33:36 +00:00
orbiter
0a11727374 added new feature for Thread dump:
"THREADS WITH STATES: LOCK FOR OTHERS"
will show only such threads that lock other threads. This is the 'opposite part' of the blocked threads.
Because that this uses a thread dump that is produced with a kill -3 on the PID of the process and such thread dumps are written by the Java core outside of System.out and Sytem.err it is necessary to read the dump from a log in the file system. Such a log is only written if YaCy is started with startYACY.sh on a linux system. That means:
this feature is only available on linux and Mac OS X if YaCy is started with ./startYACY.sh -l


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7595 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 21:32:20 +00:00
orbiter
b62b79675b removed type cast warnings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7594 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 21:08:18 +00:00
orbiter
a07a1a8b1e removed type cast warnings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7593 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 21:07:15 +00:00
orbiter
8edaccfedf removed unused variables
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7592 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 21:03:37 +00:00
orbiter
e6c3507b17 disabled some of the previous changes (did not work in openjdk)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7591 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 20:48:36 +00:00
orbiter
f9e5c21083 update to thread dump logs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7590 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-14 20:46:04 +00:00
orbiter
8f11d3a5bb redesigned the ScoreMap classes:
- new concurrent score map using atom operation from java concurrency classes
- redesigned difference beween StaticScore and Dynamic Score into ScoreMap and ReversibleScoreMap allowed that many classes can now use simple ScoreMap Objects which can be used better in concurrent environments using the ConcurrentScoreMap
- switched from DynamicScore to ConcurrentScoreMap usage wherever possible

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7586 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-13 01:41:44 +00:00
orbiter
a564230c48 more enhancements against blocked threads occurred in seed age evaluation (blocks httpd in some cases)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7585 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-12 22:54:41 +00:00
orbiter
dc0db3550e avoid string conversion
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7584 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-11 00:59:27 +00:00
orbiter
694fa3a2a5 - replaced more direct string-based UTF-8 conversions by predefined UTF-8 conversion
- changed menu structure slightly

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7583 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-10 23:25:07 +00:00
orbiter
30aed9824a moved getBytes() to UTF8.getBytes() to use a default String encoding
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7580 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-10 12:35:32 +00:00
lotus
cb6d307bba adding extension for parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7579 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-09 20:36:01 +00:00