Commit Graph

4443 Commits

Author SHA1 Message Date
orbiter
2f181d0027 introduced concurrency in HTCACHE storage compression
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6806 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-04-13 16:22:09 +00:00
orbiter
2e26744f4e more concurrency when normalizing RWI entries + cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6805 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-04-13 14:47:57 +00:00
orbiter
555b333041 fix for wrong count of server processes. may fix non-access problems in some cases
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6804 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-04-13 14:34:16 +00:00
orbiter
aa083fc45c try to get a fix for OOM problem in case that there is no real problem with missing memory.
See also http://forum.yacy-websuche.de/viewtopic.php?p=19835#p19835

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6802 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-04-13 11:39:54 +00:00
orbiter
70e6222978 more concurrency during search requests
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6801 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-04-13 11:12:36 +00:00
orbiter
4917f96729 fixes for some changes in SVN 6797 that caused NPEs when the bookmarks initialized
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6800 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-04-13 10:14:08 +00:00
low012
dff660441a *) changes for better code readability
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6799 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-04-13 01:31:16 +00:00
low012
15d9ea8375 *) changes for better code readability
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6798 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-04-13 01:25:15 +00:00
low012
2bc459252e *) changes for better code readability
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6797 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-04-13 01:16:09 +00:00
low012
dc93cec3a8 *) Java 1.5 compatibility (see http://forum.yacy-websuche.de/viewtopic.php?f=8&t=2764)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6796 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-04-13 00:25:46 +00:00
orbiter
67ec58d8e7 search performance enhancement
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6795 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-04-12 07:31:43 +00:00
hermens
4ec0092677 more null == proxy fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6794 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-04-10 18:31:12 +00:00
hermens
2f90f0ad56 Remove asserts blocking proxy use cases
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6793 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-04-10 15:12:39 +00:00
hermens
ef467a0303 Another workaround for the second part of http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2770
This should prevent URLs with bad referrer entries from being dropped by transferURL or even crashing the whole Transmission$Chunk


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6792 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-04-10 13:57:46 +00:00
sixcooler
eb2a4bb555 workaround(?) for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2770&start=0&st=0&sk=t&sd=a&hilit=DefaultCharsetStringPart
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6791 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-04-10 00:21:07 +00:00
orbiter
25aef069a6 continuing String-hash - to - byte[]-hash redesign that was started in SVN 6775
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6790 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-04-08 00:11:32 +00:00
low012
b97ad0f380 *) some minor changes for better code readability
*) added more SVN properties

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6787 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-04-05 12:37:33 +00:00
orbiter
ba51d140e1 added more info in assert in balancer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6782 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-28 22:59:19 +00:00
orbiter
a85c5bb8a7 added support for multiple (fail-over) network definition locations when http-locations are given. multiple locations can be given with a comma-separated list of urls pointing to the network definition file
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6780 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-27 23:15:15 +00:00
orbiter
9b3840cb66 performance hacks for the template engine + cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6778 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-27 22:52:48 +00:00
orbiter
5c10f8bc5f enhanced latest hack
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6777 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-27 07:19:49 +00:00
orbiter
b3238bec83 performance hack for httpd
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6776 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-27 07:09:55 +00:00
orbiter
1e8e79b9ef redesign of reference hash (URL-hash) parameter hand-over:
pass value as byte[], not as String. This should cause that less
byte[] <-> String conversions are made during time-critical tasks.
This redesign is not yet complete, more to come ..

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6775 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-26 18:33:20 +00:00
orbiter
72d8e9897b removed unnecessary cache flush call in backend of BufferedRecords
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6774 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-26 12:44:13 +00:00
orbiter
749ffbd642 - added another catch case for the index dump and index merge process that should cause non-blocking behavior in case that index dump and/or index merge caused any unexpected exception.
- reverted SVN 6766, this is too dangerous (may cause unexpected memory usage) and should not be necessary

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6773 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-26 10:46:40 +00:00
orbiter
9ddb8e4a43 set an option for the java-internal image parser that prevents that the image is cached using the file-system in a temporary file. This should speed up image parsing during image indexing dramatically and should also cause better performance when showing the yacy banner and OSM tiles.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6772 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-26 10:43:31 +00:00
orbiter
312ca5d917 removed flush at end of every rwi entry since this reduces the write performance.
This should speed up RWI cache dump and RWI merge operations and should cause less blocking time during these processes for the indexer.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6771 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-26 10:41:20 +00:00
orbiter
0018163c07 moved table row/column matching method from front-end to back-end
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6770 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-26 10:01:27 +00:00
orbiter
e12f1fd821 - added setting of access rights for executable scripts after auto-installation
The correct access right was missing expecially for bin/apicall.sh

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6769 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-25 09:51:01 +00:00
orbiter
31e29a8831 - removed synchronization during index dump and index cleaning
- added semaphores to synchronize index dump and index cleaning for each process separately

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6767 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-25 07:09:53 +00:00
orbiter
95f31da8da increase dump cache queue length from 1 to 2
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6766 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-24 20:36:35 +00:00
orbiter
6c093d6aed - enhanced domain navigator computation
- fixed domain navigator content in case that a mustmatch constraint was given

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6763 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-23 13:41:41 +00:00
orbiter
bb63c5d075 using a Pattern object with precompiled regular expressions to apply must-match constraints to search results: should speed up pre-sorting of search results and should cause richer search result sets
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6762 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-23 10:17:28 +00:00
orbiter
e0da0a84b0 performance fix in http parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6760 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-22 09:12:52 +00:00
orbiter
90dd197ae7 - no latency for local crawls
- catch interrupted exception during 'fast' crawls in workflow processor

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6759 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-22 09:12:18 +00:00
orbiter
bfb518cd47 some refactoring to get the LoaderDispatcher a little bit more independent from the switchboard
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6755 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-20 10:28:03 +00:00
orbiter
36bd843ece for for RFC5322 comformance as suggested by Quix0r in http://forum.yacy-websuche.de/viewtopic.php?p=19585#p19585
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6754 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-20 10:23:47 +00:00
orbiter
c855fc48c6 only load robots.txt for http and http protocol
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6753 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-20 10:15:11 +00:00
orbiter
748abfcffa added patches to prevent yacy-protocol DoS settings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6751 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-19 15:31:15 +00:00
orbiter
e820ed061a avoiding excessive DNS lookups to determine localhost
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6750 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-19 14:28:25 +00:00
orbiter
11983bc936 redesigned some parts of the parser entry point:
- in all cases that the parser is entered it is a whole set of possible parsers computed according to given mime type and file extension,
that means that all parsers are considered where the registered mime acceptance and extension acceptions matches.
that may cause that several parsers are tried for the same file which will cause a success in cases where there was only the mime type was used to choose the right parser and the mime type was given wrongly by the host httpd.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6749 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-19 13:04:42 +00:00
orbiter
de88200e11 - added Byte Order Mark recognition to serverObjects
The BOM character FEFF may appear at the beginning of strings if some browsers append the characters %EF%BB%BF to input values.
see http://en.wikipedia.org/wiki/Byte_order_mark

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6748 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-19 10:58:40 +00:00
orbiter
89b4fff1c2 adopted ant script for new exif library
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6746 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-12 12:36:38 +00:00
orbiter
24e5faee75 added exif parsing for jpg images
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6745 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-12 12:23:38 +00:00
orbiter
82f76e1296 removed log line
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6744 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-11 20:31:38 +00:00
orbiter
0f8004f9da enhanced html parser to recognize a href tags inside header tags
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6743 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-11 17:52:07 +00:00
orbiter
3300930fc5 - (almost) fixed FTP crawler
- integrated/fixed SMB crawler

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6742 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-11 15:43:06 +00:00
orbiter
1198b9989d bugfixes, more sorttable
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6739 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-10 15:39:36 +00:00
orbiter
9623d9e6d2 added a smb loader component for the YaCy crawler
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6737 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-10 08:55:29 +00:00
orbiter
ae2f3f000f better handling of table copy abandon .. prevent memory leak
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6734 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-09 13:32:15 +00:00
orbiter
0769517129 added a robots.txt monitor in the crawler monitor submenu
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6733 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-09 11:31:15 +00:00
orbiter
48995e71c4 added soft-auth to general authentication scheme
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6732 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-09 00:07:17 +00:00
orbiter
72f00dee59 removed never-used server access account function
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6731 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-08 22:30:45 +00:00
orbiter
57e1eae95e longer time-out for url fetching .. may help to show all that links that the statistic say for a search result
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6727 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-07 22:23:08 +00:00
orbiter
9e639603e3 after frequent occurrences of 100% CPU usages and permanent blockings I try to disable a function in a method that may cause the problem when calling an external library (apache http client 3.x). The thread dump that shows the problem is attached here.
at java.lang.StringCoding.encode(StringCoding.java:266)
	at java.lang.String.getBytes(String.java:946)
	at org.apache.commons.httpclient.util.EncodingUtil.getAsciiBytes(EncodingUtil.java:237)
	at org.apache.commons.httpclient.methods.multipart.Part.sendDispositionHeader(Part.java:220)
	at org.apache.commons.httpclient.methods.multipart.Part.send(Part.java:308)
	at org.apache.commons.httpclient.methods.multipart.Part.sendParts(Part.java:385)
	at org.apache.commons.httpclient.methods.multipart.MultipartRequestEntity.writeRequest(MultipartRequestEntity.java:164)
	at de.anomic.http.client.Client.zipRequest(Client.java:364)
	at de.anomic.http.client.Client.POST(Client.java:339)
	at de.anomic.yacy.yacyClient.wput(yacyClient.java:285)
	at de.anomic.yacy.yacyClient.transferURL(yacyClient.java:1053)
	at de.anomic.yacy.yacyClient.transferIndex(yacyClient.java:942)
	at de.anomic.yacy.dht.Transmission$Chunk.transmit(Transmission.java:200)
	at de.anomic.yacy.dht.Dispatcher.storeDocumentIndex(Dispatcher.java:397)
	at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at net.yacy.kelondro.workflow.InstantBlockingThread.job(InstantBlockingThread.java:103)
	at net.yacy.kelondro.workflow.AbstractBlockingThread.run(AbstractBlockingThread.java:66)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:637)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6726 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-07 21:19:23 +00:00
orbiter
4144927d94 show less errors
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6725 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-07 21:02:08 +00:00
orbiter
b88f5fbb4b slightly changed crawling policy
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6723 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-07 01:46:08 +00:00
orbiter
de01fe0e6d fix for bug in url parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6722 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-07 01:33:18 +00:00
orbiter
7684a575c4 fix for deletion of error database each time when YaCy starts up
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6721 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-07 00:33:39 +00:00
orbiter
f561e340c6 show more results of single domains when not authorized fully (up to 100)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6720 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-07 00:12:58 +00:00
orbiter
c4bdb1e7f2 added one more option in ViewFile to show an iframe like for the orginal web page content but using the cache than the direct link to the content in the web. Upgraded the very old and previously not any more used CacheResource_p servlet to a new and working version.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6719 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-06 23:41:51 +00:00
orbiter
c09a995930 better logging of double occurrences of urls in the crawler
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6718 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-06 20:31:30 +00:00
orbiter
1bbe14d23f SVN 6716 unfortunately contained parts of the unfinished SMB integration. To fix compile errors the remaining parts of the SMB implementation stub is added with this commit.
This adds the jcifs smb library.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6717 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-05 21:46:22 +00:00
orbiter
884b262130 - added a new Wiki Namespace Navigator
- some redesign of Navigator data structures

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6716 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-05 21:25:49 +00:00
orbiter
617dfbbd06 allo 'authorization by encoded password' also if requesting client is not from localhost but from the same host as yacy is running on.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6714 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-05 16:03:55 +00:00
orbiter
270fb38674 - fixed some bugs in Table viewer
- added 'select all' feature in Tables_p
- enhanced ViewFile.html: has now an input field to load arbitrary resources from the web and analyze them (!!!)
- included the ViewFile servlet into the Index Administration menu
- show in ViewFile if ressource is in url-db and/or in Web cache
- bugfixes to BEncodedHeap and Tables management

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6713 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-05 15:41:15 +00:00
orbiter
599c3766c4 added authentication to automated API call
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6711 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-04 14:10:03 +00:00
orbiter
727dd9b193 - fixed a bug in robots.txt parser
- moved storage of robots.txt entries to WorkTables, so it is now possible to browse the robots entries with the table browser

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6710 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-04 11:58:07 +00:00
orbiter
54af9e6b49 - added parsing of robots meta-tag in html headers to detect a noindexing request
- added evaluation and indexing prevention in case that a noindexing is given in a html file

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6709 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-03 23:32:56 +00:00
orbiter
46c4f8b68a better look-ahead into the crawl queue: show more on crawl monitor
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6699 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-24 23:11:58 +00:00
lotus
7b546415dc added svn6695 for windows
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6697 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-24 14:58:53 +00:00
orbiter
f175f9a2d3 changed way how number of search requests are counted:
so far only search requests at the remote search interface had been counted.
This was done to protect the privacy of searchers, because counting was not done and published at the own search interface.
This caused that no search requests of robinson peers had been counted, becuase they cannot be counted at remote peer.
This change introduces a distinction of locally done search requests at the local search interface from search requests that are on the local interface but had been submitted from a remote IP without authentication.
Now 3 counters are maintained:
- partial count of remote searches
- total count of local searches on robinson peers from non-authenticated clients
- total count of local searches on robinson peers from localhost or authenticated clients
In the global statistic of search requests now the first two counters of the three cases are added
Because we habe a large number of robinson peers with a large number of remote non-authenticated requests the statistic should show at least three times of the number of search requests.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6696 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-24 13:53:55 +00:00
orbiter
84222e3b4f fix for auto-updater: delete old libraries before copy of new one
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6695 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-24 13:46:50 +00:00
sixcooler
cd6de83905 next try for for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2703
(reverted 6692)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6694 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-23 15:59:58 +00:00
sixcooler
bfe4693e9a fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2703
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6693 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-23 13:46:56 +00:00
orbiter
93b7ddc27d fix for http://forum.yacy-websuche.de/viewtopic.php?p=19376#p19376
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6684 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-21 22:49:35 +00:00
orbiter
8030ed3319 self-healing for lost crawl profile handles
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6680 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-18 21:55:45 +00:00
orbiter
e3e5e05ec2 fix for problem in ranking setting which was caused by the introduction of a toString() method in serverObjects
see also: http://forum.yacy-websuche.de/viewtopic.php?p=19310#p19310

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6678 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-17 21:31:08 +00:00
orbiter
e3ccfb54aa fix for display problem in Firefox on MacOS X
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6677 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-17 09:08:16 +00:00
orbiter
564927ce72 redesign of CrawlResult data structures because of OOM occurrences during URL deletion processes.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6675 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-16 23:06:04 +00:00
orbiter
30c8185139 fix for sid check
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6673 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-15 23:31:32 +00:00
orbiter
ef62d017e5 integrated session id filtering for crawler
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6672 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-15 23:15:17 +00:00
orbiter
d8d9984913 added framework for session id filtering (not ready yet)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6671 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-15 22:30:41 +00:00
orbiter
2bc36de336 - fix for bug in svn 6669
- cleanup

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6670 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-15 22:06:13 +00:00
orbiter
d378ca4604 better handling of concurrency in seed
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6669 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-15 15:57:35 +00:00
orbiter
6538043d89 fix for http://forum.yacy-websuche.de/viewtopic.php?p=19189#p19189
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6668 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-15 15:45:31 +00:00
sixcooler
e071d71f19 fix for yacy-banner-network-values
http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2521

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6659 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-09 18:22:36 +00:00
lotus
945e0ba5a5 allow global search if res. observer disabled index transmission
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6658 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-09 17:14:16 +00:00
lotus
8faeedd99a not a fix! for:
http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2679

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6657 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-09 09:33:30 +00:00
sixcooler
787b588c33 reverted a part of svn6636:
- didn't work on blobs >2GB
- should be obsolete since svn6651
http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2652&sid=7fa98fd3edfc2a03f26394d545e3e3c1&p=19172#p19172

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6655 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-07 19:32:46 +00:00
lotus
11188cd7eb resource observer now uses the Java 6 method to check for free space. thus, disk observing now needs Java 6 installed.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6652 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-06 18:48:06 +00:00
sixcooler
089877f32c my first commit - hopefully fix for merge problem
- http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2652

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6651 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-05 19:38:00 +00:00
orbiter
be18b5d8cd fix for 'cannot switch back to default language'-bug
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6649 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-04 23:53:02 +00:00
orbiter
d6391f2537 better handling of rewrite cases where the resulting rewrite blob entry is equal in size
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6648 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-04 23:17:47 +00:00
orbiter
ef9473d92c added another sixcooler suggestion: recycle corrupted records
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6647 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-04 16:25:05 +00:00
orbiter
fe78edac32 - view API calls in correct date-order
- execute recorded API calls in date-order

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6646 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-04 15:51:54 +00:00
orbiter
74e736c903 missing file for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6645 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-04 14:52:58 +00:00
orbiter
308a973503 refactoring of tables data organisation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6644 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-04 11:26:23 +00:00
lotus
85ca96227f fix for re-enable parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6643 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-03 19:33:59 +00:00
orbiter
8a76f38d26 Added a new steering servlet that can be used to repeat actions that had been made on the yacy interface. This can be used to:
- start again a previously started crawl
- submit settings (again). This option will be used to transmit
  all settings of one peer to another peer if the remote-peer
  steering function is ready
This steering framework will also be used for a 'schedule-everything'
which will also include a new scheduler for crawling.
  

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6642 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-03 09:31:12 +00:00
orbiter
840527689b more simplification of bookmark class
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6639 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-01 23:04:52 +00:00
orbiter
d77782a8d5 removed bookmark tags file, tags are now stored only in RAM
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6638 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-01 22:44:59 +00:00
orbiter
ada0ce9de3 refactoring of bookmarks: there is a big performance problem in the bookmarks code and furthermore the bookmarks
will loose its leading role for the re-crawl funtion when the new api tables will work. To be prepared for a replacement
of such functions the bookmark class is re-organised.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6637 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-01 22:18:56 +00:00
orbiter
3751ab4ae2 added sixcoolers patch and more checks/removed unnecessary code
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6636 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-01 16:11:00 +00:00
orbiter
d8d8562c59 fill key with zeros during normalization
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6635 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-01 15:40:16 +00:00
orbiter
a131ebbcb5 one more fix for NPE, see
http://forum.yacy-websuche.de/viewtopic.php?p=19010#p19010

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6634 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-01 11:28:37 +00:00
orbiter
24060885b6 - added Tables abstraction in data.Tables.java
fix for
http://forum.yacy-websuche.de/viewtopic.php?p=18910#p18910
http://forum.yacy-websuche.de/viewtopic.php?p=18894#p18894
http://forum.yacy-websuche.de/viewtopic.php?p=18814#p18814


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6631 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-29 18:02:09 +00:00
orbiter
7fdf59a77f misc NPE check
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6630 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-29 15:59:24 +00:00
orbiter
a512aef6ad fix for http://forum.yacy-websuche.de/viewtopic.php?p=18918#p18918
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6629 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-29 10:33:20 +00:00
lotus
38a3d55afd added more possible php extensions for html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6621 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-24 20:04:31 +00:00
orbiter
4403304957 bugfix for list()
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6616 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-22 16:00:56 +00:00
orbiter
3889438db6 fix for bookmarks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6615 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-22 14:20:24 +00:00
orbiter
23bcca07a3 removed directly linked servlets that had been there to test memory failures that appeared in that servlets
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6612 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-22 13:30:20 +00:00
orbiter
69c29acb6e no exception thread dump if parser cannot parse becuase that mime-type/extension is in the deny-set
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6611 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-22 13:21:37 +00:00
orbiter
0098e6e859 bugfix for heap iterator
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6610 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-22 10:26:50 +00:00
orbiter
db19a941cf added new image index storage classes (not integrated yet)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6608 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-21 22:12:05 +00:00
orbiter
c8aece34a4 update to yacy/ai (just more testing)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6607 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-21 22:07:32 +00:00
orbiter
8ce936bcdd added an api recording function: it shall be possible to record
all operations on YaCy in a database that should make it possible
1) to re-create a setting on fresh peers
2) to transmit a setting from one peer to another
3) to re-create crawl starts after a complete deletion of the index
This functionality will also support
4) scheduled re-crawls (new implementation)
To implement this, a new database structure has been crated that stores maps into blob heaps. to encode maps the b-encoding technique was used (this is the same encoding that torrent files use)
- added a b-encoder
- enhanced the b-decoder
- added a b-encoded map heap data structure
- added a table organisation based on b-encoded heaps
- added a servlet to maintain such tables (see Tables_p.html)
- integrated the servlet into the Advanced Settings menu
- added an api recording based on the new tables

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6606 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-21 22:06:03 +00:00
orbiter
56e0d9bd01 - testings with image parser
- added image size as part of parsed text in images
- avoid unnecessary error messages if parsing of documents failed but one succeeded


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6597 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-19 14:59:58 +00:00
orbiter
e80e060ca6 - increased thread priority for server threads
- decreased thread priority for crawler threads

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6596 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-19 11:29:22 +00:00
orbiter
234f733a3d - relocation of seed db is better for network switch than re-initialization because of the embedding of the peers object in other objects
- small refactoring of blacklist interface code to remove PMD warnings


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6593 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-18 00:07:20 +00:00
orbiter
473b11033d fixed network switch process - crawling did not work after a switch before this fix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6592 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-17 23:33:15 +00:00
orbiter
fd7b348973 some fixes for the network switch
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6591 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-17 22:07:08 +00:00
orbiter
7d400b17d0 html parser support for .cfm files
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6590 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-17 16:29:49 +00:00
orbiter
f6731c6240 more logging etc.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6589 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-17 00:41:50 +00:00
orbiter
007f8297de added php3 as extension type for html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6588 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-16 15:53:18 +00:00
orbiter
4f1f4863c4 fix for deadlock when initializing a SplitTable with a file of size 0, see also:
http://forum.yacy-websuche.de/viewtopic.php?p=18594#p18594

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6587 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-14 23:03:48 +00:00
orbiter
cc5dcf69ff missing change for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6585 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-14 14:20:18 +00:00
orbiter
ca1ef9a079 fix for http://forum.yacy-websuche.de/viewtopic.php?p=18584#p18584
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6584 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-14 13:38:14 +00:00
orbiter
d9169cc6c3 increased proxy load time-out from 30000 to 60000 milliseconds
according to http://forum.yacy-websuche.de/viewtopic.php?p=17782#p17782

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6583 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-14 10:42:05 +00:00
orbiter
938e806182 tried to fix date problem that may have prevented that foreign peers stay in the network
- removed unused code
- removed possibly wrong utc difference correction

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6581 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-13 20:01:46 +00:00
orbiter
bd05e57d3b fix for http://forum.yacy-websuche.de/viewtopic.php?p=18563#p18563
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6580 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-13 18:41:33 +00:00
orbiter
5df628a2a4 - added BEncoder class
- added BEncodedHeap class that encodes B data structures and stores that to a heap
- refactoring of MapView, this is now named MapHeap to fit into the naming scheme of the BEncodedHeap

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6579 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-13 16:21:37 +00:00
orbiter
82f57f79e5 more PMD enhancements
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6576 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-13 00:23:07 +00:00
orbiter
5d930c96f0 more fixes to search result page navigation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6575 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-13 00:04:37 +00:00
orbiter
8c520f128d reverted a change in ranking process committed this afternoon
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6573 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-12 20:56:37 +00:00
orbiter
a06f7ddb33 more PMD recommendations
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6572 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-12 20:53:19 +00:00
orbiter
eb79ceb3ff update to kelondro data structures
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6571 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-12 15:37:34 +00:00
orbiter
18172451a0 better search computation:
- increased sort limit, now 3000 entries, before: 1000
  this should cause that more results can be shown in case
  of strong limitating constraints, like domain navigation
- enhanced the sort process
- check against domain navigator bugs
- fix in sort stack
- showing now all naviagtion pages at first search (not only next page)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6569 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-12 15:01:44 +00:00
orbiter
d126d6c1b5 renamed the servlet WatchCrawler_p to Crawler_p
this was done because that servlet may be used for wget/cronjob
triggered crawl starts and it appears to be confusing that the
name of the crawl start servlet looks like a pure monitoring tool.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6568 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-12 10:05:28 +00:00
orbiter
66c0a8e849 more PMD recommendations
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6567 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-11 22:18:38 +00:00
orbiter
909a4f91c7 added a logging output for crawl starts that shows the URL that can be used to start the crawl again
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6566 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-11 18:10:39 +00:00
orbiter
bc96d74813 - clean-up of robots.txt parser
- added 'yacybot' as key to recognize robots.txt entries for YaCy
- removed unused method to get robots.txt from database

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6565 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-11 16:36:30 +00:00
orbiter
2113fcd7e5 - fixed usage of isEmpty() which is not available in java 1.5
- increased visibility of some methods

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6564 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-11 12:33:40 +00:00
orbiter
dd459281c8 applied code changes that are recommended by PMD
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6563 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-10 23:09:48 +00:00
lotus
eac2daf2e8 * reenable DHT if yet enough memory is available
* reset treshold on reconfiguratoin
(thanks to sixcooler)

* display status message in web interface

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6562 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-10 19:04:43 +00:00
lotus
0752634b8b log YaCy version on startup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6561 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-10 16:10:11 +00:00
orbiter
d77a8f3b3e added some modifications recommended by PMD for better performance
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6560 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-10 01:40:26 +00:00
orbiter
d1973bae2a code cleanup: removed unused code and unused methods
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6559 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-10 00:42:49 +00:00
orbiter
a3b8b7b5c5 some redesign of the main menu structure:
- moved all index generation servlets to it's own main menu item, including proxy indexing
- removed external index import because this operation is not recommended any more. Joining an index can simply be done by moving the index files from one peer to the other peer; they will be merged automatically
- fix to prevent endless loops when disconnecting http sessions
- fix to prevent application of bad blacklist entries that can cause a 'Dangling meta character' exception

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6558 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-10 00:10:43 +00:00
lotus
ab3cf60dbe fix for npe
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6557 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-09 14:10:27 +00:00
orbiter
7f20963b41 add-on to last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6556 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-09 00:17:39 +00:00
orbiter
eeca2ded92 fix for http://forum.yacy-websuche.de/viewtopic.php?p=18500#p18500
- catch uncatched OOM
- less wasting of memory

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6555 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-09 00:08:16 +00:00
lotus
32972139af added nice configuration for the resource observer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6554 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-07 17:19:50 +00:00
orbiter
bb2e03761c - fix for deadlock with 100% CPU during search
- fix for failure of ranking because of a ConcurrentModificationException

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6553 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-07 12:41:43 +00:00
orbiter
3f771d2a16 fix for rss parser: be lazy when rss is not well-formed
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6552 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-07 01:02:23 +00:00
orbiter
dff4f95c78 some patches to get the torrent parser working
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6551 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-07 00:42:12 +00:00
hermens
574f49903e Prevent blob merge from possibly losing the last container
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6549 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-05 01:34:46 +00:00
orbiter
83d05e9176 added sixcoolers hack with some modifications:
http://forum.yacy-websuche.de/viewtopic.php?p=15004#p15004
old index blobs where deletions have been made because of DHT transmission should be melted down to new blobs. This uses sixcoolers methods from the forum thread but modifies the process in such a way that the blobs are not merged with themselves but simply rewritten to smaller files.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6548 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-04 18:12:03 +00:00
orbiter
fbd24c2d84 integrated the torrent parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6547 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-04 16:07:31 +00:00
orbiter
bd32f8b8cb added a torrent metadata file parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6546 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-03 23:26:15 +00:00
orbiter
610e3ffffb Added new classes for the implementation of concurrent greedy algorithms.
These classes can be used to produce an abstract worker process that can be used for common problems in artificial intelligence, such as game playing and problem solving. These classes will be used as abstraction layer for a new search process in YaCy. These classes had been created while searching for an abstraction of the current search process. It turned out that the abstraction of the YaCy search process is also an abstraction for problems in artificial intelligence and therefore the classes had been designed in such a way that it covers not only the YaCy-specific problem but also the more generic problems in ai. To test the classes they had been used in a ConnectFour implementation (game playing). 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6545 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-03 22:51:14 +00:00
orbiter
d0b7bf9ca2 added a decoder class for Bencoding
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6544 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-03 22:44:09 +00:00
low012
028657f019 *) adding more SVN properties
*) minor changes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6542 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-28 13:11:07 +00:00
low012
82d740050f *) adding more SVN properties
*) minor changes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6541 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-28 12:26:50 +00:00
low012
e04cb8cef0 *) adding more SVN properties
*) minor changes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6540 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-28 12:16:40 +00:00
low012
dcb1096fb0 *) adding more SVN properties
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6539 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-28 11:22:32 +00:00
low012
7d610e0063 *) minor changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6538 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-28 11:20:34 +00:00
low012
82198acc06 *) minor changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6537 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-28 11:06:49 +00:00
low012
b75547fc60 *) minor changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6536 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-28 11:05:50 +00:00
lotus
9bee0ac780 more logging for DHTrule
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6533 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-21 14:02:00 +00:00
orbiter
57d729e377 fix for negative numbers in network statistic
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6532 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-21 11:36:48 +00:00
orbiter
4ac4fe952c patch for npe in bookmarks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6530 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-18 10:14:05 +00:00
orbiter
c14233a933 fix for a OOM in MapView that can cause unavailability of
- seed list
- bookmarks
during very low memory configuration

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6529 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-15 12:23:37 +00:00
orbiter
d548bd41ad fix for a npe during search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6528 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-15 12:15:39 +00:00
orbiter
37245430c3 fix for NPE during DHT RWI selection
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6527 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-15 00:02:10 +00:00
orbiter
959b38b61b fix for memory tracker
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6526 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-14 20:23:11 +00:00
orbiter
a37878b7d5 url parser regex performance hack
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6524 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-10 14:40:32 +00:00
orbiter
b527d2ebfa fix for media search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6522 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-09 23:47:37 +00:00
orbiter
362b7a929b added extensive memory protection logic to avoid out of memory errors that may be caused by the RowCollection memory allocation function
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6521 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-09 23:27:26 +00:00
orbiter
8281e29963 - more configuration for profiling graph (number of events)
- more logging for a shutdown: print reason and accessing IP into log


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6520 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-08 14:25:51 +00:00
f1ori
5f0f6b71b4 * revert last commit, something is more broken than before
* UTC timestamps and lastseen-properteries still needs some debugging


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6519 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-06 21:54:32 +00:00
f1ori
8c8b642eba * fix timezone problem
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6518 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-06 21:03:12 +00:00
lotus
713cb26a27 update for memory observer algorithm
disable dht if memory is less than treshold
after 4 times, maximum 11 minutes between each detection

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6517 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-06 17:45:48 +00:00
orbiter
4782d2c438 fix for search bug that appeared when looking at page 3 of results or further
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6515 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-03 12:25:03 +00:00
orbiter
29fde9ed49 better control of ranking order in sort stack
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6514 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-03 00:36:07 +00:00
orbiter
93caa38d55 fix for bug in SortStack (did not appear to shrink according to required size) - caused bad and unsufficient search results
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6513 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-03 00:20:53 +00:00
orbiter
66923ebc6c - modified method in RequestHeader that delivers the host name of requester: no more reverse domain lookup (may have killed interface performance in some cases)
- added logging output for shutdown servlet: show ip of requester of the shutdown

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6512 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-02 22:56:28 +00:00
orbiter
e34e63a039 preset of proper HashMap dimensions: should prevent re-hashing and increase performance
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6511 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-02 14:01:19 +00:00
orbiter
4a5100789f replaced _all_ size() == 0 with isEmpty() and all size() > 0 with !isEmpty(). The isEmpty() method is much faster in some cases, especially when used to access badly balanced hashtables where an size() operation becomes a large iteration.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6510 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-02 00:37:59 +00:00
orbiter
f4946eaf27 - better thread dump
- suppressed one server exception

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6509 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-01 22:53:36 +00:00
orbiter
9743b70d1c disabled keep-alive of server, not really needed for speed but a cause for much trouble and memory occupancy
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6508 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-01 19:14:16 +00:00
orbiter
491ba6a1ba - some refactoring in workflow
- some refactoring in search process
- fixed image search for json and rss output
- search navigation on bottom of search result page in cases where there are more than 6 results on page
- fixes for number of displayed documents
- disabled pseudostemming

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6504 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-24 11:13:11 +00:00
orbiter
969123385b added json and rss output for image search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6503 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-23 16:10:50 +00:00
orbiter
d183f8d980 refactoring (moved code from ContentTransformer to TemplateEngine)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6498 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-20 14:57:00 +00:00
orbiter
23aef43786 - better synchronization in SortStack
- better ThreadGroup organization
- less worker threads for media search (64 was too much...)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6497 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-20 14:35:33 +00:00
orbiter
7b1f5b0430 - better media search ranking
- better concurrency with enhanced synchronization in sort stack

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6496 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-20 13:19:12 +00:00
orbiter
4df88a4e7a - fixes for missing or bad hashCode computation
- fixes for bad equals() methods that had not been used by hash maps and therefore some classes did not work as objects in hash maps.
- this may also affect some cases where double-checks should have been, but did not work.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6495 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-20 12:11:56 +00:00
orbiter
dbdf2570ba added comparator and more fixes for SortStack/SortStore
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6494 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-20 03:30:48 +00:00
orbiter
d2938c44a1 - added bmp parser to the document parsers
- image parser that implement the document parser interface return itself in the list of images of the document which should cause that the parsed images contribute to the image search

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6493 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-19 23:22:53 +00:00