Commit Graph

5751 Commits

Author SHA1 Message Date
lotus
aec3e7995a autoconfig.pac can be used to browse .yacy-domains only
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6077 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-15 19:48:11 +00:00
orbiter
4e825852d2 added stub for phpBB3 search integration guide
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6076 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-15 11:52:57 +00:00
orbiter
bc6dd8194b refactoring: moved search query class to new search package
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6075 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-15 11:49:00 +00:00
orbiter
a4805defdd added stub for new search process
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6074 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-15 11:46:23 +00:00
orbiter
b8e738a7be a collection of
- small bug fixes
- better/more comments
- more asserts
- fixed synchronization
- test case enhancements
- code cleanup
- performance hacks

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6073 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-14 22:09:08 +00:00
apfelmaennchen
39779e4796 DidYouMean: as I moved to only 8 consumer and 4 producer threads, I removed poison pills as it does not make sense anymore - threads are interrupted directly. Having a consumer thread per test case just didn't make sense either (see svn 6070) due to the massive overhead.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6072 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-14 16:31:31 +00:00
apfelmaennchen
c3c4dd0933 DidYouMean - changed to much simpler LinkedBlockingQueue
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6071 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-14 15:25:57 +00:00
apfelmaennchen
01ac1b5d7e - blocking queue implementation of DidYouMean
- timeout ist set to 500ms

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6070 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-14 11:53:09 +00:00
orbiter
b8bb1bb364 join with a timeout does not cause that the corresponding thread is stopped after the time-out. It does only cause that the waiting is stopped. Here we need additionally a signal to the thread to stop after we finished waiting.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6069 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-13 23:54:52 +00:00
orbiter
b69f22e9ca mistake in last commit: computation of loops in ReversingTwoConsecutiveLetters
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6068 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-13 23:37:51 +00:00
orbiter
3130334932 - start first with threads that run more loops
- join first with threads that run less loops

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6067 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-13 23:34:16 +00:00
apfelmaennchen
6cde7ebf16 DidYouMean
- without I/O intensive sorting by count
- but with multiple threads

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6066 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-13 23:16:14 +00:00
orbiter
f348190566 tried to insert a database dump import method to the phpBB3 import function. Reason: imports or large database dumps are cannot be handled with phpMyAdmin and this should be an easy way to the database dumps into a mySQL database where it can be exported again with the phpBB3 content integration adapter. Completion or removal of this function stub will follow before next main release.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6065 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-13 23:03:40 +00:00
orbiter
945777aa80 replaced rwi term counting method by one that computes the maximum of the blobs that contibute to the RWI. An addition of the blob sizes is wrong/incorrect and does not reflect the real size. Truncation the size operation to the maximum of all blobs is also incorrect, but not as wrong as the sum of all blob sizes wich double-counts many rwi entries.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6064 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-13 22:59:54 +00:00
apfelmaennchen
303ccda69f small fix for "did you mean"
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6063 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-13 11:11:30 +00:00
orbiter
7c4d1d471c hand-over of more specific object
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6062 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-13 10:22:25 +00:00
apfelmaennchen
9150bc0f7d - don't show empty "did you mean"
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6061 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-13 07:02:50 +00:00
apfelmaennchen
6c116be536 - set default &meanCount=5
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6060 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-13 06:49:17 +00:00
apfelmaennchen
09acfa66d1 - improved "did you mean"
- added &meanCount= to query string
- &meanCount=0 ==> no suggestion, no performance loss
- sorting suggestions by sb.indexSegment.termIndex().count()

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6059 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-13 06:20:05 +00:00
apfelmaennchen
da6ce37f7b - fixed encoding problem
- added limit to 10 suggestions

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6058 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-12 21:36:26 +00:00
apfelmaennchen
54a48b4184 - added "did you mean" to search page
- currently works for single word queries only!

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6057 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-12 20:36:03 +00:00
apfelmaennchen
31360ba40c - Updated ConfigLiveSearch.html
- added documentation for load_js and load_css

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6056 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-12 05:57:08 +00:00
apfelmaennchen
ab09d8ebb3 - small noscript fix
- noscript is now functionall but ugly

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6055 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-11 22:10:02 +00:00
apfelmaennchen
55ef9ae12a small fix for last post
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6054 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-11 21:42:34 +00:00
apfelmaennchen
36dc9b09ac - partial update to jquery-1.3.2
- partial update to jquery-ui-1.7.2
- yacyportalsearch fixed sidebar for navigators


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6053 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-11 21:34:39 +00:00
orbiter
550312ac85 added new command script to do a auto-Update from command line. this will make it easy to do mass-auto-updates in private yacy clusters
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6052 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-11 11:31:26 +00:00
orbiter
0fc1168554 - reduced time-out for socket-connection communication from 20 seconds to 5 seconds. This is a test to find out if the time-out was a cause for problems in metager environments
- turned a fine log entry in case of rejected connections on the server socket into a warning. (look for 'exceeding limit')


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6051 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-11 10:20:31 +00:00
orbiter
28b86385cd patch for bad behaving swf parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6050 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-11 09:54:48 +00:00
orbiter
d58b395993 fix for http://forum.yacy-websuche.de/viewtopic.php?p=15693#p15693
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6049 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-11 09:38:25 +00:00
orbiter
cffef67dc5 added a short info line about the latency monitor
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6048 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-10 23:03:29 +00:00
orbiter
733385cdd7 enahnced database access times by removal of unnecessary synchronization.
added also more hacks that resulted from high-volum query testing

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6047 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-10 23:02:42 +00:00
apfelmaennchen
5a7dec880e - some improvements for: http://forum.yacy-websuche.de/viewtopic.php?f=9&t=1904#p15668
- portalsearch: introduced yconf.load_js and yconf.load_css
- yacysearch.html still having problems with focus after sidebar is loaded
- yacysearchtrailer.json seems not to be valid json for ?nav=all

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6046 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-10 22:11:31 +00:00
orbiter
5d7045387b added more word lists and a multi-access search test tool for high-performance query testing:
run searchtestmulti.sh; then 10 concurrent processes fire 1000 requests each to  the local peer.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6045 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-10 22:01:48 +00:00
orbiter
398e210fef removed synchronization in logging that causes deadlocks in high-performance environments
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6044 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-10 19:17:30 +00:00
orbiter
db3a06dd81 removed cookie handling in httpc:
- no need to do cookie handling in proxy, this was switched off so far
- no need for cookies in crawler, this was switched on (by mistake)
This fix was needed for a case where a web server flooded the crawler with cookies and caused a complete blocking of the httpc.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6043 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-10 16:11:09 +00:00
orbiter
1c54ae4a63 some small changes in HandleMap Testing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6042 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-10 15:02:52 +00:00
orbiter
b21e9149f5 another fix for navigation results, the json result format and searches with yacyinteractive
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6041 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-10 12:41:15 +00:00
orbiter
15c5406b9c fixed yacyinteractive
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6040 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-10 07:24:45 +00:00
orbiter
2c5554c912 small enhancements in search result computation speed
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6039 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-09 15:22:23 +00:00
orbiter
e0b3984805 added navigation keys for site and author facets to remote search interface
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6038 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-09 09:07:52 +00:00
orbiter
27fa6a66ad - completed the author navigation
- removed some unused variables

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6037 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-08 23:30:12 +00:00
orbiter
a9a8b8d161 - added display of author navigation (usage of that navigator not yet implemented
- added a synchronization in pdf parser which should help to avoid deadlocks that occur when displaying several search results pointing to pdf sources
- fixed smaller bugs in navigation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6036 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-08 22:01:26 +00:00
orbiter
c879783008 added steering of navigator computation:
- by default the navigator computation if off for servlet yacysearch.html, but:
- the servlet is called by default with a option to switch navigator results on
this will prevent that metasearch users will get slow results that are caused by unnecessary computations

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6035 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-07 22:51:15 +00:00
orbiter
c079b18ee7 - refactoring of IntegerHandleIndex and LongHandleIndex: both classes had been merged into the new HandleMap class, which handles (key<byte[]>,n-byte-long) pairs with arbitraty key and value length. This will be useful to get a memory-enhanced/minimized database table indexing.
- added a analysis method that counts bytes that could be saved in case the new HandleMap can be applied in the most efficient way. Look for the log messages beginning with "HeapReader saturation": in most cases we could save about 30% RAM!
- removed the old FlexTable database structure. It was not used any more.
- removed memory statistics in PerformanceMemory about flex tables and node caches (node caches were used by Tree Tables, which are also not used any more)
- add a stub for a steering of navigation functions. That should help to switch off naviagtion computation in cases where it is not demanded by a client

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6034 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-07 21:48:01 +00:00
orbiter
bead0006da replaced tmp file extensions by prt
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6033 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-06 18:09:58 +00:00
orbiter
3189f9cd39 fixed problem with DCEntry initialization
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6032 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-06 18:00:50 +00:00
orbiter
a704d82280 patch for problem with digest
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6031 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-06 16:53:16 +00:00
orbiter
3029ef6eb3 fixed a bug that was recently inserted which caused that no idx and gap files were written.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6030 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-06 16:43:58 +00:00
orbiter
b6e274f211 omit most of forced crawl delays by using a separat delay table which flushes delayed URLs at the correct time
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6029 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-06 16:20:27 +00:00
orbiter
d50be59088 - added a automatic re-construction of the domain stack after 10 minutes. this includes then urls to the domain stack that were left over in case of stack size limitations when the domain stack was created the last time
- changed the busy sleep time for the crawl thread to 30 millisecons. This is sufficient to crawl with 2000 PPM.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6028 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-06 09:34:44 +00:00