Commit Graph

1250 Commits

Author SHA1 Message Date
orbiter
e90afa9483 fixed search access tracker
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4072 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-04 09:04:47 +00:00
orbiter
4779f314fe first version of next-generation search interface:
- snippets are not fetched by browser using ajax, they are now fetched internally
- YaCy-internat threads control existence of snippets and sort out bad results
- search results are prepared using SSI includes
- the search result page is visible right after the search request, the results drop in when they are detected
- no more time-out strategy during search processes, results are shifted within queues when they arrive from remote peers
- added result page switching! after the first 10 results, the next page can be retrieved
- number of remote results is updated online on the result page as they drop in
- removed old snippet servelet (which had been also a security leak btw)
- media search is broken now, will be redesigned and fixed in another step


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4071 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-03 23:43:55 +00:00
orbiter
f9e6cf6a3d more refactoring of search:
integrated first version of ssi-using search interface,
but the function is currently disabled


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4063 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-28 12:15:46 +00:00
orbiter
f81ef40cc4 no dht activity for small networks; this is not needed if the network is small
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4062 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-26 22:35:26 +00:00
orbiter
d9472b6a3a * fixed problem with watch crawler
* added new column to network table (remote crawl urls):
  the new value for provided URLs will be used for new remote crawl method


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4061 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-26 22:06:58 +00:00
orbiter
e332b844b2 - enhanced remote search: during waiting time for remote crawls
some urls are fetched so the url cache can be filled with these urls
- the url-prefetch is used to sort out some unresolved urls
- the snippet-fetcher is triggered with the search event id. This is used
  to remove missing snippets from the search cache so they will not be displayed again


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4060 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-26 18:18:35 +00:00
orbiter
a34d9b8609 * added a search history cache that maintains search results for 10 minutes
it is necessary for the new search process that will do automatic re-searches
a positive effect is, that when a re-search is done it can be monitored how many
results had been contributed from other peers. The message for this contribution
was moved from the end of the result page to the top.
* enhanced re-search time when a global search was done an the local index has
already a great number of results for this word
* re-organised presearch computation; must be further enhanced

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4059 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-24 23:12:59 +00:00
orbiter
ae86d010bb more refactoring of search processes; also some small speed enhancements
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4058 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-24 08:41:52 +00:00
orbiter
bb426565f0 added new yacy protocol for mass url-pull for better remote crawling distribution
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4056 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-22 00:59:05 +00:00
orbiter
72752bb503 because of a new database structure handling, the memory need for accessing
collection objects has been reduced to 50%:
- set new memory calculation functions for indexing process
- adjusted guessed memory amount
-> Testing needed:
   try new recommended value (see performanceQueues) and see if OOMs occur.
-> report maximum recommended value, so we can set new default values.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4053 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-20 17:36:43 +00:00
orbiter
16c203f759 fixed remote search access tracker
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4048 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-16 11:44:18 +00:00
orbiter
344911bfaa shorter minimum delay values for intranet crawl targets
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4047 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-15 23:18:12 +00:00
orbiter
f890cc86aa inserted forwarding patch from fuchs
see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=233

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4046 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-15 22:25:48 +00:00
orbiter
b5346141b3 made the plasmaHTCache static (there is only one internet, so we need only one cache)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4045 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-15 21:31:31 +00:00
orbiter
947fc46904 refactoring of search process:
- re-designed remote request result processing
- re-designed local result accumulation, will be further enhanced with snippet fetcher
- removed search process handling in switchboad
- made snippet class static (there is no need for multiple snippet objects)
- removed some redundant tasks in server-side search process, should be a little bit faster now


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4043 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-15 11:36:59 +00:00
orbiter
5c1b444690 some redesign of min/max and normalization computation during search result ordering
this saves about 1 millisecond for each URL reference, which has some good effect
on the search result computation if a word is searched that appears very often
(speed-up of 1 second and more)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4033 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-06 12:50:11 +00:00
orbiter
1af0e3bd84 refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4031 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-06 00:56:56 +00:00
orbiter
5605887571 refactoring of search processes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4030 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-05 23:57:25 +00:00
orbiter
62347b50f4 added security layer for ViewImage:
- images may be requested by localhost and authorized users only, if the request is done using a clear-text URL
- the image may be requested also using a code that can be a license to retrieve a URL for everyone
- some servelets produce URL licenses for ViewImage, like image search results


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4027 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-03 23:06:53 +00:00
orbiter
69d640b041 added missing synchronization in crawl balancer
to avoid that the synchronization is triggered during many-time-used size() operation
a notEmpty method was added that can avoid the synchronization many times

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4025 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-03 12:21:46 +00:00
orbiter
9628db6cdc enhanced memory allocation during database access:
- refactoring of kelondroRecords; this class is now divided into
  kelondroAbstractRecords, kelondroRecords, kelondroCachedRecords, kelondroHandle and kelondroNode
- better abstraction of kelondroNodes, such nodes may now be crated by different classes
- a new Node defining class kelondroEcoRecords defines Nodes that do not need so much allocation and System.arraycopy
- there is less memory transfer on the bus, especially for collection index
- now half of memory needed for web index access


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4024 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-03 11:44:58 +00:00
orbiter
57a5b6fa71 some generalization of remote proxy configuration and setting handling in httpc
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4023 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-02 00:42:37 +00:00
orbiter
367fc28928 corrected Brausse->Brausze
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4020 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-01 22:15:51 +00:00
orbiter
e76fe1c078 - replaced unicode characters in copyright holder name ('Brausse')
- more logging for bootstrap seedlist loading
- larger DHT chunks

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4015 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-31 10:00:17 +00:00
orbiter
8d6aa7a66d replaced detailed search page by ranking definition page (this is what it essentially is)
the ranking definition there will influence the normal web search

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4006 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-25 14:04:36 +00:00
orbiter
7ff4357184 fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=206&hilit=&p=1130#p1130
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4004 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-24 18:11:30 +00:00
orbiter
9ca46a8c69 indexing of local (intranet) urls enabled
To do this, one must create a separate YaCy network that has a local URL domain
A description how to do this is here: http://www.yacy-websuche.de/wiki/index.php/De:Netzdefinition

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4001 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-24 00:46:17 +00:00
orbiter
40b0547611 - documentaton changes (removed old forum links)
- different handling of link quotation
- different handling of link normalization
- enhanced html/unicode en/de-coding

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3993 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-19 15:32:10 +00:00
orbiter
dcb8687904 fix to update cycle
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3992 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-18 22:45:29 +00:00
orbiter
557f8d80e4 - better logging
- fixed bugs in auto-update

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3990 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-18 13:03:20 +00:00
orbiter
f323e1813d added commons.logging again (is used by mimeTypeParser)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3989 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-18 11:47:19 +00:00
michitux
a695c93662 Fix of the Status-page for IE:
- reverted revision 3982 and 3979
- added a real fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=163


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3988 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-17 22:06:06 +00:00
orbiter
b6d9cca67e - fixed problem with yacyVersion and own version generation
- within this context: generalized date format handling
- extended Update interface:
 * a version lookup can be triggered manually
 * a complete lookup + download + re-boot process can be triggered with one click

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3986 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-16 23:47:21 +00:00
orbiter
6071668c3b better error message in case that a mime type cannot be found.
see also http://forum.yacy-websuche.de/viewtopic.php?f=6&t=132&p=587#p587

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3984 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-16 14:40:19 +00:00
orbiter
03847bebc1 removed unused libs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3971 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-16 10:34:39 +00:00
orbiter
9da0e53fe8 repaired rss feed reader
- removed old rss parser
- removed unused rss parser libraries
- added new rss reader
- added previously removed FeedReader_p.java and adopted it to new rss parser
- adopted parser interface for rss indexing to new rss parser

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3970 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-16 10:07:48 +00:00
orbiter
26ddf797eb added bmp and ico image format to all parser/viewing methods
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3969 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-15 12:55:41 +00:00
orbiter
5444b07674 fixed bug with decompression of index abstracts
this fixes a problem that occurred when searching for several words

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3968 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-15 12:39:16 +00:00
orbiter
89e1848db6 fixed problem with favicons:
target servers had been able to see search words from the referrer of the favicon fetch.
This has been removed by using the getImage - servlet for favicon fetch.
Since java does not support loading of bmp and ico-Images, such parsers had been added.
The image parser had been coded from their original microsoft documentation.
This influences also the image-search functionality: there can now be a preview
of found bmp-images. Another benefit: favicons for search results are now cached with the HTCACHE.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3965 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-15 01:34:01 +00:00
orbiter
7c5c814a47 - simplified code (removed exception handling where not necessary)
- added confirmation dialog for shutdown and restart

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3962 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-13 14:54:01 +00:00
orbiter
a4e8ad95ab enhancements to news and switchboard queue processing
removed direct access and replaced by iteration

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3961 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-13 13:00:18 +00:00
orbiter
a45216b479 fix to prevent bad-formed news messages
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3960 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-13 09:41:55 +00:00
orbiter
bec4dbc753 added options and execution methods for automated updates
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3959 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-12 16:23:33 +00:00
orbiter
208b5297f1 enhanced handling of news records:
result is a speedup of Surftips, Supporter, and Network page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3954 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-05 22:56:37 +00:00
orbiter
36a37f758b fix for oom exception during release download
see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=101&hilit=

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3950 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-03 22:55:47 +00:00
orbiter
3421c64d26 implemented update function:
after downloading a release using the download button on the status page
the user can choose any of the downloaded versions for a update.
this enables also a downgrade to a older version.
when the update button is pushed, yacy terminates, installes the choosen version
and restarts

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3948 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-02 15:16:05 +00:00
orbiter
c1aad9e508 added parameter for network graphic background
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3942 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-29 23:06:13 +00:00
orbiter
1a45ecb356 - fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=14&p=137#p137
- fix for missing restart script in ant built target
- removed some more synchronization for size() operations
- removed blocking statement on search page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3935 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-28 22:06:33 +00:00
orbiter
f1ed91a8e4 added option to allow/disallow DHT transmission during indexing
see also http://forum.yacy.de/viewtopic.php?f=9&t=8

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3933 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-28 15:25:33 +00:00
orbiter
9bbd39b67c - removed unfinished auto-updater from roland and martin
- added new download-option for releases on the status page
still mising:
- thomas-style restart for linux/mac
- untar/gunzip on shell basis
(comes next)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3931 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-28 14:52:26 +00:00