Commit Graph

233 Commits

Author SHA1 Message Date
orbiter
693fa2a157 - renamed Comparison to compare_yacy
- added more search engines
- some refactoring and added a list that is used to present the search engine list in a specific order
- added simpleheader and no-header options
- added the compare search to the simple header
- added default compare search page selection storage - after re-start you get the same default search engines as you selected before

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5157 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-15 09:17:05 +00:00
orbiter
42e2d195ac added hint from http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1294
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5130 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-08 22:37:58 +00:00
orbiter
4fbee21cea - added fetch-ahead again (had been removed in last commit)
- reverted default query mode to verify=false

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5111 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-03 23:50:13 +00:00
orbiter
536e77e8b7 modifications towards a single database operation to read/write http header and cached file at once:
- removed distinction between header file types for http and ftp; ftp is simulated by using http properties
- removed all old resourceInfo classes that handled this distinction
- introduced a new distinction between http request and http response objects
- unified new response objects with two other object types that had been introduced elsewhere
- changed all servlet call methods to use the new http request header object type
- divided static object keys for http header properties into request and response types
- refactoring here and there (a large number of type changes and many methods merged/moved)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5079 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-25 18:11:47 +00:00
orbiter
7989335ed6 Preparations to replace the HTCache with a new storage data structure:
- refactoring of the HTCache (separation of cache entry)
- added new storage class for BLOBs. (not used yet, this is half-way to a new structure)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5062 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-19 14:10:40 +00:00
orbiter
05c26d58d9 fixed missing remove operation in balancer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4990 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-11 12:03:18 +00:00
orbiter
080cda97ef added another peer selection rule:
- select also non-robinson (dht-) peers if their peer tags match with search words
- the peer tag '*' can now act as catch-all rule: shall be selected always

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4963 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-30 23:04:32 +00:00
orbiter
40e1b989ea new release cycle
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4891 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-05 21:30:22 +00:00
orbiter
c10eaf9bdb - fix for pop-up page upon first start
- added comments in opensearchdescription to explain fast mode
- release 0.59

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4890 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-05 20:46:31 +00:00
orbiter
b21598bdd0 - enhanced handling of own IP address inside seed
- prevention of false information of own IP address
- enabled searching before an own IP address is assigned (before first ping happened)
- removed warning about limited search function
- added better time-out settings for peer-ping process (10 seconds complete, 5 seconds for back-ping)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4883 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-05 11:01:20 +00:00
orbiter
9bef20b537 - added cleanup for unused server loggings: they are removed after the client had not been seen since one hour
- removed configBasic popup trigger when no password is set

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4875 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-02 21:49:59 +00:00
f1ori
8b1a7465c1 * targets for source-distribution and linux package generation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4870 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-31 14:57:17 +00:00
orbiter
11e00a0849 - refactoring of seedURL handling
- additional check for seedURL pointing to localhost: deny such peers

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4851 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-25 18:35:38 +00:00
orbiter
01b3e9431a - fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1140&p=7626#p7626
- less dots for ppm bar in watchcrawler (one dot for each 10 ppm)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4846 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-24 11:30:16 +00:00
orbiter
c1d721dd2d fix for attacks on localhost-authorized peers from web pages with links to localhost addresses:
checking of referer in access

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4828 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-19 22:17:53 +00:00
orbiter
cafce41d8f temporary set default login behavior to not-login from localhost without password
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4810 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-16 19:55:17 +00:00
orbiter
2f29ab8779 more target server access security
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4809 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-16 19:50:28 +00:00
orbiter
483e9a2066 - shifted tld recognition methods from yacyURL to serverDomains
- changed isLocal Property in such a way that it is possible to see if a domain is in the internet (and not intranet)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4751 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-30 23:06:42 +00:00
orbiter
d7e89c2aca fixed near-deadlock situation when deleting crawl profiles
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4721 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-20 22:10:26 +00:00
orbiter
6d1be66822 - longer refresh rate for reload of WatchCrawler page forwarding to indexing start (does not work in IE)
- better names for search pages
- Release 0.58

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4719 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-20 08:10:52 +00:00
orbiter
c270d02176 Reverting SVN 4716
ein 1.6er Target versehentlich zu setzen bedeutet bei automatischer Release-Erzeugung und updates bei Usern,
die nur Java 5 installiert haben (was bei allen Mac-Usern so ist), dass große Teile des Netzes crashen können und manuell
wieder hoch gezogen werden müssen.
Neu-User die mit einem dev-Release Beginnen können mit der intranet-Einstellung gar keine Websuche starten.
Bitte nach einem Commit immer kontrollieren, was man da eingecheckt hat.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4717 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-19 09:58:36 +00:00
danielr
48ffd61e6a changed "patched wrong" to warning, so it goes to the logfile
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4716 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-19 07:54:44 +00:00
orbiter
6155f0e634 last small changes until main release
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4699 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-14 07:26:33 +00:00
orbiter
117ae78001 speed enhancement for reading of eco-table indexes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4647 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-06 11:50:15 +00:00
orbiter
764a40e37d speed enhancements for crawler and url retrieval (affects also search speed)
- concurrency for LURL-fetching: this can be done using a concurrent lookup into the separated url databases. Concurrency is possible because there is no IO during lookup. The more LURL-Tables are present, the better is the speedup. More CPUs will increase speed
- because a large number of LURL-lookups are made during crawling (for double-check), the LURL-Lookup speed enhancements enhances also crawling speed
- search speed also profits from LURL-lookup enhancement
- changed some flushing parameters in word index caching which should make better use of large word index caches and should speed up indexing
- removed flush chunksize parameter, because this was only useful for IO path enhancement feature which was removed some weeks ago to prevent blocking and deadlocks during search requests

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4628 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-31 15:41:19 +00:00
orbiter
968c775025 - preparation of parsing/indexing queue for concurrent execution
- remote crawl receipts are now transmitted concurrently in separate threads (makes remove crawls much faster!)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4605 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-26 22:43:38 +00:00
orbiter
f3996e63b8 tried to fix more deadlocks:
- changed connection modes in ftpc
- replaced sort tread pool in row collections by new one using util.concurrent. the old pool had caused blockings

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4582 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-19 11:23:43 +00:00
orbiter
b4ed937f1e - modified zone navigation (does still not work correctly)
- added dht switch in network definition
- 0.574

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4550 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-11 11:09:38 +00:00
orbiter
c48e25d784 - fixed selection box for topwords
- fixed parser detail in condenser

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4509 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-25 19:00:11 +00:00
orbiter
f4c73d8c68 - fixed highslide usage
- some enhancement to index management, better types

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4497 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-19 14:13:35 +00:00
orbiter
ebe07b736c added stub for new search page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4446 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-04 23:32:19 +00:00
orbiter
efd5807a7c - some renaming of variables to support DC
- initial 120mb RAM for fresh peers
- release 0.57

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4445 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-04 22:58:40 +00:00
orbiter
3c7b94c119 - fix for online caution delay settings, see
http://forum.yacy-websuche.de/viewtopic.php?f=6&t=738&p=4723#p4723
- removed remote search limitation for non-dht-peers according to discussion in
  http://forum.yacy-websuche.de/viewtopic.php?f=15&t=793&hilit=&p=5277#p5277

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4438 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-03 20:11:50 +00:00
orbiter
7404256997 - no more search time-out!
- fixed a bug with last commit

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4430 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-02 23:53:39 +00:00
orbiter
acf771d5e1 - fixed bug with too much RAM in crawler queue
- fixed dir bug
- better calculation of TF for join
- better waiting-on-result logic

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4424 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-31 23:40:47 +00:00
orbiter
4a80902081 - added ViewProfile as rdf in foaf syntax
- added link to rdf and vCard version on html page
- can be seen on http://localhost:8080/ViewProfile.html?hash=localhash
- more generics

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4411 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-28 18:21:08 +00:00
orbiter
94f21d9403 activated new kelondroEcoTable file structure.
This data structure replaces almost all files in the PLASMA directory
also the collection.index and the LURL-db will be created as Eco-DB, if it does not exist before
existing Flex-databases will be used as they are (the is no data lost)
If you want to force the creation of a Eco-collection.index, simply delete the old index.
The Eco file system will only be used if there is enough memory.
The collection.index RAM limit is 200MB, if you have less, a flex-Table is createt.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4340 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-17 21:48:08 +00:00
orbiter
71bcf02d3a - removed pro-version (is the same as standard version, use the standard instead)
- changed yacy logo
- removed crawlOrder protocol (unused)
- removed file index in kelondroFlex (will not work, it takes too long to maintain)
- fixed remoted crawl for clusters (now denies remote crawls from peers outside cluster)
- 0.562

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4317 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-09 23:05:52 +00:00
orbiter
4dc438f7e7 moved to Java 1.5:
- changed build script to use java 1.5 compiler
- first stept to resolve missing generics definition (about 400 from over 4100 'missing'-warnings)
- added key-iterator to kelondro databases (for rapid from-memory enumerations, will be used for domain name collection, not used yet)

please set your development environment to use java 1.5!


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4292 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-27 17:56:59 +00:00
orbiter
d8d955a7ff start of next development cycle towards 0.57
happy xmas!

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4291 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-22 03:29:00 +00:00
orbiter
db0d3d5e54 release 0.56 (and some last fixes)
- fixed bad peer hash computation in case no peer list is avaiable upon first startup
- security minimum waiting time in search result preparation
- removed dead superseed link

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4290 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-22 02:58:38 +00:00
orbiter
52dd015218 new release strategy: the standard release is now built the same way as the pro release
a new release type was added: 'embedded' which is the same as the current standard release was
this will not have any effect to the next release 0.56, which will still a pro-release on public download
the transition the the new release strategy must be done now to enable automatic update by the updated in future releases

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4287 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-20 02:46:41 +00:00
orbiter
515e1bde6d - fixed bug with constraint default
- 0.556
- default RAM for pro releases now 120MB (because pro will become default)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4258 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-08 02:04:50 +00:00
orbiter
89b9b2b02a redesigned remote crawl process:
- instead of pushing urls to other peers, the urls are actively pulled
  by the peer that wants to do a remote crawl
- the remote crawl push process had been removed
- a process that adds urls from remote peers had been added
- the server-side interface for providing 'limit'-urls exists since 0.55 and works with this version
- the list-interface had been removed
- servlets using the list-interface had been removed (this implementation did not properly manage double-check)
- changes in configuration file to support new pull-process
- fixed a bug in crawl balancer (status was not saved/closed properly)
- the yacy/urls-protocol was extended to support different networks/clusters
- many interface-adoptions to new stack counters

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4232 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-29 02:07:37 +00:00
orbiter
2fcd18a972 - fixed bad behaviour of search event worker processes
- fixed export of url lists in xml

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4229 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-23 01:08:16 +00:00
orbiter
55da871211 preparations for better ranking: better debugging of index properties
to do this, the index administration interface was extended.
It is now possible to select parts of a index.
See properties shown in interface after a word search for details.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4218 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-15 03:03:18 +00:00
orbiter
6eaa5a0e64 enhanced local search speed. The ranking process is now 6 times faster that before.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4197 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-07 22:38:09 +00:00
orbiter
4ce25b3661 - documentation update
- start of new development cycle.
in case your don't know: commits until 0.553 will not automatically be used be the auto-update funktion


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4146 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-04 20:43:52 +00:00
orbiter
1a0f89d7e8 release 0.55
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4145 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-04 20:12:08 +00:00
orbiter
3e60ae93b9 modified remote search snippet fetch behavior: do not fetch snippets for more than 300 milliseconds, even if the snippets can be found locally without online fetch
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4137 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 16:42:11 +00:00
orbiter
b183bf6f42 - fixed opensearch bugs
- added 'full domain' button to expert crawl start
- removed not-workin 'only one domain' button, the regex allowed crawling of other domains

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4125 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-02 21:43:05 +00:00
orbiter
842308ea97 - redesigned crawl start menu, integrated monitoring pages
- removed web structure picture from indexing menu and grouped it together with htcache monitor
- added a database for terminated crawls, when a crawl is finished it is automatically moved to the new database
- extended crawl profile edit servlet, shows now also terminated crawls
- option that was used to delete profiles is now redesigned to a function that moves the current crawl to the terminated crawls and removes all urls from the current queues!
- fixed here and there problems with indexing queues
- enhances indexing speed by changing cache flush sizes.
- changed behaviour of crawl result servlet: the list of crawled urls is shown if there is one, othevise the overview window is shown

attention: the new profile databases are not compatible with the old one. current crawls will be lost! the web index is not touched.
next steps: the database of terminated crawls can be used to start with them a new crawl. This is useful if one wants to re-crawl specific pages and wants to use a old crawl profile.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4113 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-28 01:21:31 +00:00
orbiter
4275727d69 fix for peer ping problem (implemented a 3-time re-ping); cause for 'Connection reset' still unknown
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4095 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-12 00:42:53 +00:00
orbiter
f4a5c287fe re-implemented post-ranking of search results
(should enhanced search result quality)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4080 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-08 11:50:19 +00:00
orbiter
4779f314fe first version of next-generation search interface:
- snippets are not fetched by browser using ajax, they are now fetched internally
- YaCy-internat threads control existence of snippets and sort out bad results
- search results are prepared using SSI includes
- the search result page is visible right after the search request, the results drop in when they are detected
- no more time-out strategy during search processes, results are shifted within queues when they arrive from remote peers
- added result page switching! after the first 10 results, the next page can be retrieved
- number of remote results is updated online on the result page as they drop in
- removed old snippet servelet (which had been also a security leak btw)
- media search is broken now, will be redesigned and fixed in another step


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4071 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-03 23:43:55 +00:00
orbiter
5c1b444690 some redesign of min/max and normalization computation during search result ordering
this saves about 1 millisecond for each URL reference, which has some good effect
on the search result computation if a word is searched that appears very often
(speed-up of 1 second and more)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4033 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-06 12:50:11 +00:00
orbiter
57a5b6fa71 some generalization of remote proxy configuration and setting handling in httpc
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4023 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-02 00:42:37 +00:00
orbiter
ea960c2b61 release 0.54
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4021 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-01 23:22:21 +00:00
orbiter
9ca46a8c69 indexing of local (intranet) urls enabled
To do this, one must create a separate YaCy network that has a local URL domain
A description how to do this is here: http://www.yacy-websuche.de/wiki/index.php/De:Netzdefinition

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4001 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-24 00:46:17 +00:00
orbiter
b6d9cca67e - fixed problem with yacyVersion and own version generation
- within this context: generalized date format handling
- extended Update interface:
 * a version lookup can be triggered manually
 * a complete lookup + download + re-boot process can be triggered with one click

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3986 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-16 23:47:21 +00:00
orbiter
527b3decde - re-sructuring of configuration menus
- added new system update configuration page
- moved system update from status page to system udate page
- moved shutdown and restart from status page to main menu
- added new configuration properties to yacy.init (not yet actively used)
- added some methods to handle new automatic update process

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3958 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-10 23:56:25 +00:00
orbiter
3421c64d26 implemented update function:
after downloading a release using the download button on the status page
the user can choose any of the downloaded versions for a update.
this enables also a downgrade to a older version.
when the update button is pushed, yacy terminates, installes the choosen version
and restarts

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3948 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-02 15:16:05 +00:00
orbiter
1fa4feb8e6 added restart button. should work on linux and mac, but was only tested on mac
should of course work on windows as before

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3934 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-28 20:36:15 +00:00
orbiter
1782ef57e5 - added SSI parser and include directive for <!--# include virtual="<file>" -->
- added chunked file transfer for non-yacy clients
- SSIs are streamed using chunked transfer, partly delivered pages can be seen in browser before transmission is finished
- added client-side network unit identification
- cleaned up code

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3926 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-26 14:37:10 +00:00
orbiter
6b4cfbd2d6 new network bootsraping method
- no more contact to yacy.net (no remote superseed any more)
- moved superseed file into new network unit definition
- fixed build; includes new network bootstraping files now

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3922 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-25 14:43:57 +00:00
rramthun
4c7644b713 *) Updater update to r4 with first changes toward new naming scheme. For the moment consider only pro-->pro upgrades as stable.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3915 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-18 19:56:07 +00:00
orbiter
5135fefcef - release 0.53
- documentation update
- set default search time back to 6 seconds

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3914 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-18 13:04:32 +00:00
orbiter
6518bb6c08 changed release strategy:
we will provide two different releases in the future, one standard release and one 'pro'-release.
the 'pro'-release contains all additional parsers AND has different default performance values.
The pro-version differs therefore from the previous 'all'-version by this default values.
The pro-configuration is automatically choosen if the libx-folder exists. If a version is once initialized, its configuration stays independently from an existing libx folder.
The ant targets had been changed. There are now 3 different targets to create standard and pro-releases, and one target to upgrade:
- dist: creates a standard release (only, no libx target any more)
- distPro: creates a pro-release (includes the libx)
- distExt: creates a libx-release which includes the libx-folder only. It may be used to upgrade from standard to pro
Furthermore, the naming of 'dev'-releases had been removed.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3902 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-16 14:11:52 +00:00
orbiter
71fd972ac0 - reduced default search time
- catched case when web structure cannot be painted because of too less data
- better logging when balance fails


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3892 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-14 15:21:01 +00:00
orbiter
684ded0e09 added new news types
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3876 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-12 15:15:24 +00:00
orbiter
71b0206935 - shifted control queue monitor pages to crawl monitor
- the crawl start menu is now cleaned up and ready for more options

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3802 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-06 12:25:52 +00:00
orbiter
33ad0c8246 added a web structure computation and logging:
- all web page parsing operations will now increase a web structure file
- the file is computed in memory and dumped at shutdown-time to PLASMASB/webStructure.map in readable form (not a database)
- the file can be used externally to analyse the link structure of the crawled pages
- the web structure can also be retrieved using a xml-interface at http://localhost:8080/xml/webstructure.xml
- the short-term purpose is the computation of a link-graph image (before linuxtag!)
- a long-term purpose could be a decentralized computation of the citation rank



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3746 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-22 08:13:48 +00:00
rramthun
23f6150a5b *) Updater now checks for updates even while waiting for user to approve update
*) Changed YaCy version back to dev scheme

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3718 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-13 12:51:03 +00:00
orbiter
578c2ef130 release 0.52
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3715 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-11 22:12:29 +00:00
orbiter
03032f7d62 more tipps
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3678 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-07 12:02:11 +00:00
orbiter
111ba9e359 - fixed some width problems in new status page
- fixed deadlock in dns cache
- added termination security for DHT peer selection

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3660 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-05 23:18:00 +00:00
orbiter
f8de19fb2f robinson cluster: added client-side protocol implementation
- the network configuration page shows a new option: robinson clusters
- when a global search is made, all robinson peers are excluded, but:
- robinson peers/clusters that provide peer tags and where search words match
  such tags, they are included in global search. Therefore, robinson peers/clusters
  support the global yacy network with their indexes, without doin DHT-exchange


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3598 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-26 09:51:51 +00:00
orbiter
ba525ebf52 - re-enabled path optimization that was disabled during testing
- re-implemented index load/extend optimization that was removed from kelondroFlexTable,
  this is now part of kelondroIntBytesIndex


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3580 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-19 14:55:19 +00:00
orbiter
a922d9444d fixed search page (there had been some unresolved patterns)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3559 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-11 14:46:54 +00:00
orbiter
356033aceb fixed bug with continuous reset of balancer file index
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3537 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-03 12:36:24 +00:00
orbiter
2cb16824e3 removed support for old database structures.
The new collection index will be more generalized to support other indexes
i.e. YBR block-rank computation. A clean-up of the many conditions to support
the old database was necessary.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3506 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-21 15:35:35 +00:00
orbiter
3688ec33e5 release 0.51
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3501 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-21 14:00:17 +00:00
orbiter
4783a30910 - fixed a flush problem in balancer
- return to idle divisor in RWI RAM cache flush

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3485 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-16 15:16:26 +00:00
orbiter
1cba31de43 redesigned ram organization for database caches
- each cache can now allocate as much memory as is available
- no more fixed limits
- replaced old performance memory monitor by new one
- added supervision methods as static functions into the classes that provide cache functionality
- steering of ram allocation is done with two simple limits that are ram availability-relative


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3434 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-06 22:43:32 +00:00
orbiter
304412a049 first generation of collection index R/W head path optimization
- collections are now hand-over as collection lists to collection index for merge opertations
- collection index lists are separated into 'new' and 'extend' lists
- lists are written separately
- write operations are done into array sets and array indexes. These are now serialized
- write operations into index files are sorted by index;
  that means that a R/W head does not need to go forward
  and backward, only forward
More enhancements are possible

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3407 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 15:49:23 +00:00
orbiter
dc0c06e43d PLEASE MAKE A BACK-UP OF YOUR COMPLETE DATA DIRECTORY BEFORE USING THIS
redesign for better IO performance
enhanced database seek-time by avoiding write operations at distant
positions of a database file. until now, a USEDC counter was written
at the head-section of a kelondroRecords database file (which is the
basic data structure of all kelondro database files) to store the
actual number of records that are contained in the database. Now, this
value is computed from the database file size. This is either done
only once at start-time, or continuously when run in asserts enabled.
The counter is then updated only in RAM, and written at close of the
file. If the close fails, the correct number can be computed from the
file size, and if this is not equal to the stored number it is a strong
evidence that YaCY was not shut down properly.
To preserve consistency, the complete storage-routine had to be re-written.
Another change enhances read of nodes in some cases, where the data-tail
can be read together with the data-head. This saves another IO lookup during
each DB node fetch.
Includes also many small bugfixes.
IF ANYTHING GOES WRONG, ALL YOUR DATA IS LOST: PLEASE MAKE A BACK-UP

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3375 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-20 08:35:51 +00:00
orbiter
1f1f398bfa enhanced speed of RAM cache flush by factor 20 (twenty times faster)
- the speed was doubled by avoiding read access during the dump
- the speed was dramatically increased at least by factor 10
   by using a temporary ram-file where the structures are flushed to
   before it is dumped then as a whole byte-chunk to the file system.
The speed enhancements also affects some other parts of the database.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3353 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-08 23:21:46 +00:00
orbiter
b2f4087400 redesign of last-seen fieln inside seed:
the field contains now a time in UDC-0 (instead relative to local UDC offset)
this fixes a bug in peer selection, where an iteration over all seeds
ordered by lastseen did not work correctly.
Problems may occur because the new meaning of this field may mix with
the different meaning of that field in older peers

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3322 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-02 23:54:27 +00:00
orbiter
b123a404b0 added mime types
added peer name in search statistics

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3285 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-27 23:24:18 +00:00
rramthun
cf49d5b0a7 Version switch to 0.501 by /me as Orbiter is at 23C3
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3139 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-27 21:32:14 +00:00
orbiter
9b726ac366 release 0.50
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3132 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-23 04:26:05 +00:00
orbiter
0a050bc043 enhanced ranking
- redesign of data storage in plasmaSearchRankingProfile
- profiles are extended by new ranking parameters
- new RWI ranking parameters are considered during ranking
- appearance attributes (i.e. emphasised text) is now considered
- faster ranking
- some attributes that had been checked during post-ranking can now be
  checked during pre-ranking phase
- removed old ranking parameter on index.html page (will be replaced by profiles in the future)
- ranking can now consider appearances of media content
- snippet-loading for media types now work correctly (fetches only from the wanted media)
- ranking-profiles can be handed over the remote peers and apply there also
- re-search of same query with different domain now also re-triggers remote search

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3105 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-20 15:44:29 +00:00
orbiter
c500178fd7 redesign of index creation interface
- the input remains in the IndexCreation menu point
- after pressing the submit button, the IndexingMonitor is called
- the code for creation of new indexing starts was moved to the indexingMonitor
- Existing crawl profiles can be monitored in the Indexing Monitor
- the code for creation of crawl profile data was shifted from indexing start to indexing monitor
- existing crawl profiles can be deleted on the crawl monitor page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3095 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-18 02:56:32 +00:00
orbiter
7ff86d6ba6 - image search now shows thumbnails (in bad order, but it works)
- repaired DHT selection

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3081 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-14 02:48:37 +00:00
orbiter
10d888e70c - added a media search for images, audio, video and applications
- new search options on search page
- new option in ViewInfo to display all links of a file
- enhanced collection data structure

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3054 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-07 02:40:57 +00:00
orbiter
052f28312a removed assortments from indexing data structures
removed options to switch on assortments

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3041 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-02 19:34:59 +00:00
orbiter
2372b4fe0c release 0.49
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3040 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-02 01:57:49 +00:00
orbiter
f8efb3c948 fixed a null pointer exception problem reported in the forum.
I cant find the forum entry any more because my girlfriend switched
off the power while the forum window was open.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3039 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-01 22:36:32 +00:00
orbiter
30888e7a2f implementation of search constraints
Such constraints may formulate specific restrictions to web searches
This is implemented by scraping information for constraints from a web
page during parsing, and storing flags to the pages within the web index.

In this first step, only information for index pages ("index of", directory listings)
are scraped and stored in flags
- added new flag class kelondroBitfield
- added scraper method in condenser
- added bitfield structure for all scrape types (see also condenser)
- added bitfield structure for appearance locations (see RWIEntry)
- added handover protocol for remote search and index distribution
- extended kelondroColumn class to hold bitfield types
- added another search attribute on search page (index.html)
- extended search-filter to enable filtering of non-matching constraints
- set all new database types to be default
- refactoring: moved word hash generation to condenser class

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2999 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-23 02:16:30 +00:00
orbiter
09bcc10344 bugfix for some problems of last change with assortments
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2986 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-19 23:10:58 +00:00