Commit Graph

4265 Commits

Author SHA1 Message Date
low012
afa708d552 *) added <s>...</s> tag to WikiCode -> works just as the HTML equivalent
*) code changes (PMD) without functional changes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7193 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-26 12:57:07 +00:00
orbiter
a83186ac7d fix for bug in cytrails
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7192 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-26 10:32:40 +00:00
orbiter
48c0d508ac fixes for crawling of smb links (file length not always available)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7190 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-25 22:32:26 +00:00
orbiter
0bc6284e27 - added bugfix for access tracker in case of concurrency conflicts
- added missing entry for new icu4j path in Mac App

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7188 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-25 21:10:50 +00:00
orbiter
10a9cb1971 simplified snippet computation process and separated the algorithm into two classes
also enhances selection criteria for best snippet line computation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7182 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-22 20:50:02 +00:00
lotus
4450c240b7 npe fix http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2982
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7181 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-22 20:24:07 +00:00
orbiter
84a023cbc8 fixed several search bugs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7180 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-21 21:48:42 +00:00
orbiter
97ee278931 enhanced search speed:
- better control of number of running search threads
- no time-out waiting time when no ranking feeding takes place
- local search queries by a remote peer may be faster up to 300 milliseconds
- a local search may even be faster

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7176 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-20 13:17:25 +00:00
orbiter
ee3820c9cc more logging for strange "java.lang.NoClassDefFoundError: de/anomic/http/server/RequestHeader" error
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7175 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-20 11:01:44 +00:00
orbiter
377f001e0d sorting of crawl profile names in crawl profile editor, see
http://forum.yacy-websuche.de/viewtopic.php?p=20851#p20851

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7172 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-20 09:09:38 +00:00
orbiter
3552476fbe terminated migration from apache httpclient-3.1 to 4.1:
- remove the library
- added two classes from the httpclient-3.1 library as source code to YaCy because these classes were used by the YaCy HTTP Server
- modified the added classes ChunkedInputStream and ContentLengthInputStream in such a way that:
 * there are no more dependencies to httpclient-3.1
 * these classes had been simplified to serve only the purpose for the YaCy httpd

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7171 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-20 08:36:48 +00:00
orbiter
a2f9974745 some redesign in the access tracker to realize sixcoolers question about "smartes way for deleting the first Object":
- not so much abstraction for a collection, makes use of remove() (no operands) possible
- different way to delete elements in track (destructive, not constructive (less copies of elements in new queue))
- more abstraction for class api since no static class must be used any more

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7169 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-19 23:00:24 +00:00
sixcooler
03f0414025 some minor correction of my last commit
sorry for the noise

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7168 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-19 20:57:25 +00:00
sixcooler
42fa0eadb1 fix endless loop:
Collection does not support remove(int)
(isn't there a smartes way for deleting the first Object?)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7167 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-19 20:41:44 +00:00
low012
5a9ea0308f *) further simplification of wiki code parser (less redundancy in code, less magic numbers), still not done with it...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7166 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-18 11:40:37 +00:00
orbiter
37baa8bae3 - fixes for concurrency exceptions and failed database integrity verification
- added link to yacystats peer when peer is more than one day old

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7164 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-17 10:20:04 +00:00
orbiter
29fe401f93 - some layout and text enhancement for site crawl start
- Quix0rs patch from http://forum.yacy-websuche.de/viewtopic.php?p=20839#p20839 (parts)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7163 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-16 23:00:07 +00:00
orbiter
461a2a6ec7 enhanced remote crawling:
- 300 ppm is default now (but this is switched off by default; if you switch it on you may want more traffic?)
- better timing for busy queue
- better amount of remote url retrieval
- better time-out values
- better tracking of availability of remote crawl urls
- more logging for result of receipt sending

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7159 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-16 09:34:17 +00:00
orbiter
670ba4d52b - removed the remote crawl option from the network configuration submenu and
- added a remote crawl menu item to the index create menu. This menu also shows a list of peers that provide remote crawl urls
- set remote crawl option by default to off. This option may be important but it also confuses first-time users


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7158 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-16 00:39:05 +00:00
orbiter
89c2d8b81e better initial hash computation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7157 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-15 22:11:52 +00:00
orbiter
34e2f7f487 enhanced snippet fetch strategy: concurrent snippet fetch even for offline-snippet searches. This improves speed since it is now possible to fetch snippets offline and parsing of source files from the htcache can be enhanced using concurrency. This improves local and remote search.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7156 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-15 21:09:14 +00:00
orbiter
0cf006865e refactoring and enhanced concurrency
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7155 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-15 11:38:03 +00:00
orbiter
83ac07874f - corrected return value of put() methods (not used anywhere, so it did not harm before)
- added use of LookAheadIterator which should prevent mistakes when coding iterators with embedded iterators
- added a fail-safe reaction in case of database corruption using iterators over database elements (no interruption then)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7154 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-15 10:43:14 +00:00
orbiter
5702419194 fixed a bug in HTTPClient: keep-alive must be set to false, otherwise servers hold connections 2 seconds open until response.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7151 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-14 22:25:35 +00:00
orbiter
5870b13f3a - code cleanup / added debug line for further investigation in HTTPDemon.parseMultipart
- changed data structure for sorting in search which performs better in that specific case (too many updates)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7150 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-14 21:03:50 +00:00
orbiter
ac1c08924e more performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7149 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-14 15:27:27 +00:00
orbiter
14c843d364 more performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7148 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-14 15:00:34 +00:00
orbiter
39f409a7bb performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7147 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-14 14:32:24 +00:00
orbiter
7ebef56add - redesign of a part of the remote search client to make it possible to have a test environment for remote search performance tests
- added a remote search test main methods in yacyClient

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7146 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-14 13:35:47 +00:00
orbiter
3c0e07ba72 removed all delays in shutdown process
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7143 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-14 09:13:28 +00:00
orbiter
64860dc1bb enhanced search event logging (to be used for further improvements)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7140 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-13 09:33:04 +00:00
sixcooler
17eebd4ef8 counting crawler traffic again:
fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2808

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7138 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-11 15:58:15 +00:00
orbiter
32f73d1aaa added copy for Info.plist for Mac application release updates (this file contains class paths and start parameters)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7133 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-10 09:48:09 +00:00
orbiter
4c21d8dc9d - changed default values for online caution (the pausing may not be necessary any more)
- fixed bug in WeakPriorityBlockingQueue
- show favicon faster using pre-loading (same technique as used for fast image search)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7130 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-09 23:25:19 +00:00
orbiter
570ca577c6 performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7129 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-09 22:42:54 +00:00
orbiter
348dece62f redesign of the SortStack and SortStore classes:
created a WeakPriorityBlockingQueue as special implementation
of a PriorityBlockingQueue with a weak object binding.
- better abstraction of ordering technique
- fixed some bugs according to result numbering (distinguish different counters in Queue)
- fixed a ordering bug in post-ranking (ordering was decreased instead of increased)
- reversed ordering numbering using a reversed ordering. The higher the ranking number the better (now).

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7128 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-09 15:30:25 +00:00
orbiter
114bdd8ba7 fixed old sitemap importer which was not able to parse urls containing post elements
- removed old parser
- removed old importer framework (was only used by removed old parser)
- added a new sitemap parser in parser framework
- linked new parser with parser access in old sitemap processing routines

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7126 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-08 14:13:15 +00:00
lotus
6a09f1f7e5 fix dedicated upnp testing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7122 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-07 18:17:23 +00:00
orbiter
5fe828fa06 - replaced pdfbox and fontbox version 1.1.0 with 1.2.1
- added some clear statements that shall clear static cache size within the pdfbox library
- the pdfbox library contains a memory leak; it is unsafe to run a peer with pdf parser permanently on.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7120 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-07 17:13:47 +00:00
orbiter
c757a4aa9f - corrected lifetime computation for search events
- made search event cache cleanup concurrent because cleanup may cause index modifications

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7119 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-06 16:05:19 +00:00
orbiter
fb828f3767 - performance enhancements in search response time using faster query ID computation and an ID cache
- code cleanup

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7113 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-06 10:00:07 +00:00
orbiter
22047ffad5 enhanced computation speed of many replaceAll string operations
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7107 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-05 13:19:42 +00:00
orbiter
e8228fba09 less locking in time format computation, caching and during secondary (remote) search evaluation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7106 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-05 11:13:12 +00:00
orbiter
9c0c94683c because of a bug in search result caching count search results had not been generated as fast as possible.
with this fix search results are (even) faster.
Also enhanced: image search. This is now speeded up using a image search result look-ahead

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7105 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-04 22:57:12 +00:00
orbiter
fa2eb9676e removed unused class
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7104 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-04 21:45:33 +00:00
low012
5f391fcfa9 *) cleaned up in wikiCode parser (more to be done)
*) HTML fixes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7103 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-04 14:01:34 +00:00
orbiter
b3f0d06444 fixed a problem with restarts in YaCy mac applications: the DATA directory path was not submitted when doing a restart. This solves the problem by:
- storing the startup properties when yacy is started
- using the properties in the restart-script again. this transports also the DATA directory location as parameter of the -gui option that is used when the Mac version of YaCy is started

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7102 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-03 23:08:43 +00:00
orbiter
d4e4967e19 cleaned up code in yacyRelease (there will be work to do there)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7101 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-03 22:35:48 +00:00
orbiter
1da5241c2d do not block server session if maximum number of sessions is reached, just try to clean up once
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7095 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-03 12:05:37 +00:00
orbiter
5de70c3d7c changed way of storage for search requests:
- the search request cache can now get as large as 1000 entries
- if more entries arrive, unused are deleted
- the elements may stay in the cache up to 10 minutes and longer if they are used
- the elements are deleted earlier that 10 minutes if the memory gets low
This commit was mainly done for metager-feeding peers that have a query load of 50000 queries each day. Also added:
- a monitor for cache hit/cache miss in PerformanceMemory_p.html (see at bottom of page)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7093 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-02 21:52:45 +00:00