Commit Graph

4015 Commits

Author SHA1 Message Date
fuchsi
1bd02762de Improve HTTP/ICAP header processing.
- workaround for illegal line endings (LF only), closes: http://forum.yacy-websuche.de/viewtopic.php?f=6&t=595
- fixed bug where we didn't break the processing immediately on EOS (the loop was run until the buffer was completely filled with -1)
- further performance improvements (one simple loop, avoid double processing of every byte and unnecessary temporary buffers)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4270 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-12 06:37:18 +00:00
orbiter
01554f4012 fixed bug with double-check in crawler
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4269 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-12 01:32:25 +00:00
orbiter
b1e08d354c repaired indexing after search snippet loading
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4268 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-12 00:33:26 +00:00
orbiter
48138952ff added memory measurement for index recreation to avoid OOM during index RAM space extension
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4267 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-11 15:07:03 +00:00
orbiter
5a80359b0e new default remote favicon for search results
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4266 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-11 12:40:48 +00:00
orbiter
9e23acf2d6 introduced new 'authority' ranking property
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4265 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-11 01:32:58 +00:00
orbiter
a1b80017e0 fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=594&p=3630#p3630
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4264 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-10 15:00:01 +00:00
orbiter
a3bfd668aa opening of array files at startup time, not when first time the web index is accessed
this speeds up the first search after startup

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4263 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-10 02:40:16 +00:00
orbiter
ca488e03f5 fixed authorization case
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4262 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-10 02:04:48 +00:00
orbiter
6a3a292015 - smoothed ymage font
- changed position of status banner

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4261 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-10 01:47:04 +00:00
low012
7397152e04 *) quick hack for antialiasing, works only on borders now => less blurry image
*) code is not finished, needs refactoring, still thinking about how to do it best


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4260 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-09 19:17:24 +00:00
orbiter
4331e52d1c fixed too small peer number in banner
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4259 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-08 02:48:19 +00:00
orbiter
515e1bde6d - fixed bug with constraint default
- 0.556
- default RAM for pro releases now 120MB (because pro will become default)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4258 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-08 02:04:50 +00:00
orbiter
2954f96fae - removed public peer info box on status page, this info can now be seen in the status banner
- added peer count to banner
- added some values to protected status box

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4257 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-08 01:39:59 +00:00
low012
4eb40c4f61 *) added 2 filters: blur and antialiasing (which in fact is nothing more than a mild blur) to ymageMatrix
*) antialiasing is used for logo in banner


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4256 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-07 22:51:13 +00:00
orbiter
aeb1cf83a6 - corrected banner link (relative now)
- changed color mode (replace) for banner
- changed default color (fits to default skin) of banner in status

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4255 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-07 21:25:36 +00:00
orbiter
5185acaf41 - reduced default search time
- this can be configured using the network definition file

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4254 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-07 03:25:57 +00:00
orbiter
e22014dc83 some memory enhancements when generating and displaying ymage objects
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4253 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-07 02:15:12 +00:00
orbiter
f243e338cf implemented online caution also for local and remote search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4252 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-06 21:53:17 +00:00
orbiter
6680634703 removed unnecessary functions
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4251 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-06 21:29:47 +00:00
daburna
3ff09ad6b4 #updated french language file by translation out of the wiki. made by Marsupoil
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4250 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-06 18:20:46 +00:00
orbiter
c57eb76b13 removed CMY color model from ymage classes and re-introduced RGB color model
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4249 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-06 01:06:17 +00:00
orbiter
b46bcaa5d8 changed method of profiling
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4248 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-04 20:19:13 +00:00
low012
76cd6ed6f6 *) New methods to insert bitmaps that feature transparencies.
*) Logo background is transparent now. (Using pixel at (0,0) to determine which color is transparent. Too dirty?)
*) Logo is loaded through filesystem instead of HTTP now.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4247 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-04 19:45:50 +00:00
orbiter
be214e594f - generalized ymage initialization options
- auto-adoption of performance memory graph to needed dimension

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4246 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-03 02:35:28 +00:00
low012
ee8a177c26 *) Logo is in the middle of free space now.
*) Fixed bugs in insertBitmap()


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4245 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-02 21:20:11 +00:00
low012
72698fcd36 *) Banner features a logo now. It does not look nice, but at least it works. Banner is not finished yet.
Which path do I have to set for IMAGE (htroot/env/grafics/yacy.gi) if I want to load it through the file system and not via HTTP?


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4244 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-02 20:37:12 +00:00
fuchsi
39d0f10ca1 Fix parsing oof dates in HTTP headers.
RFC 2616 requires a client to support RFC 1123 (default), RFC 1036 and ANSI C formatted date strings (we only supported 1123 before).

Closes: http://forum.yacy-websuche.de/viewtopic.php?f=6&t=525 (and probably others). There are servers which break the standards, please report those "DATE ERROR" messages if they contain a "sane" date string.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4243 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-30 20:47:27 +00:00
orbiter
3f848e282b PerformanceMemory_p now does a automatic update of memory graph every second
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4242 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-30 11:28:26 +00:00
orbiter
aefb3f7765 added memory graph picture to PerformanceMemory_p.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4241 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-30 03:22:42 +00:00
orbiter
ea81d97cfc fix for bad full domain crawl depth adoption
(maximum depth is 8, because higher depth will cause that remote crawls do not work)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4240 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-30 00:37:58 +00:00
orbiter
f645408ae9 added url retrieve option to uls.xml interface
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4239 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-29 22:12:01 +00:00
orbiter
cc20870267 fix for constraint handover problem:
old yacy versions set a catchall-constraint if no constraint is given, but the
new versions expect a null-constraint.
see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=565&hilit=

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4238 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-29 21:03:47 +00:00
daburna
5ba415570e #updated German language file
now the new IndexControlsites are complete translated

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4237 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-29 17:14:04 +00:00
orbiter
9b0ae4b989 added referrer to remote crawl url list
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4236 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-29 13:58:00 +00:00
fuchsi
18e516317d Fix problem with buggy HTTP-Servers which send illegal control characters in HTTP-Headers, they are ignored now.
Thx to celle for the patch and see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=560 for more information.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4235 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-29 06:02:45 +00:00
orbiter
7d5544e9b1 added some security checks to new remote crawl pull method to prevent that indexer is overloaded
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4234 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-29 02:54:59 +00:00
orbiter
d59c1a7936 removed test data
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4233 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-29 02:11:48 +00:00
orbiter
89b9b2b02a redesigned remote crawl process:
- instead of pushing urls to other peers, the urls are actively pulled
  by the peer that wants to do a remote crawl
- the remote crawl push process had been removed
- a process that adds urls from remote peers had been added
- the server-side interface for providing 'limit'-urls exists since 0.55 and works with this version
- the list-interface had been removed
- servlets using the list-interface had been removed (this implementation did not properly manage double-check)
- changes in configuration file to support new pull-process
- fixed a bug in crawl balancer (status was not saved/closed properly)
- the yacy/urls-protocol was extended to support different networks/clusters
- many interface-adoptions to new stack counters

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4232 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-29 02:07:37 +00:00
fuchsi
69521d92e5 Add another external dependency from PDFBox package ("Bouncy Castle"). This is necessary for parsing of some encrypted PDF files.
bcprov-jdk14-132.jar is the binary jar as it is provided in the PDFBox-0.7.3 package (same as our FontBox, PDFBox packages).

Closes: http://forum.yacy-websuche.de/viewtopic.php?f=6&t=453


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4231 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-27 23:13:26 +00:00
orbiter
90a02990d2 NPE fix, see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=549&hilit=&p=3383#p3383
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4230 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-23 09:26:35 +00:00
orbiter
2fcd18a972 - fixed bad behaviour of search event worker processes
- fixed export of url lists in xml

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4229 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-23 01:08:16 +00:00
orbiter
445c0b5333 added domain list extraction and html export format
to URL administration menu http://localhost:8080/IndexControlURLs_p.html

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4228 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-22 20:47:06 +00:00
orbiter
d8d77fc4b2 fix for NPE, see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=549&hilit=&p=3368#p3368
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4227 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-22 18:15:28 +00:00
orbiter
bf6952abe7 - added url export to http://localhost:8080/IndexControlURLs_p.html
- removed command-line option to export urls

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4226 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-22 16:52:44 +00:00
orbiter
af10f729df fixed image search and favicon loading
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4225 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-22 01:34:29 +00:00
orbiter
edba2b7bcc fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=543
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4224 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-21 23:26:51 +00:00
orbiter
c48b73cda2 redesign of ranking data structure
- the index administration now uses the same code base for url selection and collection
  as the search interface. The index administration is therefore a good test environment for
  ranking order control
- removed old postsorting-algorithms, will be replaced with new one
- fixed many bugs occurred before during ranking; especially the contraint filtering method
  removed too many links
- fixed media search flags; had been attached to too many urls. The effect should be a better
  pre-sorting before media load within snippet fetch

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4223 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-21 23:14:57 +00:00
orbiter
6f1308da2f - some enhancements to IndexControlURLs (shows more links, connects referrer to another query)
- some refactoring to search process

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4222 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-17 01:53:02 +00:00
orbiter
bf9a9e4e5e fix for NPE in IndexControlRWIs_p.java
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4221 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-16 16:37:45 +00:00