Commit Graph

624 Commits

Author SHA1 Message Date
orbiter
aca973e2d9 catch more exceptions
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5627 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-20 23:24:49 +00:00
orbiter
c12bb8a6d0 - refactoring of the http client
- added a protection against memory leaks for the access tracker

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5621 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-19 16:24:46 +00:00
orbiter
6b450d09ca some fixes recommended by findbugs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5618 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-16 23:31:54 +00:00
orbiter
f887fc159f try to reduce the large number of unclosed incoming connections
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5615 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-16 16:26:57 +00:00
orbiter
333489420b - fix for NPE when loading the cytag image
- some hacks for less memory usage:
-- less usage of buffer and cache memory in EcoFS
-- buffer allocation on-demand in BufferedIOChunks
-- removed largest ybr idx

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5595 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-11 10:52:56 +00:00
orbiter
e9a4182e6a using a concurrent hash map for the template cache
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5584 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-08 21:48:37 +00:00
orbiter
01b97ef3f8 added new cybertag-tracking feature that was inspired by itgrl
from the forum discussion in
http://forum.yacy-websuche.de/viewtopic.php?p=12612#p12612

The feature will provide two basic entities:
- you can integrate image links which point to your yacy installation anywhere in the web.
  the image can be loaded with
  <img src="http://<yourpeer>:<yourport>/cytag.png?icon=invisible&nick=<yournickname_or_community_id>&tag=<anything>">
  This will place a invisible 1-pixel image. If you change the icon=invisible to icon=redpill, you will see a red pill
  Use this, to track your activity in the web.
- you can view your tracks at
  http://localhost:8080/Tracks.html
- There is a public api to your tracks at
  http://localhost:8080/api/tracks_p.json
  which needs authentication


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5581 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-06 15:06:19 +00:00
orbiter
b57c9da1f8 - fixes to doc, ppt, xls parser: better title
- fixes to httpd server response header generation
- fixes to a server date computation bug
- new Button in indexControl to view content of url in ViewFile


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5576 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-05 15:15:13 +00:00
orbiter
db510b5d52 more exception logging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5561 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-01 22:05:09 +00:00
low012
f136ddcfd4 *) this change is supposed to prevent the creation of temporary files by Apache Commons Fileupload library in cases where it is not necessary (as proposed by thq in http://forum.yacy-websuche.de/viewtopic.php?f=8&t=1806)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5546 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-31 09:17:58 +00:00
orbiter
94110df85a moved logging partially to kelondro
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5545 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-31 01:06:56 +00:00
orbiter
024da2916b refactoring of logging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5544 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 23:33:47 +00:00
orbiter
83ce65707a (almost) completed partition of classes in kelondro
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5543 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 22:44:20 +00:00
orbiter
7ee494fde5 more refactoring of kelondro:
- seperated BLOB from table classes
- renamed 'coding' package to 'order'

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5542 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 22:08:08 +00:00
orbiter
bf93767ec6 refactoring of kelondro database classes
(to be continued)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5540 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 15:33:00 +00:00
orbiter
fc27bf8c4c refactoring of kelondro classes:
kelondro shall become independent from other packages.
moved bytebuffer, date and memory to kelondro

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5539 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 14:48:11 +00:00
orbiter
fe77fc3d62 - added new property setting 'repositoryPath'
which can be used to map any path to http://localhost:8080/repository/
  This can be used to do an intranet-indexing without the setting of
  symbolic links - which does not work in Windows environment.
  Now also Windows users can index their file system easily
  using the intranet use case.
- fixed some problems with the identification of the alternative
  path in DATA/HTDOCS in the httpd file server

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5538 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 13:30:36 +00:00
f1ori
aaafe05c02 * revert debug change
* contains instead of startsWith, because there might me localizied strings
* decode punycode for every domainpart seperately (see http://forum.yacy-websuche.de/viewtopic.php?f=9&t=1749)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5516 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-24 00:33:38 +00:00
orbiter
335d6ce8fc fix for class loading problem
see also http://forum.yacy-websuche.de/viewtopic.php?p=12153#p12153

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5505 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-20 22:47:58 +00:00
orbiter
b423d0a036 moved all servlets from htroot/xml to htroot/api
the file server contains a patch that temporary matches all xml paths to api,
that means all interfaces still work. Please adopt all your interfaces to the new path.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5497 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-15 23:52:58 +00:00
orbiter
814a28775f removed thread dump writing in case of invocation target exception in httpd (looked bad, not serious)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5483 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-13 00:27:01 +00:00
low012
7608944081 *) bugfix for REMOTE_HOST environment variable in CGI code (shows hostname of client instead of hostname of YaCy peer now)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5480 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-12 22:15:00 +00:00
low012
c1330f5743 *) added environment variable DOCUMENT_ROOT
*) caught exception

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5466 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-10 18:31:10 +00:00
orbiter
c6880ce28b removed the permanent cache flush and replaced it with a periodic cache flush
The cache is now flushed only for one second every ten seconds. During a crawl the cache
fills up completely, and is only flushed if space is needed for more documents.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5446 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-06 13:51:59 +00:00
low012
afe98bc11c *) added changes as proposed by Halborinda in http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1674
*) changed indention

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5434 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-03 08:24:08 +00:00
low012
bb5c2cd12e *) ISINDEX parameters will not be put on commandline anymore to prevent possible security hazards (better safe than sorry). Parmeters will have to be read from QUERY_STRING in ISINDEX case too which does not seem to be uncommon behaviour for web servers: http://vms.pdv-systeme.de/users/martinv/cgi_basics/cgi_basics.html#Datenuebergabe
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5431 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-02 11:18:26 +00:00
low012
db1cfae3e7 *) cleaning up after myself
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5429 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-01 19:45:15 +00:00
low012
f547f9a78c *) added CGI capabilities (run Perl scripts and other software via HTTP GET and POST)
*) set cgi.allow to true in yacy.conf to enable CGI (CGI is disabled by default)
*) edit cgi.suffixes in yacy.conf if necessary to use additional script types

ATTENTION: This is a rather experimental feature, not all environment variables are set yet. 

Only enable CGI if you know what you are doing. Poorly implemented CGI scripts can put a system's integrity at risk!

Implementation of more environment variables and documentation due for the next days.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5428 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-01 19:40:06 +00:00
f1ori
bdc380cd84 * add lastModified to templateCache
-> no outdated files from cache anymore...


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5427 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-01 14:56:53 +00:00
orbiter
e004da48d3 - added fast fingerprint computation for files (any). Will be used in new index dump method
- refactoring

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5415 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-29 12:22:13 +00:00
f1ori
2d2ce24011 * remove all encoding-stuff from proxy
encoding is handled by parsers or browser, proxy only passes through


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5410 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-23 19:14:54 +00:00
f1ori
73c8a0839c * abort download, when proxy connection is closed
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5409 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-23 11:30:24 +00:00
f1ori
4907697cfa * make fileuploads through proxy bigger than 65500 bytes possible
* remove gzip-encoding for files from cache


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5407 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-22 23:04:00 +00:00
orbiter
db6b3bf5a3 speed enhancement for integrated http server:
- tuning hacks in template engine
- bypassing the template engine if no servlet present

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5389 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-11 20:10:37 +00:00
orbiter
47292e696a more performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5379 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-04 12:54:16 +00:00
orbiter
d39d420b39 performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5376 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-03 15:38:29 +00:00
orbiter
0b4808ba3d added new interactive search feature:
- during the user types search queries, the local database is searched
- results are presented interactively

This was implemented using a new JSON result format for search results in YaCy
- added JSON as file format for servlets
- refactoring of current search servlets (xml and html)
- added JSON output format for search results
- added AJAX-based search page, that uses the yacysearch.json selrvlet to print results as a query is typed

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5373 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-02 15:24:25 +00:00
orbiter
74a3d86114 fixed a error response that might present classified information
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5372 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-01 23:14:42 +00:00
danielr
2e63f03ca5 copy&paste vergessen :/
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5351 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-20 11:41:11 +00:00
danielr
cd8082b4e3 fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1111#p11166
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5350 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-20 11:18:19 +00:00
f1ori
d18c18971e * dirlisting in UTF-8 encoding
* fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1550&hilit=#p11108


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5348 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-15 20:49:03 +00:00
f1ori
d49ffcd818 * files distributed by yacy are utf-8, files from repository use the system default charset
* fixes http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1564#p11092
  and http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1550


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5345 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-14 20:49:16 +00:00
f1ori
90e78b2cf6 * improve encoding detection of http service
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5337 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-12 21:06:32 +00:00
f1ori
7e1fe05e3c * added utf8-encoding to many getBytes-calls
* utf8 should work now


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5323 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-08 20:24:31 +00:00
f1ori
4b4ce75396 * http-server: submit charset from html metatags
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5314 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-01 23:17:51 +00:00
f1ori
d0543a7c39 * fix the debug ant-target
* fix yacy-subdomain handling (http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1556)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5307 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-27 22:16:56 +00:00
orbiter
0edec2b760 FULL redesign of algorithms in htmlTools to encode/decode strings from/to unicode and html.
The old process used a not really efficient way to detect html encoding strings in texts.
All calling methods had been adoped to call the new class in an enhanced way with less parameters.

Many classes in interfaces used a XML encoding only (instead of full html conversion from unicode to html); this behavior was not changed with this commit but should be controlled again since it points out possible XSS leaks

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5295 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-22 18:59:04 +00:00
orbiter
6941bf42b1 performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5288 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-20 14:07:09 +00:00
orbiter
1778fb420d - added some performance tweaks to the new BLOB buffer
- removed the now superfluous HT storage thread
- reduced number of file decompression by shifting the compression moment to the future


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5286 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-19 18:10:42 +00:00
orbiter
826ca79735 refactoring and new architecture to store the files of the web cache:
- files are not stored any more as individual files
- a new database structure using BLOBHeap files stores many cache entries in common files
- all file-writing procedures had been migrated to generate byte[] objects which are written with the new database methods

this is only an intermediate step to the final architecture, where cached files are written together with their metadata in one single database structure.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5276 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-16 21:24:09 +00:00