Commit Graph

762 Commits

Author SHA1 Message Date
lotus
9cfe89c8fc * process content-length as soon as it is received
* corrected indentation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6206 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-07-13 19:55:13 +00:00
f1ori
f814e0fa81 enable warnings and fix most of it
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6196 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-07-11 21:01:27 +00:00
f1ori
8931c8d6b4 improvments to debianpackage:
* autoupdate completely disabled, display hint
* restart-button in interface works!

* moved all build-Variables to yacyBuildProperties
* fixed some warnings


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6195 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-07-11 17:03:22 +00:00
orbiter
57a88d435b redesign of parser mime type detection and parser steering
There is now a mime-blacklist instead of a mime-whitelist

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6190 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-07-10 14:22:17 +00:00
orbiter
21b8704fb4 refactoring of the ParserDispatcher and ParserConfig: resulted into Idiom, Parser and Classification classes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6188 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-07-09 22:25:31 +00:00
orbiter
8ca1f5d400 - some work to integrate the html parser the same way as the other parsers are integrated (not finished)
- added migration of code of settings pages (hmm.. does not work correctly yet, sorry)
- more refactoring
- removed more unused code

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6187 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-07-09 20:56:30 +00:00
orbiter
dafffd0153 refactoring of parsers and document processing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6182 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-07-08 21:48:08 +00:00
orbiter
409538e17a code cleanup and code simplifcation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6161 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-30 22:20:55 +00:00
orbiter
1f1399e5c5 extending visibility of objects and methods to avoid synthetic accessor methods and increase performance
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6156 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-30 13:25:46 +00:00
orbiter
154bbc3364 code cleanup: call of static methods directly to the class
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6155 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-30 13:01:35 +00:00
orbiter
222850414e simplification of the code: removed unused classes, methods and variables
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6154 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-30 09:27:46 +00:00
orbiter
93dfb51fd4 problems with code style
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6153 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-29 22:22:35 +00:00
orbiter
ce1adf9955 serialized all logging using concurrency:
high-performance search query situations as seen in yacy-metager integration showed deadlock situation caused by synchronization effects inside of sun.java code. It appears that the logger is not completely safe against deadlock situations in concurrent calls of the logger. One possible solution would be a outside-synchronization with 'synchronized' statements, but that would further apply blocking on all high-efficient methods that call the logger. It is much better to do a non-blocking hand-over of logging lines and work off log entries with a concurrent log writer. This also disconnects IO operations from logging, which can also cause IO operation when a log is written to a file. This commit not only moves the logger from kelondro to yacy.logging, it also inserts the concurrency methods to realize non-blocking logging.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6078 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-15 21:19:54 +00:00
orbiter
b8e738a7be a collection of
- small bug fixes
- better/more comments
- more asserts
- fixed synchronization
- test case enhancements
- code cleanup
- performance hacks

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6073 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-14 22:09:08 +00:00
orbiter
db3a06dd81 removed cookie handling in httpc:
- no need to do cookie handling in proxy, this was switched off so far
- no need for cookies in crawler, this was switched on (by mistake)
This fix was needed for a case where a web server flooded the crawler with cookies and caused a complete blocking of the httpc.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6043 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-10 16:11:09 +00:00
orbiter
99bf0b8e41 refactoring of plasmaWordIndex:
divided that class into three parts:
- the peers object is now hosted by the plasmaSwitchboard
- the crawler elements are now in a new class, crawler.CrawlerSwitchboard
- the index elements are core of the new segment data structure, which is a bundle of different indexes for the full text and (in the future) navigation indexes and the metadata store. The new class is now in kelondro.text.Segment

The refactoring is inspired by the roadmap to create index segments, the option to host different indexes on one peer.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5990 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-05-28 14:26:05 +00:00
orbiter
4b4bddca00 added new submenu to crawler menu: import of phpbb3 forum postings from mysql
- yacy can import phpbb3 posts without crawling
- all data is written as surrogate
- indexed surrogate files can be re-used

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5985 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-05-27 14:53:23 +00:00
orbiter
709bfc2cd4 added a memory check in http post protocol
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5949 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-05-12 20:23:55 +00:00
orbiter
c097531e3d added a catch Exception to all thread to check if any of them silently dies without any other notification
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5922 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-05-05 06:31:35 +00:00
orbiter
057ce14c8e more fixes (character encoding, parser exceptions, http client failure, blob writing)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5914 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-05-02 07:43:03 +00:00
orbiter
d2ac0aa682 - fixed possible bugs in Stack (may affect Crawler reset) and RandomAccess handling
- increased default memory size to 180MB
- fixed possible bug in http client reset (there was a deadlock)
- bug in BOBHeap marked, but not solved, cause is still unknown.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5912 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-05-02 01:40:03 +00:00
orbiter
16baa7ad24 To translate a mediawiki dump into the YaCy surrogate format do the following:
- download a wikipedia dump, i.e. dewiki-20090311-pages-articles.xml.bz2
from http://download.wikimedia.org/dewiki/20090311/
- move dewiki-20090311-pages-articles.xml.bz2 to DATA/HTCACHE/
- start the conversion; open a command shell, move to the yacy home directory and execute
java -Xmx2000m -cp classes:lib/bzip2.jar de.anomic.tools.mediawikiIndex -convert DATA/HTCACHE/dewiki-20090311-pages-articles.xml.bz2 DATA/SURROGATES/in/ http://de.wikipedia.org/wiki/

this generates a series of files to DATA/SURROGATES/in

if YaCy is running (it may run concurrently), it fetches all new dumps in the surrogate-in directory. The export process is transaction-save, that means YaCy will not start reading a dump while the dump is not completely finished.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5851 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-21 22:12:19 +00:00
orbiter
8ffb9889e1 some fixes and performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5845 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-20 23:01:44 +00:00
orbiter
d7cbf4cdd4 more performance hacks: less overhead in word hash computation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5825 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-17 13:47:06 +00:00
f1ori
dd6b5005ff * fix missing charset handling in getpageinfo_p
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5811 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-16 12:31:28 +00:00
orbiter
c0e8ed5461 fixed problem with not http client
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5801 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-13 21:21:47 +00:00
orbiter
57c00dd8c9 fix for bad filtering of common http error
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5788 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-09 13:20:09 +00:00
orbiter
c2359f20dd refactoring: better abstraction of reference and metadata prototypes.
This is a preparation to introduce other index tables as used now only for reverse text indexes. Next application of the reverse index is a citation index.
Moved to version 0.74

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5777 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-03 13:23:45 +00:00
orbiter
9bfb2641db - removed deprecated threads
- added automatic http client reset. this was necessary because excessive intranet crawling caused deadlocks. this hack solved the problem.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5768 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-01 20:13:57 +00:00
orbiter
b6c2167143 - patch for bad web structure dumps
- added automatic slow down of accessed to specific domains when access to a web page fails

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5765 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-01 13:21:47 +00:00
orbiter
9a90ea05e0 added a merge operation for IndexCell data structures
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5727 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-18 16:14:31 +00:00
orbiter
14a1c33823 refactoring of wordIndex class
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5709 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-13 10:34:51 +00:00
borg-0300
0a2fabeef3 static TMPDIR
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5704 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-12 16:23:12 +00:00
lotus
f35dc11dc4 allow crawl start from pages with script tags
http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1910

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5702 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-12 16:12:50 +00:00
orbiter
858f800a07 more logging in httpd to detect shutdown cause. See also:
http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1914&hilit=&p=13246#p13246

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5683 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-09 09:56:26 +00:00
orbiter
b80db04667 - refactoring of IntegerHandleIndex and LongHandleIndex (better method names)
- fix for problem in httpdFileHandler: mising close of open Files if tempate cache was disabled
- more memory for DHT selection required
- stub for URL reference hash statistics in index collections

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5682 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-08 21:37:17 +00:00
orbiter
efcd95dc37 simplification of (internal) query process / refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5671 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-06 15:53:20 +00:00
orbiter
aa44d9bad9 more refactoring of kelondro.text / deleted de.anomic.index
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5664 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-02 11:04:13 +00:00
orbiter
aca973e2d9 catch more exceptions
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5627 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-20 23:24:49 +00:00
orbiter
c12bb8a6d0 - refactoring of the http client
- added a protection against memory leaks for the access tracker

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5621 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-19 16:24:46 +00:00
orbiter
6b450d09ca some fixes recommended by findbugs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5618 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-16 23:31:54 +00:00
orbiter
f887fc159f try to reduce the large number of unclosed incoming connections
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5615 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-16 16:26:57 +00:00
orbiter
333489420b - fix for NPE when loading the cytag image
- some hacks for less memory usage:
-- less usage of buffer and cache memory in EcoFS
-- buffer allocation on-demand in BufferedIOChunks
-- removed largest ybr idx

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5595 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-11 10:52:56 +00:00
orbiter
e9a4182e6a using a concurrent hash map for the template cache
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5584 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-08 21:48:37 +00:00
orbiter
01b97ef3f8 added new cybertag-tracking feature that was inspired by itgrl
from the forum discussion in
http://forum.yacy-websuche.de/viewtopic.php?p=12612#p12612

The feature will provide two basic entities:
- you can integrate image links which point to your yacy installation anywhere in the web.
  the image can be loaded with
  <img src="http://<yourpeer>:<yourport>/cytag.png?icon=invisible&nick=<yournickname_or_community_id>&tag=<anything>">
  This will place a invisible 1-pixel image. If you change the icon=invisible to icon=redpill, you will see a red pill
  Use this, to track your activity in the web.
- you can view your tracks at
  http://localhost:8080/Tracks.html
- There is a public api to your tracks at
  http://localhost:8080/api/tracks_p.json
  which needs authentication


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5581 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-06 15:06:19 +00:00
orbiter
b57c9da1f8 - fixes to doc, ppt, xls parser: better title
- fixes to httpd server response header generation
- fixes to a server date computation bug
- new Button in indexControl to view content of url in ViewFile


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5576 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-05 15:15:13 +00:00
orbiter
db510b5d52 more exception logging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5561 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-01 22:05:09 +00:00
low012
f136ddcfd4 *) this change is supposed to prevent the creation of temporary files by Apache Commons Fileupload library in cases where it is not necessary (as proposed by thq in http://forum.yacy-websuche.de/viewtopic.php?f=8&t=1806)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5546 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-31 09:17:58 +00:00
orbiter
94110df85a moved logging partially to kelondro
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5545 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-31 01:06:56 +00:00
orbiter
024da2916b refactoring of logging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5544 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 23:33:47 +00:00
orbiter
83ce65707a (almost) completed partition of classes in kelondro
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5543 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 22:44:20 +00:00
orbiter
7ee494fde5 more refactoring of kelondro:
- seperated BLOB from table classes
- renamed 'coding' package to 'order'

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5542 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 22:08:08 +00:00
orbiter
bf93767ec6 refactoring of kelondro database classes
(to be continued)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5540 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 15:33:00 +00:00
orbiter
fc27bf8c4c refactoring of kelondro classes:
kelondro shall become independent from other packages.
moved bytebuffer, date and memory to kelondro

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5539 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 14:48:11 +00:00
orbiter
fe77fc3d62 - added new property setting 'repositoryPath'
which can be used to map any path to http://localhost:8080/repository/
  This can be used to do an intranet-indexing without the setting of
  symbolic links - which does not work in Windows environment.
  Now also Windows users can index their file system easily
  using the intranet use case.
- fixed some problems with the identification of the alternative
  path in DATA/HTDOCS in the httpd file server

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5538 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 13:30:36 +00:00
f1ori
aaafe05c02 * revert debug change
* contains instead of startsWith, because there might me localizied strings
* decode punycode for every domainpart seperately (see http://forum.yacy-websuche.de/viewtopic.php?f=9&t=1749)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5516 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-24 00:33:38 +00:00
orbiter
335d6ce8fc fix for class loading problem
see also http://forum.yacy-websuche.de/viewtopic.php?p=12153#p12153

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5505 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-20 22:47:58 +00:00
orbiter
b423d0a036 moved all servlets from htroot/xml to htroot/api
the file server contains a patch that temporary matches all xml paths to api,
that means all interfaces still work. Please adopt all your interfaces to the new path.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5497 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-15 23:52:58 +00:00
orbiter
814a28775f removed thread dump writing in case of invocation target exception in httpd (looked bad, not serious)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5483 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-13 00:27:01 +00:00
low012
7608944081 *) bugfix for REMOTE_HOST environment variable in CGI code (shows hostname of client instead of hostname of YaCy peer now)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5480 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-12 22:15:00 +00:00
low012
c1330f5743 *) added environment variable DOCUMENT_ROOT
*) caught exception

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5466 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-10 18:31:10 +00:00
orbiter
c6880ce28b removed the permanent cache flush and replaced it with a periodic cache flush
The cache is now flushed only for one second every ten seconds. During a crawl the cache
fills up completely, and is only flushed if space is needed for more documents.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5446 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-06 13:51:59 +00:00
low012
afe98bc11c *) added changes as proposed by Halborinda in http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1674
*) changed indention

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5434 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-03 08:24:08 +00:00
low012
bb5c2cd12e *) ISINDEX parameters will not be put on commandline anymore to prevent possible security hazards (better safe than sorry). Parmeters will have to be read from QUERY_STRING in ISINDEX case too which does not seem to be uncommon behaviour for web servers: http://vms.pdv-systeme.de/users/martinv/cgi_basics/cgi_basics.html#Datenuebergabe
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5431 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-02 11:18:26 +00:00
low012
db1cfae3e7 *) cleaning up after myself
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5429 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-01 19:45:15 +00:00
low012
f547f9a78c *) added CGI capabilities (run Perl scripts and other software via HTTP GET and POST)
*) set cgi.allow to true in yacy.conf to enable CGI (CGI is disabled by default)
*) edit cgi.suffixes in yacy.conf if necessary to use additional script types

ATTENTION: This is a rather experimental feature, not all environment variables are set yet. 

Only enable CGI if you know what you are doing. Poorly implemented CGI scripts can put a system's integrity at risk!

Implementation of more environment variables and documentation due for the next days.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5428 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-01 19:40:06 +00:00
f1ori
bdc380cd84 * add lastModified to templateCache
-> no outdated files from cache anymore...


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5427 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-01 14:56:53 +00:00
orbiter
e004da48d3 - added fast fingerprint computation for files (any). Will be used in new index dump method
- refactoring

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5415 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-29 12:22:13 +00:00
f1ori
2d2ce24011 * remove all encoding-stuff from proxy
encoding is handled by parsers or browser, proxy only passes through


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5410 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-23 19:14:54 +00:00
f1ori
73c8a0839c * abort download, when proxy connection is closed
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5409 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-23 11:30:24 +00:00
f1ori
4907697cfa * make fileuploads through proxy bigger than 65500 bytes possible
* remove gzip-encoding for files from cache


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5407 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-22 23:04:00 +00:00
orbiter
db6b3bf5a3 speed enhancement for integrated http server:
- tuning hacks in template engine
- bypassing the template engine if no servlet present

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5389 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-11 20:10:37 +00:00
orbiter
47292e696a more performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5379 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-04 12:54:16 +00:00
orbiter
d39d420b39 performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5376 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-03 15:38:29 +00:00
orbiter
0b4808ba3d added new interactive search feature:
- during the user types search queries, the local database is searched
- results are presented interactively

This was implemented using a new JSON result format for search results in YaCy
- added JSON as file format for servlets
- refactoring of current search servlets (xml and html)
- added JSON output format for search results
- added AJAX-based search page, that uses the yacysearch.json selrvlet to print results as a query is typed

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5373 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-02 15:24:25 +00:00
orbiter
74a3d86114 fixed a error response that might present classified information
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5372 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-01 23:14:42 +00:00
danielr
2e63f03ca5 copy&paste vergessen :/
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5351 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-20 11:41:11 +00:00
danielr
cd8082b4e3 fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1111#p11166
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5350 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-20 11:18:19 +00:00
f1ori
d18c18971e * dirlisting in UTF-8 encoding
* fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1550&hilit=#p11108


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5348 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-15 20:49:03 +00:00
f1ori
d49ffcd818 * files distributed by yacy are utf-8, files from repository use the system default charset
* fixes http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1564#p11092
  and http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1550


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5345 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-14 20:49:16 +00:00
f1ori
90e78b2cf6 * improve encoding detection of http service
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5337 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-12 21:06:32 +00:00
f1ori
7e1fe05e3c * added utf8-encoding to many getBytes-calls
* utf8 should work now


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5323 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-08 20:24:31 +00:00
f1ori
4b4ce75396 * http-server: submit charset from html metatags
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5314 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-01 23:17:51 +00:00
f1ori
d0543a7c39 * fix the debug ant-target
* fix yacy-subdomain handling (http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1556)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5307 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-27 22:16:56 +00:00
orbiter
0edec2b760 FULL redesign of algorithms in htmlTools to encode/decode strings from/to unicode and html.
The old process used a not really efficient way to detect html encoding strings in texts.
All calling methods had been adoped to call the new class in an enhanced way with less parameters.

Many classes in interfaces used a XML encoding only (instead of full html conversion from unicode to html); this behavior was not changed with this commit but should be controlled again since it points out possible XSS leaks

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5295 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-22 18:59:04 +00:00
orbiter
6941bf42b1 performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5288 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-20 14:07:09 +00:00
orbiter
1778fb420d - added some performance tweaks to the new BLOB buffer
- removed the now superfluous HT storage thread
- reduced number of file decompression by shifting the compression moment to the future


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5286 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-19 18:10:42 +00:00
orbiter
826ca79735 refactoring and new architecture to store the files of the web cache:
- files are not stored any more as individual files
- a new database structure using BLOBHeap files stores many cache entries in common files
- all file-writing procedures had been migrated to generate byte[] objects which are written with the new database methods

this is only an intermediate step to the final architecture, where cached files are written together with their metadata in one single database structure.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5276 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-16 21:24:09 +00:00
lotus
fe2792e9ce use accept-language header instead of user agent for language detection
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5235 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-01 17:47:11 +00:00
orbiter
ce2a7ed116 integrated language detection classes into condenser environment
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5180 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-18 13:12:33 +00:00
orbiter
0cd0fee546 fixed bug with wrong proxy result enqueueing. See:
http://forum.yacy-websuche.de/viewtopic.php?p=8130#p8130
- removed the online status property. This influenced the proxy behavior and created some complexity that was not needed because the online status was never used as it was ceated for (offline browsing)
- checked all proxy identification procedures during crawling and enhanced transparency and error checking
- fixed a proxy identification routine that caused the wrong selection of the proxy result queue

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5173 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-16 21:56:23 +00:00
f1ori
ba76995d2c * fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1415
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5140 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-12 10:54:11 +00:00
danielr
d60b2b198d proxy fixed 'not modified' http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1419
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5133 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-10 11:06:22 +00:00
f1ori
bd0318ba81 * YaCy only supports gzip-encoding, so remove any other encoding from request
* fixes http://www.yacy-forum.org/viewtopic.php?f=2&t=163


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5132 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-09 14:04:52 +00:00
danielr
cf29ca19d4 possible fix for POST character encoding http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1374
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5121 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-07 13:10:46 +00:00
orbiter
05dbba4bab added logging conditions to all fine and finest log line calls
this will prevent an overhead for the generation of the log lines in case that they then are not printed

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5102 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-03 00:30:21 +00:00
orbiter
d3d41e2ee4 - fixed problem with searching with quotes (still not complete, but not as bad as before)
- fixed parsing of crawl-delay statements when seconds were given with float numbers
- enhanced performance of profiling (not too many loggings; not more than one per second)
- removed some debug output
- fixed wrong return type in logging
- added a logging condition in httpd to prevent that logging statements are generated when they are not written (should be added everywhere!)
- fixed wrong word distance computation in RWI management


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5101 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-02 23:49:48 +00:00
danielr
219b93df6a - fixed internal error after receiving chunked POST
- removed debug output
- added info for "501 Unknown" messages



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5098 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-29 13:51:22 +00:00
danielr
cd19d0aee6 - added warnings for failed transferRWI (dht-in)
- fixed parseMultipart (uncompress gzipped body) (dht-in)
- fixed parseMultipart (using content-length only if uncompressed)
- better gzipped POST (chunked instead of content-length) (dht-out)



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5096 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-29 09:42:39 +00:00
danielr
d6d9b0f14a fixed transferRWI.html 'Read timed out'
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5092 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-28 08:37:51 +00:00
danielr
e503158527 Proxy: fix for never ending loading after POST
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5091 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-27 20:46:34 +00:00
danielr
1a1d57e449 Proxy: added binary passthrough for POST
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5089 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-27 08:07:18 +00:00
danielr
9ff4fc11da partial fix (images,audio,video) for proxy and content-type problem http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1374
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5084 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-26 16:34:24 +00:00
orbiter
536e77e8b7 modifications towards a single database operation to read/write http header and cached file at once:
- removed distinction between header file types for http and ftp; ftp is simulated by using http properties
- removed all old resourceInfo classes that handled this distinction
- introduced a new distinction between http request and http response objects
- unified new response objects with two other object types that had been introduced elsewhere
- changed all servlet call methods to use the new http request header object type
- divided static object keys for http header properties into request and response types
- refactoring here and there (a large number of type changes and many methods merged/moved)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5079 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-25 18:11:47 +00:00
danielr
4d937f6b21 fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1396
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5073 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-22 23:46:32 +00:00
danielr
753a1ae430 - changed default browser from netscape to firefox
- fixed "Inefficient use of keySet iterator instead of entrySet iterator" [WMI_WRONG_MAP_ITERATOR, FindBugs]
- fixed some possible null pointer accesses


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5063 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-20 07:54:56 +00:00
orbiter
7989335ed6 Preparations to replace the HTCache with a new storage data structure:
- refactoring of the HTCache (separation of cache entry)
- added new storage class for BLOBs. (not used yet, this is half-way to a new structure)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5062 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-19 14:10:40 +00:00
danielr
be28af50f5 - fixed "yacy2yacy no proxy"-problem
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5058 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-17 10:16:32 +00:00
orbiter
bdae051d9a - extended new performance graph (better timing)
- added paths for new libraries in classpath for eclipse
- refactoring to remove compiler warnings (static access to finals variables)
- removed some unused import

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5055 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-13 10:37:53 +00:00
danielr
d9cea5ff23 removed annotations which broke the build with java 1.5
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5054 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-13 09:07:23 +00:00
danielr
7e7e6a099a undo 5044
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5046 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-10 10:54:13 +00:00
danielr
f2d0bd7790 fix for NPE in JakartaHttpClient.setProxy
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5045 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-10 09:37:32 +00:00
danielr
bb6a6fc233 fixed 'FileUploadException Stream ended unexpectedly'
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5044 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-09 22:44:17 +00:00
danielr
8422ee5ec4 - fixed UnsupportedEncoding (in proxy) using defaultCharset if no characterEncoding can be determined
- serverFileUtils.copy* use now Charset instead of String
- added some warnings for ignored exceptions


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5043 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-09 12:00:31 +00:00
danielr
31d97f2b9f replaced httpd.parseMultipart() by a 'right' implementation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5040 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-08 01:40:28 +00:00
danielr
621b473b18 * removed some warnings of findbugs (http://findbugs.sf.net)
- removed unnecessary code (unused variables, String.toString)
- corrected some calculations (cast int to double or long ;)
- improved little performance (using Integer.valueOf() instead of new Integer)
- log if some File-actions fail (mkdir(), delete(), ...) and some ignored exceptions
- finalized some (more) fields
- finally close some streams
- made inner classes static if not using environment
- generalized some equals (from specificClass to Object)
- fixed some potential nullpointer accesses


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5039 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-06 19:43:12 +00:00
danielr
17b7845eb5 * refactoring
- moved constants from plasmaSwitchboard to own class (all 232 ;)
- moved remoteProxy-Methods to httpRemoteProxyConfig, better names
- removed some unnecessary code (else-statements)
* formatting (correct indentation)
* minor bugfixes (due to findbugs.sf.net)
* hopefully fixed "missing quote" (announcing StringParts as UTF-8)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5031 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-02 13:57:00 +00:00
danielr
3bb870bfcd added final where possible
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5030 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-02 12:12:04 +00:00
f1ori
b0724e5ec0 * add config option to disable cookie monitoring (disabled by default)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5028 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-30 21:19:06 +00:00
orbiter
c3d461d191 - removed superfluous copyright statement
- updated my email address

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5011 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-20 17:14:51 +00:00
danielr
c049d80fbd fixed login problem with yacy as proxy (POST and Cookies)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5009 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-19 15:10:00 +00:00
danielr
7c110e07f0 removed debug
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5006 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-16 12:39:38 +00:00
danielr
eadc204130 gzip POST wiederholbar gemacht (macht transferURL stabiler)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5004 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-16 09:46:25 +00:00
orbiter
e81be7d4f2 added many missing user-agent declarations for yacy http client connections.
the most important fix was the addition of the yacybot user-agent for robots.txt loading,
because web masters look for that access to see if the crawler behaves correctly.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4968 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-04 11:03:03 +00:00
danielr
da917cf4b1 undo reduced menu
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4960 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-30 07:11:13 +00:00
danielr
0c1dc703e4 - set staticIP at startUp
- added setting for reduced menu (simpleMenu)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4959 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-29 18:35:15 +00:00
danielr
63eadfdf84 fixed unlimited FileSizeLimit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4954 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-24 19:11:27 +00:00
danielr
7feae906aa - organize imports
- removed potential null pointer accesses
- removed unnecessary casts


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4893 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-06 16:01:27 +00:00
danielr
d3037c2950 Accept all SSL-certificates (not only valid and self-signed), but put a warning into log file
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4888 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-05 15:21:43 +00:00
orbiter
6f1a3fce05 BF Bugfix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4869 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-29 17:27:38 +00:00
lotus
b09af53643 proper links to directories in repository dirlisting
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4867 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-28 19:23:44 +00:00
lotus
ba4091c5b2 proxy sends status code now
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4860 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-27 13:11:06 +00:00
orbiter
0b52ef3e4b - update of grafics
- check for startsWith("127.") instead of equals("127.0.0.0")

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4853 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-25 20:30:37 +00:00
orbiter
11e00a0849 - refactoring of seedURL handling
- additional check for seedURL pointing to localhost: deny such peers

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4851 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-25 18:35:38 +00:00
orbiter
faed00d75d added use cases to basic configuration
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4831 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-20 13:21:55 +00:00
orbiter
c1d721dd2d fix for attacks on localhost-authorized peers from web pages with links to localhost addresses:
checking of referer in access

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4828 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-19 22:17:53 +00:00
orbiter
1127d62b64 some enhancements to the access tracker (less synchronization)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4812 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-16 23:33:59 +00:00
orbiter
3bd1db776a implemented switch for admin authorization from localhost:
- access is granted for localhost users to administration pages by default
- the default setting can be changed in the BasicConfig.html page
- if the BasicConfig page was accessed with post and no password was submitted, a random password is generated
- a headless installation MUST give a password upon first call of the configuration page, otherwise they will not be able to access it again
- if no password is given within 10 minutes after start-up, a random password is generated

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4804 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-15 11:26:43 +00:00
orbiter
cfe6790498 - added option to switch between yacy networks, especially between the two default networks (freeworld and intranet),
from the ConfigNetwork online interface
- to make this possible, a large refactoring and reorganisation of data structures was necessary

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4803 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-14 21:36:02 +00:00
orbiter
239cc4428d - better domain graph, faster when more links exist, looks better
- new authorization rule: localhost is always authorized for administration. This solves many problems with ajax, and also fixed a problem in rssTerminal
- fix bug in RSSFeed which prevented that entries had been recognized as individual, new entries
- added reloading/updating of status image on status page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4796 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-12 22:23:29 +00:00
lotus
9bc56a9edc xss protection
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4772 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-07 16:37:13 +00:00
orbiter
d2ba1fd2ab major step forward to network switching (target is easy switch to intranet or other networks .. and back)
This change is inspired by the need to see a network connected to the index it creates in a indexing team.
It is not possible to divide the network and the index. Therefore all control files for the network was moved to the network within the INDEX/<network-name> subfolder.
The remaining YACYDB is superfluous and can be deleted.
The yacyDB and yacyNews data structures are now part of plasmaWordIndex. Therefore all methods, using static access to yacySeedDB had to be rewritten. A special problem had been all the port forwarding methods which had been tightly mixed with seed construction. It was not possible to move the port forwarding functions to the place, meaning and usage of plasmaWordIndex. Therefore the port forwarding had been deleted (I guess nobody used it and it can be simulated by methods outside of YaCy).
The mySeed.txt is automatically moved to the current network position. A new effect causes that every network will create a different local seed file, which is ok, since the seed identifies the peer only against the network (it is the purpose of the seed hash to give a peer a location within the DHT).
No other functional change has been made. The next steps to enable network switcing are:
- shift of crawler tables from PLASMADB into the network (crawls are also network-specific)
- possibly shift of plasmaWordIndex code into yacy package (index management is network-specific)
- servlet to switch networks 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4765 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-05 23:13:47 +00:00
danielr
d70a472460 added file for previous commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4764 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-05 05:19:01 +00:00
danielr
8c5f062e0b corrected YaCy version in HTTP User-Agent
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4762 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-04 12:18:00 +00:00
danielr
d7b21bc90c re-added gzip POST for transferRWI/URL (HTTP/1.1 compliant)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4761 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-04 10:53:04 +00:00
danielr
d4bce6affd refactoring (initialized static fields, removed empty if/else, serialized some fields in serializable classes)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4755 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-03 09:06:00 +00:00
danielr
be2c9c07ff escape some unescaped characers in URLs (fixes problems with proxy)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4753 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-02 08:19:47 +00:00
danielr
ec84a52adb change for problem with NPE (seen as "PROXY Unknown Error while processing request")
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4745 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-29 16:06:54 +00:00
orbiter
ff755fb858 small corrections and enhancements after search timing profiling
search should be a little bit faster now

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4734 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-24 13:31:55 +00:00
danielr
d1ee231866 HTTPC close more unused connections
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4702 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-15 16:37:51 +00:00