Commit Graph

692 Commits

Author SHA1 Message Date
danielr
8b2efb6f8c fixed garbage in HTCACHE
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4663 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-07 16:46:45 +00:00
danielr
fb541f9162 HTTPC: default timeout half-hour
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4660 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-07 09:48:49 +00:00
danielr
a94f6cdca4 HTTPC: allowed self-signed certs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4659 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-07 09:21:43 +00:00
danielr
ab330cfdca Network.html: removed ; from location
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4658 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-07 08:13:38 +00:00
danielr
5c3c1fdf41 replaced httpc with Apache Jakarta Commons HttpClient (includes some refactoring ;)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4640 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-05 13:17:16 +00:00
orbiter
daa04f5db9 added additional check in file handler to prevent that url attacks are hidden in url path encodings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4637 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-04 12:15:27 +00:00
orbiter
7f9f639d20 - refactoring and abstraction of index reference (urls) handling: blacklisting is part of reference filtering
- refactoring of word/phrase handling: word abstraction from condenser becomes part of index element handling
- removed unused code parts from condenser

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4603 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-26 15:37:49 +00:00
orbiter
d6050b9ffb - separated the LURL data storage and Crawl result stack for process supervision.
this is another step to enable multiple, concurrent fulltext-indexes
- another try to make the yacy-httpc more stable

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4602 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-26 14:13:05 +00:00
orbiter
541b817502 refactoring of switchboard queueing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4591 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-22 01:28:37 +00:00
orbiter
4c584dff87 disabled soLinger to prevent that too many connections stay open (it's a TEST!)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4565 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-15 10:46:55 +00:00
orbiter
9c989fe5f7 fixed deadlock
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4562 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-15 00:49:16 +00:00
orbiter
fa1090113d - next try to fix the networking problem:
set the maximum transfer size to less than MTU=1500-52: buffer size <= 1448
- some refactoring of transfer methods (naming)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4558 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-14 00:16:04 +00:00
orbiter
d87d295c68 one more try to fix the connection problem
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4556 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-12 13:13:11 +00:00
orbiter
9eddc1506b - one try to fix the httpd problem
- fix for handling of collection index that appears when removing elements
- added another navigation method (stub, not working yet)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4543 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-09 23:58:22 +00:00
orbiter
7cc4ff05c9 some code enhancements and bugfixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4542 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-09 23:48:24 +00:00
borg-0300
3445b1e10b *better logging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4526 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-05 13:41:54 +00:00
borg-0300
4b0339fec0 *fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=927
*remove some cast
*Properties added

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4525 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-05 13:29:42 +00:00
orbiter
275a226cc5 refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4524 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-04 22:45:45 +00:00
orbiter
4fdf695064 - fixed a bug in remote search that prevented that any results had been generated (!)
- added a great number of printStackTrace and new exceptions that shall be used to find the cause
  for a bug in yacy client-server communication which causes the interruption of data transfer
  which then causes the parser bug for the seed strings.
- tried to fix the communication bug on server-side (copy functions)
Be aware that the log may be full of errors and bugs - there should not be more bugs but there is more to see


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4519 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-27 23:12:43 +00:00
orbiter
3f321ece7d added a search history to the new search page
the history distinguishes between different users and identifies them by their ip
a history is only shown to the user who submitted the search

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4510 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-25 21:26:49 +00:00
orbiter
87a8747ce3 - enhanced recognition, parsing, management and double-occurrence-handling of image tags
- enhanced text parser (condenser): found and eliminated bad code parts; increase of speed
- added handling of image preview using the image cache from HTCACHE
- some other minor changes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4507 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-25 14:08:15 +00:00
orbiter
6c3cd2b4f2 - added new way to watch images from the image search:
they appear as separate, floating window above the search results,
  not in a new window
- added highslide javascript library for feature mentioned above
- removed dir servlet. This thing was not used as it was supposed to be (as an example applet)
  and was a major problem for intranet-indexing when files are hosted on the same peer.
- added yacy-httpd-internal directory listing. Because YaCy is a search engine,
  directory listings are similar to search result listings. Intranet indexing from the same peer
  will get nice index pages for document collections.
- removed unused test applet

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4494 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-18 16:38:06 +00:00
orbiter
bd63999801 - faster search: using different data structures that avoid multiplr calculations
- no more table copy for error-eco table
- optional table copy for lurl-entries
- more abstractions (less single constant strings)
- better logging (using host names instead of ips)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4459 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-07 22:16:36 +00:00
orbiter
acf771d5e1 - fixed bug with too much RAM in crawler queue
- fixed dir bug
- better calculation of TF for join
- better waiting-on-result logic

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4424 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-31 23:40:47 +00:00
orbiter
0f5c4abaca more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4414 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-29 10:12:48 +00:00
orbiter
1a296af6ff more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4412 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-28 20:08:32 +00:00
orbiter
4a80902081 - added ViewProfile as rdf in foaf syntax
- added link to rdf and vCard version on html page
- can be seen on http://localhost:8080/ViewProfile.html?hash=localhash
- more generics

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4411 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-28 18:21:08 +00:00
orbiter
15397298dc - refactoring of indexControlRWIs: moved statics to own class; better Dublin Core naming
- fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=759&hilit=&p=4866#p4866
- some bugfixes in EcoTable according remove method
- switched more tables to Eco: crawl Profiles, htcache, seeddb, newsdb

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4397 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-24 22:49:00 +00:00
orbiter
03e7782269 more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4305 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-06 19:23:38 +00:00
low012
ae6d07bdb8 *) "Did you mean:" will only be displayed if the list of suggested URLs is not empty.
*) Removed <hr /> to make the "404 Unknown Host" error pag look like the other 404 error pages.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4298 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-01 23:03:02 +00:00
orbiter
df2a7a8ac8 more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4295 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-28 18:47:45 +00:00
fuchsi
d517e96714 last cleanup bits to serverDate before the release. only safe refactoring (method renaming) changes outside of serverDate.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4289 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-21 00:53:46 +00:00
hermens
4748d5c1ab Some enhancements to time management:
- remove unnecessary generation of Calendar and Date objects
- synchronized SimpleDateFormat objects in blog-, message- and wikiBoard
- correct use of TimeZones and SimpleDateFormats



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4288 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-20 17:11:35 +00:00
fuchsi
f41172f850 Merge httpDate into serverDate as suggested. Removed some unnecessary code and fixed a possible synchronization problem.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4283 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-18 22:35:02 +00:00
fuchsi
21f7e13fa1 fix stupid tiny bug introduced in rev 4276 that broke request URL parsing almost completely
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4277 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-16 00:33:32 +00:00
fuchsi
5d406d0094 - fixed url "file extension" parsing when there is no extension (like http://yacy.net/ would have extracted .net/)
- removed unecessary code + minor cleanup

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4276 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-14 20:03:26 +00:00
fuchsi
21b8d1b918 small cosmetic change for static fields in serverCore (special protocol ASCII entities) to improve readability
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4275 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-14 19:17:54 +00:00
orbiter
ca488e03f5 fixed authorization case
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4262 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-10 02:04:48 +00:00
orbiter
e22014dc83 some memory enhancements when generating and displaying ymage objects
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4253 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-07 02:15:12 +00:00
fuchsi
39d0f10ca1 Fix parsing oof dates in HTTP headers.
RFC 2616 requires a client to support RFC 1123 (default), RFC 1036 and ANSI C formatted date strings (we only supported 1123 before).

Closes: http://forum.yacy-websuche.de/viewtopic.php?f=6&t=525 (and probably others). There are servers which break the standards, please report those "DATE ERROR" messages if they contain a "sane" date string.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4243 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-30 20:47:27 +00:00
orbiter
9b0ae4b989 added referrer to remote crawl url list
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4236 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-29 13:58:00 +00:00
orbiter
af10f729df fixed image search and favicon loading
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4225 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-22 01:34:29 +00:00
orbiter
c527969185 - enhanced monitoring of ranking parameters
for details, please try http://localhost:8080/IndexControlRWIs_p.html
- fixed computation of ranking ordering in some cases

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4220 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-16 14:48:09 +00:00
low012
383dc815d2 *) fix for commit 4212
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4217 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-14 19:14:53 +00:00
fuchsi
425e4ead66 Allow absolute paths in configuration settings.
- before absolute paths would be expanded incorrectly, e.g.: fooPath=/a/b/c would become /path/to/yacy/root/a/b/c. Now you can put nearly every dynamically generated data with a configurable path to a location outside of yacys root dir without having to use symlinks (probably good for third party distribution packaging).
- abstractServerSwitch.getConfigPath(setting, default) returns a File instance, either with an absolute path or relative to the applications root path.

- exceptions (hardcoded): 
  DATA/LOG/yacy.logging
  DATA/SETTINGS/httpProxy.conf
  DATA/SETTINGS/user.db
TODO: all of these are the global configuration files and they should probably be put into _one_ command line configurable settings path, so it would be possible to package them in /etc/ for example.

- add missing workPath to yacy.init (it was used in code, but there was no default in the file)
- fix broken skinPath (was skinsPath in yacy.init but skinsPath in the code) + a few other broken config reading caused by typos.
- replaced path setting names and their default values with the related static fields in plasmaSwitchboard where not already done/existing

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4196 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-04 10:36:25 +00:00
orbiter
a31b9097a4 preparations for mass remote crawls:
two main changes must be implemented to enable mass remote crawls:
- shift control of robots.txt to crawl queue (away from stacker). This is necessary since remote
  crawls can contain unchecked urls. Each peer must check the robots to prevent that it is misused
  as crawl agent for unwanted file retrieval
- implement new index files that control double-check of remotely crawled urls

After removal of robots.txt checking from stacker threads, the multi-threading of this process is void.
Multithreading has been removed. Also the thread pools for the crawl threads had been removed, since
creation of these threads is not resource-consuming, for a detailed explanation see svn 4106

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4181 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-29 01:43:20 +00:00
fuchsi
0e1738899f * Complete number localization and provide a more reasonable interface to serverObjects:
- put(key, value) methods are now used if a value added to the map should be kept as it is. Numbers are transformed (but not formatted) to an equivalent String representation.
- putASIS(...) have been removed, now done with simple put(...) (see above).
- puNum(...) can be used for number values which should be stored in a formatted way, either depending on the current locale setting for yacy (default) or in a "none" locale (see javadocs and setLocalize()).
- putHTML(...) escapes special characters into corresponding HTML enities ('<' => '&lt;') which was done with put(...) before and so was called too often, becauses it is necessary only for very few cases. Additionally there is a "forXML" mode which only replaces < > & ".
In short: Use put(...) for almost everything, use putXY(...) if you need some special transformation of the value.
A few bugs have been fixed as well, and there should be a small performance improvement for complex pages with a lot of values.

* added additional Sum/Avg rows to access tracker pages, see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=456
* removed duplicate code (mostly related to the big changes above).

TODO:
- make sure, number formats work as expected _everywhere_, report overseen stuff http://forum.yacy-websuche.de/viewtopic.php?f=5&t=437
- probably a good idea to add special putDate() methods as they are used in many pages and create duplicated formatting code + maybe some centralized handling for memory value formatting.
- further improve the speed of page creation for the WatchCrawler.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4178 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-24 21:38:19 +00:00
fuchsi
f717beecb1 - Changed yFormatter handling to be more flexible and produce more readable code for server pages. There are serverObject.putNum() methods to allow adding of number type values in a formatted form, and put() methods for number types that add them without formatting. This reduces the need to transform them into Strings in server pages and removes the HTML encoding step which is unecessary for numbers.
- some minor code cleanups (mostly unnecessary casts, null checks)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4166 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-19 04:13:46 +00:00
orbiter
711641f167 extended client connection clean-up:
there are now two time-outs, one for the complete connection time, and one for an idle time
connections that are idle for more than 2 minutes are closed, and connections that are alive since more than one hour are also closed
if the complete number of connections exceeds 64, all connections more than 64 and have most idle time are also closed

During normal operation of peers these forced closings should never appear,
but the existence of the idle connection check ensures the availability of the peer and the usability of the host.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4134 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 15:06:12 +00:00
orbiter
b19bb6e5b1 - reverted svn 4132; this did not solve the problem and removed the emergency mehtod which caused production failure for shure within some hours
- removed and added some debugging lines

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4133 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 14:34:05 +00:00
fuchsi
1eba408d2f Make sure that sockets which couldn't be opened aren't handled as active connections, in which case they wouldn't be closed.
Please test this and report any problems (connections that stay open for a very long time according to http://<your_yacy_peed>/Connections_p.html to http://forum.yacy-websuche.de/viewtopic.php?f=5&t=386

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4132 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 12:18:26 +00:00
orbiter
d69d386f7d added additional forced client connection closing
if a specific number of simultanous connections is reached
the limit is currently set to 64 connections

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4129 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 00:21:53 +00:00
orbiter
dea7bee049 - increased minimum time before an active connection is interrupted from 1 minute to 10 minutes
- added sorting by connection time in client connection tabe of connectionTimeComparatorInstance

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4128 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-02 23:56:04 +00:00
orbiter
01e0669264 re-designed some parts of DHT position calculation (effect is the same as before)
and replaced old fist hash computation by new method that tries to find a gap in the current dht
to do this, it is necessary that the network bootstraping is done before the own hash is computed
this made further redesigns in peer initialization order necessary

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4117 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-01 12:30:23 +00:00
orbiter
842308ea97 - redesigned crawl start menu, integrated monitoring pages
- removed web structure picture from indexing menu and grouped it together with htcache monitor
- added a database for terminated crawls, when a crawl is finished it is automatically moved to the new database
- extended crawl profile edit servlet, shows now also terminated crawls
- option that was used to delete profiles is now redesigned to a function that moves the current crawl to the terminated crawls and removes all urls from the current queues!
- fixed here and there problems with indexing queues
- enhances indexing speed by changing cache flush sizes.
- changed behaviour of crawl result servlet: the list of crawled urls is shown if there is one, othevise the overview window is shown

attention: the new profile databases are not compatible with the old one. current crawls will be lost! the web index is not touched.
next steps: the database of terminated crawls can be used to start with them a new crawl. This is useful if one wants to re-crawl specific pages and wants to use a old crawl profile.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4113 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-28 01:21:31 +00:00
orbiter
2f1ff048ba some fixes to socket connection time-out
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4111 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-25 23:45:05 +00:00
orbiter
3c74014004 automatic deletion of dead client connections
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4110 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-25 22:46:11 +00:00
orbiter
11b4f80bde - fixed non-closing client connections
- added client connection tracker in connections servelet

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4108 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-25 21:36:08 +00:00
orbiter
d352853f2d fix for non-closing client sessions
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4107 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-24 08:42:07 +00:00
orbiter
1488769e1f cleanup of unmaintained and outdated performance methods:
removed object pools in httpc. Object pooling is not recommended,
if the creation of the object is not time-intensive. Object pools are only useful,
if there is much computation necessary to create some basic data that is stored
in the object pool and can be re-used. This does not apply to object pools in YaCy.
Object pooling of client sessions would make sense if they would allow re-use of
living connections to other yacy clients. But every connection is closed after usage
of an object in the client pool, therefore the YaCy server client objects are not such
that hold hardware/network-allocated entities.
See:
http://www.javaperformancetuning.com/news/qotm033.shtml
http://java.sun.com/docs/hotspot/HotSpotFAQ.html#gc_pooling
http://docs.sun.com/source/816-7159-10/pt_chap5.html
http://www.microjava.com/articles/techtalk/recylcle2


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4106 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-23 20:49:52 +00:00
orbiter
3cb9cdc9be try to fix connection problem, possible cause for wrong junior status and non-passive passive peers:
the YaCy client treats disconnections during data transmissions as error and discards all data transmitted so far
this did not happen so far until I removed a delay time at the end of the daemon session which prevented this case.
To fix this problem, disconnections during transmissions are not treated as error now, which means that end-of-transmissions
with sudden disconnections are not a cause for peer diconnections any more. To be nice to non-updated peers, the sleep time
at the end of server sessions is also re-enabled.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4105 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-23 17:31:29 +00:00
fuchsi
e192f99134 fix small bug introduced in r4089 that appeared when we tried to remove "gzip" encoding from Accept-Encodings header
closes http://forum.yacy-websuche.de/viewtopic.php?f=6&t=336

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4090 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-10 21:46:40 +00:00
fuchsi
ae4b9308ef Fix problems with some web servers which couldn't handle the way yacy was sending requests. Thx to celle for the patch.
http://forum.yacy-websuche.de/viewtopic.php?f=5&t=320

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4089 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-10 09:15:28 +00:00
orbiter
daf0f74361 joined anomic.net.URL, plasmaURL and url hash computation:
search profiling showed, that a major amount of time is wasted by computing url hashes. The computation does an intranet-check, which needs a DNS lookup. This caused that each urlhash computation needed 100-200 milliseconds, which caused remote searches to delay at least 1 second more that necessary. The solution to this problem is to attach a URL hash to the URL data structure, because that means that the url hash value can be filled after retrieval of the URL from the database. The redesign of the url/urlhash management caused a major redesign of many parts of the software. Since some parts had been decided to be given up they had been removed during this change to avoid unnecessary maintenance of unused code.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4074 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-05 09:01:35 +00:00
orbiter
4779f314fe first version of next-generation search interface:
- snippets are not fetched by browser using ajax, they are now fetched internally
- YaCy-internat threads control existence of snippets and sort out bad results
- search results are prepared using SSI includes
- the search result page is visible right after the search request, the results drop in when they are detected
- no more time-out strategy during search processes, results are shifted within queues when they arrive from remote peers
- added result page switching! after the first 10 results, the next page can be retrieved
- number of remote results is updated online on the result page as they drop in
- removed old snippet servelet (which had been also a security leak btw)
- media search is broken now, will be redesigned and fixed in another step


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4071 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-03 23:43:55 +00:00
orbiter
6d759ad0a7 - new bot address
- removed unused skins

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4065 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-29 11:46:42 +00:00
orbiter
f9e6cf6a3d more refactoring of search:
integrated first version of ssi-using search interface,
but the function is currently disabled


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4063 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-28 12:15:46 +00:00
orbiter
bb426565f0 added new yacy protocol for mass url-pull for better remote crawling distribution
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4056 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-22 00:59:05 +00:00
orbiter
b5346141b3 made the plasmaHTCache static (there is only one internet, so we need only one cache)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4045 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-15 21:31:31 +00:00
orbiter
61f93cbf14 some code-cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4040 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-11 00:42:04 +00:00
orbiter
24e25e1141 enhanced SSI server-side support:
- SSIs may now refer to servlets, not only files
- calling a servlet, the servlet/SSI engine is called recursively
- SSIs now work also for non-chunked-encoding supporting clients
This will support the new search page functionality, to show search results
dynamically without using javascript. To test this method, a test page has been added
http://localhost:8080/ssitest.html
..calls dynamicalls 3 servlets, which produce some delays during their execution
please verify that you can see the result step-by-step on your browser
To implement this feature, some refactoring had been taken place, mostly code
had been made static and will execute faster.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4037 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-09 21:58:38 +00:00
orbiter
57a5b6fa71 some generalization of remote proxy configuration and setting handling in httpc
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4023 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-02 00:42:37 +00:00
orbiter
367fc28928 corrected Brausse->Brausze
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4020 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-01 22:15:51 +00:00
orbiter
e76fe1c078 - replaced unicode characters in copyright holder name ('Brausse')
- more logging for bootstrap seedlist loading
- larger DHT chunks

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4015 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-31 10:00:17 +00:00
orbiter
75d1437340 fix for http://forum.yacy-websuche.de/viewtopic.php?p=1123#p1123
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4002 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-24 13:39:53 +00:00
orbiter
9ca46a8c69 indexing of local (intranet) urls enabled
To do this, one must create a separate YaCy network that has a local URL domain
A description how to do this is here: http://www.yacy-websuche.de/wiki/index.php/De:Netzdefinition

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4001 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-24 00:46:17 +00:00
orbiter
6758beae9c fix for http://forum.yacy-websuche.de/viewtopic.php?p=1092#p1092
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3999 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-23 09:54:12 +00:00
orbiter
40b0547611 - documentaton changes (removed old forum links)
- different handling of link quotation
- different handling of link normalization
- enhanced html/unicode en/de-coding

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3993 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-19 15:32:10 +00:00
orbiter
26ddf797eb added bmp and ico image format to all parser/viewing methods
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3969 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-15 12:55:41 +00:00
low012
1ea5fa2c04 *) Changed a comment to get rid of this message:
[javac] /home/low012/subversion/yacy/trunk/source/de/anomic/http/httpc.java:1117: warning: unmappable character for encoding UTF8
    [javac]             // if download == null, the get result is stored to a byte[]�and returned,
*) Changed broken link (see: http://forum.yacy-websuche.de/viewtopic.php?t=128)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3956 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-07 09:04:41 +00:00
orbiter
a9e73b6852 fixed great mess with localization paths. the problem was:
automatic re-translation after update did not work. hopefully now

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3952 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-04 10:32:30 +00:00
orbiter
36a37f758b fix for oom exception during release download
see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=101&hilit=

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3950 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-03 22:55:47 +00:00
orbiter
84be912e90 fix for null pointer exception that occurred when missing user-agent in request header
see also http://forum.yacy-websuche.de/viewtopic.php?f=6&t=78&hilit=

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3943 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-30 18:08:10 +00:00
orbiter
e03fcf4627 SSI fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=29
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3936 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-29 10:45:13 +00:00
orbiter
9bbd39b67c - removed unfinished auto-updater from roland and martin
- added new download-option for releases on the status page
still mising:
- thomas-style restart for linux/mac
- untar/gunzip on shell basis
(comes next)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3931 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-28 14:52:26 +00:00
orbiter
154ffd7c2c fix for wrong http connection version and SSIs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3928 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-26 15:06:23 +00:00
orbiter
1782ef57e5 - added SSI parser and include directive for <!--# include virtual="<file>" -->
- added chunked file transfer for non-yacy clients
- SSIs are streamed using chunked transfer, partly delivered pages can be seen in browser before transmission is finished
- added client-side network unit identification
- cleaned up code

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3926 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-26 14:37:10 +00:00
orbiter
0e57a8062b added network definition for different YaCy networks
(needs much more work)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3919 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-22 14:29:14 +00:00
orbiter
6518bb6c08 changed release strategy:
we will provide two different releases in the future, one standard release and one 'pro'-release.
the 'pro'-release contains all additional parsers AND has different default performance values.
The pro-version differs therefore from the previous 'all'-version by this default values.
The pro-configuration is automatically choosen if the libx-folder exists. If a version is once initialized, its configuration stays independently from an existing libx folder.
The ant targets had been changed. There are now 3 different targets to create standard and pro-releases, and one target to upgrade:
- dist: creates a standard release (only, no libx target any more)
- distPro: creates a pro-release (includes the libx)
- distExt: creates a libx-release which includes the libx-folder only. It may be used to upgrade from standard to pro
Furthermore, the naming of 'dev'-releases had been removed.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3902 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-16 14:11:52 +00:00
allo
465145cb6f revert to insecure, but dau-proof defaults
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3898 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-15 12:56:52 +00:00
allo
7ad11ceaaa security fix for peers without password. allow access only from localhost
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3897 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-15 00:03:44 +00:00
orbiter
e4aa8f2a08 disabled more sleep(200)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3889 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-14 11:02:28 +00:00
orbiter
cb38e57622 reduced httpd final waiting time
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3888 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-14 10:28:37 +00:00
orbiter
b4585ad67d im Sommer 2005 wurden die ersten pings zwischen YaCy-Peers ausgetauscht.
Das klappte aber merkwürdigerweise nicht immer. Um das Protokoll zu testen schrieb ich eine einfache message-Funktion, so wie sie heute noch in YaCy drin ist.
Aber auch die Messages funktionierten nicht richtig. Alex und ich haben lange Zeit gesucht, und den Fehler nie gefunden. Es stellte sich heraus das ein Timing-Detail das Problem lösen konnte, die Ursache haben wir bis heute nicht gefunden.
Die Lösung des Problems bestand aus einem kurzen sleep, kurz bevor der httpd Daten zum client zurück geschrieben hat. Das ist natürlich eine fürchterlich schlechte Lösung.
Bis heute war diese Sache im httpd. Mit diesem Commit habe ich den sleep auskommentiert, und es steht zu befürchten das wieder irgendwas nicht geht.
Wenn jetzt das Netz zusammenbricht, keine pings mehr ankommen oder so, war es dieses sleep, das es verhinderte.
Vorschläge willkommen.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3887 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-14 08:18:10 +00:00
karlchenofhell
669f840eab - added ViewProfile / Impressum (default on) to local peer's robots.txt
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3874 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-12 13:48:27 +00:00
orbiter
66ec8b63c1 added a httpd access tracker:
- all requests to the own httdp can now be listed in the access tracker menu
- the search statistics had been renamed to access tracker and extended by this tracker

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3861 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-11 14:05:20 +00:00
orbiter
33ad0c8246 added a web structure computation and logging:
- all web page parsing operations will now increase a web structure file
- the file is computed in memory and dumped at shutdown-time to PLASMASB/webStructure.map in readable form (not a database)
- the file can be used externally to analyse the link structure of the crawled pages
- the web structure can also be retrieved using a xml-interface at http://localhost:8080/xml/webstructure.xml
- the short-term purpose is the computation of a link-graph image (before linuxtag!)
- a long-term purpose could be a decentralized computation of the citation rank



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3746 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-22 08:13:48 +00:00
karlchenofhell
601fc7d1c5 - added source to J7Zip-modifed.jar and it's license (changelog is still to come)
- moved HTML-*replace-methods from wikiCode to de.anomic.data.htmlTools
- prepared use of different wiki parsers as suggested here: http://www.yacy-forum.de/viewtopic.php?p=34444#34444

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3741 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-20 13:29:12 +00:00
orbiter
26f05d1fd0 avoid division by zero if search is done for no words
this case is relevant if the bluewords (yacy.blue) are used

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3698 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-09 22:10:12 +00:00
orbiter
2fa8b50e54 reverting svn 3691+3692
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3696 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-09 19:31:40 +00:00
orbiter
22a0e9f117 more timeout-control
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3692 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-09 14:53:17 +00:00
orbiter
24db55a541 added timeout for httpd-sockets during read
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3691 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-09 14:30:01 +00:00
orbiter
111ba9e359 - fixed some width problems in new status page
- fixed deadlock in dns cache
- added termination security for DHT peer selection

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3660 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-05 23:18:00 +00:00
orbiter
29fe2beac7 possibly fixed a deadlock
cannot find forum link now for that

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3593 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-24 21:55:57 +00:00
theli
c2e6afbd69 *) bugfix: setting mimeType properly for dir listing with e.g. "?format=xml"
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3516 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-23 05:37:19 +00:00
theli
f20b596dc0 *) adding servlet to display all deployed SOAP Services
- soap related servlets are located in htroot/soap
*) new serverContext class for soap

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3511 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-22 08:30:57 +00:00
theli
81b4598487 *) peer profile can now be displayed as vcard
e.g. http://localhost:8080/ViewProfile.vcf?hash=localhash

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3504 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-21 15:08:18 +00:00
theli
91c2a042a7 *) bugfix for wrong proxy traffic accounting
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3484 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-16 13:52:48 +00:00
orbiter
5b0a84ce09 fix for synchronization deadlock with flushMissNameCache.
see also: http://www.yacy-forum.de/viewtopic.php?p=32939#32939

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3472 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-12 09:06:57 +00:00
orbiter
a1fb8358b2 lets make a well-formed http link so that other crawlers don't have a problem to follow this link :-)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3463 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-09 12:35:54 +00:00
orbiter
4edb70f68b added yacybot info-page from Roland
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3462 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-09 12:26:31 +00:00
orbiter
d755a8026d - better OOM protection
- better memory allocation for FlexTable indexes
- splitting between static index and dynamic index (only the dynamic part must grow)
- to enable a merge-iteration of new splittet index, a huge number of classes needed to be adopted for new iterator classes
- added new iterator classes that support cloneable iterators
- adopted all iterator classes to implement cloneable itarators

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3453 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-08 16:15:40 +00:00
karlchenofhell
88245e44d8 - improved version of robots.txt (delete your old htroot/robots.txt before updating):
- robots.txt is a servlet now
  - no need to rewrite the whole file each time a section is added or removed
  - user-defined disallows, added manually, won't be overwritten anymore
- new config-setting: httpd.robots.txt, holding names of the disallowed sections

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3423 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-02 01:19:38 +00:00
karlchenofhell
a1d68fe092 - use .class rather than Class.forName for classes in class-path
- added Bost's patch for Diff.findDiagonale() from: http://www.yacy-forum.de//files/patch_685.txt
- fixed minor bugs in Blog

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3416 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-27 22:52:22 +00:00
karlchenofhell
6fbe31425a - some code-cleanup (no more syntax-warnings here)
- added deletion from loadedURLs of URLs to be blacklisted in IndexControl_p

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3404 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 12:56:50 +00:00
karlchenofhell
c016fcb10f - added streaming-support to CrawlURLFetchStack_p servlet
- bug for NPE in list.java
- use more constants

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3373 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-19 12:47:46 +00:00
orbiter
bf69a721cb more protection against mis-use of YaCyHop interface:
- target must not be at port 80
- target access not more than every 3 seconds
- requester may not access more than every 10 seconds

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3357 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-09 15:25:10 +00:00
orbiter
c464157a6e replaced some toString()
see http://www.yacy-forum.de/viewtopic.php?p=31151#31151

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3345 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-06 16:26:56 +00:00
orbiter
b4aa195c27 added user-agent check for yacy-hop proxy authentication
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3343 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-06 09:53:02 +00:00
orbiter
d25caa07bf redesigned some parts of http authentication
added another access check for peer hops

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3340 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-05 19:46:50 +00:00
karlchenofhell
2401e748a3 - fixed wrong replacement of POST-parameters in httpd ('<' and '>' are still replaced, don't know why): http://www.yacy-forum.de/viewtopic.php?t=3466
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3324 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-03 01:26:05 +00:00
karlchenofhell
e68cdeeeb3 - reverted parseArg(String) to use a byte-array to handle correct UTF-8 parsing
- arguments aren't passed html-escaped to the servlets anymore, bug-fix for http://www.yacy-forum.de/viewtopic.php?p=30573

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3321 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-02 21:20:53 +00:00
orbiter
47ab83a7c0 added flag for YaCyHop - proxy access for all paths that start with /yacy/
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3304 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-31 00:09:51 +00:00
allo
25c7d4e25e fix for form (cookie) login
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3284 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-27 17:22:49 +00:00
karlchenofhell
7c40197e42 - fixed error pages and <label>s for index.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3226 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-17 04:20:19 +00:00
allo
b4457763e5 fix for putSafeXML and supertemplates.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3223 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-16 21:06:31 +00:00
allo
0c81bd39d4 XSS-safe put as default.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3217 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-16 14:07:54 +00:00
orbiter
5515571950 redesign of ymage classes
- less memory usage
- better usage of awt classes
- drawing abstractions: preparations for movable objects for animation class
- test applet for animations
- known bugs: wrong colours for network picture

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3214 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-15 23:31:50 +00:00
karlchenofhell
b873ad51ab - fix for http://www.yacy-forum.de/viewtopic.php?t=3369
- merged netBude's alternative for tables in yacysearch.html & search results valid
- added statistic info to index.html as proposed here: http://www.yacy-forum.de/viewtopic.php?p=29762#29762
- fixed error-log in httpTemplate

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3189 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-12 00:52:38 +00:00
karlchenofhell
340dc52a9d - ConfigProfile_p.html now transmits usable encoding for other than 7-bit ASCII charset, see TODO in httpd.parseArg(String)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3174 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-07 02:07:27 +00:00
karlchenofhell
00aa9472d6 - added decode of HTML-entities in request lines
- removed Bookmark symbol on search pages and surftips if not authenticated

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3172 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-06 11:05:50 +00:00
orbiter
0a050bc043 enhanced ranking
- redesign of data storage in plasmaSearchRankingProfile
- profiles are extended by new ranking parameters
- new RWI ranking parameters are considered during ranking
- appearance attributes (i.e. emphasised text) is now considered
- faster ranking
- some attributes that had been checked during post-ranking can now be
  checked during pre-ranking phase
- removed old ranking parameter on index.html page (will be replaced by profiles in the future)
- ranking can now consider appearances of media content
- snippet-loading for media types now work correctly (fetches only from the wanted media)
- ranking-profiles can be handed over the remote peers and apply there also
- re-search of same query with different domain now also re-triggers remote search

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3105 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-20 15:44:29 +00:00
orbiter
d0c32c6aeb better protection against fraud peers
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3104 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-20 01:07:49 +00:00
karlchenofhell
e17591acc3 - parse HTML arguments as UTF-8 strings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3085 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-16 21:40:59 +00:00
karlchenofhell
d30932c7d8 - fix for fix... sry
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3084 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-16 16:43:52 +00:00
karlchenofhell
6118fb73ec - added decode of UTF-16 escapes in url-arguments (%u0123), bugfix for http://www.yacy-forum.de/viewtopic.php?t=2762
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3083 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-16 16:40:40 +00:00
orbiter
fb7902aa68 fix for http://www.yacy-forum.de/viewtopic.php?p=26142#26142
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3033 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-01 11:01:56 +00:00
orbiter
984285bdd6 better organisation of dns hit/miss cache flush
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3016 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-27 15:53:42 +00:00
orbiter
73c63578ad - activated the dns miss cache
- added a cache-control for cache miss flush to the dns miss cache
- better naming of cache variables to distinguish hit- and miss- cache

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3015 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-27 15:27:43 +00:00
orbiter
e3d75f42bd final version of collection entry type definition
- the test phase of the new collection data structure is finished
- test data that had been generated is void. There will be no migration
- the new collection files are located in DATA/INDEX/PUBLIC/TEXT/RICOLLECTION
- the index dump is void. There will be no migration
- the new index dump is in DATA/INDEX/PUBLIC/TEXT/RICACHE

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2983 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-19 20:05:25 +00:00
orbiter
d34f10c63d some tests with reverse dns lookup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2954 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-12 00:28:10 +00:00
(no author)
a51417d86b Bugfix: language of ConfigLanguage_p.html was not changed properly when a different language was choosen here
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2948 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-09 22:18:16 +00:00
theli
f77d624b94 *) bugfix for persistent connection support on transfer-encoded requests
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2942 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-09 05:59:56 +00:00
orbiter
114a76a86e - added flag to urlhash that shows that domain is a local domain
- enhanced local domain detection
- bugfixing for memory assignment in kelondroFlexSplit
- automatic memory assignment to caches according to available RAM
- bugfixes for details during search process

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2924 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-06 02:05:39 +00:00
(no author)
e59ff8b657 Bugfix: language of ConfigBasic.html was not changed properly when a different language was choosen here. Note: there's a similair bug on ConfigLanguage_p.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2921 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-05 17:43:37 +00:00
theli
29a1f132ec *) some strings replaced by constants
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2910 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-04 16:33:02 +00:00
orbiter
215c4e65f1 code cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2887 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-31 22:10:25 +00:00
theli
532c23b5c7 *) soap handler
- better errorhandling 
   - adding support for outgoing transfer- and content-encoding
   - avoid holding outgoing messages into memory before sending them

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2872 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-28 12:31:48 +00:00
theli
777e39cea0 *) new template to display the dir-listing in xml format.
This can e.g. be done by using the url http://localhost:8080/share/?format=xml

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2856 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-24 12:13:46 +00:00
theli
88cfdecd38 *) Bugfix: calling close must not close the wrapped input stream, otherwise
keep-alive connections would terminate

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2853 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-24 06:09:38 +00:00
allo
8a5c2d0a19 fix for supertemplates, too.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2839 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-21 16:53:31 +00:00
allo
c35793fb46 fix for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2838 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-21 16:41:22 +00:00
allo
a831c83025 create servletProperties, with the servlet specific funktions from serverObjects
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2835 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-21 15:01:53 +00:00
orbiter
8b56887676 removed unused code
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2820 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-19 21:30:02 +00:00
theli
68204ff729 *) Suppressing for bad client requests.
See: http://www.yacy-forum.de/viewtopic.php?p=26918

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2814 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-19 11:10:56 +00:00
theli
df49724f28 *) better error handling for seed upload - test download - problems
See: http://www.yacy-forum.de/viewtopic.php?p=26814#26814

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2812 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-19 10:10:53 +00:00
theli
b357a13e9a *) adding synchronization block because SimpleDateFormat is not thread-safe
See: http://www.yacy-forum.de/viewtopic.php?p=26906#26906

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2809 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-19 07:48:13 +00:00
orbiter
688cbfb776 - bugfixing for flextable bug
- bugfixing for collection index bug
- several other bugfixes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2785 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-16 00:27:25 +00:00
allo
a29b4d4fb5 extended Supertemplates for Headerincludes.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2780 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-15 13:43:46 +00:00
theli
a7e11ada50 *) suppressing stacktrace for "server has closed connection"
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2779 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-15 09:18:51 +00:00
orbiter
c8f3a7d363 added snippet-url re-indexing
- snippets will generate an entry in responseHeader.db
- there is now another default profile for snippet loading
- pages from snippet-loading will be indexed, indexing depth = 0
- better organization of default profiles

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2733 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-09 23:07:10 +00:00
allo
226f2c5b2c first version, of the Serverlet Debugger
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2717 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-08 14:25:54 +00:00
theli
ce7ee74316 *) better errorhandling in filehandler (try catch block now starts before argument parsing)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2704 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-03 14:21:46 +00:00
theli
f17ce28b6d *) plasmaHTCache:
- method loadResourceContent defined as deprecated. 
     Please do not use this function to avoid OutOfMemory Exceptions 
     when loading large files
   - new function getResourceContentStream to get an inputstream of a cache file
   - new function getResourceContentLength to get the size of a cached file
*) httpc.java:
   - Bugfix: resource content was loaded into memory even if this was not requested
*) Crawler:
   - new option to hold loaded resource content in memory
   - adding option to use the worker class without the worker pool 
     (needed by the snippet fetcher)
*) plasmaSnippetCache
   - snippet loader does not use a crawl-worker from pool but uses
     a newly created instance to avoid blocking by normal crawling
     activity.
   - now operates on streams instead of byte arrays to avoid OutOfMemory 
     Exceptions when operating on large files 
   - snippet loader now forces the crawl-worker to keep the loaded
     resource in memory to avoid IO 
*) plasmaCondenser: adding new function getWords that can directly operate on input streams
*) Parsers
   - keep resource in memory whenever possible (to avoid IO)
   - when parsing from stream the content length must be passed to the parser function now.
     this length value is needed by the parsers to decide if the parsed resource content is to large
     to hold it in memory and must be stored to file 
   - AbstractParser.java: new function to pass the contentLength of a resource to the parsers
   


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2701 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-03 11:05:48 +00:00
orbiter
5a40ea7866 refactoring of wget string list generation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2692 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-02 09:59:20 +00:00
orbiter
310f1c41cd added option to see ranking scores in surftipps
and some cleanups

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2684 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-30 23:28:03 +00:00
theli
cd5f349666 *) Better handling of large files during parsing
Extracted text of files that are larger than 5MB is stored in a temp file instead of keeping it in memory
*) plasmaParserDocument.java; getText now returnes an inputStream instead of a byte array
*) plasmaParserDocument.java: new function getTextBytes returns the parsed content as byte array
   Attention: the caller of this function has to ensure that enough memory is available to do this 
   to avoid OutOfMemory Exceptions
*) httpd.java: better error handling if the soaphander is not installed
*) pdfParser.java: 
   - better handling of documents with exotic charsets
   - better handling of large documents
   - better error logging of encrypted documents
*) rtfParser.java: Bugfix for UTF-8 support
*) tarParser.java: better handling of large documents
*) zipParser.java: better handling of large documents
*) plasmaCrawlEURL.java: new errorcode for encrypted documents
*) plasmaParserDocument.java: the extracted text can now be passed
   to this object as byte array or temp file   

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2679 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-30 09:31:53 +00:00
orbiter
df1629b05a - code cleanup
- version 0.471
- moved surftipps to own web page


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2676 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-29 22:27:20 +00:00
theli
c665f6cddb *) handling of quotes in charset string
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2674 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-28 06:29:15 +00:00
theli
009a33170b *) Content-Location header added
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2658 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-26 04:32:01 +00:00
theli
1aa07a52cd *) Bugfix for UnsupportedEncodingException if the media type contains multiple parameters
See: http://www.yacy-forum.de/viewtopic.php?p=25832#25826

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2654 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-24 15:50:51 +00:00
orbiter
ec031eb993 first version of surftipps
see http://localhost:8080/index.html

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2627 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-18 20:14:21 +00:00
theli
5afb0cbce8 *) setting default charset (for unkown documents) to iso-8859-1
*)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2620 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-18 11:39:06 +00:00
theli
97d2a08ef1 *) restructuring needed to support parsing of documents using various charsets
- serverFileUtils.java: 
   -- adding methods to copy from stream to writer and readers to writers
   -- moving httpc writeX methods into serverFileUtils class
   - serverCharBuffer.java: removing inheritance from Writer class
   - replacing htmlFilterOutputStream by htmlFilterWriter class which handles
     content as char stream
   - htmlFilterContentTransformer.java: deactivating getText mode 
    (still needs to be migrated to use char streams instead of byte streams)
   - changes in several classes to use htmlFilterWriter instead of htmlFilterOutputStream
   - changes in Scraper and Transformer classes to operate on chars instead of bytes
   - httpdProxyHandler.java: bugfix. clientTimeout setting was missing in config file

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2617 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-18 10:12:11 +00:00
theli
fc594e8eda *) adding httpContentLengthInputStream.java class to allow reading of http response bodies
until EOF even if a persistent connection is used
*) httpdByteCountInputStream.java: adding skip method
*) httpHeader.java: adding getCharacterEncoding function

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2616 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-18 10:00:28 +00:00
theli
2a06ce5538 *) next bugfix for UTF-8
- Sending UFT-8 messages to other peers did not work
   - httpd.java: minor corrections for UTF-8

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2570 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-13 15:47:56 +00:00
theli
bdc51591ae *) UTF-8 Bug solved (hopefully)
See: http://www.yacy-forum.de/viewtopic.php?p=25522

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2569 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-13 14:48:58 +00:00
theli
ef751b9d33 *) removing all string operations from the template engine
- engine should fully operate on bytes now

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2567 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-13 13:56:10 +00:00
theli
fded1f4a5d *) better handling of maximum file size limit in crawler
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2543 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-11 08:26:39 +00:00
theli
63893003be *) Adding settings page for the crawler which allows to specify a file size limit and the timeout to use.
*) adding first version of maximum filesize check for the crawler

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2534 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-09 15:06:49 +00:00
orbiter
9340dbb501 fixed all possible problems with nullpointer exception for LURLs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2513 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-07 18:24:39 +00:00
theli
a5ed86105b *) bugfix for handling of ResourceInfo object in proxy
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2512 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-07 15:50:45 +00:00
hydrox
59a5511dbb *) added missing static Strings as requested by theli
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2505 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-07 07:20:28 +00:00
theli
6578564c9a *) Ignore more hop by hop http headers
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2504 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-07 05:38:35 +00:00
theli
dae763d8e3 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2495 6c8d7289-2bf4-0310-a012-ef5d649a1542 2006-09-06 14:31:17 +00:00
theli
ffbf416e76 *) direct access to requestheader of htCache.Entry removed to make it more http independent
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2486 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 15:29:45 +00:00
theli
3870d615e3 *) setting htCache.Entry fields to private
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2485 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 15:06:58 +00:00
theli
393a7d10be *) setting htCache.Entry fields to private
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2484 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 15:03:54 +00:00
theli
1c8300fcec *) Bugfix for name resolution in proxy mode
See: http://www.yacy-forum.de/viewtopic.php?p=25241

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2477 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 11:23:57 +00:00
orbiter
d78b824e85 fixed problem with default path after first start-up
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2440 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-22 13:35:51 +00:00
orbiter
6ad471ef96 * applied many compiler warning recommendations
* cleaned up code
* added unit test code
* migrated ranking RCI computation to kelondroFlex and kelondroCollectionIndex


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2414 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-16 19:49:31 +00:00
allo
cf1186597b utf fix from theli
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2412 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-16 15:26:04 +00:00
theli
eee44be602 *) adding an interface for customized blacklist classes
- now it's possible to use a customized blacklist engine
     instead of the default one
   - this can be done by configuring the property BlackLists.class
   See: http://www.yacy-forum.de/viewtopic.php?t=2108

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2397 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-12 14:28:14 +00:00
theli
d2e8e76218 *) now it's possible to configure the yacy blacklist separately for dht, search, proxy, crawler
See: http://www.yacy-forum.de/viewtopic.php?t=2541
        http://www.yacy-forum.de/viewtopic.php?p=24516

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2389 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-12 02:42:10 +00:00
allo
a52f36787f better templatedebugging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2371 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-10 14:02:03 +00:00
allo
3480d36417 added some debug code
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2369 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-09 16:57:36 +00:00
orbiter
d468d665c9 some changes that may help to prevent deadlocks that cause an OutOfMemoryError
as described in
http://www.yacy-forum.de/viewtopic.php?p=24359

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2353 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-07 00:19:01 +00:00
theli
6e676224d0 *) adding support for upnp
A new port forwarding method for upnp was added.
   If this method is enabled, yacy automatically determines an UPnP 
   capable internet gateway and configures the gateway port forwarding
   settings properly. 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2328 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-26 14:26:45 +00:00
orbiter
97fa6788a1 added gettext support:
automatic replacement of string appearances in html files by
gettext quotes.
see also: http://www.yacy-forum.de/viewtopic.php?p=23901#23901

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2309 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-19 22:35:36 +00:00
allo
67c486a023 some example Code, how supertemplates can be used.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2304 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-19 07:08:15 +00:00
allo
7b0e2521bb Support for a supertemplate, which can do all thing, a normal template can do.
Its a layer under the servlets, this means, #[page]# will be replaced by serverletcode, the rest can be set by you.
(TODO: if we use this for layout, we need to read "TITLE" from the servlet's tp, to set it outside of the servlet.)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2302 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-18 15:51:19 +00:00
allo
8795875800 dirlisting for all empty directories.
no problem to update dir.java anymore, because its only in htroot/htdocsdefault needed.
migration to delete old dir.* files in the fileshare

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2294 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-17 15:49:42 +00:00
orbiter
3879a0ecd0 replaced java.net.URL usage by use of new class de.anomic.net.URL
This shall be seen as an experiment to exclude all cases where
there could be a DNS lookup during URL comparisment.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2290 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-13 01:21:53 +00:00
theli
b594ee9a5a *) Adding possibility to configure if the http proxy should send the
X-forwarded-for header (requested by TeeSee)
   See: http://www.yacy-forum.de/viewtopic.php?t=2577

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2257 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-29 16:01:03 +00:00
allo
6866bc2758 be quiet!
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2243 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-24 17:40:55 +00:00
theli
ed2cb040d1 *) Bugfix for http connection header validation
- Connection header was not handled correctly if it contains
     multiple values, e.g. Connection: TE, close 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2219 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-20 05:22:55 +00:00
allo
0621106ef3 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2214 6c8d7289-2bf4-0310-a012-ef5d649a1542 2006-06-18 12:15:26 +00:00
orbiter
12af69dd86 cosmetics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2212 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-18 11:49:31 +00:00
allo
67a8c74be3 Fix for dynamic login with static password.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2210 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-18 08:04:51 +00:00
allo
6fe2fed87e cookieauth works with static Admin.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2208 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-16 08:04:02 +00:00
allo
b23703f260 using cookieAuth.
logout for httpauth seems to be broken :-(

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2202 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-12 16:16:13 +00:00
allo
7f51a43cba disabled ipAuth for _p Pages (and broken Form-Login :-() for security reasons
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2201 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-12 14:18:38 +00:00
allo
bd22634c44 HTML-login, logout fixed.
TODO: If you login with the form, then logout with the form, and then try to login with httpauth, the first try will fail.
(should logged_out be resettet in ipAuth? but if there is ipAuth before proxyAuth, the logout would be broken. Maybe a combined method can help.)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2200 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-12 13:47:44 +00:00
hermens
3f1ebc097e Limit the size of the DNS cache to 5000 and the age of the entries to one day.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2199 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-12 12:14:11 +00:00
allo
d7a3fdb18b no white pages, when clicking cancel on the password-dialog
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2198 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-12 12:12:21 +00:00
rramthun
5625937d1c Language improvements
One very minor  HTML fix

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2181 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-06 16:30:32 +00:00
orbiter
26b6cddf51 synchronized the DNS cache, because the non-synchronized version resulted in deadlocks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2168 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-02 19:09:48 +00:00
orbiter
90d569d70f refactoring of index management:
url storage is part of index management; moved plasmaURL to indexURL

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2122 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-19 23:50:55 +00:00
theli
b4ab183518 *) Bugfix for NullpointerException if the seeds IP could not be resolved
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2099 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-15 10:50:10 +00:00
allo
9938c252dd better Errorhandling for proxyAccounts
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2082 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-11 13:12:35 +00:00
orbiter
015d044c25 tried to fix some problems with latest changes to httpc
very experimental!

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2078 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-10 16:01:14 +00:00
orbiter
55c5b41bd0 modified kelondroDyn to work better with new object caches
(removed own single object cache)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2077 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-10 13:57:31 +00:00
orbiter
fd7c17e624 added virtual host support:
all yacy-to-yacy communication now send the <peer-hexhash>.yacyh
virtual domain inside the http 'Host' property field.
This shall enable running a yacy peer on a virtual host.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2074 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-09 13:11:00 +00:00
theli
727aac4768 *) Bugfix for Transparent-Proxy-Support <-> Port Forwarding problem
See: http://www.yacy-forum.de/viewtopic.php?p=20358

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2039 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-25 05:29:20 +00:00
theli
cd4aeffea2 *) Bugfix: httpdFileHandler.java did not handle filenames with encoded chars correctly
See: http://www.yacy-forum.de/viewtopic.php?t=2265

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2036 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-23 11:01:31 +00:00
theli
76ea16a6cb *) Removing Keep-Alive header (is also a hopByHop header)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2034 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-22 15:00:35 +00:00
orbiter
b0036249c1 added some attributes to network picture
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2032 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-21 21:21:35 +00:00
rramthun
0604203bce Updated and corrected German language file
Changed Italian language file for an Italian/English interface and not Italian/German

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2024 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-18 11:37:03 +00:00
orbiter
14d6e476c9 tried to solve some problems with new picture viewer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2019 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-10 22:34:47 +00:00
orbiter
d8d0ac29c3 added image-viewer servlet that can do:
- each image that is requested is stored in the cache
- the image is taken from the cache if exists there
- the image can be scaled
The purpose of creation a scaled image is because of copyright problems
In a further stept the retrieval of not-shrinked images is restricted
to either access from localhost or with given authentication
This servlet can be used for image-preview purpose after an image search

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1989 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-02 22:59:53 +00:00
rramthun
42b0b10a95 -Adding Windows Media to types which are not sended compressed
-Renaming writeandzip to writeandgzip to avoid confusion about type of compression
-Adding new startup message to windows script
-The usual language "enhancements" ;-)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1953 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-23 20:12:23 +00:00
borg-0300
77f3237de3 adapted for isListed()
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1942 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-21 20:55:59 +00:00
borg-0300
399538b7de Bugfix: wrongly compared
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1898 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-16 20:57:06 +00:00
orbiter
3237fe1cc7 added IOException for httpc client error
see also http://www.yacy-forum.de/viewtopic.php?p=18615#18615

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1842 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-07 19:38:10 +00:00
theli
c7ececbfb2 *) httpd.mime: adding jar mimetype
*) httpd.java: charset is only appended to mimetype for text mimetypes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1839 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-07 15:58:50 +00:00
theli
759800f543 *) Bugfix for storeHTCache problem
- content was not indexed if storeHTCache was off
   See: http://www.yacy-forum.de/viewtopic.php?p=18269
   See: http://www.yacy-forum.de/viewtopic.php?t=1882
   See: http://www.yacy-forum.de/viewtopic.php?t=241

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1800 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-03 08:30:08 +00:00
orbiter
ce5274c194 yacybot user agent
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1786 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-28 19:08:58 +00:00
orbiter
34341a868e code cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1701 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-19 00:39:16 +00:00
theli
aa87df35e1 *) To avoid confusion location will now also be displayed for own peer
See: http://www.yacy-forum.de/viewtopic.php?p=17283#17283

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1692 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-18 10:27:56 +00:00
rramthun
15ed57f9b7 Updated German language, by VT100, NN, rramthun
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1690 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-17 21:23:45 +00:00
allo
3b4a99ff6a fix for java 1.4.x
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1685 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-17 17:55:13 +00:00
theli
9b941fb773 *) bugfix for usage of yacy with extended port binding (e.g. #eth0:8080, 192.168.0.1:8080, etc.)
- port was reported incorrectly to other peers


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1678 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-17 10:53:20 +00:00
allo
2d4e1325cf UTF-8 fix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1676 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-16 21:33:41 +00:00
hermens
c8f5adea4d - don't send Message Body on HEAD requests, even in the case of an error
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1669 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-16 11:45:32 +00:00
theli
a7248fbb0a *) bugfix for http/0.9 responses
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1668 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-16 11:07:17 +00:00
theli
a354bc2ec1 *) Bugfix for content length check
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1666 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-16 10:54:47 +00:00
hermens
e974d0cb99 Improve compliance to rfc
*) There is no status line in HTTP/0.9
*) Answers to HEAD requests should return the same headers as a GET request



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1664 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-16 10:27:21 +00:00
theli
556d242be0 *) Limited support of content-range requests
- a simple continue download request should work now

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1663 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-16 09:23:27 +00:00
theli
8fcb25f9f9 *) Setting via header according to rfc
- can be disabled via settings dialog

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1662 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-16 09:20:57 +00:00
theli
040624e361 *) better support for http head requests of servlets
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1648 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-15 12:51:24 +00:00