Commit Graph

178 Commits

Author SHA1 Message Date
f1ori
90e78b2cf6 * improve encoding detection of http service
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5337 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-12 21:06:32 +00:00
f1ori
7e1fe05e3c * added utf8-encoding to many getBytes-calls
* utf8 should work now


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5323 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-08 20:24:31 +00:00
f1ori
4b4ce75396 * http-server: submit charset from html metatags
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5314 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-01 23:17:51 +00:00
f1ori
ba76995d2c * fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1415
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5140 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-12 10:54:11 +00:00
orbiter
05dbba4bab added logging conditions to all fine and finest log line calls
this will prevent an overhead for the generation of the log lines in case that they then are not printed

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5102 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-03 00:30:21 +00:00
danielr
d6d9b0f14a fixed transferRWI.html 'Read timed out'
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5092 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-28 08:37:51 +00:00
orbiter
536e77e8b7 modifications towards a single database operation to read/write http header and cached file at once:
- removed distinction between header file types for http and ftp; ftp is simulated by using http properties
- removed all old resourceInfo classes that handled this distinction
- introduced a new distinction between http request and http response objects
- unified new response objects with two other object types that had been introduced elsewhere
- changed all servlet call methods to use the new http request header object type
- divided static object keys for http header properties into request and response types
- refactoring here and there (a large number of type changes and many methods merged/moved)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5079 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-25 18:11:47 +00:00
danielr
621b473b18 * removed some warnings of findbugs (http://findbugs.sf.net)
- removed unnecessary code (unused variables, String.toString)
- corrected some calculations (cast int to double or long ;)
- improved little performance (using Integer.valueOf() instead of new Integer)
- log if some File-actions fail (mkdir(), delete(), ...) and some ignored exceptions
- finalized some (more) fields
- finally close some streams
- made inner classes static if not using environment
- generalized some equals (from specificClass to Object)
- fixed some potential nullpointer accesses


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5039 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-06 19:43:12 +00:00
danielr
17b7845eb5 * refactoring
- moved constants from plasmaSwitchboard to own class (all 232 ;)
- moved remoteProxy-Methods to httpRemoteProxyConfig, better names
- removed some unnecessary code (else-statements)
* formatting (correct indentation)
* minor bugfixes (due to findbugs.sf.net)
* hopefully fixed "missing quote" (announcing StringParts as UTF-8)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5031 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-02 13:57:00 +00:00
danielr
3bb870bfcd added final where possible
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5030 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-02 12:12:04 +00:00
orbiter
c3d461d191 - removed superfluous copyright statement
- updated my email address

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5011 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-20 17:14:51 +00:00
danielr
da917cf4b1 undo reduced menu
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4960 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-30 07:11:13 +00:00
danielr
0c1dc703e4 - set staticIP at startUp
- added setting for reduced menu (simpleMenu)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4959 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-29 18:35:15 +00:00
danielr
7feae906aa - organize imports
- removed potential null pointer accesses
- removed unnecessary casts


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4893 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-06 16:01:27 +00:00
orbiter
6f1a3fce05 BF Bugfix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4869 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-29 17:27:38 +00:00
lotus
b09af53643 proper links to directories in repository dirlisting
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4867 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-28 19:23:44 +00:00
orbiter
faed00d75d added use cases to basic configuration
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4831 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-20 13:21:55 +00:00
orbiter
c1d721dd2d fix for attacks on localhost-authorized peers from web pages with links to localhost addresses:
checking of referer in access

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4828 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-19 22:17:53 +00:00
orbiter
3bd1db776a implemented switch for admin authorization from localhost:
- access is granted for localhost users to administration pages by default
- the default setting can be changed in the BasicConfig.html page
- if the BasicConfig page was accessed with post and no password was submitted, a random password is generated
- a headless installation MUST give a password upon first call of the configuration page, otherwise they will not be able to access it again
- if no password is given within 10 minutes after start-up, a random password is generated

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4804 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-15 11:26:43 +00:00
orbiter
239cc4428d - better domain graph, faster when more links exist, looks better
- new authorization rule: localhost is always authorized for administration. This solves many problems with ajax, and also fixed a problem in rssTerminal
- fix bug in RSSFeed which prevented that entries had been recognized as individual, new entries
- added reloading/updating of status image on status page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4796 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-12 22:23:29 +00:00
lotus
9bc56a9edc xss protection
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4772 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-07 16:37:13 +00:00
orbiter
d2ba1fd2ab major step forward to network switching (target is easy switch to intranet or other networks .. and back)
This change is inspired by the need to see a network connected to the index it creates in a indexing team.
It is not possible to divide the network and the index. Therefore all control files for the network was moved to the network within the INDEX/<network-name> subfolder.
The remaining YACYDB is superfluous and can be deleted.
The yacyDB and yacyNews data structures are now part of plasmaWordIndex. Therefore all methods, using static access to yacySeedDB had to be rewritten. A special problem had been all the port forwarding methods which had been tightly mixed with seed construction. It was not possible to move the port forwarding functions to the place, meaning and usage of plasmaWordIndex. Therefore the port forwarding had been deleted (I guess nobody used it and it can be simulated by methods outside of YaCy).
The mySeed.txt is automatically moved to the current network position. A new effect causes that every network will create a different local seed file, which is ok, since the seed identifies the peer only against the network (it is the purpose of the seed hash to give a peer a location within the DHT).
No other functional change has been made. The next steps to enable network switcing are:
- shift of crawler tables from PLASMADB into the network (crawls are also network-specific)
- possibly shift of plasmaWordIndex code into yacy package (index management is network-specific)
- servlet to switch networks 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4765 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-05 23:13:47 +00:00
danielr
d7b21bc90c re-added gzip POST for transferRWI/URL (HTTP/1.1 compliant)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4761 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-04 10:53:04 +00:00
danielr
d4bce6affd refactoring (initialized static fields, removed empty if/else, serialized some fields in serializable classes)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4755 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-03 09:06:00 +00:00
danielr
5c3c1fdf41 replaced httpc with Apache Jakarta Commons HttpClient (includes some refactoring ;)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4640 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-05 13:17:16 +00:00
orbiter
daa04f5db9 added additional check in file handler to prevent that url attacks are hidden in url path encodings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4637 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-04 12:15:27 +00:00
orbiter
541b817502 refactoring of switchboard queueing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4591 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-22 01:28:37 +00:00
orbiter
fa1090113d - next try to fix the networking problem:
set the maximum transfer size to less than MTU=1500-52: buffer size <= 1448
- some refactoring of transfer methods (naming)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4558 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-14 00:16:04 +00:00
orbiter
9eddc1506b - one try to fix the httpd problem
- fix for handling of collection index that appears when removing elements
- added another navigation method (stub, not working yet)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4543 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-09 23:58:22 +00:00
orbiter
3f321ece7d added a search history to the new search page
the history distinguishes between different users and identifies them by their ip
a history is only shown to the user who submitted the search

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4510 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-25 21:26:49 +00:00
orbiter
87a8747ce3 - enhanced recognition, parsing, management and double-occurrence-handling of image tags
- enhanced text parser (condenser): found and eliminated bad code parts; increase of speed
- added handling of image preview using the image cache from HTCACHE
- some other minor changes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4507 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-25 14:08:15 +00:00
orbiter
6c3cd2b4f2 - added new way to watch images from the image search:
they appear as separate, floating window above the search results,
  not in a new window
- added highslide javascript library for feature mentioned above
- removed dir servlet. This thing was not used as it was supposed to be (as an example applet)
  and was a major problem for intranet-indexing when files are hosted on the same peer.
- added yacy-httpd-internal directory listing. Because YaCy is a search engine,
  directory listings are similar to search result listings. Intranet indexing from the same peer
  will get nice index pages for document collections.
- removed unused test applet

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4494 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-18 16:38:06 +00:00
orbiter
bd63999801 - faster search: using different data structures that avoid multiplr calculations
- no more table copy for error-eco table
- optional table copy for lurl-entries
- more abstractions (less single constant strings)
- better logging (using host names instead of ips)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4459 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-07 22:16:36 +00:00
orbiter
0f5c4abaca more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4414 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-29 10:12:48 +00:00
orbiter
1a296af6ff more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4412 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-28 20:08:32 +00:00
orbiter
4a80902081 - added ViewProfile as rdf in foaf syntax
- added link to rdf and vCard version on html page
- can be seen on http://localhost:8080/ViewProfile.html?hash=localhash
- more generics

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4411 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-28 18:21:08 +00:00
orbiter
03e7782269 more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4305 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-06 19:23:38 +00:00
fuchsi
d517e96714 last cleanup bits to serverDate before the release. only safe refactoring (method renaming) changes outside of serverDate.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4289 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-21 00:53:46 +00:00
orbiter
e22014dc83 some memory enhancements when generating and displaying ymage objects
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4253 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-07 02:15:12 +00:00
fuchsi
425e4ead66 Allow absolute paths in configuration settings.
- before absolute paths would be expanded incorrectly, e.g.: fooPath=/a/b/c would become /path/to/yacy/root/a/b/c. Now you can put nearly every dynamically generated data with a configurable path to a location outside of yacys root dir without having to use symlinks (probably good for third party distribution packaging).
- abstractServerSwitch.getConfigPath(setting, default) returns a File instance, either with an absolute path or relative to the applications root path.

- exceptions (hardcoded): 
  DATA/LOG/yacy.logging
  DATA/SETTINGS/httpProxy.conf
  DATA/SETTINGS/user.db
TODO: all of these are the global configuration files and they should probably be put into _one_ command line configurable settings path, so it would be possible to package them in /etc/ for example.

- add missing workPath to yacy.init (it was used in code, but there was no default in the file)
- fix broken skinPath (was skinsPath in yacy.init but skinsPath in the code) + a few other broken config reading caused by typos.
- replaced path setting names and their default values with the related static fields in plasmaSwitchboard where not already done/existing

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4196 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-04 10:36:25 +00:00
orbiter
3cb9cdc9be try to fix connection problem, possible cause for wrong junior status and non-passive passive peers:
the YaCy client treats disconnections during data transmissions as error and discards all data transmitted so far
this did not happen so far until I removed a delay time at the end of the daemon session which prevented this case.
To fix this problem, disconnections during transmissions are not treated as error now, which means that end-of-transmissions
with sudden disconnections are not a cause for peer diconnections any more. To be nice to non-updated peers, the sleep time
at the end of server sessions is also re-enabled.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4105 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-23 17:31:29 +00:00
orbiter
daf0f74361 joined anomic.net.URL, plasmaURL and url hash computation:
search profiling showed, that a major amount of time is wasted by computing url hashes. The computation does an intranet-check, which needs a DNS lookup. This caused that each urlhash computation needed 100-200 milliseconds, which caused remote searches to delay at least 1 second more that necessary. The solution to this problem is to attach a URL hash to the URL data structure, because that means that the url hash value can be filled after retrieval of the URL from the database. The redesign of the url/urlhash management caused a major redesign of many parts of the software. Since some parts had been decided to be given up they had been removed during this change to avoid unnecessary maintenance of unused code.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4074 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-05 09:01:35 +00:00
orbiter
4779f314fe first version of next-generation search interface:
- snippets are not fetched by browser using ajax, they are now fetched internally
- YaCy-internat threads control existence of snippets and sort out bad results
- search results are prepared using SSI includes
- the search result page is visible right after the search request, the results drop in when they are detected
- no more time-out strategy during search processes, results are shifted within queues when they arrive from remote peers
- added result page switching! after the first 10 results, the next page can be retrieved
- number of remote results is updated online on the result page as they drop in
- removed old snippet servelet (which had been also a security leak btw)
- media search is broken now, will be redesigned and fixed in another step


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4071 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-03 23:43:55 +00:00
orbiter
f9e6cf6a3d more refactoring of search:
integrated first version of ssi-using search interface,
but the function is currently disabled


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4063 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-28 12:15:46 +00:00
orbiter
bb426565f0 added new yacy protocol for mass url-pull for better remote crawling distribution
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4056 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-22 00:59:05 +00:00
orbiter
61f93cbf14 some code-cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4040 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-11 00:42:04 +00:00
orbiter
24e25e1141 enhanced SSI server-side support:
- SSIs may now refer to servlets, not only files
- calling a servlet, the servlet/SSI engine is called recursively
- SSIs now work also for non-chunked-encoding supporting clients
This will support the new search page functionality, to show search results
dynamically without using javascript. To test this method, a test page has been added
http://localhost:8080/ssitest.html
..calls dynamicalls 3 servlets, which produce some delays during their execution
please verify that you can see the result step-by-step on your browser
To implement this feature, some refactoring had been taken place, mostly code
had been made static and will execute faster.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4037 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-09 21:58:38 +00:00
orbiter
75d1437340 fix for http://forum.yacy-websuche.de/viewtopic.php?p=1123#p1123
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4002 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-24 13:39:53 +00:00
orbiter
6758beae9c fix for http://forum.yacy-websuche.de/viewtopic.php?p=1092#p1092
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3999 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-23 09:54:12 +00:00
orbiter
a9e73b6852 fixed great mess with localization paths. the problem was:
automatic re-translation after update did not work. hopefully now

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3952 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-04 10:32:30 +00:00