Commit Graph

5123 Commits

Author SHA1 Message Date
f1ori
5ce9d81955 * remove class-file from old location
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5442 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-05 21:19:30 +00:00
low012
1af728ae09 *) regex for site operator changed as proposed by Lotus
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5441 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-05 18:30:34 +00:00
lotus
c8451614f3 fix for overflow
http://forum.yacy-websuche.de/viewtopic.php?p=11696#p11696

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5440 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-05 18:28:27 +00:00
low012
9e58ae036d *) added site operator which can be used to only show results from a certain domain. example: "test site:edu" shows only documents which contain the word test and which come from an edu domain
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5439 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-04 14:58:32 +00:00
low012
19e7c56f7f *) apply filter to dir list to only show .black files as blacklists
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5438 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-04 10:14:19 +00:00
orbiter
c4c4c223b9 fixed a problem with attribute flags on RWI entries that prevented proper selection of index-of constraint
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5437 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-04 02:27:29 +00:00
orbiter
6072831235 no cr transmission for robinson peers
see also: http://forum.yacy-websuche.de/viewtopic.php?p=10290#p10290

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5436 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-03 23:44:42 +00:00
low012
4bffe664ca *) moved entry field for new expressions to top of the list as requested in forum (http://forum.yacy-websuche.de/viewtopic.php?f=9&t=1678)
*) added some Javascript to disable list selection on bottom of list in cases it is not needed (edit, delete) and only enable it if needed (move), if JS is turned off everything will work as usual

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5435 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-03 10:18:48 +00:00
low012
afe98bc11c *) added changes as proposed by Halborinda in http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1674
*) changed indention

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5434 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-03 08:24:08 +00:00
orbiter
07fc115e90 removed active profiling in kelondroRowSet
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5433 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-02 12:33:06 +00:00
orbiter
be4c458951 refactoring (implemented Iterable in kelondroRowCollection)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5432 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-02 11:38:20 +00:00
low012
bb5c2cd12e *) ISINDEX parameters will not be put on commandline anymore to prevent possible security hazards (better safe than sorry). Parmeters will have to be read from QUERY_STRING in ISINDEX case too which does not seem to be uncommon behaviour for web servers: http://vms.pdv-systeme.de/users/martinv/cgi_basics/cgi_basics.html#Datenuebergabe
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5431 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-02 11:18:26 +00:00
orbiter
b6bba18c37 replaced the storing procedure for the index ram cache with a method that generates BLOBHeap-compatible dumps
this is a migration step to support a new method to store the web index, which will also based on the same data structure. made also a lot of refactoring for a better structuring of the BLOBHeap class.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5430 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-01 22:31:16 +00:00
low012
db1cfae3e7 *) cleaning up after myself
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5429 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-01 19:45:15 +00:00
low012
f547f9a78c *) added CGI capabilities (run Perl scripts and other software via HTTP GET and POST)
*) set cgi.allow to true in yacy.conf to enable CGI (CGI is disabled by default)
*) edit cgi.suffixes in yacy.conf if necessary to use additional script types

ATTENTION: This is a rather experimental feature, not all environment variables are set yet. 

Only enable CGI if you know what you are doing. Poorly implemented CGI scripts can put a system's integrity at risk!

Implementation of more environment variables and documentation due for the next days.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5428 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-01 19:40:06 +00:00
f1ori
bdc380cd84 * add lastModified to templateCache
-> no outdated files from cache anymore...


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5427 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-01 14:56:53 +00:00
f1ori
6792c2a07d * change mime type of xml documents from application/xml to text/xml
-> for easier Javascript requests


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5426 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-01 12:30:51 +00:00
f1ori
cb1e887027 * move svnRevNr classes to libbuild
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5425 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-31 19:58:22 +00:00
f1ori
025094675f * remove empty directory
* add necessary dependency for pdfParser


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5424 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-31 19:39:02 +00:00
f1ori
c5691180cb * skip style-tags in HTML-files
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5423 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-31 19:34:24 +00:00
low012
9d5d30f877 *) http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1672
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5422 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-31 16:50:10 +00:00
orbiter
5448aad328 removed unused code
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5421 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-30 12:12:00 +00:00
orbiter
3567c58b18 added another filed information for BLOBHeap dumps: the gaps
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5420 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-30 10:49:43 +00:00
orbiter
abdd4aa414 added a index dump for blob heaps:
this will increase the shutdown time for at most some seconds, but will speed up the start-up

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5419 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-29 21:36:27 +00:00
orbiter
28d2d28573 added support for filetype search
(just use filetype:<type> in the search query)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5418 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-29 17:57:04 +00:00
orbiter
8c3205b62e fix for OOB Exception
see http://forum.yacy-websuche.de/viewtopic.php?p=11598#p11598

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5417 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-29 17:36:53 +00:00
orbiter
78c568331e added test channel to /xml/feed.rss
can be obtained with 
http://localhost:8080/xml/feed.rss?set=TEST
returns always a single feed entry with a fresh date

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5416 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-29 12:39:07 +00:00
orbiter
e004da48d3 - added fast fingerprint computation for files (any). Will be used in new index dump method
- refactoring

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5415 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-29 12:22:13 +00:00
low012
eab72424df *) Fixed small bug: When adding new elements to blacklist via import, the blacklist which the elements were added to was supposed to be displayed, which did not work correctly.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5414 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-28 09:58:02 +00:00
low012
0e56675596 *) cleaning up ;-)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5413 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-27 20:09:36 +00:00
low012
cf69557ea2 *) blacklists can be exported as XML or plain text now
*) blacklist import via file upload works now

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5412 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-27 15:38:20 +00:00
low012
1594a15be9 *) explicit mentioning of blacklist in blacklist cleaner
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5411 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-27 13:06:05 +00:00
f1ori
2d2ce24011 * remove all encoding-stuff from proxy
encoding is handled by parsers or browser, proxy only passes through


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5410 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-23 19:14:54 +00:00
f1ori
73c8a0839c * abort download, when proxy connection is closed
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5409 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-23 11:30:24 +00:00
orbiter
bb935fdbb0 less organization overhead for DNS caching and prefetching
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5408 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-23 10:06:49 +00:00
f1ori
4907697cfa * make fileuploads through proxy bigger than 65500 bytes possible
* remove gzip-encoding for files from cache


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5407 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-22 23:04:00 +00:00
orbiter
fc8189f3fb better self-healing of corrupted databases
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5406 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-22 16:43:49 +00:00
f1ori
963da8c3f9 * updated tm-extractors to new version 1.0
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5405 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-21 14:51:03 +00:00
f1ori
51f1a1927c * remove saaj.jar and axis.jar and references to it (was for soap-stuff?)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5404 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-21 13:06:04 +00:00
low012
5a89266598 *) new parameters for future use (better blacklist handling for im- and export)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5403 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-19 19:33:08 +00:00
orbiter
e34ac22fbd - added new monitoring servlet at
http://localhost:8080/PerformanceConcurrency_p.html
- used the new monitoring to do some fine-tuning of the indexing queue

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5402 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-19 15:26:01 +00:00
lotus
449e697436 fix for null-seed in seedfile
http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1653

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5401 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-19 12:10:01 +00:00
orbiter
d376d81fc4 replaced busy thread control of crawl stacker by blocking threads
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5400 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-18 23:18:34 +00:00
orbiter
f29b48d9ff patch for IndexOutOfBoundsException
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5399 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-18 22:05:26 +00:00
f1ori
0881190b19 * Robots.txt: don't interpret Crawl-Delays for other robots
fixes: http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1647


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5398 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-18 15:35:41 +00:00
orbiter
243e73f53b removed unnecessary usage of kelondroBLOBTree
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5397 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-18 00:18:37 +00:00
orbiter
8cb7170b75 - set status of kelondroTree, kelondroBLOBTree and kelondroFlexTable to deprecated
- removed initialization and/or usage of kelondroFlexTable (should meanwhile not be used any more)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5396 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-18 00:08:17 +00:00
orbiter
7535fd7447 - refactoring of CrawlEntry and CrawlStacker
- introduced blocking queues in CrawlStacker to make it ready for concurrency
- added a second busy thread for the CrawlStacker
The CrawlStacker is multithreaded. It shall be transformed into a BlockingThread in another step.
The concurrency of the stacker will hopefully solve some problems with cases where DNS blocks.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5395 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-17 22:53:06 +00:00
lotus
6569cbbec1 npe fix: http://forum.yacy-websuche.de/viewtopic.php?t=1646
(break to avoid bad side effects)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5394 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-16 20:53:31 +00:00
lotus
18513e2ee2 npe fix: http://forum.yacy-websuche.de/viewtopic.php?t=1646
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5393 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-16 13:36:13 +00:00