Commit Graph

2778 Commits

Author SHA1 Message Date
orbiter
fe41a84330 some enhancements in web caching: avoid double loading of response metadata and/or content
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6491 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-19 10:17:26 +00:00
orbiter
4c6312d103 enhanced image search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6489 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-18 23:56:05 +00:00
orbiter
2d8f3ee301 some performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6488 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-18 16:03:28 +00:00
orbiter
36fbfdcb21 more performance for remote search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6487 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-18 15:13:06 +00:00
orbiter
5c7b32a4fa better performance for list api (blacklist transfer)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6486 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-18 15:11:52 +00:00
orbiter
013f337d3f - avoid unnecessary host name lookups for localhost
- avoid unnecessary reverse domain name lookups for remote access

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6481 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-16 23:00:54 +00:00
orbiter
29fe436e36 - fixed post-ranking including prefer mask
- enhanced a core database access method / less wasted ram

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6473 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-09 19:14:51 +00:00
orbiter
5399d1e2bc refactoring (reason: get more abstraction to use the blacklist class; for integration in other servlets)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6471 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-08 22:58:57 +00:00
orbiter
4c99d4683d possible fix for lost crawl profile handles: clean-up job did wrong measurement to see if crawl is still running.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6465 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-06 23:15:20 +00:00
orbiter
18b21eaffe small fixes to search default values and server logging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6460 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-06 19:13:35 +00:00
orbiter
4431b9767e added about 450 replacements for printStackTrace() methods to pipe such traces into the log at DATA/LOG/
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6458 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-05 20:28:37 +00:00
orbiter
e3025ee691 - new icon for OAI-PMH loading action
- added many stack trace outputs for exceptions in crawl profile handler to find the 'missing profile handle' bug
- catched one more timeout exception in httpd file loader

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6457 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-05 16:40:15 +00:00
orbiter
f0b8db93f0 - more abstraction of serverCore thread access
- no more keep-alive when number of connections exceeds 1/2 of the allowed number of connection

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6456 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-05 14:54:43 +00:00
orbiter
11f7da06ed - fixes to csv parser
- automatic OAI-PMH import by just clicking on one link from the provided resource list

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6449 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-03 21:18:19 +00:00
orbiter
9b6762ec2e - added a csv "comma separated values" parser to parse OAI-PMH sources from
http://roar.eprints.org/index.php?action=csv
- integrated the csv parser into the crawlers parser list
- added an extension to the OAI-PMH import function to download and show the roar csv file using the csv parser

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6448 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-03 20:10:59 +00:00
orbiter
0f63de8236 - it is now possible to start several OAI imports concurrently
(still not possible to start them with one single request, that will be next)
- added a monitor for all running and finished OAI imports (with a little bit of animation..)



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6447 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-03 16:15:22 +00:00
orbiter
176e334aa4 fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6446 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-02 19:23:05 +00:00
orbiter
2fa6bf440b workflow update to OAI-PMH importer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6445 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-02 18:19:30 +00:00
orbiter
b0b7a4f9a5 - added function to OAI-PMH reader that can pull all records from a server using an evaluation of the resumption token to get URL to retrieve remaining records
- added monitoring for retrieved records

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6444 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-02 11:53:14 +00:00
orbiter
350d13e153 very first working version of oai-pmh importer: if given the right url, the importer can read and index listRecord xml files and calculate the right resumptionURL which is then given as next default start point for the importer url input.
no automatic harvesting by now, this will be done later

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6443 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-02 00:14:14 +00:00
orbiter
a0e891c63d - some redesign in UI menu structure to make room for new 'Content Integration' main menu containing import servlets for Wikimedia Dumps, phpbb3 forum imports and OAI-PMH imports
- extended the OAI-PMH test applet and integrated it into the menu. Does still not import OAI-PMH records, but shows that it is able to read and parse this data
- some redesign in ZURL storage: refactoring of access methods, better concurrency, less synchronization
- added a limitation to the LURL metadata database table cache to 20 million entries: this cache was until now not limited and only limited by the available RAM which may have caused a memory-leak-like behavior.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6440 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-31 11:58:06 +00:00
orbiter
30f108f97d added stub of oai-pmh importer (not working yet)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6437 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-30 15:58:04 +00:00
orbiter
52470d0de4 - fix for xls parser
- fix for image parser
- temporary integration of images as document types in the crawler and indexer for testing of the image parser

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6435 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-22 22:38:04 +00:00
orbiter
5e8038ac4d - refactoring of blacklists
- refactoring of event origin encoding


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6434 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-21 20:14:30 +00:00
orbiter
26fafd85a5 - more refactoring
- fixed problem with parsers

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6433 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-21 15:12:34 +00:00
orbiter
3528b970d6 - refactoring
- added new experimental (not-yet-working) image parser
- added new test image

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6431 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-19 22:34:44 +00:00
orbiter
b79f4f062f refactoring of yacy documents and parsers: they depend now only on the kelondro classes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6426 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-18 00:53:43 +00:00
low012
519c3619ff *) minor changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6424 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-17 00:32:07 +00:00
low012
f5656b2ae1 *) Made sure that only files with appropriate file endings are listed as skin or language files.
*) Introduced protection against directory traversal attacks in configuration servlets for skin and language configuration. Files can only be deleted if they are contained in a list of files which has been read by the servlet first.


Until now it was possible to delete any data on a system YaCy is running on and which can be deleted by the user who's account has been used to start YaCy. Most of the times a user of YaCy is also the owner of the machine the peer is running on, but this might not always be the case and not even the owner of the machine should be able to use YaCy as a replacement for "rm" or "del".

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6423 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-17 00:26:14 +00:00
low012
3434ca381f *) grrr
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6422 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-16 22:17:21 +00:00
low012
ae42c51cf7 *) Skin names and language names are displayed in alphabetical order in dropdown menu now.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6421 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-16 22:16:36 +00:00
suessthomas
56a5bd090d Small fixes to header.template for more XHTML compatibility.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6420 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-16 20:31:06 +00:00
orbiter
76bca8cffd show interactive search without menu
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6417 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-15 13:26:14 +00:00
orbiter
3d5eeb842a new default skin 'pdblue'
The old default skin named 'default' is renamed to 'classic-blue'.
All users will keep their current default skin named default, but YaCy will copy the classic-blue also to the skin folder.
For all new peers, the new skin pdblue is used.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6416 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-15 12:59:44 +00:00
orbiter
c864901087 - moved httpd.mime to defaults path
- some documentation fixes
- adopted a default setting for the search window: moves css setting to base.css
- some enhancements for the DocumentIndex class

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6410 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-14 13:29:09 +00:00
orbiter
5841ee83d3 refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6400 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-11 21:29:18 +00:00
orbiter
ce8dc575ca refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6398 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-11 00:12:19 +00:00
orbiter
bea3b99aff moved table and util classes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6397 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-10 01:14:19 +00:00
orbiter
1e4f8b56ed accumulated classes from different packages into the new rwi package
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6394 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-10 00:39:15 +00:00
orbiter
194da25a2f moved kelondro index
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6393 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-09 23:32:08 +00:00
orbiter
4446acc8cd moved kelondro order
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6392 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-09 23:22:22 +00:00
orbiter
f677d534b1 start of a really extensive refactoring which will produce a hierarchical package structure with the domain yacy.net as package root
- moved here the logging classes as part of the new net.yacy.kelondro package

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6391 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-09 23:13:30 +00:00
orbiter
735e2737e3 * added index segments
This is a major change in the organization of indexes.
Please consider a back-up of your data before you run this update.
All existing index files will be moved and renamed to a new position.
With this change, it will be possible to maintain different indexes for different purposes and it will be possible to have a distinction between DHT-in and DHT-out specific indexes. Tenants may also have their own index, and it may be possible to have histories and back-ups of indexes. This is just the beginning, many servlets must be adopted after this change, but all functions that had been there should still work.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6389 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-09 14:44:20 +00:00
orbiter
04a548a1e3 - temporary integrated the transferURL servlet as static class instead as a class that is called using reflection to investigate the OOM problems in that class
- fixes for numerous other problems
- removed dead code
- resdesign of the strings-method, which produces now less memory overhead and may help to prevent OOMs
- another fix for the deadlock problem in SplitTable

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6373 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-05 20:11:41 +00:00
orbiter
6aa474f529 - better logging for web cache access and fail reasons
- better Exception handling for web cache access
- distinction between access of web cache for proxy and crawler


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6367 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-01 13:08:19 +00:00
orbiter
0c17b600c6 remote search by default off
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6365 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-30 15:06:29 +00:00
orbiter
a995b95367 tried a fix for the httpd access bug (too many unclosed sessions)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6362 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-30 13:18:02 +00:00
low012
a6a3090c3d *) blacklist cleaner supports usage of regular expressions now
*) refacored BlacklistCleaner_p.java for better readability
*) moved check of validity of patterns to the Balcklist implementation since patterns might be valid in one implementation, but not in another
*) added method to check validity to Blacklist interface
*) fixed some minor issues like typos or wrong whitespaces
*) set subversion properties for a whole bunch of files

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6359 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-29 21:28:49 +00:00
low012
3c4064932c *) added width and hight to prevent the page from "jumping" when the image is reloaded automatically in Opera 10
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6349 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-26 22:32:52 +00:00
low012
5e4f267a36 *) added subversion properties and edited a few comments
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6348 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-26 22:07:40 +00:00