Commit Graph

3397 Commits

Author SHA1 Message Date
orbiter
d7a493b4f5 added experimental timeline api
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5672 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-06 16:01:29 +00:00
orbiter
efcd95dc37 simplification of (internal) query process / refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5671 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-06 15:53:20 +00:00
orbiter
f1b712c29a small corrections to image loading methods in result presentation
especially loading of favicons in search results. This is a fix that
affects only searches in intranet/repository configurations.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5670 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-06 15:39:02 +00:00
orbiter
d4b56d5819 added more asserts to BLOBHeap.flushBuffer() to fix the problem described in
http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1679&hilit=&p=13109#p13109

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5666 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-03 23:24:19 +00:00
f1ori
c545fcb9fa * add class to handle keys and signatures
* fix bug in serverCharBuffer
* add build-target to sign tar.gz (run ant dist sign)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5665 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-02 13:29:50 +00:00
orbiter
aa44d9bad9 more refactoring of kelondro.text / deleted de.anomic.index
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5664 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-02 11:04:13 +00:00
orbiter
6ffc6e3389 more refactoring of indexer and kelondro classes;
- integrating the indexer into kelondro as package 'text'
- renaming of classes in kelondro.index

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5663 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-02 10:00:32 +00:00
orbiter
404bc21da9 simplification of (internal) query process / refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5662 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-02 08:48:27 +00:00
orbiter
76ef5f0f14 refactoring of index package: better names for the classes (to be continued)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5661 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-01 23:58:14 +00:00
orbiter
2df57b1fd1 refactoring of index collection class
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5660 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-01 23:07:45 +00:00
lotus
39a177649b * added upnp listener for devices that do not respond to discovery but advertise themselves
* moved package

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5659 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-28 14:36:23 +00:00
orbiter
d1d9fbae5c enabling the URLAnalysis to operate on multime input files, just use a wild card when calling the class from the command line
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5658 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-26 23:47:41 +00:00
orbiter
c728879ab8 fixes to yacyURL - more exceptions in case that urls are strange
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5657 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-26 22:33:47 +00:00
orbiter
7542336ae5 performance enhancement to yacyURL: omit second processing of resolveBackpath. This method is already applied during initialization of the object and was called a second time when the url was exportet.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5656 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-26 21:52:32 +00:00
orbiter
7ea53fe47b added another url list transformation option:
- check the list and kick out entries with lines that contain not valid urls
- normalize the urls
- remove doubles
- sort the list
- split the list in smaller chunks
This is all done in one process which can be called with a new -sort option

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5655 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-26 21:51:23 +00:00
orbiter
e521e81148 bugfix in yacyURL (for latest performance hack)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5654 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-26 07:46:47 +00:00
orbiter
54625360f7 performance update
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5653 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-25 23:27:21 +00:00
orbiter
d884c4718a added gzip support for URLAnalysis:
url lists can also be compressed with gzip
If such a file is handed over to URLAnalysis, the output will also be written as .gz-file

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5652 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-25 13:40:51 +00:00
orbiter
46632f4385 performance update to yacyURL
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5651 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-25 00:17:34 +00:00
orbiter
cf9b74e6e3 added another method to process url lists: extract hosts only
This can be used like
java -Xmx2000m -cp classes de.anomic.data.URLAnalysis -host DATA/EXPORT/20090224213823.txt

changed als the call method to generate statistics, please use now
java -Xmx2000m -cp classes de.anomic.data.URLAnalysis -stat DATA/EXPORT/20090224213823.txt


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5650 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-24 22:51:07 +00:00
orbiter
89d8e824ed memory protection for URLAnalysis
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5649 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-24 22:05:09 +00:00
orbiter
0f6fa804ff performance update to URLAnalysis
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5648 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-24 21:35:33 +00:00
orbiter
8444357291 added new row interator in kelondro tables files that enumerates rows
without an order by the primary key. The result is a very fast enumeration of the Eco table data structure. Other table data types are not affected.
The new enumerator is used for the url export function that can be accessed from the online interface (Index Administration -> URL References -> Export). This export should now be much faster, if all url database files are from type Eco
The new enumeration is also used at other functions in YaCy, i.e. the initialization of the crawl balancer and the initialization of YaCy News.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5647 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-24 10:40:20 +00:00
orbiter
e8f5f2f612 added tool to analyse url strings
and to generate statistics about words occurring in urls

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5646 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-24 10:00:35 +00:00
lotus
6117e083e5 option to customize tray label (tooltip) with tray.label
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5642 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-23 21:07:08 +00:00
orbiter
b8c3803bfc don't panic when canceling server sessions
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5641 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-23 17:15:52 +00:00
orbiter
de714783b1 - added host, path, filename to search result
- modified yacyinteractive, shows now also date
- added size attribut to export file in xml format

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5639 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-23 11:39:20 +00:00
lotus
9519d84372 changed "dooble" variable to "browserintegration" to be less specific
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5636 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-22 17:32:17 +00:00
lotus
8429083972 adjusted tray for dooble:
you can now set dooble=true in yacy.init to disable the menu and browser popups by default

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5633 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-22 15:11:44 +00:00
orbiter
c852d2d70e - reject too old seeds
- do not store the complete seed in the reverse name cache, only the hash of the peer

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5628 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-21 00:39:44 +00:00
orbiter
aca973e2d9 catch more exceptions
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5627 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-20 23:24:49 +00:00
orbiter
9559bc23fd automatic clean-up of dead connections
(hope that works well..)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5626 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-20 22:27:02 +00:00
hermens
02dfd6183b Fix logging in serverCore
Prevent NPEs from keeping stopped Sessions in the pool and blocking slots



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5625 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-20 18:54:01 +00:00
hermens
d30456e2c8 Fix logging in serverCore
Prevent NPE:
I 2009/02/20 15:15:56 PLASMA check for Session_77.37.19.225:38812#0: 86515 ms alive, stopping thread
I 2009/02/20 15:15:56 PLASMA Closing main socket of thread 'Session_77.37.19.225:38812#0'
E 2009/02/20 15:15:56 SERVER receive interrupted - exception 2 = Socket closed
Exception in thread "Session_77.37.19.225:38812#0" java.lang.NullPointerException
        at de.anomic.server.serverCore$Session.run(serverCore.java:623)



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5624 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-20 15:12:00 +00:00
orbiter
4f9dae2571 remove reference in crawl entries
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5623 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-19 22:58:00 +00:00
orbiter
1ba4301920 automated interruption of dead incoming connections, if they are there for more than one minute
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5622 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-19 22:27:24 +00:00
orbiter
c12bb8a6d0 - refactoring of the http client
- added a protection against memory leaks for the access tracker

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5621 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-19 16:24:46 +00:00
orbiter
5d3983faae the soLinger parameter was wrong.
With soLinger=true the httpd looses connections
The effect can be seen when crawling the internal repository:
lost connections filled the client process queue until it was full
and no more connections were possible.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5620 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-17 16:22:15 +00:00
orbiter
62505bb3cb more bugfixes as recommendet by findbugs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5619 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-17 09:12:47 +00:00
orbiter
6b450d09ca some fixes recommended by findbugs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5618 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-16 23:31:54 +00:00
orbiter
4db80065ac select more
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5617 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-16 21:53:37 +00:00
orbiter
94c42691d8 - reject less transmissions as transmission receiver
- do not flag too much receiver when something goes wrong during transmission as sender

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5616 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-16 21:28:48 +00:00
orbiter
f887fc159f try to reduce the large number of unclosed incoming connections
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5615 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-16 16:26:57 +00:00
orbiter
e04a0e05c3 fix for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5614 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-16 16:21:12 +00:00
orbiter
a9ad863686 second part of 'doubles' fix - better handling of doubles in RAMIndex. More logging.
still missing: deletion of double entries in collections

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5613 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-16 16:13:48 +00:00
orbiter
59427064fb first part of 'doubles' fix (not fully ready yet)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5612 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-16 00:47:48 +00:00
orbiter
26978b2a25 - better memory protection in kelondro caches: computation of needed memory for cache grow
- removed excessive gc calls
- step to 16 vertical DHT partitions

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5611 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-15 23:35:59 +00:00
lotus
e9e2fff47a better scaling on performance graph
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5610 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-15 17:36:13 +00:00
lotus
4aad461100 added UPnP support
YaCy can now automatically forward ports on home routers
off by default

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5609 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-14 13:12:08 +00:00
orbiter
99b9788e54 fix for possible 100% CPU caused by concurrent access of HashMap
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5607 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-14 00:39:53 +00:00