Commit Graph

1213 Commits

Author SHA1 Message Date
orbiter
5444b07674 fixed bug with decompression of index abstracts
this fixes a problem that occurred when searching for several words

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3968 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-15 12:39:16 +00:00
orbiter
89e1848db6 fixed problem with favicons:
target servers had been able to see search words from the referrer of the favicon fetch.
This has been removed by using the getImage - servlet for favicon fetch.
Since java does not support loading of bmp and ico-Images, such parsers had been added.
The image parser had been coded from their original microsoft documentation.
This influences also the image-search functionality: there can now be a preview
of found bmp-images. Another benefit: favicons for search results are now cached with the HTCACHE.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3965 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-15 01:34:01 +00:00
orbiter
7c5c814a47 - simplified code (removed exception handling where not necessary)
- added confirmation dialog for shutdown and restart

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3962 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-13 14:54:01 +00:00
orbiter
a4e8ad95ab enhancements to news and switchboard queue processing
removed direct access and replaced by iteration

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3961 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-13 13:00:18 +00:00
orbiter
a45216b479 fix to prevent bad-formed news messages
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3960 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-13 09:41:55 +00:00
orbiter
bec4dbc753 added options and execution methods for automated updates
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3959 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-12 16:23:33 +00:00
orbiter
208b5297f1 enhanced handling of news records:
result is a speedup of Surftips, Supporter, and Network page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3954 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-05 22:56:37 +00:00
orbiter
36a37f758b fix for oom exception during release download
see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=101&hilit=

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3950 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-03 22:55:47 +00:00
orbiter
3421c64d26 implemented update function:
after downloading a release using the download button on the status page
the user can choose any of the downloaded versions for a update.
this enables also a downgrade to a older version.
when the update button is pushed, yacy terminates, installes the choosen version
and restarts

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3948 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-02 15:16:05 +00:00
orbiter
c1aad9e508 added parameter for network graphic background
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3942 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-29 23:06:13 +00:00
orbiter
1a45ecb356 - fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=14&p=137#p137
- fix for missing restart script in ant built target
- removed some more synchronization for size() operations
- removed blocking statement on search page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3935 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-28 22:06:33 +00:00
orbiter
f1ed91a8e4 added option to allow/disallow DHT transmission during indexing
see also http://forum.yacy.de/viewtopic.php?f=9&t=8

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3933 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-28 15:25:33 +00:00
orbiter
9bbd39b67c - removed unfinished auto-updater from roland and martin
- added new download-option for releases on the status page
still mising:
- thomas-style restart for linux/mac
- untar/gunzip on shell basis
(comes next)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3931 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-28 14:52:26 +00:00
orbiter
1782ef57e5 - added SSI parser and include directive for <!--# include virtual="<file>" -->
- added chunked file transfer for non-yacy clients
- SSIs are streamed using chunked transfer, partly delivered pages can be seen in browser before transmission is finished
- added client-side network unit identification
- cleaned up code

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3926 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-26 14:37:10 +00:00
orbiter
0e57a8062b added network definition for different YaCy networks
(needs much more work)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3919 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-22 14:29:14 +00:00
auron_x
1d41ebf489 *) made age for deletion of too old seeds configurable
*) changed naming-scheme of seed-deletion-properties

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3918 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-22 10:42:29 +00:00
auron_x
52cb3208d0 *) old (lastseen > 7d) peers are now automatically removed from passive and potential seed-dbs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3917 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-22 09:16:25 +00:00
orbiter
815e3da62f fix for http://www.yacy-forum.de/viewtopic.php?p=37353#37353
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3913 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-18 11:34:30 +00:00
low012
c59a7ce5c2 *) hopefully fixed a stupid bug (my fault of course) that sometimes messed up the marking of search words in the snippets (see http://www.yacy-forum.de/viewtopic.php?p=37329#37329)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3908 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-17 16:29:04 +00:00
orbiter
6518bb6c08 changed release strategy:
we will provide two different releases in the future, one standard release and one 'pro'-release.
the 'pro'-release contains all additional parsers AND has different default performance values.
The pro-version differs therefore from the previous 'all'-version by this default values.
The pro-configuration is automatically choosen if the libx-folder exists. If a version is once initialized, its configuration stays independently from an existing libx folder.
The ant targets had been changed. There are now 3 different targets to create standard and pro-releases, and one target to upgrade:
- dist: creates a standard release (only, no libx target any more)
- distPro: creates a pro-release (includes the libx)
- distExt: creates a libx-release which includes the libx-folder only. It may be used to upgrade from standard to pro
Furthermore, the naming of 'dev'-releases had been removed.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3902 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-16 14:11:52 +00:00
orbiter
069562a14d fixed problem with re-crawl; replaced error file-db with ram-db
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3900 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-15 23:47:08 +00:00
orbiter
c7a614830a several bugfixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3899 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-15 17:45:49 +00:00
allo
465145cb6f revert to insecure, but dau-proof defaults
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3898 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-15 12:56:52 +00:00
allo
7ad11ceaaa security fix for peers without password. allow access only from localhost
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3897 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-15 00:03:44 +00:00
orbiter
71fd972ac0 - reduced default search time
- catched case when web structure cannot be painted because of too less data
- better logging when balance fails


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3892 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-14 15:21:01 +00:00
orbiter
684ded0e09 added new news types
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3876 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-12 15:15:24 +00:00
orbiter
d7de0938a6 fix for http://www.yacy-forum.de/viewtopic.php?p=36587#36587
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3870 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-12 08:47:34 +00:00
karlchenofhell
22ee85ca02 - specified exceptions thrown by ResourceInfoFactory and plasmaHTCache.loadResourceInfo()
- caught possible NPE in CacheAdmin_p and added more error-cases
- speeded up deletion of entries in the local crawl queue by crawl profile (it has been noted often that this deletion is slow)
- added a bit javadoc

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3868 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-11 23:33:24 +00:00
orbiter
dfd5e823c3 automatic limitation of web structure host count
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3867 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-11 22:08:10 +00:00
orbiter
8b0aea6910 fixed automatic deletion of too many referenced hosts in web structure
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3866 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-11 21:51:56 +00:00
orbiter
9a8a87612d added new qph column to search tracker servlet
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3854 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-10 22:02:17 +00:00
orbiter
e07458bad4 added time-out function to web analysis
the default time-out is 1 second

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3852 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-10 20:00:44 +00:00
hydrox
4a1bc4743a *)News-entries with blacklisted URLs are now ignored
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3849 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-10 08:05:18 +00:00
theli
339153d40e *) favicons that are specified in the document content via html link-tags
are now detected and displayed on the search page (requested by allo).

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3845 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-09 15:22:37 +00:00
karlchenofhell
6265d321bd - more constants
- display why global search is not available on search page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3839 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-08 20:01:16 +00:00
rramthun
18a5380ee3 *) situation-dependent lock-buttons for search-page
*) removed one unused import and a double definition of "ogg" as media-type

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3817 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-07 15:26:41 +00:00
karlchenofhell
9d6605a83c - fixed NPE in Blacklist Cleaner during deletion of more than one double entries
- don't display responseHeader1.db in CacheAdmin_p anymore

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3814 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-06 23:36:38 +00:00
orbiter
594ff95955 :-(
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3801 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-06 11:34:39 +00:00
orbiter
4ca797401e fix for ConcurrentModificationException
see http://www.yacy-forum.de/viewtopic.php?p=36566#36566

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3800 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-06 10:36:04 +00:00
orbiter
7b904e0077 integrated robots.txt crawlDelay into the crawl balancer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3797 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-06 07:53:56 +00:00
orbiter
52cb033f01 - slightly different painting of web structure picture:
hosts that have many own connections are painted farer away (this is not yet cato's idea, this will be implemented in another step)

- doc update

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3796 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-05 15:32:43 +00:00
allo
6c9df13552 more debugging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3791 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-04 20:30:40 +00:00
allo
d1e1580223 Surftips Blacklist
Blacklists List Hardcoded instead of only updated on firststart / migration.java

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3788 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-04 15:36:10 +00:00
(no author)
94cc9f05f5 *) Improvements for restart via update wrapper
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3785 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-02 15:25:13 +00:00
borg-0300
2ab020445a bugfix, i think - http://www.yacy-forum.de/viewtopic.php?t=4059
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3777 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-31 17:03:02 +00:00
(no author)
ef24bed406 Sorry...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3760 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-24 16:25:07 +00:00
(no author)
a29cb2e1af blupp
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3759 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-24 16:14:46 +00:00
orbiter
a585b4d41b added web structure image
see http://localhost:8080/WatchWebStructure_p.html

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3747 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-22 15:20:50 +00:00
orbiter
33ad0c8246 added a web structure computation and logging:
- all web page parsing operations will now increase a web structure file
- the file is computed in memory and dumped at shutdown-time to PLASMASB/webStructure.map in readable form (not a database)
- the file can be used externally to analyse the link structure of the crawled pages
- the web structure can also be retrieved using a xml-interface at http://localhost:8080/xml/webstructure.xml
- the short-term purpose is the computation of a link-graph image (before linuxtag!)
- a long-term purpose could be a decentralized computation of the citation rank



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3746 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-22 08:13:48 +00:00
karlchenofhell
7904175338 - sorry for typos
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3743 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-20 16:22:46 +00:00