Commit Graph

34 Commits

Author SHA1 Message Date
theli
40777556c5 *) Connection Tracking
- adding automatic refresh
   - accepts new parameter nameLookup which can be used to deactivate 
     yacy-peer name lookup (because we have problems with this on large seed-dbs)

*) ViewFile
   New page that can be used to view 
   - original content 
   - plain text content 
   - parsed content
   - parsed sentences 
   of a webpage specified by there url hash
   Mainly for debugging purpose at the moment

*) Robots.txt 
   Bugfix for if-modified-since usage
   TODO: synchronization of downloads to avoid loading the same robots-file 
   multiple times in parallel by different threads

*) Shutdown
   Better abortion of transferRWI and transferURL sessions on server shutdown

*) Status Page
   Adding icon to start/stop crawling via status page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@950 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-18 07:45:27 +00:00
borg-0300
e642a5d8b7 more constants
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@947 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-17 15:46:12 +00:00
allo
7ca60f97bf localization Support for Includes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@923 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-12 12:44:05 +00:00
theli
c8a35a0130 *) Adding new connection tracking page (currently only for incoming connections)
*) Displaying statistic for incoming connections on status page
*) Bugfix for Loop-Access Bug when trying to access the yacy page while yacy is configured as proxy
   See: http://www.yacy-forum.de/viewtopic.php?p=6826
*) Bugfix for Referer Bug
   See: http://www.yacy-forum.de/viewtopic.php?p=11098#11098
*) Adding reverse Name lookup for yacy-domain names (used by the connection tracking page)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@916 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-12 08:17:43 +00:00
orbiter
c83594528c integrated crawl stacker into thread control
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@887 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-09 15:59:09 +00:00
theli
45f55a6fad *) Bugfix for wrong index-queue size displayed on status page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@883 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-09 04:58:09 +00:00
theli
a2fa75e688 *) Asynchronous queuing of crawl job URLs (stackCrawl)
various checks like the blacklist check or the robots.txt disallow check are now
   done by a separate thread to unburden the indexer thread(s)
   TODO: maybe we have to introduce a threadpool here if it turn out that this single
         thread is a bottleneck because of the time consuming robots.txt downloads

*) improved index transfer
   The index selection and transmission is done in parallel now to improve index 
   transfer performance.
   TODO: maybe we could speed up performance by unsing multiple transmission threads in 
         parallel instead of only a single one.

*) gzip encoded post requests
   it is now configureable if a gzip encoded post request should be send on
   intex transfer/distribution

*) storage Peer (very experimentell and not optimized yet)
   Now it's possible to send the result of the yacy indexer thread to a remote peer 
   istead of storing the indexed words locally. 
   This could be done by setting the property "storagePeerHash" in the yacy config file
   - Please note that if the index transfer fails, the index ist stored locally.
   - TODO: currently this index transfer is done by the indexer thread. 
     To seedup the indexer
     a) this transmission should be done in parallel and
     b) multiple chunks should be bundled and transfered together


*) general performance improvements  
   - better memory cleanup after http request processing has finished
   - replacing some string concatenations with stringBuffers
   - replacing BufferedInputStreams with serverByteBuffer
   - replacing vectors with arraylists wherever possible
   - replacing hashtables with hashmaps wherever possible
   This was done because function calls to verctor or hashtable functions
   take 3 time longer than calls to functions of arraylists or hashmaps.
   TODO: we should take a look on the class serverObject which is inherited from hashmap
         Do we realy need a synchronization for this class?
   TODO: replace arraylists with linkedLists if random access to the list elements is not needed

*) Robots Parser supports if-modified-since downloads now
   If the downloaded robots.txt file is older than 7 days the robots parser tries to
   download the robots.txt with the if-modified-since header to avoid unnecessary downloads
   if the file was not changed. Additionally the ETag header is used to detect changes.

*) Crawler: better handling of unsupported mimeTypes + FileExtension

*) Bugfix: plasmaWordIndexEntity was not closed correctly in 
   - query.java
   - plasmaswitchboard.java

*) function minimizeUrlDB added to yacy.java 
   this function tests the current urlHashDB for unused urls
   ATTENTION: please don't use this function at the moment because
              it causes the wordIndexDB to flush all words into the
              word directory!

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@853 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-05 10:45:33 +00:00
orbiter
dc474aa22f various bug-fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@792 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-26 01:10:41 +00:00
borg-0300
150bd33591 finals;
cleaned;
Properties;

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@761 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-21 10:48:46 +00:00
orbiter
e17df64b54 removed IS_ADMIN - feature. This was covered by plasmaSwitchborad.adminAuthenticated
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@760 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-21 09:22:01 +00:00
theli
b2d48ebcef *) Splitting Status Page into Private and Public Informations
*) Adding Queue overview to status page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@576 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-23 08:27:37 +00:00
orbiter
dbba052331 removed internal addres presentation in interface according to http://www.yacy-forum.de/viewtopic.php?p=6779#6779
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@541 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-15 18:45:18 +00:00
rramthun
ea780a7afc Additions to the language file
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@537 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-14 20:16:52 +00:00
orbiter
bb3e897baf mor minor changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@488 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-03 13:43:55 +00:00
theli
228b04b499 *) Bugfix for "wrong seed-upload timestamp" problem
http://www.yacy-forum.de/viewtopic.php?t=817

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@480 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-02 15:36:10 +00:00
orbiter
2181982ce5 disabled buttons on Status page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@439 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-27 13:03:42 +00:00
orbiter
3470a72d48 fixed div by zero, set default delays, fixed release number format and display
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@435 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-26 11:47:50 +00:00
allo
12da0c2758 http://www.yacy-forum.de/viewtopic.php?p=5645#5645
the line with ??? does not work...


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@434 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-26 10:26:35 +00:00
orbiter
51962d55bf added 'PPM', page-per-minute statistics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@405 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-18 00:44:51 +00:00
allo
4851b432e1 nice Version display(without svn rev added)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@380 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-05 13:43:48 +00:00
orbiter
fbef7fed81 adopted latestVersion to float handling
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@363 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-03 12:40:36 +00:00
allo
129929b396 Preparations for automatic Languagefile upgrade on new YaCy Version.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@352 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-30 19:39:19 +00:00
theli
08e4334c1d *) Status.java: showing amount of time since last upload of seed-file
*) hello.java: adding additional output for principal-downgrade bug
*) httpd.java, httpdFileHandler.java, httpdProxyHandler.java: improved errorhandling
*) yacyCore.java: trying to fix principal-downgrade bug
*) yacySeed.java: adding some constants

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@329 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-28 11:27:31 +00:00
theli
83ceec3a63 *) Bugfix for wrong "Statusanzeige"
See: http://www.yacy-forum.de/viewtopic.php?p=4560

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@314 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-23 07:23:40 +00:00
theli
aea355c03c *) adding test for connection status of port forwarding feature
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@292 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-17 07:57:00 +00:00
orbiter
5d06ded005 enhanced html parser speed
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@290 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-17 01:26:51 +00:00
theli
9a98988c3c *) Bugfix for SSL/NIO Bug
See: http://www.yacy-forum.de/viewtopic.php?t=516
   - removing NIO from server/serverCore.java because of massive problems
     with socket close issues
*) Adding support for remote port forwarding via sch
   @Orbiter: Please take a look into
   - hello.java
   - server/serverCore.java.publicIP()
   - yacy/yacyClient.java.publishMySeed(...)
*) Making startup loading of additional content parsers more failsafe


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@281 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-16 07:28:07 +00:00
orbiter
f45dc29f35 maintenance
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@279 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-15 14:59:53 +00:00
theli
f9a95b5cb8 *) Displaying more user friendly Memory Usage statistic
*) Displaying traffic consumed by yacy 
   - this is not finished yet
   - at the moment only outgoing proxy traffic is counted

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@235 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-09 09:33:15 +00:00
orbiter
ca3b4ccaf4 added snippet-routines (not yet finished)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@218 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-08 00:52:24 +00:00
theli
9af1bf4b38 *) displaying memory usage of yacy in Status.html
*) displaying more expressive uptime information on Status.html and Network.html

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@214 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-07 09:05:21 +00:00
rramthun
85c2f3be8a Fixed spelling mistakes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@110 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-12 17:50:45 +00:00
theli
e7f7aa0bb9 *) Import statements reorganized
Now it's easier to determine which class really uses which other class*) Reogranizing Import Statements 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@83 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-05 05:36:42 +00:00
orbiter
248077d3f0 initial load with yacy 0.36
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-07 19:19:42 +00:00