Commit Graph

3070 Commits

Author SHA1 Message Date
orbiter
bca87f1e38 - refactoring of serverThreads: renaming to distinguish busy-threads and blocking-threads
- added blockingThreads which are threads that are not driven by pause times but by BlockingQueue lookup

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4606 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-27 12:03:16 +00:00
orbiter
968c775025 - preparation of parsing/indexing queue for concurrent execution
- remote crawl receipts are now transmitted concurrently in separate threads (makes remove crawls much faster!)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4605 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-26 22:43:38 +00:00
orbiter
9b0e20fb06 next refactoring step in document indexing to prepare concurrency environment for document parsing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4604 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-26 19:51:05 +00:00
orbiter
7f9f639d20 - refactoring and abstraction of index reference (urls) handling: blacklisting is part of reference filtering
- refactoring of word/phrase handling: word abstraction from condenser becomes part of index element handling
- removed unused code parts from condenser

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4603 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-26 15:37:49 +00:00
orbiter
d6050b9ffb - separated the LURL data storage and Crawl result stack for process supervision.
this is another step to enable multiple, concurrent fulltext-indexes
- another try to make the yacy-httpc more stable

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4602 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-26 14:13:05 +00:00
orbiter
8d6a13bc07 refactoring of parsing-condensing-indexing process:
- separated parts
- removed storagePeer function
next step will be parallelization of processes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4600 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-24 22:51:26 +00:00
orbiter
d3b06913ec protection against seed-db failure during enumeration
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4598 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-23 23:47:41 +00:00
orbiter
5aa96dbc36 fix for shutdown configuration
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4596 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-23 13:14:57 +00:00
orbiter
93633abed8 - removed some debugging code from search process - should speed up now
- added some profiling code to search event - more time details in PerformanceSearch_p.html

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4594 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-23 00:55:04 +00:00
orbiter
fba46c51d7 fixed non-termination bug in qsort
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4593 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-22 23:15:28 +00:00
orbiter
541b817502 refactoring of switchboard queueing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4591 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-22 01:28:37 +00:00
orbiter
fc94fbe224 another improvement to the collection sorting
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4589 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-20 23:11:04 +00:00
orbiter
11270d450e better quicksort-pivot computation: 30% faster (measured with test program)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4588 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-20 22:01:12 +00:00
orbiter
3e44293f07 - fixed a problem with thread pools in row collection
- added a line-viewing feature in threaddump	

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4587 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-20 14:21:58 +00:00
danielr
e43051b125 - fixed Threaddump output (html-escaped ie. <init>)
- in EcoFS converted comments to javadoc


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4586 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-20 10:20:55 +00:00
orbiter
433ff855f7 - fixed another concurrency problem in collection sorting
- fixed a typing problem that was introduced in svn 4579 and caused the crawl monitor to fail

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4585 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-19 23:47:24 +00:00
orbiter
19286fa2d1 tried to fix seed2.old.db-problem
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4584 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-19 22:35:19 +00:00
orbiter
f3996e63b8 tried to fix more deadlocks:
- changed connection modes in ftpc
- replaced sort tread pool in row collections by new one using util.concurrent. the old pool had caused blockings

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4582 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-19 11:23:43 +00:00
danielr
7008a218b3 avoid ConcurrentModificationException in plasmaCrawlerQueues
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4579 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-17 13:51:56 +00:00
orbiter
7150b463ff changed handling of default values and database paths:
- the default files yacy.init and for the network definition is now moved to the path defaults
- the httpProxy.conf is renamed to yacy.conf
- the DATA/INDEX/PUBLIC is renamed to the actual network nickname, which should be freeworld or sciencenet
more menu entries
- added apfelmaennchens alternative search page to the menu
- added the new thread dump page to the server log menu point as submenu
modifications
- modified the thread dump page: sorting by thread type

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4575 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-16 22:31:54 +00:00
orbiter
7fd094fcbe small bug in ftpc: did cot compile in Java 1.5
Please set compiler to Java 1.5-compliance

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4570 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-16 13:41:49 +00:00
danielr
f51bad8ae5 FTP:
- report connection status (to break if no connection possible)
- fixed isFolder()
- additional error output
- fixed paths with encoded symbols (ie. a%20file.txt)
- refactoring


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4567 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-15 21:57:55 +00:00
danielr
820641938e ftpc: fixed date parsing, some refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4566 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-15 10:56:47 +00:00
orbiter
4c584dff87 disabled soLinger to prevent that too many connections stay open (it's a TEST!)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4565 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-15 10:46:55 +00:00
orbiter
9c989fe5f7 fixed deadlock
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4562 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-15 00:49:16 +00:00
danielr
c565906050 FTP:
- added maxFileSize-check
- added timeout for download
- fixed dirlist (when all filenames have spaces, change to absolute links)
- enhanced isFolder()
- make sure data connection is closed, so a new can be opened
- refactoring


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4561 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-14 16:28:27 +00:00
danielr
1a7870df0d FTP: source cleanup (added finals, indention for easier diffs)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4559 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-14 12:35:53 +00:00
orbiter
fa1090113d - next try to fix the networking problem:
set the maximum transfer size to less than MTU=1500-52: buffer size <= 1448
- some refactoring of transfer methods (naming)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4558 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-14 00:16:04 +00:00
orbiter
d87d295c68 one more try to fix the connection problem
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4556 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-12 13:13:11 +00:00
orbiter
a3dadcd89b preventing that peer which return a false search result are disconnected
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4555 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-12 00:56:18 +00:00
orbiter
ba622bb240 addendum to svn 4553
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4554 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-12 00:24:20 +00:00
orbiter
5530b8e1ca reverted changes to yacy protocol classes: they caused the sciencenet to loose connections
a comparisment with the main release 0.57 had been made: this showed a stable network
This is an emergency operation to ensure availability of the sciencenet network.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4553 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-12 00:05:18 +00:00
orbiter
b664a53553 fix for NPE during search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4552 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-11 15:30:26 +00:00
orbiter
b4ed937f1e - modified zone navigation (does still not work correctly)
- added dht switch in network definition
- 0.574

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4550 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-11 11:09:38 +00:00
orbiter
8d0470a5c6 new method to compute search history IDs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4549 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-10 23:40:56 +00:00
orbiter
65785da8f2 new method for best hash computation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4548 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-10 23:28:05 +00:00
orbiter
9eddc1506b - one try to fix the httpd problem
- fix for handling of collection index that appears when removing elements
- added another navigation method (stub, not working yet)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4543 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-09 23:58:22 +00:00
orbiter
7cc4ff05c9 some code enhancements and bugfixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4542 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-09 23:48:24 +00:00
danielr
6788f8f7c1 fixed error 'FTPC cannot change directory'
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4531 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-06 11:59:23 +00:00
orbiter
7ce76c8ff8 added missing file
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4530 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-05 22:57:53 +00:00
orbiter
bfed9c2da6 - some refactoring in search process
- separated sidebars in new search interface and placed them in their own files
  which can be put in into the search page like plug-ins

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4529 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-05 21:46:55 +00:00
borg-0300
3445b1e10b *better logging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4526 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-05 13:41:54 +00:00
borg-0300
4b0339fec0 *fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=927
*remove some cast
*Properties added

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4525 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-05 13:29:42 +00:00
orbiter
275a226cc5 refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4524 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-04 22:45:45 +00:00
apfelmaennchen
bc3d3b4c97 fixed rebuildTags() to correctly rebuild folders...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4523 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-03 22:36:27 +00:00
danielr
fbe335db73 consistent use of de.anomic.server.serverMemory to get information about memory statistics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4522 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-02 15:42:50 +00:00
orbiter
8c06436c4a removing the error-db upon each time a start-up is made.
This is necessary because the table uses a lot of RAM and the content is never re-used after Start-Up.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4520 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-01 09:44:33 +00:00
orbiter
4fdf695064 - fixed a bug in remote search that prevented that any results had been generated (!)
- added a great number of printStackTrace and new exceptions that shall be used to find the cause
  for a bug in yacy client-server communication which causes the interruption of data transfer
  which then causes the parser bug for the seed strings.
- tried to fix the communication bug on server-side (copy functions)
Be aware that the log may be full of errors and bugs - there should not be more bugs but there is more to see


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4519 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-27 23:12:43 +00:00
borg-0300
0ddbed9451 Less memory consumption at start
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4518 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-27 20:09:22 +00:00
orbiter
1dce2f1079 more multithreading support:
- replaced some synchronized classes by classes from util.concurrent
- used a util.concurrent.SynchronousQueue to implement a persistent sorting thread in
  the very basic kelondroRowCollection which supports sorting with a second thread
  in case that a double-core processing CPU is used

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4517 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-27 15:16:47 +00:00
orbiter
6779b455d7 another fix for the punycode parser/generator (should work now!)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4516 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-26 23:00:20 +00:00
orbiter
1b127406d0 update to punycode encoding (still not working)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4515 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-26 22:37:23 +00:00
orbiter
83860507c9 - added punycode class from gnu idn library
- added parser for international domains in yacyURL

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4514 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-26 22:18:40 +00:00
orbiter
253a453413 removed possible synchronization deadlock
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4511 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-26 14:05:43 +00:00
orbiter
3f321ece7d added a search history to the new search page
the history distinguishes between different users and identifies them by their ip
a history is only shown to the user who submitted the search

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4510 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-25 21:26:49 +00:00
orbiter
c48e25d784 - fixed selection box for topwords
- fixed parser detail in condenser

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4509 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-25 19:00:11 +00:00
orbiter
87a8747ce3 - enhanced recognition, parsing, management and double-occurrence-handling of image tags
- enhanced text parser (condenser): found and eliminated bad code parts; increase of speed
- added handling of image preview using the image cache from HTCACHE
- some other minor changes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4507 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-25 14:08:15 +00:00
low012
652086159a *) Replaced System.err.println() by logging function. Left System.err.println()s as comments to be able to quickly revert changes since gzip is an application with it's own main method and Orbiter maybe wants to keep it this way.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4505 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-24 19:56:19 +00:00
orbiter
677ee2ea04 added remove operation to collection index (re-activation)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4503 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-23 00:14:11 +00:00
orbiter
d477483373 stronger criteria to use RAM copy to use table copy
(should use less RAM)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4502 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-22 23:46:27 +00:00
orbiter
a7abee3578 - fixed some data types in new search stack
- added image domain presentation to image preview
- added new search page to menu
- added automatic re-search when an old search profile is requested and a crawl is ongoing,
  to fetch newly crawled entries

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4501 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-21 23:40:38 +00:00
orbiter
81687b6bd5 added missing hachCode computation for previous feature
this solves also the missing image double-check fetaure!

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4500 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-21 15:37:46 +00:00
orbiter
bedd8dfbe2 - added image sorting by image size. This is the default now.
This is performed using a 3-stage sorting process:
  - sort by relevance, then do snippet-fetch
  - sort snippets by relevance then do image link extraction
  - sort image links by image size; unknown sizes are handled like small sizes
- only the exact amount of images as requested are shown

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4499 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-21 14:53:51 +00:00
orbiter
727feb4358 - fixed some bugs in ranking computation
- introduced generalized method to organize ranked results (2 new classes)
- added a post-ranking after snippet-fetch (before: only listed) using the new ranking data structures
- fixed some missing data fields in RWI ranking attributes and correct hand-over between data structures

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4498 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-21 10:06:57 +00:00
orbiter
f4c73d8c68 - fixed highslide usage
- some enhancement to index management, better types

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4497 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-19 14:13:35 +00:00
orbiter
2327451653 - changed order of database initialisation (index first)
- removed mainly unused init-time for databases (was only used for tree tables, which are not used any more)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4496 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-19 09:14:07 +00:00
orbiter
3441ec3928 - some small changes to highslide integration to get it working... (does not work yet)
- performance enhancement for url list parser

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4495 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-18 23:49:03 +00:00
orbiter
6c3cd2b4f2 - added new way to watch images from the image search:
they appear as separate, floating window above the search results,
  not in a new window
- added highslide javascript library for feature mentioned above
- removed dir servlet. This thing was not used as it was supposed to be (as an example applet)
  and was a major problem for intranet-indexing when files are hosted on the same peer.
- added yacy-httpd-internal directory listing. Because YaCy is a search engine,
  directory listings are similar to search result listings. Intranet indexing from the same peer
  will get nice index pages for document collections.
- removed unused test applet

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4494 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-18 16:38:06 +00:00
orbiter
61a81820e3 - refactoring of search tracker
- added link to search history to repeat the search

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4493 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-17 23:35:48 +00:00
lulabad
9ecc17baef fixed double Blog entrys
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4492 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-17 13:03:40 +00:00
orbiter
36b898ca7a - tested successfully z-presentation of yacy seed encoding
- added alternative switch that takes shortest representation as yacy seed string encoding

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4491 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-17 12:36:43 +00:00
orbiter
066c88140f quickfix for OOM, see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=875&hilit=&p=5686#p5686
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4488 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-16 12:16:53 +00:00
orbiter
4079c38ce0 - probably slightly better default ranking
- added experimental right column to new search page (no function, only container)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4487 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-16 12:13:38 +00:00
orbiter
8fd5e52f04 added basket icons and experimental gif animation class
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4485 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-16 10:40:00 +00:00
lulabad
94e256e13b * removed single Blogview, now links direct to BlogComments.html
* some other small changes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4483 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-16 09:32:29 +00:00
orbiter
ff5969901c modified dir servlet to cooperate with intranet indexing from the own HTDOCS repository:
- removed md5 file generation (spoils the won repository)
- removed comments in file share (was never used)
- moved dir list comparator to other place (maybe solves problem, lets see)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4481 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-15 13:12:25 +00:00
lulabad
00f5f917de - more refactoring to blog
- fixed moderate comment bug. see http://forum.yacy-websuche.de/viewtopic.php?f=9&t=860

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4478 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-12 19:17:17 +00:00
orbiter
f890b039ee experiments wit openstreetmaps
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4477 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-12 15:39:32 +00:00
orbiter
7f445f34a6 bitte die Java 5 - typischen Warnings einschalten!
(unboxed-Fehler wies auf Programmfehler hin und Typangabe fehlte)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4476 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-10 22:50:09 +00:00
lulabad
c1b9a03304 * some refactoring to Blog
* changed default sort order to reverse (newest first)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4475 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-10 21:31:11 +00:00
lulabad
766a04bc06 fixed sort problem in Blog. see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=639
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4474 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-10 18:35:28 +00:00
borg-0300
bfe171e693 Small change (generics)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4473 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-10 16:13:13 +00:00
borg-0300
2589290ded better ping
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4472 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-10 15:57:52 +00:00
borg-0300
dae9053b21 BUGFIX
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4464 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-09 11:43:15 +00:00
borg-0300
77ba446332 seedDB helpers update/cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4461 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-08 14:06:34 +00:00
borg-0300
dd215e7f6b NPE fix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4460 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-08 12:26:03 +00:00
orbiter
bd63999801 - faster search: using different data structures that avoid multiplr calculations
- no more table copy for error-eco table
- optional table copy for lurl-entries
- more abstractions (less single constant strings)
- better logging (using host names instead of ips)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4459 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-07 22:16:36 +00:00
lulabad
8358652fa9 some small changes to blog
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4457 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-06 21:15:24 +00:00
orbiter
159aaf8889 re-introduced global search limitation when index receive is switched off
this was necessary because othervise robinson peers did also global searches, which cannot be a wanted effect

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4456 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-06 20:29:22 +00:00
borg-0300
a9c4e9c309 Small change (ping)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4453 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-06 16:45:13 +00:00
borg-0300
9ab6ad8b73 more seedDB helpers
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4452 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-06 12:32:41 +00:00
lulabad
6a85764e1a Second bugfix for numberbug in Blog.
This update fix automatic existing blogentrys.
A backup is not needed but almost a good idea ;)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4451 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-05 21:52:14 +00:00
orbiter
efd5807a7c - some renaming of variables to support DC
- initial 120mb RAM for fresh peers
- release 0.57

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4445 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-04 22:58:40 +00:00
lulabad
40a0591942 Fixed numberbug in Blog, see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=639. This wont fix existing Blogentrys (comes later).
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4443 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-04 21:46:18 +00:00
orbiter
141db7ba48 there is less RAM needed for eco table (its just a security-plus for RAM check)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4442 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-04 15:51:51 +00:00
orbiter
249d61759a fix for false RAM table activation in EcoTables
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4441 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-04 10:33:45 +00:00
orbiter
ff6b69b37e fix for NPE in access tracker
fix for NPE in word index


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4439 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-03 21:47:27 +00:00
orbiter
3c7b94c119 - fix for online caution delay settings, see
http://forum.yacy-websuche.de/viewtopic.php?f=6&t=738&p=4723#p4723
- removed remote search limitation for non-dht-peers according to discussion in
  http://forum.yacy-websuche.de/viewtopic.php?f=15&t=793&hilit=&p=5277#p5277

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4438 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-03 20:11:50 +00:00
orbiter
f35a3794e0 auto-healing (deletion) of bad peer addresses during start-up
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4437 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-03 18:42:25 +00:00
orbiter
42c1e11f2b added another link double-check
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4434 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-03 12:40:40 +00:00
orbiter
a5d388bfff fix for HTCache organisation that may have caused unlimited grow of the cache
appeared only for tree-caches

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4433 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-03 11:21:50 +00:00
orbiter
96c5e6acc7 added a double-check for search results
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4432 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-03 02:55:21 +00:00
orbiter
a1e9e6e2e6 fix for search result page navigation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4431 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-03 02:23:04 +00:00
orbiter
7404256997 - no more search time-out!
- fixed a bug with last commit

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4430 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-02 23:53:39 +00:00
orbiter
cd3e0d6f03 tried to fix another eco bug
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4429 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-02 21:36:19 +00:00
orbiter
08a12e9bb5 - removed dashed line from default skin (looks much better!)
- better timing when displaying results

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4428 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-02 11:30:47 +00:00
orbiter
89169d54fd fixed search result preparation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4427 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-02 00:16:00 +00:00
orbiter
acf771d5e1 - fixed bug with too much RAM in crawler queue
- fixed dir bug
- better calculation of TF for join
- better waiting-on-result logic

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4424 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-31 23:40:47 +00:00
orbiter
a8a5df4a51 - more dublin core naming of page metadata
- better presentation of result counters in search results

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4420 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-30 21:58:30 +00:00
orbiter
fa3b8f0ae1 fixed bug in remote search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4419 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-30 00:15:43 +00:00
orbiter
7d875290b2 more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4417 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-29 22:13:30 +00:00
orbiter
9d693ee635 more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4415 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-29 16:41:09 +00:00
orbiter
0f5c4abaca more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4414 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-29 10:12:48 +00:00
orbiter
974fea7933 added term-frequency ranking
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4413 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-28 23:41:39 +00:00
orbiter
1a296af6ff more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4412 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-28 20:08:32 +00:00
orbiter
4a80902081 - added ViewProfile as rdf in foaf syntax
- added link to rdf and vCard version on html page
- can be seen on http://localhost:8080/ViewProfile.html?hash=localhash
- more generics

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4411 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-28 18:21:08 +00:00
orbiter
da8c850a25 disabled IO path optimization (seems to block other methods)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4405 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-26 00:21:37 +00:00
hermens
d177ceb3b3 Fix for growing responseHeader[12].db when using proxyCacheLayout = hash
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4404 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-25 21:56:25 +00:00
apfelmaennchen
b1fae9b5af fixed import Netscape Bookmarks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4401 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-25 19:22:36 +00:00
orbiter
2485681002 added termination control for RotateIterator
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4399 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-25 11:44:27 +00:00
orbiter
e2e7f065e9 minor fixes, some generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4398 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-24 23:58:18 +00:00
orbiter
15397298dc - refactoring of indexControlRWIs: moved statics to own class; better Dublin Core naming
- fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=759&hilit=&p=4866#p4866
- some bugfixes in EcoTable according remove method
- switched more tables to Eco: crawl Profiles, htcache, seeddb, newsdb

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4397 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-24 22:49:00 +00:00
apfelmaennchen
f3a9e9c542 added getFolderList() to bookmarksDB
added cleanTagsString() to bookmarksDB
added getFoldersString() to Bookmark
modified getTagsString() to exclude folderTags

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4383 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-24 20:11:57 +00:00
orbiter
db25425893 more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4382 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-23 23:08:32 +00:00
orbiter
9e7cd4fdbb more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4380 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-23 21:23:17 +00:00
orbiter
4e70dff8cf more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4379 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-23 21:09:56 +00:00
orbiter
6dc679785f - fixed bad sort behavior of kelondroRowSet, in this case: no sort at all!
see http://forum.yacy-websuche.de/viewtopic.php?p=4841#p4841
- some memory calculation enhancements in kelondroFlex and a little bit more logging

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4378 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-23 20:18:36 +00:00
orbiter
0b4205eb5a - fix double-deletion in eco tables
- changed behaviour of sort moment (not during a get)
- added some asserts in snippet cache for debugging

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4375 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-23 11:13:39 +00:00
orbiter
4ce6fab428 added special handling for doubles in eco tables after initialization
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4370 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 21:40:25 +00:00
orbiter
002a109c4d patch for http://forum.yacy-websuche.de/viewtopic.php?p=4597#p4597
(urls that have no protocol but start with www will be treated as http://www...

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4369 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 20:49:26 +00:00
orbiter
634430c48a - more logging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4368 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 20:44:12 +00:00
orbiter
d372a78aef some fixes to bring back lulabads peer..
see also: http://forum.yacy-websuche.de/viewtopic.php?p=4772#p4772

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4366 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 20:02:20 +00:00
low012
f4799c2334 *) removed since I decided to turn this into a project of it's own using Perl to gather n-gram data which YaCy will be able to use
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4365 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 19:59:21 +00:00
orbiter
4ffbcd54a4 fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=754
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4358 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 19:10:03 +00:00
apfelmaennchen
e81bced2bd reorganized the code and adjusted getTagIterator() to suit folders
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4357 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 19:08:32 +00:00
orbiter
85dc62c16f refactoring: more dublin core - compliant naming
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4354 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 19:03:47 +00:00
orbiter
efd0b8371a - added parsing of Dublin Core - compliant metadata (see RFC 5013 and ISO 15836) to html parser
- refactoring of plasmaParserDocument to use Dublin Core - compatible property names
- redesign of url handling in parser and condenser (less String-to-yacyURL conversion)
- more generics

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4352 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 11:51:43 +00:00
low012
cfd4fecd12 *) blanks in paths for restart and update script are replaced by backslash+blank now (see http://forum.yacy-websuche.de/viewtopic.php?t=745)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4351 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-21 18:04:08 +00:00
orbiter
f945ee21d2 some security additions, keep maximum byte[] size to 2^27
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4350 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-20 23:46:27 +00:00
orbiter
2f3b2f3481 - extended dbtest for comparisment tests
- added initial space option for eco tables
- used initial space value in initialization of collectionIndex, this should avoid OOM failures" /Volumes/Magneto/dev/workspace/trunk/source/dbtest.java /Volumes/Magneto/dev/workspace/trunk/source/de/anomic/kelondro/kelondroCollectionIndex.java /Volumes/Magneto/dev/workspace/trunk/source/de/anomic/kelondro/kelondroDyn.java /Volumes/Magneto/dev/workspace/trunk/source/de/anomic/kelondro/kelondroEcoTable.java /Volumes/Magneto/dev/workspace/trunk/source/de/anomic/kelondro/kelondroRow.java /Volumes/Magneto/dev/workspace/trunk/source/de/anomic/kelondro/kelondroSplitTable.java /Volumes/Magneto/dev/workspace/trunk/source/de/anomic/plasma/plasmaCrawlBalancer.java /Volumes/Magneto/dev/workspace/trunk/source/de/anomic/plasma/plasmaCrawlStacker.java /Volumes/Magneto/dev/workspace/trunk/source/de/anomic/plasma/plasmaCrawlZURL.java
- added index consistency check (checks for double-occurrences of primary keys in file)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4349 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-20 21:42:35 +00:00
orbiter
9eb746863d interface enhancements for eco records memory statistics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4348 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-20 01:51:02 +00:00
orbiter
9abc927645 to fix inconsistencies in collection index, a double reference reporting mechanism has been implemented
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4347 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-20 01:22:46 +00:00
orbiter
58a1f518f8 fixed some problems with eco tables
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4346 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-19 12:23:56 +00:00
orbiter
d4d07802ac better RAM protection using eco tables
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4345 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-19 01:50:24 +00:00
orbiter
f4e9ff6ce9 more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4343 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-19 00:40:19 +00:00
orbiter
cbefc651ac more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4342 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-18 18:43:56 +00:00
orbiter
45339c3db5 more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4341 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-18 17:14:02 +00:00
orbiter
94f21d9403 activated new kelondroEcoTable file structure.
This data structure replaces almost all files in the PLASMA directory
also the collection.index and the LURL-db will be created as Eco-DB, if it does not exist before
existing Flex-databases will be used as they are (the is no data lost)
If you want to force the creation of a Eco-collection.index, simply delete the old index.
The Eco file system will only be used if there is enough memory.
The collection.index RAM limit is 200MB, if you have less, a flex-Table is createt.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4340 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-17 21:48:08 +00:00
orbiter
a0f7f2faad some more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4338 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-17 18:43:01 +00:00
orbiter
dc26d6262b - removed write buffer from kelondroCache (was never used because buggy; will now be replaced by new EcoBuffer)
- added new data structure 'eco' for an index file that should use only 50% of write-IO compared to kelondroFlex
The new eco index is not used yet, but already successfully tested with the collectionIndex
The main purpose is to replace the kelondroFlex at every point when enough RAM is available.
Othervise, the kelondroFlex stays as option in case of low memory (which then can even use a file-index)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4337 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-17 12:12:52 +00:00
orbiter
dbdec0f4d3 another fix for the "too many processes in loader queue, dismissed" - problem:
this was probably caused by http-forward cases; which are cases when urls from the loader queue change
and it was not possible to remove the old urls from the queue because they had been based on url hashes.
The queue is now again stored using the entry.hashCode, which does not change ieven if the url changes.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4332 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-13 23:10:09 +00:00
orbiter
b806a6af8b renamed kelondroEcoRecords to kelondroFullRecords (the "Eco"-name will be used for something else)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4331 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-13 00:41:22 +00:00
orbiter
065ba2d60f fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=719&hilit=
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4330 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-13 00:21:47 +00:00
orbiter
f3f02b08ec no distinction between standard and pro releases in auto-updater
this did not work in 0.56 main (but should).
Therefore it will be necessery to provide a hand-made 'virtual pro' (just renaming) release 0.57

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4326 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-12 00:43:33 +00:00
borg-0300
3cab85158c update for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4325 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-12 00:41:45 +00:00
borg-0300
53367d941a more information (BASE64)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4324 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-12 00:24:24 +00:00
orbiter
b3636f5ba8 re-implemented file index in kelondroFlex
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4323 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-11 14:43:28 +00:00
orbiter
a6ca3b51be more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4322 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-11 14:13:08 +00:00
orbiter
a5054c038d - added large number of generics
- redesign of ordering structures in kelondro (old did not work with strict generics)
- 50% IO reduction during read access on kelondroFlex (ommiting of read on index table)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4320 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-11 00:12:01 +00:00
orbiter
71bcf02d3a - removed pro-version (is the same as standard version, use the standard instead)
- changed yacy logo
- removed crawlOrder protocol (unused)
- removed file index in kelondroFlex (will not work, it takes too long to maintain)
- fixed remoted crawl for clusters (now denies remote crawls from peers outside cluster)
- 0.562

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4317 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-09 23:05:52 +00:00
orbiter
ce7257483d fix for bad fix with random access files (no performace enhancement)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4314 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-09 16:00:05 +00:00
apfelmaennchen
704de4dee8 Neue Funktion angelegt - notwendig für Einschränkung der Tagwolke
public Iterator getTagIterator(String tagName, boolean priv)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4313 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-09 15:58:47 +00:00
orbiter
016fc594af more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4311 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-09 09:58:56 +00:00
orbiter
ecd7f8ba4e - added NEAR operator (must be written in UPPERCASE in search query)
- more generics
- removed unused commons classes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4310 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-08 20:12:31 +00:00
orbiter
3e3d2e39a4 - some refactoring and redesign of kelondroBytesIntMap (created new class kelondroRAMIndex)
- more generics
- preparation to extend the balancer for flexible forced delay times
- set different random-access type, should now omit update of metadata in file and could be a bit faster (lets see)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4309 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-07 22:36:48 +00:00
orbiter
03e7782269 more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4305 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-06 19:23:38 +00:00
orbiter
f7c5ccedc7 more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4301 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-06 00:31:26 +00:00
low012
7af60fb24d *) fixed bug in update script
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4299 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-03 20:22:21 +00:00
low012
ae6d07bdb8 *) "Did you mean:" will only be displayed if the list of suggested URLs is not empty.
*) Removed <hr /> to make the "404 Unknown Host" error pag look like the other 404 error pages.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4298 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-01 23:03:02 +00:00
low012
408cb7a29b *) added check if archive for update is OK, install if OK, else just restart (http://forum.yacy-websuche.de/viewtopic.php?t=663)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4297 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-31 14:52:17 +00:00
orbiter
df2a7a8ac8 more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4295 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-28 18:47:45 +00:00
orbiter
9d8b17188a more generics, bugfixes for wrong cast
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4294 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-28 03:39:36 +00:00
low012
b08f877e97 *) tried to get rid of warnings when compiling parsers (http://forum.yacy-websuche.de/viewtopic.php?t=660)
lots of warnings are gone, new one in htmlFilterContentScraper


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4293 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-27 22:37:02 +00:00
orbiter
4dc438f7e7 moved to Java 1.5:
- changed build script to use java 1.5 compiler
- first stept to resolve missing generics definition (about 400 from over 4100 'missing'-warnings)
- added key-iterator to kelondro databases (for rapid from-memory enumerations, will be used for domain name collection, not used yet)

please set your development environment to use java 1.5!


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4292 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-27 17:56:59 +00:00
orbiter
db0d3d5e54 release 0.56 (and some last fixes)
- fixed bad peer hash computation in case no peer list is avaiable upon first startup
- security minimum waiting time in search result preparation
- removed dead superseed link

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4290 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-22 02:58:38 +00:00
fuchsi
d517e96714 last cleanup bits to serverDate before the release. only safe refactoring (method renaming) changes outside of serverDate.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4289 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-21 00:53:46 +00:00
hermens
4748d5c1ab Some enhancements to time management:
- remove unnecessary generation of Calendar and Date objects
- synchronized SimpleDateFormat objects in blog-, message- and wikiBoard
- correct use of TimeZones and SimpleDateFormats



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4288 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-20 17:11:35 +00:00
orbiter
52dd015218 new release strategy: the standard release is now built the same way as the pro release
a new release type was added: 'embedded' which is the same as the current standard release was
this will not have any effect to the next release 0.56, which will still a pro-release on public download
the transition the the new release strategy must be done now to enable automatic update by the updated in future releases

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4287 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-20 02:46:41 +00:00
fuchsi
1cb6e431a6 Replace the ISO8601 aka W3C datetime parser by one that supports every representation allowed by this standard, see http://www.w3.org/TR/NOTE-datetime
- useful expecially for sitemaps parsing, where this date format is used

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4286 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-19 22:45:58 +00:00
fuchsi
33ee6745f6 more cleanup in serverDate
- remove direct accesses to SimpleDateFormat fields in serverDate and use the static parse... methods instead
- remove nowDate() as a Date doesn't store timezone information and a new Date() is always faster
- default formatter methods use a GMT timezone by default now, this is important for interchangability as some date formats we use don't include a timezone offset.
- continued renaming and rearanging (formatter) methods. all should follow the general naming scheme formatWHAT(...)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4285 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-19 19:39:19 +00:00
fuchsi
3c30c2da75 more cleanup and API consistency changes, more to come...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4284 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-19 13:29:50 +00:00
fuchsi
f41172f850 Merge httpDate into serverDate as suggested. Removed some unnecessary code and fixed a possible synchronization problem.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4283 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-18 22:35:02 +00:00
fuchsi
a52681dd49 add buffering for the performance graph to avoid ConcurrentModificationException
closes: http://forum.yacy-websuche.de/viewtopic.php?f=6&t=628

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4281 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-18 15:59:35 +00:00
orbiter
814aff60bd - (re-)activated ftp protocol. see discussion here: http://forum.yacy-websuche.de/viewtopic.php?f=6&t=623&hilit=&p=3875#p3875
- set default-flushsize for pro to 500

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4280 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-18 00:14:44 +00:00
low012
6fbda9ef4f *) cleaned up code
*) added new filter 'FILTER_INVERT' and new method ymageMatrix.invert() to use it (does not work where characters have been written with ymageToolPrint.print(), haven't found the reason yet)
*) fixed a possible arrayOutOfBoundsExceptions in filter() if y-value of area to use filter in was larger than height of image filter is used with


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4279 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-16 22:15:27 +00:00
fuchsi
21f7e13fa1 fix stupid tiny bug introduced in rev 4276 that broke request URL parsing almost completely
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4277 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-16 00:33:32 +00:00
fuchsi
5d406d0094 - fixed url "file extension" parsing when there is no extension (like http://yacy.net/ would have extracted .net/)
- removed unecessary code + minor cleanup

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4276 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-14 20:03:26 +00:00
fuchsi
21b8d1b918 small cosmetic change for static fields in serverCore (special protocol ASCII entities) to improve readability
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4275 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-14 19:17:54 +00:00
orbiter
270d016d89 fix for missing anonymization in search profiling
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4274 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-12 18:57:43 +00:00
orbiter
e3e4f06be4 enhanced search result preparation in the case that no result is found (fast abandon of search)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4273 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-12 14:18:42 +00:00
fuchsi
1bd02762de Improve HTTP/ICAP header processing.
- workaround for illegal line endings (LF only), closes: http://forum.yacy-websuche.de/viewtopic.php?f=6&t=595
- fixed bug where we didn't break the processing immediately on EOS (the loop was run until the buffer was completely filled with -1)
- further performance improvements (one simple loop, avoid double processing of every byte and unnecessary temporary buffers)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4270 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-12 06:37:18 +00:00
orbiter
01554f4012 fixed bug with double-check in crawler
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4269 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-12 01:32:25 +00:00
orbiter
b1e08d354c repaired indexing after search snippet loading
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4268 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-12 00:33:26 +00:00
orbiter
48138952ff added memory measurement for index recreation to avoid OOM during index RAM space extension
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4267 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-11 15:07:03 +00:00
orbiter
9e23acf2d6 introduced new 'authority' ranking property
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4265 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-11 01:32:58 +00:00
orbiter
a3bfd668aa opening of array files at startup time, not when first time the web index is accessed
this speeds up the first search after startup

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4263 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-10 02:40:16 +00:00
orbiter
ca488e03f5 fixed authorization case
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4262 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-10 02:04:48 +00:00
orbiter
6a3a292015 - smoothed ymage font
- changed position of status banner

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4261 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-10 01:47:04 +00:00
low012
7397152e04 *) quick hack for antialiasing, works only on borders now => less blurry image
*) code is not finished, needs refactoring, still thinking about how to do it best


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4260 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-09 19:17:24 +00:00
orbiter
2954f96fae - removed public peer info box on status page, this info can now be seen in the status banner
- added peer count to banner
- added some values to protected status box

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4257 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-08 01:39:59 +00:00
low012
4eb40c4f61 *) added 2 filters: blur and antialiasing (which in fact is nothing more than a mild blur) to ymageMatrix
*) antialiasing is used for logo in banner


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4256 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-07 22:51:13 +00:00
orbiter
aeb1cf83a6 - corrected banner link (relative now)
- changed color mode (replace) for banner
- changed default color (fits to default skin) of banner in status

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4255 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-07 21:25:36 +00:00
orbiter
e22014dc83 some memory enhancements when generating and displaying ymage objects
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4253 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-07 02:15:12 +00:00
orbiter
f243e338cf implemented online caution also for local and remote search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4252 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-06 21:53:17 +00:00
orbiter
c57eb76b13 removed CMY color model from ymage classes and re-introduced RGB color model
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4249 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-06 01:06:17 +00:00
orbiter
b46bcaa5d8 changed method of profiling
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4248 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-04 20:19:13 +00:00
low012
76cd6ed6f6 *) New methods to insert bitmaps that feature transparencies.
*) Logo background is transparent now. (Using pixel at (0,0) to determine which color is transparent. Too dirty?)
*) Logo is loaded through filesystem instead of HTTP now.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4247 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-04 19:45:50 +00:00
orbiter
be214e594f - generalized ymage initialization options
- auto-adoption of performance memory graph to needed dimension

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4246 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-03 02:35:28 +00:00
low012
ee8a177c26 *) Logo is in the middle of free space now.
*) Fixed bugs in insertBitmap()


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4245 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-02 21:20:11 +00:00
low012
72698fcd36 *) Banner features a logo now. It does not look nice, but at least it works. Banner is not finished yet.
Which path do I have to set for IMAGE (htroot/env/grafics/yacy.gi) if I want to load it through the file system and not via HTTP?


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4244 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-02 20:37:12 +00:00
fuchsi
39d0f10ca1 Fix parsing oof dates in HTTP headers.
RFC 2616 requires a client to support RFC 1123 (default), RFC 1036 and ANSI C formatted date strings (we only supported 1123 before).

Closes: http://forum.yacy-websuche.de/viewtopic.php?f=6&t=525 (and probably others). There are servers which break the standards, please report those "DATE ERROR" messages if they contain a "sane" date string.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4243 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-30 20:47:27 +00:00
orbiter
aefb3f7765 added memory graph picture to PerformanceMemory_p.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4241 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-30 03:22:42 +00:00
orbiter
9b0ae4b989 added referrer to remote crawl url list
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4236 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-29 13:58:00 +00:00
fuchsi
18e516317d Fix problem with buggy HTTP-Servers which send illegal control characters in HTTP-Headers, they are ignored now.
Thx to celle for the patch and see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=560 for more information.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4235 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-29 06:02:45 +00:00
orbiter
7d5544e9b1 added some security checks to new remote crawl pull method to prevent that indexer is overloaded
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4234 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-29 02:54:59 +00:00
orbiter
89b9b2b02a redesigned remote crawl process:
- instead of pushing urls to other peers, the urls are actively pulled
  by the peer that wants to do a remote crawl
- the remote crawl push process had been removed
- a process that adds urls from remote peers had been added
- the server-side interface for providing 'limit'-urls exists since 0.55 and works with this version
- the list-interface had been removed
- servlets using the list-interface had been removed (this implementation did not properly manage double-check)
- changes in configuration file to support new pull-process
- fixed a bug in crawl balancer (status was not saved/closed properly)
- the yacy/urls-protocol was extended to support different networks/clusters
- many interface-adoptions to new stack counters

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4232 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-29 02:07:37 +00:00
fuchsi
69521d92e5 Add another external dependency from PDFBox package ("Bouncy Castle"). This is necessary for parsing of some encrypted PDF files.
bcprov-jdk14-132.jar is the binary jar as it is provided in the PDFBox-0.7.3 package (same as our FontBox, PDFBox packages).

Closes: http://forum.yacy-websuche.de/viewtopic.php?f=6&t=453


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4231 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-27 23:13:26 +00:00
orbiter
90a02990d2 NPE fix, see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=549&hilit=&p=3383#p3383
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4230 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-23 09:26:35 +00:00
orbiter
2fcd18a972 - fixed bad behaviour of search event worker processes
- fixed export of url lists in xml

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4229 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-23 01:08:16 +00:00
orbiter
445c0b5333 added domain list extraction and html export format
to URL administration menu http://localhost:8080/IndexControlURLs_p.html

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4228 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-22 20:47:06 +00:00
orbiter
d8d77fc4b2 fix for NPE, see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=549&hilit=&p=3368#p3368
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4227 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-22 18:15:28 +00:00
orbiter
bf6952abe7 - added url export to http://localhost:8080/IndexControlURLs_p.html
- removed command-line option to export urls

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4226 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-22 16:52:44 +00:00
orbiter
af10f729df fixed image search and favicon loading
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4225 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-22 01:34:29 +00:00
orbiter
c48b73cda2 redesign of ranking data structure
- the index administration now uses the same code base for url selection and collection
  as the search interface. The index administration is therefore a good test environment for
  ranking order control
- removed old postsorting-algorithms, will be replaced with new one
- fixed many bugs occurred before during ranking; especially the contraint filtering method
  removed too many links
- fixed media search flags; had been attached to too many urls. The effect should be a better
  pre-sorting before media load within snippet fetch

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4223 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-21 23:14:57 +00:00
orbiter
6f1308da2f - some enhancements to IndexControlURLs (shows more links, connects referrer to another query)
- some refactoring to search process

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4222 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-17 01:53:02 +00:00
orbiter
c527969185 - enhanced monitoring of ranking parameters
for details, please try http://localhost:8080/IndexControlRWIs_p.html
- fixed computation of ranking ordering in some cases

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4220 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-16 14:48:09 +00:00
orbiter
bd5673efbe added cleaning of search event before opening the index administration
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4219 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-15 12:49:13 +00:00
orbiter
55da871211 preparations for better ranking: better debugging of index properties
to do this, the index administration interface was extended.
It is now possible to select parts of a index.
See properties shown in interface after a word search for details.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4218 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-15 03:03:18 +00:00
low012
383dc815d2 *) fix for commit 4212
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4217 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-14 19:14:53 +00:00
orbiter
3491531cea - fixed 'appears in url' flag in index generation
- extended index administration page, shows some properties to the web links now

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4216 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-14 01:15:28 +00:00
orbiter
ec7ba0d3d0 - fixed problem with too small sort fields (sortbound was not set)
- slightly changed handling of date in indexURLEntry

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4214 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-13 02:24:10 +00:00
orbiter
bc2368e907 fix for problem with remote crawl referrers
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4210 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-12 16:32:50 +00:00
orbiter
875096552f fix for NPE in case that remote search results are empty
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4209 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-12 15:54:50 +00:00
orbiter
64b3b79e44 - fix for termination problem with uniq()
- addition to seed dna interpretation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4208 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-12 14:39:30 +00:00
orbiter
0abf33ed03 - tried to remove deadlock
- enhanced searchtime in kelondroRowSets
- enhanced uniq() - reverse enumeration causes less time in case of mass removal of doubles

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4207 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-12 01:14:51 +00:00
low012
a4010f7dc8 *) fixed bug where dots were added after numbers < 1000: "123" was transformed to "123." which is undesirable
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4206 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-11 21:42:50 +00:00
orbiter
2421127612 fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=513&hilit=
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4204 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-11 01:32:54 +00:00
orbiter
d0d2771883 disabled multiprocessoring of rowCollection.sort for testing purpose
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4202 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-11 00:28:22 +00:00
orbiter
edc4da5317 fix for division by zero in test reoutine
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4201 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-10 08:57:00 +00:00
orbiter
df38aaf7bd update to RowCollection sort speed-enhancements:
- better handling of small collections (less overhead)
- usage of pre-sorted limits
- different re-sort limit
- more testing procedures

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4200 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-09 15:34:11 +00:00
orbiter
0eb60cfe6f better handling of seed properties
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4199 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-09 09:40:42 +00:00
orbiter
ecba35de72 enhanced computing speed of kelondro core function: sorting
the enhancement was made by using better organized data structures and
multi-threading during the sort. A sort can be divided into two separate
processes when the first partition of the quicksort algorithm was done.
Generating a separate thread and starting the thread takes only 10 milliseconds,
so using a separate thread makes only sense if the data amount is large.
statistics about the speed-up:
without ehancement: 250 milliseconds for 100000 entries
with data structure enhancement: 170 milliseconds for 100000 entries
with additional second thread (if second processor is present): 130 milliseconds.

For dual-processor systems, this means about 100% speed-up
a test can be made with the following command:
java -classpath classes de.anomic.kelondro.kelondroRowCollection


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4198 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-09 00:51:38 +00:00
orbiter
6eaa5a0e64 enhanced local search speed. The ranking process is now 6 times faster that before.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4197 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-07 22:38:09 +00:00
fuchsi
425e4ead66 Allow absolute paths in configuration settings.
- before absolute paths would be expanded incorrectly, e.g.: fooPath=/a/b/c would become /path/to/yacy/root/a/b/c. Now you can put nearly every dynamically generated data with a configurable path to a location outside of yacys root dir without having to use symlinks (probably good for third party distribution packaging).
- abstractServerSwitch.getConfigPath(setting, default) returns a File instance, either with an absolute path or relative to the applications root path.

- exceptions (hardcoded): 
  DATA/LOG/yacy.logging
  DATA/SETTINGS/httpProxy.conf
  DATA/SETTINGS/user.db
TODO: all of these are the global configuration files and they should probably be put into _one_ command line configurable settings path, so it would be possible to package them in /etc/ for example.

- add missing workPath to yacy.init (it was used in code, but there was no default in the file)
- fix broken skinPath (was skinsPath in yacy.init but skinsPath in the code) + a few other broken config reading caused by typos.
- replaced path setting names and their default values with the related static fields in plasmaSwitchboard where not already done/existing

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4196 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-04 10:36:25 +00:00
borg-0300
e8d32d9f62 other loglevel
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4195 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-02 16:06:54 +00:00
borg-0300
a5d28785b1 less OOM (works for me)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4194 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-02 14:55:46 +00:00
orbiter
ccbfb15b6b enhancement to crawl stacker enqueue order
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4192 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-01 00:57:32 +00:00
hermens
5c5344ae97 Beautify log
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4190 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-31 16:29:07 +00:00
hermens
35cf196204 transferRanking(): Do not flush more ranking files than requested by caller.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4189 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-31 15:55:52 +00:00
hermens
d0aa8cf25d Only update handshaked peer's last seed date if it has not been updated recently.
Unil now the newer data was overwritten by old data from before the handshake.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4188 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-31 15:47:48 +00:00
hermens
8f9d65da67 Small corrections to dhtFlushControl()
- Test wCacheMaxChunk against maxURLinCache(), not getMaxWordCount(). This triggered a flush everytime dhtFlushControl() was called.
- If triggered, flush at least 1 entry.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4187 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-31 14:21:58 +00:00
orbiter
55c87b3b12 changed behavior of crawl stacker
- final flush only when tabletype = RAM
- prestacker (dns prefetch) only if tabletype = RAM and busytime <= 100
- number of maximun entries in stacker is configurable in yacy.init (stacker.slots)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4186 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-31 11:32:40 +00:00
hermens
18144043e6 Correct UTC Offset at beginning/end of daylight savings time
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4185 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-30 19:20:02 +00:00
orbiter
4fefa53135 removed parser object pool, see also svn 4106
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4184 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-29 12:14:18 +00:00
orbiter
a31b9097a4 preparations for mass remote crawls:
two main changes must be implemented to enable mass remote crawls:
- shift control of robots.txt to crawl queue (away from stacker). This is necessary since remote
  crawls can contain unchecked urls. Each peer must check the robots to prevent that it is misused
  as crawl agent for unwanted file retrieval
- implement new index files that control double-check of remotely crawled urls

After removal of robots.txt checking from stacker threads, the multi-threading of this process is void.
Multithreading has been removed. Also the thread pools for the crawl threads had been removed, since
creation of these threads is not resource-consuming, for a detailed explanation see svn 4106

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4181 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-29 01:43:20 +00:00
fuchsi
a718858e8b seed.CCOUNT is interpreted as a double value not int
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4180 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-24 23:25:48 +00:00
fuchsi
0e1738899f * Complete number localization and provide a more reasonable interface to serverObjects:
- put(key, value) methods are now used if a value added to the map should be kept as it is. Numbers are transformed (but not formatted) to an equivalent String representation.
- putASIS(...) have been removed, now done with simple put(...) (see above).
- puNum(...) can be used for number values which should be stored in a formatted way, either depending on the current locale setting for yacy (default) or in a "none" locale (see javadocs and setLocalize()).
- putHTML(...) escapes special characters into corresponding HTML enities ('<' => '&lt;') which was done with put(...) before and so was called too often, becauses it is necessary only for very few cases. Additionally there is a "forXML" mode which only replaces < > & ".
In short: Use put(...) for almost everything, use putXY(...) if you need some special transformation of the value.
A few bugs have been fixed as well, and there should be a small performance improvement for complex pages with a lot of values.

* added additional Sum/Avg rows to access tracker pages, see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=456
* removed duplicate code (mostly related to the big changes above).

TODO:
- make sure, number formats work as expected _everywhere_, report overseen stuff http://forum.yacy-websuche.de/viewtopic.php?f=5&t=437
- probably a good idea to add special putDate() methods as they are used in many pages and create duplicated formatting code + maybe some centralized handling for memory value formatting.
- further improve the speed of page creation for the WatchCrawler.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4178 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-24 21:38:19 +00:00
orbiter
f8318436a1 fix for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4177 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-22 16:32:39 +00:00
orbiter
7d57b80598 distinct keepOrder strategy, more discrete implementation of enhancement introduced in SVN 4158
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4176 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-22 15:26:47 +00:00
orbiter
9a7b093eed tried to avoid endless loop, see also:
http://forum.yacy-websuche.de/viewtopic.php?f=6&t=467&hilit=

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4175 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-22 14:35:45 +00:00
orbiter
b856e377a9 some additions and a small bugfix to SVN 4158
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4173 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-21 23:26:22 +00:00
hermens
501a7aae90 Small correction
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4172 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-20 12:02:31 +00:00
hermens
caff520988 Removed unnecessary and unused code.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4171 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-20 11:56:15 +00:00
hermens
d732840f8a Avoid ConcurrentModificationException when accessing the PerformanceQueues page while yacy is indexing.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4170 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-19 23:36:40 +00:00
fuchsi
35303f9504 add real size values (KBytes) of the DHT-In/Out-RAM-Caches to the PerformanceQueues page. A lot of users seem to tweak this value and it might help in finding the best size in relation to the peer's memory ressources.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4169 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-19 21:47:07 +00:00
fuchsi
38bbd4a4b3 no code changes. just touched yacyClient.java to trigger a rebuild of the file in an uncleaned tree.
NOTE: run "ant clean" before building SVN 4166/4167 in a tree that includes class files from a previous build to make sure, that every class file is rebuilt!

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4167 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-19 15:31:38 +00:00
fuchsi
f717beecb1 - Changed yFormatter handling to be more flexible and produce more readable code for server pages. There are serverObject.putNum() methods to allow adding of number type values in a formatted form, and put() methods for number types that add them without formatting. This reduces the need to transform them into Strings in server pages and removes the HTML encoding step which is unecessary for numbers.
- some minor code cleanups (mostly unnecessary casts, null checks)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4166 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-19 04:13:46 +00:00
fuchsi
ca83f5a8d9 Add external lib FontBox which is part of the PDFBox (they extracted the font handling code into this package in 0.7.3).
Add the packages to the eclipse .classpath.
Closes: http://forum.yacy-websuche.de/viewtopic.php?f=5&t=453

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4165 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-18 19:53:52 +00:00
fuchsi
3352474dd8 Remove grouping separator in Network.xml (yacystats will woork without it) and format a few more numbers.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4163 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-16 13:29:11 +00:00
fuchsi
06e6a1ff62 Add a generalized Formatter class yFormatter inspired by http://forum.yacy-websuche.de/viewtopic.php?f=5&t=437
At the current state it allows formatting of numbers (integer + decimal types) for output according to the Locale derived from the language setting in yacy. Network.(html|xml) and Status.html have been changed to use it for now (TODO: should be integrated into other servlets as well to reduce duplicate formatting code).
NOTE: For now the output format for Network.xml simulates the old behaviour which is wrong (it uses '.' as decimal and grouping separator), to make sure external scripts like the yacystats.de one won't break with this update.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4162 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-16 02:12:31 +00:00
fuchsi
e77aec8c9d fix handling of encrypted PDF-Documents (with default user password "")
- update PDFBox package to current version 0.7.3
- use new security model in PDFBox to "guess" wether we can decrypt a document or not
NOTE: When upgrading to this version make sure the old PDFBox-0.7.2.jar is removed from libx/

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4161 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-15 13:18:38 +00:00
fuchsi
b5f7df8d0a Speed up remove operations in rowCollections.
- Array element shifting during remove is only done when it is necessary to keep the order of a row collection.
- This will speed up the most expensive operation "common word shrinking" by a factor of 500-1000 (in the worst cases we shifted > 60 GB of data during this operation)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4158 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-11 17:17:08 +00:00
low012
fdb0b861f8 *) fixed wrong calculation of network words, network links, network PPM if peer is senior or principal peer
*) added network QPH
*) banner is cached for 1 second to avoid DOS
*) still no logo


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4154 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-09 21:47:37 +00:00
fuchsi
508de558f7 sbStackCrawlThread is null during first cleanProfiles() run at startup.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4152 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-08 15:56:40 +00:00
fuchsi
70614385ef Attempt to fix the "lost profile handle" bug.
It seems improbable, but it might happen, that during a crawl all queues (indexing, crawling, ...) except the crawl URL stacker ran empty. This commit adds an additional check for an empty crawl stacker queue before executing the profile cleaner.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4151 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-08 15:11:26 +00:00
low012
507ecd8afa *) added banner that can be displayed like this: http://localhost:8080/Banner.png
possible arguments: textcolor, bgcolor, bordercolor
   example: http://localhost:8000/Banner.png?textcolor=ffffff&bgcolor=121212&bordercolor=ffffff
   take care: YaCy uses CMY color model!
*) there are still some known bugs, but I can't continue coding right now


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4149 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-07 21:59:36 +00:00
fuchsi
9b0948cb4c gnarf. mixed up the positions. finally fixed...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4143 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-04 10:58:01 +00:00
fuchsi
c0f5fc51ef bugfix for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4142 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-04 10:47:48 +00:00
orbiter
33fb2f756d added emergency fail case in remote crawls
in extreme situations this will cause that no remote crawls are send out any more
this is bad, but it protects the case where failing remote crawls fill up the local queue too much,
which is even worse

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4141 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-04 10:40:30 +00:00
fuchsi
c5a8585ac6 fix more encooding problems in yacysearch.rss.
- URL encoding for search terms where required
- removed "ugly" CDATA escaping
- UTF-8 encoding for the XML
- no HTML style escaping for XML/RSS element values
Note: some unicode characters might still be encooded in a wrong way.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4140 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-04 09:21:03 +00:00
fuchsi
6b00fe0c4e fix ArrayIndexOutOfBoundsException
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4139 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-04 08:50:33 +00:00
orbiter
3e60ae93b9 modified remote search snippet fetch behavior: do not fetch snippets for more than 300 milliseconds, even if the snippets can be found locally without online fetch
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4137 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 16:42:11 +00:00
orbiter
97f1ca52bd fox for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=390
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4136 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 15:45:12 +00:00
orbiter
143fa40d77 fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=394&p=2382#p2382
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4135 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 15:34:16 +00:00
orbiter
711641f167 extended client connection clean-up:
there are now two time-outs, one for the complete connection time, and one for an idle time
connections that are idle for more than 2 minutes are closed, and connections that are alive since more than one hour are also closed
if the complete number of connections exceeds 64, all connections more than 64 and have most idle time are also closed

During normal operation of peers these forced closings should never appear,
but the existence of the idle connection check ensures the availability of the peer and the usability of the host.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4134 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 15:06:12 +00:00
orbiter
b19bb6e5b1 - reverted svn 4132; this did not solve the problem and removed the emergency mehtod which caused production failure for shure within some hours
- removed and added some debugging lines

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4133 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 14:34:05 +00:00
fuchsi
1eba408d2f Make sure that sockets which couldn't be opened aren't handled as active connections, in which case they wouldn't be closed.
Please test this and report any problems (connections that stay open for a very long time according to http://<your_yacy_peed>/Connections_p.html to http://forum.yacy-websuche.de/viewtopic.php?f=5&t=386

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4132 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 12:18:26 +00:00
fuchsi
03c5b4ad68 more fixes to the yacysearch.rss, it's now 100% valid according to http://feedvalidator.org
- RFC-822 date time had to include the time instead of date only
- <opensearch:link> doesn't exist -> <atom:link>, see http://www.opensearch.org/Specifications/OpenSearch/1.1
- <link> elements are mandatory for <channel> and <item>

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4131 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 04:00:52 +00:00
orbiter
d69d386f7d added additional forced client connection closing
if a specific number of simultanous connections is reached
the limit is currently set to 64 connections

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4129 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 00:21:53 +00:00
orbiter
dea7bee049 - increased minimum time before an active connection is interrupted from 1 minute to 10 minutes
- added sorting by connection time in client connection tabe of connectionTimeComparatorInstance

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4128 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-02 23:56:04 +00:00
orbiter
c1440d2241 fixed problem with redirection: redirected URLs had not been tested with the double-check
see also: http://forum.yacy-websuche.de/viewtopic.php?f=6&t=348

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4126 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-02 22:40:53 +00:00
fuchsi
7404f2c35c Fix some of the issues with the RSS search interface, see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=392
Note: the new DateFormatter822 in the plasmaSwitchboard is just a copy of the DateFormatter that always uses the US locale to allow formatting of a loocale independent date String.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4124 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-02 21:28:29 +00:00
orbiter
98abe0804d another enhancement to crawl starts with link files
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4123 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-02 20:30:42 +00:00
orbiter
1b42152a76 fixed and enhanced some details in crawl start with file
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4120 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-02 00:49:38 +00:00
orbiter
4465db7399 removed debug information from network grafic
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4118 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-01 12:32:10 +00:00
orbiter
01e0669264 re-designed some parts of DHT position calculation (effect is the same as before)
and replaced old fist hash computation by new method that tries to find a gap in the current dht
to do this, it is necessary that the network bootstraping is done before the own hash is computed
this made further redesigns in peer initialization order necessary

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4117 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-01 12:30:23 +00:00
hermens
d547c3b4bd Avoid NullPointerException in yacySeedDB.lookupByIP
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4116 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-29 11:18:09 +00:00
orbiter
5b1a937ed8 fix for crawl stack database format change, introduced in SVN 4113
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4115 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-28 08:17:08 +00:00
orbiter
af25c98306 enhanced local search performance in case of a remote search:
there is no waiting until the local search terminates to show the result page.
the local search appear like all other results from remote peers using a separated thread.
This has especially a stron effect, if the local index for a specific word is large.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4114 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-28 01:36:22 +00:00
orbiter
842308ea97 - redesigned crawl start menu, integrated monitoring pages
- removed web structure picture from indexing menu and grouped it together with htcache monitor
- added a database for terminated crawls, when a crawl is finished it is automatically moved to the new database
- extended crawl profile edit servlet, shows now also terminated crawls
- option that was used to delete profiles is now redesigned to a function that moves the current crawl to the terminated crawls and removes all urls from the current queues!
- fixed here and there problems with indexing queues
- enhances indexing speed by changing cache flush sizes.
- changed behaviour of crawl result servlet: the list of crawled urls is shown if there is one, othevise the overview window is shown

attention: the new profile databases are not compatible with the old one. current crawls will be lost! the web index is not touched.
next steps: the database of terminated crawls can be used to start with them a new crawl. This is useful if one wants to re-crawl specific pages and wants to use a old crawl profile.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4113 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-28 01:21:31 +00:00
orbiter
341f7cb327 steps to enhance remote search performance:
- added a file size limitation, that disallows parsing of large documents during (offline-) remote search
- added profiling information to search result computation, visible at search access tracker. this info shows used time for URL fetch and snippet computation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4112 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-26 10:11:50 +00:00
orbiter
2f1ff048ba some fixes to socket connection time-out
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4111 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-25 23:45:05 +00:00
orbiter
3c74014004 automatic deletion of dead client connections
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4110 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-25 22:46:11 +00:00
orbiter
11b4f80bde - fixed non-closing client connections
- added client connection tracker in connections servelet

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4108 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-25 21:36:08 +00:00
orbiter
d352853f2d fix for non-closing client sessions
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4107 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-24 08:42:07 +00:00
orbiter
1488769e1f cleanup of unmaintained and outdated performance methods:
removed object pools in httpc. Object pooling is not recommended,
if the creation of the object is not time-intensive. Object pools are only useful,
if there is much computation necessary to create some basic data that is stored
in the object pool and can be re-used. This does not apply to object pools in YaCy.
Object pooling of client sessions would make sense if they would allow re-use of
living connections to other yacy clients. But every connection is closed after usage
of an object in the client pool, therefore the YaCy server client objects are not such
that hold hardware/network-allocated entities.
See:
http://www.javaperformancetuning.com/news/qotm033.shtml
http://java.sun.com/docs/hotspot/HotSpotFAQ.html#gc_pooling
http://docs.sun.com/source/816-7159-10/pt_chap5.html
http://www.microjava.com/articles/techtalk/recylcle2


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4106 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-23 20:49:52 +00:00
orbiter
3cb9cdc9be try to fix connection problem, possible cause for wrong junior status and non-passive passive peers:
the YaCy client treats disconnections during data transmissions as error and discards all data transmitted so far
this did not happen so far until I removed a delay time at the end of the daemon session which prevented this case.
To fix this problem, disconnections during transmissions are not treated as error now, which means that end-of-transmissions
with sudden disconnections are not a cause for peer diconnections any more. To be nice to non-updated peers, the sleep time
at the end of server sessions is also re-enabled.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4105 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-23 17:31:29 +00:00
borg-0300
ba59de773f again and again junior - test
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4097 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-13 17:05:53 +00:00
orbiter
4275727d69 fix for peer ping problem (implemented a 3-time re-ping); cause for 'Connection reset' still unknown
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4095 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-12 00:42:53 +00:00
orbiter
07d1e98909 fixed round-robin method of peer-ping order (the successfully pinged peer was not updated to current last-seed date)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4093 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-11 16:07:35 +00:00
orbiter
76e4c2d69e fix for peer-ping in case that remote peer does not respond with valid values
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4091 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-11 15:27:01 +00:00
fuchsi
e192f99134 fix small bug introduced in r4089 that appeared when we tried to remove "gzip" encoding from Accept-Encodings header
closes http://forum.yacy-websuche.de/viewtopic.php?f=6&t=336

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4090 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-10 21:46:40 +00:00
fuchsi
ae4b9308ef Fix problems with some web servers which couldn't handle the way yacy was sending requests. Thx to celle for the patch.
http://forum.yacy-websuche.de/viewtopic.php?f=5&t=320

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4089 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-10 09:15:28 +00:00
fuchsi
6601e37512 clear caches after changing blacklists, closes http://forum.yacy-websuche.de/viewtopic.php?f=6&t=241&p=1964#p1964
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4088 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-10 08:15:25 +00:00
fuchsi
5b0c1449e1 various fixes and cleanups for blacklist handling:
1. avoid adding duplicate file name entries in config properties for lists, 
2. correctly merge all path masks from all list files for the same host masks,
3. rewrite helper methods standard java methods for Collection transformations,
4. merged various methods with identical functionality for different Collection implementations into one,
5. minor refactoring to improve code readability.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4087 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-10 06:20:27 +00:00
orbiter
841cf71022 fix for NPE in DHT transfer selection, see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=327
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4085 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-09 19:08:13 +00:00
orbiter
dbd1eeead5 fix for missing object miss-cache flush value:
the value is alway zero because there is no miss-cache flush
see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=288

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4083 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-09 18:35:05 +00:00
orbiter
f2a3434407 fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=238&p=1341#p1341
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4082 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-09 17:31:29 +00:00
orbiter
f4a5c287fe re-implemented post-ranking of search results
(should enhanced search result quality)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4080 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-08 11:50:19 +00:00
orbiter
8ff5e2c283 - fixed/re-implemented media search
- fixed search tipps (topwords, now appearing at the bottom of the page)
- added search consequences execution (deletion of bad referenced some time after the search happened)
- added some formatting at network table

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4078 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-07 11:45:38 +00:00
orbiter
6c819a6fd9 added cache to favicon display
added better synchronization for simultanous search requests

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4076 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-06 01:28:35 +00:00
borg-0300
d69013f66a added patch from Fuchs - http://forum.yacy-websuche.de/viewtopic.php?f=6&t=241
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4075 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-05 11:51:02 +00:00
orbiter
daf0f74361 joined anomic.net.URL, plasmaURL and url hash computation:
search profiling showed, that a major amount of time is wasted by computing url hashes. The computation does an intranet-check, which needs a DNS lookup. This caused that each urlhash computation needed 100-200 milliseconds, which caused remote searches to delay at least 1 second more that necessary. The solution to this problem is to attach a URL hash to the URL data structure, because that means that the url hash value can be filled after retrieval of the URL from the database. The redesign of the url/urlhash management caused a major redesign of many parts of the software. Since some parts had been decided to be given up they had been removed during this change to avoid unnecessary maintenance of unused code.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4074 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-05 09:01:35 +00:00
orbiter
e90afa9483 fixed search access tracker
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4072 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-04 09:04:47 +00:00
orbiter
4779f314fe first version of next-generation search interface:
- snippets are not fetched by browser using ajax, they are now fetched internally
- YaCy-internat threads control existence of snippets and sort out bad results
- search results are prepared using SSI includes
- the search result page is visible right after the search request, the results drop in when they are detected
- no more time-out strategy during search processes, results are shifted within queues when they arrive from remote peers
- added result page switching! after the first 10 results, the next page can be retrieved
- number of remote results is updated online on the result page as they drop in
- removed old snippet servelet (which had been also a security leak btw)
- media search is broken now, will be redesigned and fixed in another step


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4071 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-03 23:43:55 +00:00
orbiter
6d759ad0a7 - new bot address
- removed unused skins

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4065 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-29 11:46:42 +00:00
orbiter
f9e6cf6a3d more refactoring of search:
integrated first version of ssi-using search interface,
but the function is currently disabled


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4063 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-28 12:15:46 +00:00
orbiter
f81ef40cc4 no dht activity for small networks; this is not needed if the network is small
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4062 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-26 22:35:26 +00:00
orbiter
d9472b6a3a * fixed problem with watch crawler
* added new column to network table (remote crawl urls):
  the new value for provided URLs will be used for new remote crawl method


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4061 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-26 22:06:58 +00:00
orbiter
e332b844b2 - enhanced remote search: during waiting time for remote crawls
some urls are fetched so the url cache can be filled with these urls
- the url-prefetch is used to sort out some unresolved urls
- the snippet-fetcher is triggered with the search event id. This is used
  to remove missing snippets from the search cache so they will not be displayed again


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4060 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-26 18:18:35 +00:00
orbiter
a34d9b8609 * added a search history cache that maintains search results for 10 minutes
it is necessary for the new search process that will do automatic re-searches
a positive effect is, that when a re-search is done it can be monitored how many
results had been contributed from other peers. The message for this contribution
was moved from the end of the result page to the top.
* enhanced re-search time when a global search was done an the local index has
already a great number of results for this word
* re-organised presearch computation; must be further enhanced

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4059 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-24 23:12:59 +00:00
orbiter
ae86d010bb more refactoring of search processes; also some small speed enhancements
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4058 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-24 08:41:52 +00:00
orbiter
bb426565f0 added new yacy protocol for mass url-pull for better remote crawling distribution
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4056 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-22 00:59:05 +00:00
borg-0300
4f6d56330d Bugfix für abgeschnittene Überschriften - http://forum.yacy-websuche.de/viewtopic.php?f=6&t=273
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4055 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-21 22:07:23 +00:00
low012
54004e929b *) Better Bourne-Shell (OpenSolaris) compatibility, update and restart really work now. As the Bourne-Shell is the grandfather of most modern shells, it should also work with Linux (tested with Mandriva, works) and OSX (Please test!).
*) Fixed a typo.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4054 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-20 21:52:52 +00:00
orbiter
72752bb503 because of a new database structure handling, the memory need for accessing
collection objects has been reduced to 50%:
- set new memory calculation functions for indexing process
- adjusted guessed memory amount
-> Testing needed:
   try new recommended value (see performanceQueues) and see if OOMs occur.
-> report maximum recommended value, so we can set new default values.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4053 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-20 17:36:43 +00:00
low012
694defb257 *) better compatibility with OpenSolaris 5/07, updates should work now
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4050 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-17 15:26:34 +00:00
orbiter
16c203f759 fixed remote search access tracker
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4048 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-16 11:44:18 +00:00
orbiter
344911bfaa shorter minimum delay values for intranet crawl targets
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4047 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-15 23:18:12 +00:00
orbiter
f890cc86aa inserted forwarding patch from fuchs
see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=233

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4046 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-15 22:25:48 +00:00
orbiter
b5346141b3 made the plasmaHTCache static (there is only one internet, so we need only one cache)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4045 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-15 21:31:31 +00:00
orbiter
947fc46904 refactoring of search process:
- re-designed remote request result processing
- re-designed local result accumulation, will be further enhanced with snippet fetcher
- removed search process handling in switchboad
- made snippet class static (there is no need for multiple snippet objects)
- removed some redundant tasks in server-side search process, should be a little bit faster now


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4043 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-15 11:36:59 +00:00
orbiter
3ca8f71cbb refactoring of dbtest to create separated kelondro sql connector interface
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4042 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-11 22:40:24 +00:00
orbiter
61f93cbf14 some code-cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4040 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-11 00:42:04 +00:00
orbiter
e76e996737 fixed umlaute-problem
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4039 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-10 14:10:57 +00:00
orbiter
4798044708 fixed compile problem with svn 4037
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4038 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-10 14:03:07 +00:00
orbiter
24e25e1141 enhanced SSI server-side support:
- SSIs may now refer to servlets, not only files
- calling a servlet, the servlet/SSI engine is called recursively
- SSIs now work also for non-chunked-encoding supporting clients
This will support the new search page functionality, to show search results
dynamically without using javascript. To test this method, a test page has been added
http://localhost:8080/ssitest.html
..calls dynamicalls 3 servlets, which produce some delays during their execution
please verify that you can see the result step-by-step on your browser
To implement this feature, some refactoring had been taken place, mostly code
had been made static and will execute faster.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4037 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-09 21:58:38 +00:00
low012
c8e5a4a6b7 *) fixed bug described by Huppi in http://forum.yacy-websuche.de/viewtopic.php?t=239
*) added a preview function to message system
*) removed some old comments, I hope that's OK


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4036 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-08 18:23:45 +00:00
orbiter
5c1b444690 some redesign of min/max and normalization computation during search result ordering
this saves about 1 millisecond for each URL reference, which has some good effect
on the search result computation if a word is searched that appears very often
(speed-up of 1 second and more)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4033 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-06 12:50:11 +00:00
orbiter
9678d1b282 fixed new EcoRecords-Nodes. Here I omitted object content copying before
to avoid massive System.arraycopy. That did obviously not protect enough the Node objects


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4032 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-06 10:10:33 +00:00