Commit Graph

2884 Commits

Author SHA1 Message Date
orbiter
2c34038912 addition/correction to last commit: usage of concurrent-classes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4626 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-30 21:17:12 +00:00
orbiter
b2150057d2 removed unnecessary cleanup method
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4625 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-30 20:32:08 +00:00
lulabad
c4c0d54b22 * added regex extended blacklistengine
* removed my own engines

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4618 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-30 08:50:09 +00:00
orbiter
368593e449 enhanced the concurrency handling of indexing process (better queue size control, better data concept, better shutdown behavior)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4617 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-30 00:03:44 +00:00
orbiter
be58135b3e possible fix for deadlock in search execution
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4612 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-29 07:50:37 +00:00
orbiter
0241d070bc added concurrency to indexing process:
- the methods {parsing, semantic analysis (condensing), structure analysis (web structure)} in the serialized indexing path had been made concurrent.
- four BlockingQueues handle concurrency and hand-over of the indexing objects, the last object in the queue is stored into a blockingQueue of maximum size 1 to serialize the process for storage (which uses IO and therefore here should not be deserialized)
- a concurrency of (CPUs + 1) is default. Single-CPU users will profil from the change because large files cannot block the indexing process any more.
- removed the secondary indexing thread, which is superfluous now. Concurrency is default for all users.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4609 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-28 11:56:28 +00:00
lulabad
9fb5d661f2 added my Blacklistengines
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4608 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-28 08:18:21 +00:00
orbiter
bca87f1e38 - refactoring of serverThreads: renaming to distinguish busy-threads and blocking-threads
- added blockingThreads which are threads that are not driven by pause times but by BlockingQueue lookup

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4606 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-27 12:03:16 +00:00
orbiter
968c775025 - preparation of parsing/indexing queue for concurrent execution
- remote crawl receipts are now transmitted concurrently in separate threads (makes remove crawls much faster!)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4605 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-26 22:43:38 +00:00
orbiter
9b0e20fb06 next refactoring step in document indexing to prepare concurrency environment for document parsing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4604 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-26 19:51:05 +00:00
orbiter
7f9f639d20 - refactoring and abstraction of index reference (urls) handling: blacklisting is part of reference filtering
- refactoring of word/phrase handling: word abstraction from condenser becomes part of index element handling
- removed unused code parts from condenser

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4603 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-26 15:37:49 +00:00
orbiter
d6050b9ffb - separated the LURL data storage and Crawl result stack for process supervision.
this is another step to enable multiple, concurrent fulltext-indexes
- another try to make the yacy-httpc more stable

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4602 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-26 14:13:05 +00:00
orbiter
8d6a13bc07 refactoring of parsing-condensing-indexing process:
- separated parts
- removed storagePeer function
next step will be parallelization of processes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4600 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-24 22:51:26 +00:00
orbiter
d3b06913ec protection against seed-db failure during enumeration
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4598 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-23 23:47:41 +00:00
orbiter
5aa96dbc36 fix for shutdown configuration
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4596 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-23 13:14:57 +00:00
orbiter
93633abed8 - removed some debugging code from search process - should speed up now
- added some profiling code to search event - more time details in PerformanceSearch_p.html

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4594 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-23 00:55:04 +00:00
orbiter
fba46c51d7 fixed non-termination bug in qsort
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4593 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-22 23:15:28 +00:00
orbiter
541b817502 refactoring of switchboard queueing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4591 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-22 01:28:37 +00:00
orbiter
fc94fbe224 another improvement to the collection sorting
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4589 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-20 23:11:04 +00:00
orbiter
11270d450e better quicksort-pivot computation: 30% faster (measured with test program)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4588 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-20 22:01:12 +00:00
orbiter
3e44293f07 - fixed a problem with thread pools in row collection
- added a line-viewing feature in threaddump	

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4587 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-20 14:21:58 +00:00
danielr
e43051b125 - fixed Threaddump output (html-escaped ie. <init>)
- in EcoFS converted comments to javadoc


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4586 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-20 10:20:55 +00:00
orbiter
433ff855f7 - fixed another concurrency problem in collection sorting
- fixed a typing problem that was introduced in svn 4579 and caused the crawl monitor to fail

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4585 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-19 23:47:24 +00:00
orbiter
19286fa2d1 tried to fix seed2.old.db-problem
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4584 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-19 22:35:19 +00:00
orbiter
f3996e63b8 tried to fix more deadlocks:
- changed connection modes in ftpc
- replaced sort tread pool in row collections by new one using util.concurrent. the old pool had caused blockings

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4582 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-19 11:23:43 +00:00
danielr
7008a218b3 avoid ConcurrentModificationException in plasmaCrawlerQueues
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4579 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-17 13:51:56 +00:00
orbiter
7150b463ff changed handling of default values and database paths:
- the default files yacy.init and for the network definition is now moved to the path defaults
- the httpProxy.conf is renamed to yacy.conf
- the DATA/INDEX/PUBLIC is renamed to the actual network nickname, which should be freeworld or sciencenet
more menu entries
- added apfelmaennchens alternative search page to the menu
- added the new thread dump page to the server log menu point as submenu
modifications
- modified the thread dump page: sorting by thread type

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4575 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-16 22:31:54 +00:00
lulabad
25f5035f23 typo
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4571 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-16 15:13:12 +00:00
orbiter
7fd094fcbe small bug in ftpc: did cot compile in Java 1.5
Please set compiler to Java 1.5-compliance

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4570 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-16 13:41:49 +00:00
danielr
f51bad8ae5 FTP:
- report connection status (to break if no connection possible)
- fixed isFolder()
- additional error output
- fixed paths with encoded symbols (ie. a%20file.txt)
- refactoring


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4567 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-15 21:57:55 +00:00
danielr
820641938e ftpc: fixed date parsing, some refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4566 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-15 10:56:47 +00:00
orbiter
4c584dff87 disabled soLinger to prevent that too many connections stay open (it's a TEST!)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4565 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-15 10:46:55 +00:00
orbiter
9c989fe5f7 fixed deadlock
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4562 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-15 00:49:16 +00:00
danielr
c565906050 FTP:
- added maxFileSize-check
- added timeout for download
- fixed dirlist (when all filenames have spaces, change to absolute links)
- enhanced isFolder()
- make sure data connection is closed, so a new can be opened
- refactoring


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4561 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-14 16:28:27 +00:00
danielr
1a7870df0d FTP: source cleanup (added finals, indention for easier diffs)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4559 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-14 12:35:53 +00:00
orbiter
fa1090113d - next try to fix the networking problem:
set the maximum transfer size to less than MTU=1500-52: buffer size <= 1448
- some refactoring of transfer methods (naming)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4558 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-14 00:16:04 +00:00
orbiter
d87d295c68 one more try to fix the connection problem
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4556 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-12 13:13:11 +00:00
orbiter
a3dadcd89b preventing that peer which return a false search result are disconnected
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4555 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-12 00:56:18 +00:00
orbiter
ba622bb240 addendum to svn 4553
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4554 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-12 00:24:20 +00:00
orbiter
5530b8e1ca reverted changes to yacy protocol classes: they caused the sciencenet to loose connections
a comparisment with the main release 0.57 had been made: this showed a stable network
This is an emergency operation to ensure availability of the sciencenet network.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4553 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-12 00:05:18 +00:00
orbiter
b664a53553 fix for NPE during search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4552 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-11 15:30:26 +00:00
orbiter
b4ed937f1e - modified zone navigation (does still not work correctly)
- added dht switch in network definition
- 0.574

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4550 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-11 11:09:38 +00:00
orbiter
8d0470a5c6 new method to compute search history IDs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4549 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-10 23:40:56 +00:00
orbiter
65785da8f2 new method for best hash computation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4548 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-10 23:28:05 +00:00
orbiter
9eddc1506b - one try to fix the httpd problem
- fix for handling of collection index that appears when removing elements
- added another navigation method (stub, not working yet)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4543 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-09 23:58:22 +00:00
orbiter
7cc4ff05c9 some code enhancements and bugfixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4542 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-09 23:48:24 +00:00
danielr
6788f8f7c1 fixed error 'FTPC cannot change directory'
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4531 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-06 11:59:23 +00:00
orbiter
7ce76c8ff8 added missing file
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4530 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-05 22:57:53 +00:00
orbiter
bfed9c2da6 - some refactoring in search process
- separated sidebars in new search interface and placed them in their own files
  which can be put in into the search page like plug-ins

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4529 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-05 21:46:55 +00:00
borg-0300
3445b1e10b *better logging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4526 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-05 13:41:54 +00:00