Commit Graph

1646 Commits

Author SHA1 Message Date
low012
d8f4b17e31 *) Hopefully fixed bug described in http://www.yacy-forum.de/viewtopic.php?t=2825.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2611 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-17 22:57:10 +00:00
theli
0e84a969d6 *) Bugfix for serverCharBuffer read from file operation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2607 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-16 13:11:32 +00:00
theli
90ef19d778 *) first version of a serverCharBuffer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2606 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-16 12:56:03 +00:00
orbiter
d374ef2bbe bugfix for tryRemoveURLs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2605 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-16 00:34:34 +00:00
orbiter
f644a1c3a7 better evaluation of index abstracts
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2604 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-16 00:07:09 +00:00
orbiter
1b48473bc5 bugfix to utf8 recognition
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2603 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-15 23:55:06 +00:00
orbiter
90f7241b59 serverByteBuffer.trim() can now recognize utf-8 characters
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2602 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-15 23:52:26 +00:00
allo
2fd610b556 http://www.yacy-forum.de/viewtopic.php?p=25611#25611
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2601 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-15 17:48:41 +00:00
theli
e34d9b3fec *) charset aware headlines (after the serverByteBuffer.trim problem is solved)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2599 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-15 15:07:35 +00:00
theli
8115ac47b5 *) charset aware metadata parsing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2598 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-15 15:01:25 +00:00
theli
3ac30bdf22 *) some todo markers added for additional charset support
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2597 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-15 14:49:43 +00:00
theli
06fa891152 *) htmlFilterContentScraper.java: using proper charset for document title
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2595 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-15 14:05:28 +00:00
theli
74c3e7cf29 *) storing document charset into plasmaParserDocument object (is needed later by the condenser)
*) htmlFilterContentScraper.java: using proper charset for document title
*) serverByteBuffer.java: adding new toString which allows to specify the charset for byte encoding


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2593 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-15 13:18:12 +00:00
theli
c5d3020941 *) better errorhandling for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2592 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-15 12:56:01 +00:00
theli
d0a5a53789 *) changes needed for multi-language support
- parsers may need to know the charset of the byte stream 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2591 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-15 12:52:46 +00:00
orbiter
d82875c72b removed removal of 'funny symbols' that may have caused utf-8 problems
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2589 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-15 09:08:15 +00:00
orbiter
26ab1fa885 fixed null pointer exception
See http://www.yacy-forum.de/viewtopic.php?p=25598#25598

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2588 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-15 08:50:16 +00:00
theli
b0e8ff6eda *) some TODO makers for UTF-8 problem
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2586 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-15 05:31:30 +00:00
orbiter
41e27b85b7 fix for crawler condition
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2583 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-15 00:38:45 +00:00
orbiter
0ee7e45413 bugfix for merge method (caused by bad refactoring)
see http://www.yacy-forum.de/viewtopic.php?p=25529#25529

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2581 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-14 10:30:25 +00:00
orbiter
5c2f30eaca adjustments to dhtInCache write
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2579 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-14 09:28:17 +00:00
theli
9ecf7f0da2 *) some TODO makers for UTF-8 problem
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2578 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-14 05:37:46 +00:00
theli
e2f8339827 *) some bugfixes for UTF-8 related problems
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2577 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-14 05:16:36 +00:00
orbiter
c89d8142bb replaced old 'kCache' by a full-controlled cache
there are now two full-controlled caches for incoming indexes:
- dhtIn
- dhtOut
during indexing, all indexes that shall not be transported to remote peers
because they belong to the own peer are stored to dhtIn. It is furthermore
ensured that received indexes are not again transmitted to other peers
directly. They may, however be transmitted later if the network grows.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2574 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-14 00:51:02 +00:00
orbiter
6e2907135a bugfixes for remote search server part
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2573 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-13 22:19:34 +00:00
orbiter
cf9884e22b first attempt to implement a secondary search
this is a set of search processes that shall enrich search results
with specialized requests to realize a combination of search results
from different peers.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2571 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-13 17:13:28 +00:00
theli
2a06ce5538 *) next bugfix for UTF-8
- Sending UFT-8 messages to other peers did not work
   - httpd.java: minor corrections for UTF-8

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2570 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-13 15:47:56 +00:00
theli
bdc51591ae *) UTF-8 Bug solved (hopefully)
See: http://www.yacy-forum.de/viewtopic.php?p=25522

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2569 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-13 14:48:58 +00:00
theli
ef751b9d33 *) removing all string operations from the template engine
- engine should fully operate on bytes now

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2567 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-13 13:56:10 +00:00
orbiter
7ef80c1026 more debugging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2566 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-13 13:52:46 +00:00
orbiter
b251076e64 avoid ConcurrentModificationException
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2563 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-13 10:36:18 +00:00
orbiter
75b198bc02 - updated references to indexContainer
- more bugfixes and debugging for indexAbstract processing

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2555 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-12 11:13:27 +00:00
orbiter
0bed3b9ac3 removed superfluous interface
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2554 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-12 11:09:51 +00:00
orbiter
b7e7808ea6 wordmigration now works also for new index database
if the new database is switched on, no 'too big' messages appear,
all the WORDS files can be completely migrated

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2553 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-12 08:23:47 +00:00
theli
a0ddf2ec11 *) AbstractCrawlWorker.java: delete already downloaded data on crawling error
*) plasmaSwitchboard.java: log unexpected errors while parsing/indexing

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2552 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-12 04:50:12 +00:00
orbiter
4f9e42d5ed more changes towards better join-search
- fixed problems with index-abstract generation
- added analysis output for index abstract receive

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2551 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-12 00:42:42 +00:00
orbiter
a7281a9b4d fix for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2545 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-11 11:12:42 +00:00
orbiter
82a6054275 - fixed bug with new indexAbstract generation
- added partly evaluation of indexAbstracts during remote searches

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2544 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-11 10:39:25 +00:00
theli
fded1f4a5d *) better handling of maximum file size limit in crawler
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2543 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-11 08:26:39 +00:00
orbiter
416b4e5c6b ups
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2542 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-11 08:17:55 +00:00
orbiter
309accb983 memory control for ymage generation:
the ymageMatrix initializer throws an RuntimeException if there is not
enough memory available to generate a new ymage of wanted size

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2541 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-11 07:01:39 +00:00
orbiter
74d1dea30b changes towards better join-search
- added generation of a compressed index within remote peers during global search
- added selection of specific urls within remote peers during secondary global search


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2539 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-10 22:36:47 +00:00
orbiter
ae4e8ce03e - cut for 'probably last html-interface version': version number update
- small enhancement to ranking

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2536 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-10 19:44:11 +00:00
orbiter
64bed59ee8 enhancements to ranking
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2535 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-09 23:44:54 +00:00
theli
63893003be *) Adding settings page for the crawler which allows to specify a file size limit and the timeout to use.
*) adding first version of maximum filesize check for the crawler

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2534 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-09 15:06:49 +00:00
auron_x
06b1365066 *) fixed existing protection against divbyzero and removed the new one
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2530 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-08 23:43:30 +00:00
orbiter
94d7ced900 fix for last ranking commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2529 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-08 21:14:57 +00:00
orbiter
cc97a3e9c6 fixed possibly bug with indexOutOfBoundsException
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2528 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-08 20:27:45 +00:00
orbiter
03835c2ee8 enhanced search result computation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2527 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-08 20:26:44 +00:00
orbiter
809960ddc6 avoid division by zero
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2526 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-08 20:00:19 +00:00