Commit Graph

2555 Commits

Author SHA1 Message Date
allo
b78d171b1e Windows installer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2675 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-29 21:13:56 +00:00
theli
c665f6cddb *) handling of quotes in charset string
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2674 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-28 06:29:15 +00:00
theli
b73efd5565 *) missing changes needed because of last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2673 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-28 05:48:28 +00:00
theli
65c1f13d11 *) migration to newer odt parser lib
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2672 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-28 04:47:39 +00:00
theli
140ddba93f *) adding soap functions to pause and resume the crawler
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2668 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-27 05:22:43 +00:00
theli
ed8227d222 *) Bugfix for NullpoinerException in IndexCreateIndexingQueue_p.java
See: http://www.yacy-forum.de/viewtopic.php?p=25874

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2667 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-27 04:35:02 +00:00
theli
c0f7a4124c *) Bugfix for soap templates
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2666 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-27 04:24:32 +00:00
orbiter
2463e5624a 'quick' release 0.47
- documentation update
- necessary bugfixes (missing css for new peers)
- reduced effect of search result redundancy filter
- removed some debug output, but not all

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2665 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-26 23:41:54 +00:00
theli
3433dfb5e2 *) Bugfix for soap search template: correction for resultCount tags, cdata for snippet tag
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2664 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-26 16:18:04 +00:00
theli
d42dcead1d *) Bugfix renaming snippet tag in soap search template
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2663 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-26 16:11:38 +00:00
theli
49fbb688df *) SOAP: old urlInfo renamed to urlInfoByHash, new urlInfo Function added.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2662 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-26 15:14:33 +00:00
theli
8f143d516b *) make snippet fetcher accessible via soap api
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2661 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-26 15:07:16 +00:00
theli
97615af406 *) Restructuring of YaCy SOAP services
- general functions moved to abstract service class
   - service class splitted into SearchService, CrawlService, StatusService
*) Bugfix for SOAP search services
   - Attention: some xml tages where renamed
   See: http://www.yacy-forum.de/viewtopic.php?p=25877
*) New SOAP service function urlInfo to view the parsed content of an URL
   See: http://www.yacy-forum.de/viewtopic.php?p=25869

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2660 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-26 14:47:44 +00:00
theli
241b881560 *) Redesign of YaCy SOAP handler
- should be more fail-safe now
   - better handling of compressed request bodies
   - better handling of persistent connections
   - better handling of AxisFaults

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2659 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-26 12:24:40 +00:00
theli
009a33170b *) Content-Location header added
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2658 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-26 04:32:01 +00:00
rramthun
b0cab1e731 *)Adapted surftipps to use common 0/1 parameters
*)Added translation of WatchCrawler.html
*)Changed format of German translation. Formal description will probably follow.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2657 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-24 20:15:28 +00:00
rramthun
f17a91313e *)Updated phosphor.css for XHTML
*)New grey skin
Thanks to Philipp Redeker!

*)Renamed old skins until somebody updates them.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2655 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-24 17:55:21 +00:00
theli
1aa07a52cd *) Bugfix for UnsupportedEncodingException if the media type contains multiple parameters
See: http://www.yacy-forum.de/viewtopic.php?p=25832#25826

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2654 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-24 15:50:51 +00:00
allo
4922ab8920 try to fix a nullpointer on snippet generation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2653 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-22 22:51:44 +00:00
hermens
d8fde14c3a Make maximum number of words in DHT-In cache configurable at runtime
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2652 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-22 12:44:58 +00:00
theli
625c2ce6b1 *) bugfix for snippet fetching problem if content but not http header is available in cache
See: http://www.yacy-forum.de/viewtopic.php?p=25748

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2651 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-22 11:55:28 +00:00
theli
813a8a8179 *) migration of mimeTypeParser to jmimemagic 0.1
- better mimetype detection for rss feeds
   - better mimetype detection for odt documents (less memory consuming)
   - two new detector classes implementing MagicDetector interface of jmimemagic

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2650 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-22 11:40:46 +00:00
hermens
3f5a4153a0 Make Peers more receptible to transferred indexes
- Set MaxWordCount for dhtInCache to indexDistribution.dhtReceiptLimit
  so that the inCache gets flushed when the limit is passed
- Modify flushCacheSome to flush enough words to get below MaxWordCount immediately



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2649 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-22 10:58:58 +00:00
hydrox
740696f6c3 *) few fixes XHTML-validation ( there is still much to do)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2648 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-22 08:04:24 +00:00
theli
57415b6889 *) Bugfix for surftipps UTF-8 problem
See: http://www.yacy-forum.de/viewtopic.php?t=2864

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2647 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-22 05:40:29 +00:00
theli
706572f18d *) Bugfix for ArithmeticException caused by setting max crawling thread count was to 0
See: http://www.yacy-forum.de/viewtopic.php?t=2862

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2646 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-22 04:29:31 +00:00
orbiter
2d3b96eeba bugfixes for surftipps
- added missing authorization check for votes
- second vote on same entry was possible after complete publishing of current vote

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2645 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-21 21:37:40 +00:00
hydrox
85f3617835 *) moved HTML from class-file to template-file (please check if it is valid HTML)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2644 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-21 20:36:46 +00:00
hydrox
9434dba8f2 *) corrected title of IndexCleaner_p.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2643 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-21 07:26:41 +00:00
allo
b0a4fcce8c fix from theli
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2642 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-20 18:03:24 +00:00
theli
b6c7b91582 *) Parser now throws an ParserException instead of returning null on parsing errors (e.g. needed by snippet fetcher)
*) better logging of parser failures
*) simplified usage of plasmaparser through switchboard
*) restructuring of crawler
   - crawler now returns an error message if it is used in sync mode (e.g. by snippet fetcher)
*) snippet-fetcher: more verbose error messages
*) serverByteBuffer.java: adding new function append(String,encoding)
*) serverFileUtils.java: adding functions to copy only a given number of bytes between streams


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2641 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-20 12:25:07 +00:00
orbiter
aa38721cf6 new features for surftipps
- new generation with less memory
- removal of doubles
- positive votes can generate entries without original news (so they can live on)
- link deletion on search results are now also negative votes for surftipps (but they may rarely hit any news)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2640 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-20 12:01:51 +00:00
theli
64b2ef5aae *) Trying to bugfix shutdown problem
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2639 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-20 10:13:23 +00:00
orbiter
e03427871e enhanced surftipps:
- added switchh to show or hide surftipps
- more news contribute to surftipps
- added voting system for surftipps

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2638 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-20 07:17:41 +00:00
theli
e745b63c77 *) Bugfix for different behavior of indexDistributeWhileCrawling to other checkboxes on IndexControl_p.html
See: http://www.yacy-forum.de/viewtopic.php?t=2849

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2637 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-20 04:44:56 +00:00
theli
1dc12d6659 *) Bugfix for shutdown problem caused by cacheScan thread
See: http://www.yacy-forum.de/viewtopic.php?p=25729

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2636 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-20 04:36:25 +00:00
borg-0300
42173462f5 rename cutUrlText to shortenURLString;
other little things;

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2635 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-19 20:47:45 +00:00
borg-0300
af1d89e381 check url == null added;
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2634 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-19 20:12:26 +00:00
theli
cc667b0aa5 *) htmlFilterContentScraper.java: adding support for link tag
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2633 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-19 16:13:13 +00:00
borg-0300
16ba5d1b46 topwords: only [a-z] words, quality is better;
blank removes; 
properties added;


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2632 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-19 10:44:45 +00:00
theli
66a58502df *) configure logging filehandler to use UTF-8 for logging messages
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2631 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-19 05:39:50 +00:00
theli
26dfbb7499 *) Bugfix for UTF-8: url names are now stored properly in stackcrawl, crawler, indexing queue and should be displayed correct on the gui
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2630 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-19 05:19:41 +00:00
theli
cf6acff2c2 *) Bugfix. htmlFilterInputStream document analysis did not work properly for documents smaller than the
default InputStream Buffer size.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2629 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-19 04:58:34 +00:00
borg-0300
f18304ddd3 unused/not needed imports removes;
properties added;

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2628 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-18 22:21:18 +00:00
orbiter
ec031eb993 first version of surftipps
see http://localhost:8080/index.html

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2627 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-18 20:14:21 +00:00
borg-0300
b174fbd0ca "import ...*" removed;
properties added;

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2626 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-18 18:31:27 +00:00
orbiter
807756150e patch for strange bug reported by email
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2625 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-18 16:50:31 +00:00
theli
5c6251bced *) some improvements for extended html document charset support
- new class htmlFilterInputStream.java which allows to pre-analyze the html header to extract
     the charset meta data. This is only enabled for the crawler at the moment. Integration into 
     proxy needs more testing.     
   - adding eventlisterner interfaces to the htmlscraper to allow other classes to get informed
     about detected tags (used by the htmlFilterInputStream.java)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2624 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-18 15:36:04 +00:00
theli
33f0f703c0 *) reinserting type cast again
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2623 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-18 13:21:12 +00:00
orbiter
8c11a543dc fixed line ending coding
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2622 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-18 13:17:31 +00:00