orbiter
17530ca7b5
fix for bug http://bugs.yacy.net/view.php?id=10
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7642 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-04 12:20:20 +00:00
orbiter
b1a8d0c020
enhancements to web cache and less strict caching rules
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7620 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-22 10:35:26 +00:00
orbiter
a35d513bd8
fix for not-deleted .gap and .idx files
...
see also: http://forum.yacy-websuche.de/viewtopic.php?p=22128#p22128
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7605 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-17 17:09:19 +00:00
orbiter
8f11d3a5bb
redesigned the ScoreMap classes:
...
- new concurrent score map using atom operation from java concurrency classes
- redesigned difference beween StaticScore and Dynamic Score into ScoreMap and ReversibleScoreMap allowed that many classes can now use simple ScoreMap Objects which can be used better in concurrent environments using the ConcurrentScoreMap
- switched from DynamicScore to ConcurrentScoreMap usage wherever possible
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7586 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-13 01:41:44 +00:00
orbiter
30aed9824a
moved getBytes() to UTF8.getBytes() to use a default String encoding
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7580 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-10 12:35:32 +00:00
orbiter
e1b6916423
always try to guess the size of a StringBuilder to prevent too many memory re-allocations
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7572 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-09 09:29:05 +00:00
low012
3b40b98256
*) set SVN properties
...
*) minor changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7567 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-08 01:51:51 +00:00
orbiter
cb1f49d0f2
replaced all 'new String' with default encoding (missing) or UTF-8 encoding with a String generation method that uses a pre-defined Charset constant for UTF-8. This avoids a cache-lookup for the Charset object using String hashing of the String 'UTF-8'.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7558 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-07 20:36:40 +00:00
orbiter
8d14916c74
more patches for a better out-of-memory management
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7555 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-07 01:45:11 +00:00
orbiter
993b9bc1a8
memory/performance hacks, less synchronization, better concurrency
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7544 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-03 11:30:04 +00:00
orbiter
42d90664f3
- fixed a memory leak in the httpc.post method (no finish)
...
- patched some more memory-saving relevant code
- some more minor bug fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7541 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-01 09:03:33 +00:00
orbiter
b1781d7aae
some more performance hacks
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7533 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-26 01:24:49 +00:00
orbiter
b2f147d28e
performance hack: excluded map encoding in many cases from synchronization block, especially when doing an iteration
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7532 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-25 21:29:55 +00:00
orbiter
5e186e0122
continuing the fight against deadlocks during time formatting: better caching.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7531 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-25 21:11:53 +00:00
orbiter
19b2a50578
- enhanced date formatter cache
...
- added more instances of formatter objects to different classes to make them independent in case of lockings that may applay during synchronization of the date formatter object (date formatting is not thread-safe and must be synchronized therefore)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7528 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-25 12:23:00 +00:00
orbiter
48a61c39a3
speed hacks in BLOB ArrayStack:
...
- more concurrency if possible
- less threads if no concurrency necessary
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7527 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-25 11:58:01 +00:00
orbiter
804ae2275b
- do not delete idx and gap files if the heap is not modified
...
this change may have bugs in it which may cause damage to your existing data. please use with care.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7516 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-23 11:28:12 +00:00
orbiter
5e45ded8e2
- removed locks from WordReference
...
- refactoring of HeapReader/Writer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7514 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-23 00:32:16 +00:00
orbiter
d84b4a072e
healing for some OOM problems
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7502 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-21 00:38:49 +00:00
orbiter
6083f2f171
fix for (false) oom
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7484 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-15 14:26:25 +00:00
orbiter
fe93caac5a
added flags and administration options to show advanced search and to show search result attributes (for each search result)
...
Administration can be done at ConfigPortal.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7466 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-02 15:54:13 +00:00
orbiter
eb12e15738
moved all Double values to Float values because of
...
http://www.exploringbinary.com/java-hangs-when-converting-2-2250738585072012e-308/
YaCy does not really need double-precision floating point computation anywhere, so this should not affect any feature
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7460 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-01 23:49:11 +00:00
orbiter
090c73e32e
catch a OOM in HeapReader iteration
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7433 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-01-12 12:04:18 +00:00
orbiter
10ae8d961b
- cora package has now no dependencies to other yacy packages and becomes a 'base' package (refactoring)
...
- cleaned up (removed special code and documentation for 27c3)
- added remote search functions to be used within cora
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7420 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-01-03 20:52:54 +00:00
orbiter
b2ed4cfaf8
more small bugfixes and light refactoring
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7401 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-12-28 01:57:05 +00:00
low012
9b3fae9496
*) cleaning up the code a little bit
...
*) program to interface, not implementation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7345 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-28 02:57:31 +00:00
sixcooler
b87bf88ac8
using less memory on merging and rewriting blobs
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7317 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-12 16:02:20 +00:00
orbiter
4c50d3428e
smaller file size for array stacks to support smaller deletion sizes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7305 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-04 13:29:19 +00:00
orbiter
becc463d8a
enhanced did-you-mean
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7300 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-04 00:25:19 +00:00
orbiter
445619f3ec
added a submenu ConfigHTCache_p.html to set the size of the HTCache separately from the proxy configuration.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7291 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-02 23:57:11 +00:00
orbiter
ca738ac924
- added a tag cloud to search results (using the topics)
...
- some refactoring of score classes
- added default package for new classes add_ymark and delete_ymark
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7251 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-15 22:01:39 +00:00
orbiter
e4d561971e
added more score cluster options and made score cluster usage more transparent
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7248 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-14 11:40:02 +00:00
orbiter
7cd9d9d22a
- enhanced DidYouMean computation using a faster count on index entries; this causes that results can be ranked better
...
- added limitations on DidYouMean result sets according to input and output string length
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7246 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-12 22:02:10 +00:00
orbiter
09c208a3ab
patch for corrupted database files (just work on and forget key)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7177 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-20 14:38:56 +00:00
orbiter
8da4eb5de6
addition to patch in SVN 7111
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7170 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-19 23:12:50 +00:00
orbiter
37baa8bae3
- fixes for concurrency exceptions and failed database integrity verification
...
- added link to yacystats peer when peer is more than one day old
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7164 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-17 10:20:04 +00:00
orbiter
83ac07874f
- corrected return value of put() methods (not used anywhere, so it did not harm before)
...
- added use of LookAheadIterator which should prevent mistakes when coding iterators with embedded iterators
- added a fail-safe reaction in case of database corruption using iterators over database elements (no interruption then)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7154 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-15 10:43:14 +00:00
orbiter
7dbc357593
patch to identify corrupted database files
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7139 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-13 07:20:53 +00:00
orbiter
5fe828fa06
- replaced pdfbox and fontbox version 1.1.0 with 1.2.1
...
- added some clear statements that shall clear static cache size within the pdfbox library
- the pdfbox library contains a memory leak; it is unsafe to run a peer with pdf parser permanently on.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7120 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-07 17:13:47 +00:00
orbiter
24502fe3de
performance hacks
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7116 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-06 12:59:33 +00:00
orbiter
d865ef77a8
removed re-read of index in case of a bad index. This may not solve the problem but it applies a 100% CPU problem on the peer. I'm afraid bad index files must be abandoned, and cannot be fixed this way.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7111 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-06 09:55:04 +00:00
orbiter
b2c9db48ea
Performance enhancement
...
- introduced byte[] - based ARC method for MapHeap which avoids a String generation each time the cache is accessed
- bugfixing in required class ComparableARC
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7110 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-06 09:53:33 +00:00
orbiter
65eaf30f77
redesign of crawl profiles data structure. target will be:
...
- permanent storage of auto-dom statistics in profile
- storage of profiles in WorkTable data structure
not finished yet. No functional change yet.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7088 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-31 15:47:47 +00:00
orbiter
4f22e2df41
bugfixes for
...
- next-execution-time in scheduler
- deletion of scheduled rss feed loading (now deletes also the scheduling entry)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7075 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-26 16:42:00 +00:00
orbiter
42414a6ae3
added two more tables in rss reader interface:
...
- fresh recorded rss feeds (not yet loaded or in scheduler)
- rss feeds in scheduler
The first list has a button that can be used to place rss feeds into the scheduler
The second list has a button to delete rss feeds from the scheduler
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7074 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-26 16:01:45 +00:00
orbiter
0010cd9db1
Support for indexing of RSS feeds!
...
- added a scanning in html parser for rss feeds
- storage of rss feed addresses, can be viewed with http://localhost:8080/Tables_p.html?table=rss
- rss items retrieved by http://localhost:8080/Load_RSS_p.html (in Index Creation menu) can be selected and indexed
- a rss feed retrieved in http://localhost:8080/Load_RSS_p.html can now be fully indexed
- indexing of rss feeds can be placed in scheduler
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7073 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-25 18:24:54 +00:00
orbiter
0f276dd63f
- MapHeap now implements Map<byte[], Map<String, String>>
...
- refactoring of method names to comply with Map method names
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7072 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-24 12:36:56 +00:00
orbiter
cf07b34c2d
implemented the Map interface in the ARC classes so it will be possible to instantiate ARCs as
...
Map<byte[], Map<String, byte[]>>
Because such Maps with byte[] keys cannot be stored in hash maps (bad hashing on byte[])
another ARC with comparable Maps has been added
This will make it possible to move the HTCache class 'Cache' into the cora package because that
class may be used either with RAM caches (ARCs) or with file-based caches (BEncodedHeaps)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7071 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-23 23:38:03 +00:00
orbiter
c60d0282fd
more abstraction for tables stored in heaps:
...
the BEncodedHeap now implements Map<byte[], Map<String, byte[]>>
This will make it possible that also different database storage types may be added that implement also the same Map<byte[], Map<String, byte[]>> interface.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7070 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-23 21:27:58 +00:00
orbiter
d1be64d491
removed wrong assert
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7069 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-23 21:02:28 +00:00