Commit Graph

145 Commits

Author SHA1 Message Date
Michael Christen
1f4afb4dc0 performance hacks 2011-12-15 15:15:53 +01:00
Michael Christen
e9dc99fe15 added rules to set specific RWIs as private RWIs which are not
transmitted to remote peers. This will be used for private index copies
and phonetic indexes.
2011-12-14 22:15:51 +01:00
Michael Christen
078fcde0dd bad initialization 2011-12-07 01:02:23 +01:00
Michael Christen
044f83feed added some pauses into the search process which shall produce
better-ranked search results. without that pauses the result page will
only contain links from the peer that answers first which is not a good
average picture of all the peers that provided results
2011-12-06 15:28:48 +01:00
Michael Christen
d35bdc2df6 removed npe 2011-12-05 23:37:49 +01:00
Michael Christen
9cd469e6d6 added pull request from als plus an NPE fix 2011-12-04 12:15:03 +01:00
orbiter
83335c3b09 fix for http://bugs.yacy.net/view.php?id=78
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8127 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-12-01 20:57:22 +00:00
orbiter
35a9e8f307 - fixed network graphic
- debuged evaluation tables
- changed cache settings in template engine
- some speed hacks
- changed int angles for peer positions in network graphic to double angles

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8124 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-30 20:50:41 +00:00
Al Sutton
8993cac4d8 Initial performance improvements 2011-11-30 11:15:54 +00:00
orbiter
5a55397f99 some last-minute performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8101 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-25 11:23:52 +00:00
orbiter
05f34a3fa7 added a full, complete, database insert, update and delete API for the tables.
Please see this example:

list all database tables:
http://localhost:8090/api/table_p.xml

now create a new table and insert some values into 'mytable'
http://localhost:8090/api/table_p.xml?table=mytable&pk=&commitrow=&col_termin=Release%20Machen&col_datum=24.11.2011&col_status=ongoing

list the table content:
http://localhost:8090/api/table_p.xml?table=mytable&pk=

update the table and change a single value inside. You must refer to the row using a primary key 'pk'
http://localhost:8090/api/table_p.xml?table=mytable&pk=000000000001&commitrow=&col_datum=29.11.2011

you can also select rows using a search operator
http://localhost:8090/api/table_p.xml?table=mytable&pk=&count=10&search=

now lets delete the row:
http://localhost:8090/api/table_p.xml?table=mytable&pk=&deleterows=pk_000000000001

and we can also delete the complete table:
http://localhost:8090/api/table_p.xml?table=mytable&deletetable=

You can use this to administrate the robots, bookmarks and API steering using an outside application!

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8071 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-22 12:31:07 +00:00
orbiter
3a15e58e28 - increased stability when opening the robots table
- increased stability when deleting tables

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8034 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-14 15:33:35 +00:00
orbiter
57d5529a01 performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7977 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-28 21:16:40 +00:00
orbiter
2842ce30d6 added synchronization in ReferenceContainer and logging for shrinking
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7937 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-07 22:15:01 +00:00
sixcooler
ecb4986b38 refactored stuff from last commit to ReferenceContainer
see: http://forum.yacy-websuche.de/viewtopic.php?f=5&t=3353&p=23163#p23163
the limiting of references is disabled per default
to enable this set yacy.conf - index.maxReferences to a value of e.g. 100000

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7935 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-07 18:55:16 +00:00
sixcooler
f7c4abfdd7 limit references per blob & term to the 100.000 youngest
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7934 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-07 13:08:06 +00:00
orbiter
51cf697acd refactoring: moved all score-related classes to new ranking package
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7889 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-08-22 22:37:53 +00:00
sixcooler
5cd07d7f84 early freeing resources on deleting index reference if search-verification fails (aka Switchboard.cleanupJob)
doing same thingy on other methods of touched files as well

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7860 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-08-02 15:52:33 +00:00
orbiter
1912d0cccc changed handling of RowSet element retrieval: until today all elements had been copied from the underlying byte[] arrays into a new Entry object that again had a copy of a portion of that byte[] in its own bye[]. There was an option to just refer to the underlying byte[] with a pointer but that was almost never used. This commit now changes an interface to the Row class where it is now necessary to tell if a copy is always required. Fortunately the copy is only needed in very rare cases. That means that this change should cause much less memory allocation; it is expected that this happens especially during search situations.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7840 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-15 08:38:10 +00:00
orbiter
0c1b29f3c9 - applied many small performance hacks
- added a memory limitation in the zip parser and the pdf parser
- added a search throttling: if there are too many search queries are still to be computed, then new requests are not accepted for some time. if after a one second still no space is there to perform another search, the search terminates with no results. this case should only happen in case of DoS-like situations and in case of strong load on a peer like if it is integrated in metager.
- added a search cache deletion process that removes search requests in case that throttling happens

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7766 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-01 19:31:56 +00:00
orbiter
fe0c08455b more concurrency (enhancement) hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7759 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-30 08:53:58 +00:00
orbiter
4bea3f9714 hack to reduce resource contention caused by massive UTF8 decodings which use java.nio resources:
used a ASCII String <-> byte[] conversion wherever possible. Many Strings in YaCy are hashes which are pure ASCII (base64 hashes).
The new ASCII String <-> byte[] conversion method have less computation overhead than the UTF8 conversion.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7746 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-27 08:24:54 +00:00
orbiter
e28bd0d038 fix for some possible causes of memory leaks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7741 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-26 14:35:32 +00:00
orbiter
10e2f588f8 - enhanced ybr ranking computation
- many speed/performance hacks
- added solr charding and new charding web interface
- added option to switch off the yacy index when using solr
- added new fail-url categories which are used to make a distinction which fail-urls to be sent to solr
- refactoring/renaming of some method names to distinguish host/url hashes better
- a large number of bug/npe fixes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7738 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-26 10:57:02 +00:00
orbiter
3ed4a09368 small features, some bug fixes and performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7733 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-23 21:08:04 +00:00
orbiter
b45701d20f this is a re-implementation of the YaCy Block Rank feature
This time it works like this:
- each peer provides its ranking information using the yacy/idx.json servlet
- peers with more than 1 GB ram will load this information from all other peers, combine that into one ranking table and store it locally. This happens during the start-up of the peer concurrently. The new generated file with the ranking information is at DATA/INDEX/<network>/QUEUES/hostIndex.blob
- this index is then computed to generate a new fresh ranking table. Peers which can calculate their own ranking table will do that every start-up to get latest feature updates until the feature is stable
- I computed new ranking tables as part of the distribition and commit it here also
- the YBR feature must be enabled manually by setting the YBR value in the ranking servlet to level 15. A default configuration for that is also in the commit but it does not affect your current installation only fresh peers
- a recursive block rank refinement is implemented but disabled at this point. it needs more testing

Please play around with the ranking settings and see if this helped to make search results better.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7729 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-18 14:26:28 +00:00
orbiter
dc54915df4 fix for very bad compare
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7708 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-08 08:45:58 +00:00
orbiter
b77b8cac0c - enhanced html parser: recognized much more details in the content
- added more properties to solr index
- refactoring
- more constants in switchboard
- fix for some NPEs
- recognition of more images
- removed synchronization in HandleMap (obviously not necessary?)
- added a nolocal configuration to remove excessive dns lookup (works only on allip - default off). Indexes produced with this setting are all flagged with 'local' and are (on purpose) not usable for freeworld because they will be rejected as beeing local.



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7672 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-21 13:58:49 +00:00
orbiter
17530ca7b5 fix for bug http://bugs.yacy.net/view.php?id=10
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7642 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-04 12:20:20 +00:00
orbiter
b1a8d0c020 enhancements to web cache and less strict caching rules
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7620 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-22 10:35:26 +00:00
orbiter
a35d513bd8 fix for not-deleted .gap and .idx files
see also: http://forum.yacy-websuche.de/viewtopic.php?p=22128#p22128

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7605 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-17 17:09:19 +00:00
orbiter
8f11d3a5bb redesigned the ScoreMap classes:
- new concurrent score map using atom operation from java concurrency classes
- redesigned difference beween StaticScore and Dynamic Score into ScoreMap and ReversibleScoreMap allowed that many classes can now use simple ScoreMap Objects which can be used better in concurrent environments using the ConcurrentScoreMap
- switched from DynamicScore to ConcurrentScoreMap usage wherever possible

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7586 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-13 01:41:44 +00:00
orbiter
30aed9824a moved getBytes() to UTF8.getBytes() to use a default String encoding
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7580 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-10 12:35:32 +00:00
orbiter
e1b6916423 always try to guess the size of a StringBuilder to prevent too many memory re-allocations
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7572 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-09 09:29:05 +00:00
low012
3b40b98256 *) set SVN properties
*) minor changes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7567 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-08 01:51:51 +00:00
orbiter
cb1f49d0f2 replaced all 'new String' with default encoding (missing) or UTF-8 encoding with a String generation method that uses a pre-defined Charset constant for UTF-8. This avoids a cache-lookup for the Charset object using String hashing of the String 'UTF-8'.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7558 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-07 20:36:40 +00:00
orbiter
8d14916c74 more patches for a better out-of-memory management
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7555 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-07 01:45:11 +00:00
orbiter
993b9bc1a8 memory/performance hacks, less synchronization, better concurrency
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7544 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-03 11:30:04 +00:00
orbiter
42d90664f3 - fixed a memory leak in the httpc.post method (no finish)
- patched some more memory-saving relevant code
- some more minor bug fixes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7541 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-01 09:03:33 +00:00
orbiter
b1781d7aae some more performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7533 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-26 01:24:49 +00:00
orbiter
b2f147d28e performance hack: excluded map encoding in many cases from synchronization block, especially when doing an iteration
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7532 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-25 21:29:55 +00:00
orbiter
5e186e0122 continuing the fight against deadlocks during time formatting: better caching.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7531 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-25 21:11:53 +00:00
orbiter
19b2a50578 - enhanced date formatter cache
- added more instances of formatter objects to different classes to make them independent in case of lockings that may applay during synchronization of the date formatter object (date formatting is not thread-safe and must be synchronized therefore)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7528 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-25 12:23:00 +00:00
orbiter
48a61c39a3 speed hacks in BLOB ArrayStack:
- more concurrency if possible
- less threads if no concurrency necessary

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7527 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-25 11:58:01 +00:00
orbiter
804ae2275b - do not delete idx and gap files if the heap is not modified
this change may have bugs in it which may cause damage to your existing data. please use with care.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7516 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-23 11:28:12 +00:00
orbiter
5e45ded8e2 - removed locks from WordReference
- refactoring of HeapReader/Writer

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7514 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-23 00:32:16 +00:00
orbiter
d84b4a072e healing for some OOM problems
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7502 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-21 00:38:49 +00:00
orbiter
6083f2f171 fix for (false) oom
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7484 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-15 14:26:25 +00:00
orbiter
fe93caac5a added flags and administration options to show advanced search and to show search result attributes (for each search result)
Administration can be done at ConfigPortal.html

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7466 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-02 15:54:13 +00:00
orbiter
eb12e15738 moved all Double values to Float values because of
http://www.exploringbinary.com/java-hangs-when-converting-2-2250738585072012e-308/
YaCy does not really need double-precision floating point computation anywhere, so this should not affect any feature

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7460 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-01 23:49:11 +00:00