Commit Graph

744 Commits

Author SHA1 Message Date
Michael Christen
9e5894c784 Removed handling of components objects for URIMetadataRows.
This is a preparation to replace this rows with nodes from the node
store.
2011-12-17 01:27:08 +01:00
Michael Christen
c04bfaa51b refactoring 2011-12-16 23:59:29 +01:00
Michael Christen
17f962fceb translator updates:
- config string for chinese
- do not copy the language file to DATA/LOCALE any more (and do not use
them there, this is really confusing for new translators)
2011-12-08 10:25:26 +01:00
admin
23afee58fe Merge branch 'master' of git://github.com/f1ori/yacy 2011-12-07 01:01:04 +01:00
apfelmaennchen
ff19fcdb28 bugfix for YMarks XBEL import and export; thanks to Dominic
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8138 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-12-06 23:17:57 +00:00
Michael Christen
9cd469e6d6 added pull request from als plus an NPE fix 2011-12-04 12:15:03 +01:00
apfelmaennchen
70bcfc150a - small bug fix to ymarks html importer
- import of delicious.com exports has successfully been tested

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8132 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-12-02 23:24:40 +00:00
apfelmaennchen
b5d9f631e3 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8128 6c8d7289-2bf4-0310-a012-ef5d649a1542 2011-12-01 22:25:23 +00:00
Al Sutton
8993cac4d8 Initial performance improvements 2011-11-30 11:15:54 +00:00
apfelmaennchen
77a080ced9 smaller fixes for YMarks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8105 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-25 17:33:03 +00:00
orbiter
5a55397f99 some last-minute performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8101 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-25 11:23:52 +00:00
apfelmaennchen
dd1482aaf5 further update to YMarks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8100 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-25 00:32:18 +00:00
orbiter
c584db991f creating a bookmark from the search results now works again .. with new YMarks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8092 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-24 14:57:09 +00:00
apfelmaennchen
564374d1fe - included YMarks in addition to old bookmarks in yacysearchitem.html; don't get confused by the old bookmark dialog, the ymark is automatically added silently beforehand.
- reworked bookmark creation on crawlstart
- many smaller adjustments to ymarks


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8072 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-22 23:50:49 +00:00
orbiter
c93f10417a add a bookmark automatically each time a new crawl is started
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8063 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-22 00:03:20 +00:00
apfelmaennchen
6287c2b4a9 YMarks:
- introduced tag manager - a quite powerful tool (still not 100% stable, so be careful)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8060 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-20 22:42:15 +00:00
cominch
2236e01137 Minor correction to prevent useless comma at beginning of string, created from list
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8059 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-20 15:20:57 +00:00
apfelmaennchen
5581be12fb YMarks:
- added backend and api for tag management


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8058 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-19 20:59:21 +00:00
apfelmaennchen
a3eebfdcba YMarks:
- show active/running crawls
- execute crawls (works currently only if API entry is available)
- various smaller fixes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8056 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-17 23:11:27 +00:00
apfelmaennchen
4f95f72124 YMarks:
- working direct importer for YaCy Crawl Starts
- working direct import for old bookmarks.db

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8052 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-16 23:10:53 +00:00
apfelmaennchen
a8dfe787ed - updated to jquery flexigrid 1.1
- YMarks.html automatically  recognizes if a bookmark is a crawl start


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8040 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-15 21:45:17 +00:00
apfelmaennchen
abba31f02e - bugfix for correctly sorting ymarks
- some tuning for the autotagger (still not perfect)
- /api/ymarks/get_metadata.xml now provides info for crawlstarts
- removed unused code

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8036 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-14 22:00:44 +00:00
apfelmaennchen
5f7dbe1c42 - some refactoring (ymarks)
- improvement for autotagger (is now able to create/detect  multi word tags e.g. 'open source')



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8031 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-13 23:19:47 +00:00
orbiter
0d858d48ec replaced String with StringBuilder in suggestion process
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8020 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-09 14:42:55 +00:00
orbiter
d2ea250d99 refactoring:
- moved many classes from de.anomic to net.yacy
- made more sub-packages for search classes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7973 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-25 16:59:06 +00:00
f1ori
97045022fa * pass cookies to Server Side Includes
* User.html a bit more usable


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7963 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-20 14:54:14 +00:00
orbiter
610b01e1c3 - added a 'add every media object linked in a html document as a new document' to the html parser. This causes that all image, app, video or audio file that is linked in a html file is added as document. In fact that means that parsing a single html document may cause that a number of documents is inserted into the search index.
- some refactoring for mime type discovery

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7919 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-01 16:05:00 +00:00
orbiter
51cf697acd refactoring: moved all score-related classes to new ranking package
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7889 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-08-22 22:37:53 +00:00
sixcooler
5cd07d7f84 early freeing resources on deleting index reference if search-verification fails (aka Switchboard.cleanupJob)
doing same thingy on other methods of touched files as well

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7860 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-08-02 15:52:33 +00:00
sixcooler
59b767eebd stop loading via http at defined maximum of bytes - even size is unknown before loading
using max-file-size of type int for parsing documents
(since content is used as byte-arrays, 'integer' should be maximum)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7855 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-08-01 23:28:23 +00:00
orbiter
7db208c992 performance hacks: more pre-allocated StringBuilder
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7790 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-21 23:10:50 +00:00
orbiter
115abc8917 - more attributes for search progress bar
- moved cache strategy to cora package

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7778 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-13 21:44:03 +00:00
orbiter
0c1b29f3c9 - applied many small performance hacks
- added a memory limitation in the zip parser and the pdf parser
- added a search throttling: if there are too many search queries are still to be computed, then new requests are not accepted for some time. if after a one second still no space is there to perform another search, the search terminates with no results. this case should only happen in case of DoS-like situations and in case of strong load on a peer like if it is integrated in metager.
- added a search cache deletion process that removes search requests in case that throttling happens

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7766 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-01 19:31:56 +00:00
orbiter
4bea3f9714 hack to reduce resource contention caused by massive UTF8 decodings which use java.nio resources:
used a ASCII String <-> byte[] conversion wherever possible. Many Strings in YaCy are hashes which are pure ASCII (base64 hashes).
The new ASCII String <-> byte[] conversion method have less computation overhead than the UTF8 conversion.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7746 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-27 08:24:54 +00:00
orbiter
746e3c3b06 Replaced a widely-used Property Object in the httpd with HashMap<String, Object> which is not synchronized like Properties
A synchronization is not needed here and applies an overhead to the httpd process which is now removed.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7745 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-26 16:34:35 +00:00
orbiter
09ba6814c0 - non-blocking word hash computation with dynamic digest object generation (this was important!)
- (very) small performance enhancement in did-you-mean


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7740 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-26 12:58:11 +00:00
orbiter
10e2f588f8 - enhanced ybr ranking computation
- many speed/performance hacks
- added solr charding and new charding web interface
- added option to switch off the yacy index when using solr
- added new fail-url categories which are used to make a distinction which fail-urls to be sent to solr
- refactoring/renaming of some method names to distinguish host/url hashes better
- a large number of bug/npe fixes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7738 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-26 10:57:02 +00:00
orbiter
5b579e21a3 code cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7713 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-13 06:21:40 +00:00
apfelmaennchen
8b8db2aaba YMarks: some small changes/fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7695 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-03 21:21:06 +00:00
apfelmaennchen
441035f1f4 YMarks: some improvements to flexigrid quick search on YMarks.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7694 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-02 20:11:58 +00:00
orbiter
6fa439c82b - refactoring of robots
- added option to crawler to send error-URLs to solr
- changed solr scheme slightly (no multi-value fields where no multi values are)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7693 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-02 14:05:51 +00:00
apfelmaennchen
e7c2ea193b YMark:
- general improvements on importers, especially on auto tagging
- added get_tags (needed for tag clouds etc.)
- improved flexigrid support
- added YMarks.html (not fully working) that will eventually replace Bookmarks.html

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7691 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-01 21:42:48 +00:00
orbiter
b77b8cac0c - enhanced html parser: recognized much more details in the content
- added more properties to solr index
- refactoring
- more constants in switchboard
- fix for some NPEs
- recognition of more images
- removed synchronization in HandleMap (obviously not necessary?)
- added a nolocal configuration to remove excessive dns lookup (works only on allip - default off). Indexes produced with this setting are all flagged with 'local' and are (on purpose) not usable for freeworld because they will be rejected as beeing local.



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7672 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-21 13:58:49 +00:00
low012
bc84d2bc9d *) fixed typo in stop script
*) added <u> </u> tags for underlined text in Wiki Code
*) minor code changes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7671 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-20 22:54:29 +00:00
apfelmaennchen
b2281f0b7d YMark: intermediate work towards flexigrid support
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7670 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-20 22:33:01 +00:00
low012
06d50fd801 *) fixed stupid bug (introduced in r7663 by myself) which caused wrong parsing of Wiki pages
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7669 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-20 17:27:59 +00:00
apfelmaennchen
60412d2bb3 YMark:
- more refactoring >> YMarkEntry
- integration of SurrogateReader as bookmark importer
- various small bug fixes e.g. get_xbel.xml

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7668 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-18 21:42:14 +00:00
low012
2e9694c9e9 *) removed recursion which hopefully prevents exception
*) fixed bug in creation of table of content which caused double entries if a page was previewed more than once

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7663 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-17 21:02:18 +00:00
apfelmaennchen
a2e86daae9 YMark: more bug fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7662 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-16 22:09:50 +00:00
apfelmaennchen
62855f9567 YMark: code clean up and some small fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7661 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-16 21:19:42 +00:00