Commit Graph

8558 Commits

Author SHA1 Message Date
cominch
b21048892b augmentedParser add features and integrate external html parser to
modify existing web pages

Conflicts:
	addon/YaCy.app/Contents/Info.plist
	build.xml
2012-06-10 10:23:35 +02:00
cominch
9cbfc1a1c0 augmentedProxy, which forwards every proxy request to a
rewrite engine to customize existing webpages. originally implemented by
Florian Richter.

Conflicts:
	source/de/anomic/http/server/HTTPDProxyHandler.java
2012-06-10 10:15:34 +02:00
cominch
1626be7916 Add menu entries for urlproxy / augmented browsing 2012-06-10 09:59:30 +02:00
cominch
a32943b382 add json mimetype 2012-06-10 09:29:09 +02:00
Michael Peter Christen
3b992e6b00 using utf8 String compression in Webstructure database 2012-06-09 11:00:33 +02:00
Michael Peter Christen
26301a538d bugfix in Domains - dns-lookup 2012-06-09 10:59:45 +02:00
Michael Peter Christen
cde20911bb saved a bit more ram using UTF8 String compression for OpenGeoDB and
Geonames data files.
2012-06-09 10:07:11 +02:00
Michael Peter Christen
225ee42879 made the GeoLocation into an interface with the current
integer implementation as accuracy implementation of 1.863cm
2012-06-09 09:46:27 +02:00
Michael Peter Christen
5b25272f40 added location search to main menu 2012-06-09 09:10:54 +02:00
Michael Peter Christen
2280a7b276 - changed initialization order to prefer allocation of memory for table
files first
- bugfixes in memory amount calculation
2012-06-09 09:05:47 +02:00
Michael Peter Christen
0746308bc2 only the metadata tables shall be able to use the tail cache 2012-06-08 18:36:11 +02:00
Michael Peter Christen
7ec9bef0c3 fix for OOM 2012-06-08 17:14:09 +02:00
Michael Peter Christen
41c02cb10e - less restrictions for usage of Table RAM copy
- new limit to use the table copy (instead of flag): 400MB available. If
less is available, then a copy is never used. If more is available, then
it can be used if there is a remaining space of at least 200MB
- flush caches more often: flush the Digest cache
2012-06-08 12:48:25 +02:00
Michael Peter Christen
b8f56a9803 npe bugfix 2012-06-08 10:20:43 +02:00
Michael Peter Christen
ea0dceb55d bugfix: do not switch off standard memory strategy when performing a
forced GC
PLEASE CHECK if your peer has standard memory switched on!
2012-06-08 09:48:46 +02:00
Michael Peter Christen
dd14b19c26 lazy initialization of block rank table ... only normal web search uses
this. When interactive search or location search is used, the block rank
is switched off
2012-06-08 09:41:29 +02:00
Michael Peter Christen
ba10caf89a lazy initialization of database tables 2012-06-08 09:30:51 +02:00
Michael Peter Christen
701b9a28a0 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
Conflicts:
	htroot/PerformanceMemory_p.java
2012-06-08 09:16:16 +02:00
Michael Peter Christen
ab7107b34b fixed RWIProcess queue limits: now discovering hidden results for mass
result retrieval
2012-06-08 09:14:54 +02:00
Michael Peter Christen
10c9c17d51 fixed handlemap spread factor and null iterator handling 2012-06-08 09:13:41 +02:00
Michael Peter Christen
b0095c8d3c flush the compressor cache when a cleanup is done 2012-06-07 19:42:33 +02:00
Michael Peter Christen
a61f44f9e4 lazy initialization of block rank table.
this causes that the table is not initialized when there is no search is
done. the effect is most strong if YaCy is started headless which causes
no browser pop-up which otherwise would load the search page and
therefore trigger the initialization of the table.
2012-06-07 13:16:38 +02:00
Michael Peter Christen
c8bbd180e4 enhanced hint for debian package automatic update 2012-06-07 12:36:26 +02:00
Michael Peter Christen
9ad84c5e9f fix for NPE in PerformanceMemory 2012-06-07 12:36:05 +02:00
Michael Peter Christen
96e9d77270 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
Conflicts:
	source/net/yacy/cora/sorting/WeakPriorityBlockingQueue.java
2012-06-06 20:13:28 +02:00
Michael Peter Christen
00f2df1120 a variety of possible memory leak fixes 2012-06-06 18:23:18 +02:00
Michael Peter Christen
d7eb18cdf2 accept also file names beginning with "file://" for crawl start from
file.
2012-06-06 14:27:18 +02:00
Michael Peter Christen
8002fd2578 use less cache space since a large cache would cause more memory usage
in index files.
2012-06-06 14:17:42 +02:00
Michael Peter Christen
3dd8376825 added automatic cleaning of cache if metadata and file database size is
not equal. It might happen that these data is different because one of
that caches is cleaned after a while or when it is too big. The metadata
is then not cleaned, but now wiped after a checkup process at every
application start. This should cause a bit less memory usage.
2012-06-06 14:15:24 +02:00
Michael Peter Christen
d0ec8018f5 fixes for bad long computation 2012-06-06 14:13:31 +02:00
Michael Peter Christen
6bb07afcc3 accept also files with other file prefix; used to read 'foreign' cache
files
2012-06-06 13:36:10 +02:00
Michael Peter Christen
96c8119b50 added GeoLocation / GeoPoint classes which uses less memory than
Location/Coordinates and has initializers with correct order of lat,lon
coordinates
2012-06-06 12:57:42 +02:00
Michael Peter Christen
461a0ce052 removed warnings 2012-06-05 20:03:43 +02:00
Michael Peter Christen
62ae9bbfda allow more POIs, get more at once 2012-06-05 18:29:54 +02:00
Michael Peter Christen
407fdf6968 more bug fixes and performance hacks for search process 2012-06-05 15:04:23 +02:00
Michael Peter Christen
a1fe65b115 performance hacks 2012-06-05 12:06:26 +02:00
Michael Peter Christen
2fe207f813 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2012-06-04 23:44:38 +02:00
Michael Peter Christen
5aee19daa4 added show from cache in search results (not yet finished) 2012-06-04 23:44:26 +02:00
Michael Peter Christen
5e562dcdb7 adopted vocabulary usage within anotation/naviagtion feature of search
to new SimpleVocabulary class
2012-06-04 23:43:30 +02:00
Michael Peter Christen
514700291a moved Vocabulary to cora package (added in git
964406ad17)
2012-06-04 23:41:36 +02:00
Michael Peter Christen
0284a4d88f more fixes for double precision of coordinates 2012-06-04 23:37:41 +02:00
Michael Peter Christen
964406ad17 added concurrency enhancement to xml parser 2012-06-04 23:35:56 +02:00
Michael Peter Christen
240045cf7c fix for bad distance computation 2012-06-04 16:33:16 +02:00
Michael Peter Christen
e0d8643226 - performance hacks
- added log warnings in case that search processes run into time-out
situations
- better concurrency for Integer formatter (used a non-synchronized
formatter before)
- bugfix for search termination (a poison pill was missing)
- added timeout parameters for search (again) -> target is, that they
are never reached.
2012-06-04 15:37:39 +02:00
Michael Peter Christen
7a329465b3 using pre-compile pattern in blacklist; should enhance search speed 2012-06-04 15:34:53 +02:00
Michael Peter Christen
cf79b6cee3 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2012-06-01 08:32:26 +02:00
Michael Peter Christen
6e83b02b83 - bugfix for surrogate file reader
- bugfix for location search: suppress empty search
2012-06-01 00:08:31 +02:00
Michael Peter Christen
9b4c699526 ehanced location search:
- search request are now made using a map boundary
- search results are only computed for the map boundary
- the number of results is adopted to the results in the visible range
- added a double-buffering for the search result markers
- added a search query option for the search results:
/radius/<lat>/<lon>/<radius>
2012-05-31 22:39:53 +02:00
Michael Peter Christen
434af404c1 - added double-buffering for search layers
- added automatic zooming to search result
to location search
2012-05-31 14:05:36 +02:00
Michael Peter Christen
4d9b2dc487 automatically zoom to result layer bounds 2012-05-31 01:12:06 +02:00