Commit Graph

3821 Commits

Author SHA1 Message Date
cominch
09a34cfe1b prepare RDF dump routines 2012-06-10 12:58:40 +02:00
cominch
300b235ce8 Updated Demo Servlet
Conflicts:
	htroot/About.html
	htroot/DemoServlet.html
	htroot/DemoServlet.java
	htroot/interaction/interaction.js
	source/net/yacy/interaction/Interaction.java
2012-06-10 12:58:29 +02:00
cominch
90512640bf Added config switches for custom parser
Conflicts:
	source/net/yacy/document/TextParser.java
2012-06-10 12:49:36 +02:00
cominch
a12cbcba36 Add a global value store 2012-06-10 12:45:01 +02:00
cominch
e14f2881ae interaction: add special table interaction
Conflicts:
	source/net/yacy/interaction/Interaction.java
2012-06-10 12:41:16 +02:00
cominch
4e4e7a99f8 interaction: add global variable store
Conflicts:
	source/net/yacy/interaction/Interaction.java
2012-06-10 12:34:36 +02:00
cominch
bde07ed7a8 Add tagging overlay element
Conflicts:
	htroot/env/templates/jqueryheader.template
	htroot/yacysearchitem.java
	source/net/yacy/interaction/Interaction.java
2012-06-10 12:28:50 +02:00
cominch
bee3bee8f3 Small fix - return value of JSON should be empty 2012-06-10 12:20:13 +02:00
cominch
ff4ba3ee05 Small fix
Conflicts:
	htroot/yacysearchitem.java
2012-06-10 10:56:39 +02:00
cominch
f05e3968f7 Quick fix 2012-06-10 10:55:09 +02:00
cominch
e859481889 Add Triplestore settings functionality
Conflicts:
	htroot/env/templates/header.template
2012-06-10 10:55:00 +02:00
cominch
b0bc0b4572 Add new demonstration module for client-side key-value store (backend:
triplestore): /DemoServletInteraction.html

Conflicts:
	source/net/yacy/interaction/Interaction.java
2012-06-10 10:53:30 +02:00
cominch
c9dc6cda02 Demonstration: include value from interaction in search results
Conflicts:
	htroot/interaction/OverlayInteraction.html
	htroot/yacysearchitem.java
2012-06-10 10:51:53 +02:00
cominch
ae8adb0e58 Small changes 2012-06-10 10:44:16 +02:00
cominch
bcbd8eee33 Add several parsers, for RDFa and rdf files.
Conflicts:
	source/net/yacy/document/TextParser.java
2012-06-10 10:42:33 +02:00
cominch
9ef5a80f4e add interaction for triples and selector for augmented browsing
Conflicts:
	htroot/interaction/interaction.js
	source/net/yacy/interaction/Interaction.java
2012-06-10 10:38:54 +02:00
cominch
5d20cd324a Add Triplestore and RDF query interface
Conflicts:
	build.xml
	defaults/yacy.init
	source/net/yacy/interaction/AugmentHtmlStream.java
2012-06-10 10:35:59 +02:00
cominch
bc9a618e0a augmented browsing: ignore js and css, integrate more user interaction
Conflicts:
	htroot/interaction/Footer.html
	source/net/yacy/interaction/AugmentHtmlStream.java
2012-06-10 10:29:15 +02:00
cominch
9cbfc1a1c0 augmentedProxy, which forwards every proxy request to a
rewrite engine to customize existing webpages. originally implemented by
Florian Richter.

Conflicts:
	source/de/anomic/http/server/HTTPDProxyHandler.java
2012-06-10 10:15:34 +02:00
cominch
1626be7916 Add menu entries for urlproxy / augmented browsing 2012-06-10 09:59:30 +02:00
Michael Peter Christen
5b25272f40 added location search to main menu 2012-06-09 09:10:54 +02:00
Michael Peter Christen
ea0dceb55d bugfix: do not switch off standard memory strategy when performing a
forced GC
PLEASE CHECK if your peer has standard memory switched on!
2012-06-08 09:48:46 +02:00
Michael Peter Christen
dd14b19c26 lazy initialization of block rank table ... only normal web search uses
this. When interactive search or location search is used, the block rank
is switched off
2012-06-08 09:41:29 +02:00
Michael Peter Christen
701b9a28a0 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
Conflicts:
	htroot/PerformanceMemory_p.java
2012-06-08 09:16:16 +02:00
Michael Peter Christen
ab7107b34b fixed RWIProcess queue limits: now discovering hidden results for mass
result retrieval
2012-06-08 09:14:54 +02:00
Michael Peter Christen
10c9c17d51 fixed handlemap spread factor and null iterator handling 2012-06-08 09:13:41 +02:00
Michael Peter Christen
a61f44f9e4 lazy initialization of block rank table.
this causes that the table is not initialized when there is no search is
done. the effect is most strong if YaCy is started headless which causes
no browser pop-up which otherwise would load the search page and
therefore trigger the initialization of the table.
2012-06-07 13:16:38 +02:00
Michael Peter Christen
c8bbd180e4 enhanced hint for debian package automatic update 2012-06-07 12:36:26 +02:00
Michael Peter Christen
9ad84c5e9f fix for NPE in PerformanceMemory 2012-06-07 12:36:05 +02:00
Michael Peter Christen
96e9d77270 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
Conflicts:
	source/net/yacy/cora/sorting/WeakPriorityBlockingQueue.java
2012-06-06 20:13:28 +02:00
Michael Peter Christen
d7eb18cdf2 accept also file names beginning with "file://" for crawl start from
file.
2012-06-06 14:27:18 +02:00
Michael Peter Christen
3dd8376825 added automatic cleaning of cache if metadata and file database size is
not equal. It might happen that these data is different because one of
that caches is cleaned after a while or when it is too big. The metadata
is then not cleaned, but now wiped after a checkup process at every
application start. This should cause a bit less memory usage.
2012-06-06 14:15:24 +02:00
Michael Peter Christen
d0ec8018f5 fixes for bad long computation 2012-06-06 14:13:31 +02:00
Michael Peter Christen
96c8119b50 added GeoLocation / GeoPoint classes which uses less memory than
Location/Coordinates and has initializers with correct order of lat,lon
coordinates
2012-06-06 12:57:42 +02:00
Michael Peter Christen
461a0ce052 removed warnings 2012-06-05 20:03:43 +02:00
Michael Peter Christen
62ae9bbfda allow more POIs, get more at once 2012-06-05 18:29:54 +02:00
Michael Peter Christen
a1fe65b115 performance hacks 2012-06-05 12:06:26 +02:00
Michael Peter Christen
2fe207f813 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2012-06-04 23:44:38 +02:00
Michael Peter Christen
5aee19daa4 added show from cache in search results (not yet finished) 2012-06-04 23:44:26 +02:00
Michael Peter Christen
e0d8643226 - performance hacks
- added log warnings in case that search processes run into time-out
situations
- better concurrency for Integer formatter (used a non-synchronized
formatter before)
- bugfix for search termination (a poison pill was missing)
- added timeout parameters for search (again) -> target is, that they
are never reached.
2012-06-04 15:37:39 +02:00
Michael Peter Christen
cf79b6cee3 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2012-06-01 08:32:26 +02:00
Michael Peter Christen
6e83b02b83 - bugfix for surrogate file reader
- bugfix for location search: suppress empty search
2012-06-01 00:08:31 +02:00
Michael Peter Christen
9b4c699526 ehanced location search:
- search request are now made using a map boundary
- search results are only computed for the map boundary
- the number of results is adopted to the results in the visible range
- added a double-buffering for the search result markers
- added a search query option for the search results:
/radius/<lat>/<lon>/<radius>
2012-05-31 22:39:53 +02:00
Michael Peter Christen
434af404c1 - added double-buffering for search layers
- added automatic zooming to search result
to location search
2012-05-31 14:05:36 +02:00
Michael Peter Christen
4d9b2dc487 automatically zoom to result layer bounds 2012-05-31 01:12:06 +02:00
Michael Peter Christen
6b40803adf - show number of results in map search interface
- transfer view radius within query
2012-05-31 00:47:52 +02:00
Michael Peter Christen
a8778e9c47 npe fix 2012-05-30 15:28:45 +02:00
Michael Peter Christen
1a6fab60e0 added node state to xml 2012-05-30 09:32:25 +02:00
Michael Peter Christen
20e0cc0822 fix for bad location evaluation 2012-05-29 14:46:13 +02:00
Michael Peter Christen
1ab3de0885 fixes to location search 2012-05-29 12:43:14 +02:00
Michael Peter Christen
f167a1c69f removed osmarender from yacysearch_location because that caused a
javascript error
2012-05-29 02:22:02 +02:00
Michael Peter Christen
71c3163f3d - fixes to node identification
- added link to node in network list
- added marking of portal search node peers
2012-05-29 01:38:54 +02:00
Michael Peter Christen
d1e9fe3db5 enhanced RootState icon 2012-05-29 00:06:33 +02:00
Michael Peter Christen
ad222be7f8 added node state icon in network list 2012-05-25 17:29:54 +02:00
Michael Peter Christen
638390930d another patch to fix the Crawler_p layout 2012-05-25 15:56:21 +02:00
Michael Peter Christen
c846e9ca14 redesign of the crawler monitor page: show crawled pages instead of
queue of urls that shall be crawled
2012-05-25 01:45:38 +02:00
Michael Peter Christen
8b974905ee changed log-in text for all servlets with authentication:
- added hint how to set the password using a shell script
- added a shell script to change the password
2012-05-24 13:24:31 +02:00
Michael Peter Christen
16b21f7a5b Added more steering in Crawler_p.html interface 2012-05-23 18:00:37 +02:00
Michael Peter Christen
c15fcde1c8 add-on to latest commit 2012-05-21 17:52:30 +02:00
Michael Peter Christen
cf47d94888 performance hack to parse numbers inside of substrings without actually
generating a substring. This avoids the allocation of a String object
ech time a substring is parsed. Should affect CPU load during RWI
transmission.
2012-05-21 13:40:46 +02:00
Michael Peter Christen
7bf421b9dd - fixed image search page navigation
- removed some deadlocks and ConcurrentModificationExceptions during
DidYouMean collection
2012-05-21 01:58:29 +02:00
reger
6696cb1313 bugfix: lookup of peernames no result for active peer in page IndexControlRWIs_p.html -> Transfer RWI to other Peer
SeedDB.lookupByName searche for lowercase peerNames, while MapColumnIndex.getIndex uses peername as is in the keyset.
Changed the index init to insert lowercase peer names as key
2012-05-20 05:25:16 +02:00
Michael Peter Christen
4298f00d2d fixed bad usage of given words 2012-05-20 01:35:49 +02:00
Michael Peter Christen
0d32a766ed relax verify attribute for search widget to make it faster:
set to "cacheonly"
2012-05-20 00:50:54 +02:00
reger
ae335a4190 bugfix Tables_p for edit and delete selected row (correction to use "pk_" html prefex) 2012-05-19 23:15:29 +02:00
Michael Peter Christen
f294f2e295 bugfix to http://bugs.yacy.net/view.php?id=181
tried to make a bit less 'noise' to dns server

also included: less processes in snippet fetch to reduce load during
search on small computers
2012-05-19 01:06:33 +02:00
Michael Peter Christen
1473e2258e fix for http://bugs.yacy.net/view.php?id=154 2012-05-18 23:56:40 +02:00
Michael Peter Christen
3e1bc9477f Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2012-05-17 13:58:09 +02:00
Roland 'Quix0r' Haeder
fbb946f913 Made a method static (Eclipse suggested it), removed unused import, pk=null check does now output a warning in logfile 2012-05-17 05:55:44 +02:00
Roland 'Quix0r' Haeder
5f983faef9 No &amp; in JavaScript-embeded URLs, added ability to stop focus in
ConfigPortal.html preview (is this not secured with _p????)

Conflicts:
	htroot/yacyinteractive.java
	htroot/yacysearch.java
2012-05-17 05:49:25 +02:00
Michael Peter Christen
5b3acc12cd Pattern.quote() replaces \\Q and \\E according to publication in
http://www.cs.washington.edu/homes/mernst/pubs/regex-types-ftfjp2012.pdf
2012-05-17 03:55:10 +02:00
Michael Peter Christen
89142d1e8d removed (not all) warnings 2012-05-16 13:42:32 +02:00
Michael Peter Christen
ffa4553229 typo 2012-05-16 10:14:44 +02:00
Michael Peter Christen
5deebd02ea added serialization 2012-05-15 23:10:47 +02:00
reger
b2175ea4ef Add possibility to set custom Solr field names for the YaCy default Solr attributes.
- Changing the format of YaCy's solr.key.list while maintainig backward compatibility
  Federated index config screens adjusted accordingly
- modified the Solr update request to use a 3 min Solr autocommit intervall
2012-05-15 22:34:02 +02:00
Michael Peter Christen
0d58fea210 made multiple connector default 2012-05-12 10:39:01 +02:00
Michael Peter Christen
8864141872 more abstraction in solr connection classes 2012-05-09 17:00:56 +02:00
Michael Peter Christen
c00efc2717 made the solr connection more generic 2012-05-09 16:46:45 +02:00
Michael Peter Christen
f130ab39e8 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2012-05-08 22:55:52 +02:00
Marc Nause
a691023d04 *) better formatting for network QPM
*) refactoring
2012-05-08 20:07:34 +02:00
Michael Peter Christen
dcccbe0be8 removed superfluous column 2012-05-07 00:54:00 +02:00
Michael Peter Christen
77f8e9fb9b Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2012-05-04 17:29:16 +02:00
Michael Peter Christen
ba6aaabc51 refactoring + parser bugfixes 2012-05-04 17:28:27 +02:00
Michael Peter Christen
a18b6dee04 Merge remote branch 'bbyacy-rc1/master' 2012-04-29 21:22:23 +02:00
reger
ea932f841c changed link to opensearchdescription document to a absolute uri (in yacysearch.html and yacysearch.rss)
see http://www.opensearch.org/Specifications/OpenSearch/1.1/Draft_5#The_.22Description.22_element
2012-04-29 05:50:56 +02:00
Michael Peter Christen
453010bd68 - solved problems with backpath normalization
- redesigned in/outbound link handover
- removed iframe links from inbound/outbound in solr scheme
2012-04-27 16:48:51 +02:00
Michael Peter Christen
5f5ed33ed8 patch for media search (audio, video apps) 2012-04-27 14:18:02 +02:00
Michael Peter Christen
0e13022147 - enhanced solr field documentation
- added xml api button to IndexFederated_p - the solr schema.xml file
can be generated by YaCy
2012-04-26 15:25:07 +02:00
Michael Peter Christen
08dcf3e5d1 hack to get all results if the actual number is between 10 and 64 2012-04-26 00:27:21 +02:00
Michael Peter Christen
19efbf1b0f - apply directDocByURL to NOLOAD Queue
- choose pushing to NOLOAD as default for site crawl
2012-04-26 00:23:18 +02:00
Michael Peter Christen
5c66880be2 fix for search result selection in case that contentdom is not set 2012-04-26 00:04:23 +02:00
Michael Peter Christen
3bea25c513 increased image preview size 2012-04-24 16:04:13 +02:00
Michael Peter Christen
a3badd3205 changed search process for images: no more media snippet load process,
show only links from index which had been on the text search page
before. This creates a superfast search process for images!
2012-04-24 12:55:58 +02:00
Michael Peter Christen
4aa0eedead one more scroogle... 2012-04-24 12:05:37 +02:00
Michael Peter Christen
347612ddd4 removed scroogle parser 2012-04-24 12:04:44 +02:00
Michael Peter Christen
f8cd57c92f new indexing strategy: ALL links that appear anywhere are indexed, not
only links where the content can be parsed. All non-parseable links are
placed into the noload queue. The search process must therefore be able
to filter out non-text search results.
- This fixes the problem that image search results appeared in the text
search.
- The interactive search can retrieve now ALL types of links
- The p2p interface is now extended to retrieve only certain types of
links (text, image, video, apps)
- The search process has an extension to filter the right document type
according to the search query
2012-04-22 02:05:17 +02:00
Michael Peter Christen
14f67f217c refactoring of ContentDomain: now subclass of Classification 2012-04-22 00:04:36 +02:00
Michael Peter Christen
a5d7da68a0 refactoring: removed dependency from switchboard in Balancer/CrawlQueues 2012-04-21 13:47:48 +02:00
Michael Peter Christen
33d1062c79 refactoring: the cache belongs to the crawler 2012-04-21 13:34:07 +02:00
Michael Peter Christen
8429967ea7 no more SVN 2012-04-19 13:29:08 +02:00