Commit Graph

3985 Commits

Author SHA1 Message Date
cominch
f7160dae5c Merge remote-tracking branch 'original yacy/master' 2012-06-18 15:44:50 +02:00
cominch
e4555cbee3 Augmented browsing: Pass on additional action parameter 2012-06-18 15:44:01 +02:00
Michael Peter Christen
24bbe359ca integrate also geonames library files for less cities. these are more
useful for tagging since less normal words are false-identified as
location
2012-06-18 15:19:57 +02:00
Michael Peter Christen
5a41e739b4 better apilink description 2012-06-18 13:04:20 +02:00
Michael Peter Christen
e16e4bd2ba added ontology extraction in xml as api call for vocabularies 2012-06-18 13:02:12 +02:00
cominch
8cf47a8335 Merge remote-tracking branch 'original yacy/master' 2012-06-18 12:11:07 +02:00
cominch
b85f01a14e Augmented browsing: small UI fix 2012-06-18 12:01:03 +02:00
Michael Peter Christen
26cb1c65c2 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
Conflicts:
	source/net/yacy/document/importer/OAIPMHLoader.java
2012-06-17 23:50:44 +02:00
Michael Peter Christen
963f92ed9a - merged files
- changed behaviour of delete button in vocabulary edit
- fixed size numbe in vocabulary listing
2012-06-17 23:48:33 +02:00
cominch
d8815db877 Merge remote-tracking branch 'original yacy/master' 2012-06-17 23:07:00 +02:00
cominch
e4dab19045 Augmented Browsing: added template for document info bar 2012-06-17 23:05:53 +02:00
Michael Peter Christen
743b0ec89f - added size of vocabulary to vocabulary view
- fixed bad terms in vocabulary-from-titles autogeneration
2012-06-17 17:32:52 +02:00
Michael Peter Christen
22d5e33c5e added more methods to vocabulary generation: scrape document title and
document author to vocabulary
2012-06-17 14:53:16 +02:00
Michael Peter Christen
b2d1c25ebb removed warnings/unused entities 2012-06-17 11:22:08 +02:00
Michael Peter Christen
f1aa4c4390 - accept only location names wit a minimum length
- remove comma from synonym terms
2012-06-17 10:15:26 +02:00
Michael Peter Christen
cc9ad7198a - use only names which consists of at least two parts
- remove word from derewo from locations
2012-06-17 01:12:31 +02:00
Michael Peter Christen
9264d8b4af removed old navigation practice using subject tags in favor of
triplestore-tags
2012-06-17 00:33:40 +02:00
Michael Peter Christen
eeb4fd8b8c refactoring (geolocalzation -> geolocation) 2012-06-16 22:09:32 +02:00
Michael Peter Christen
64c0268b2b show triplestore metadata in yacydoc and viewfile 2012-06-16 17:40:15 +02:00
Michael Peter Christen
c2f0d16d2c fixed vocabulary initialization 2012-06-16 13:12:02 +02:00
Michael Peter Christen
fbded1f466 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2012-06-16 12:42:43 +02:00
Michael Peter Christen
df3531f8d5 added the generation of virtual vocabularies using the pnd 2012-06-16 12:36:15 +02:00
Michael Peter Christen
e806106b10 jquery bugfix 2012-06-16 08:25:28 +02:00
Michael Peter Christen
a0f1decd82 - added loading of the dbpedia pnd triplestore in the dictionary loader
- renamed the dictionary loader to knowledge loader
- some refactoring in the library provider method names
2012-06-15 19:19:18 +02:00
Michael Peter Christen
6d17686258 made triplestore persistent by default
added a size display in triplestore servlet
2012-06-15 19:13:07 +02:00
Michael Peter Christen
8d6e77ad0c Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2012-06-15 14:38:46 +02:00
cominch
2ac7a5c1f2 Augmented browsing: Add overlay bar which shows the vocabulary tags 2012-06-15 14:32:16 +02:00
Michael Peter Christen
777d22e145 renamed "augmented proxy" to "augmented browsing" 2012-06-15 11:54:33 +02:00
cominch
bddac2839e add missing files for tag display 2012-06-15 10:46:19 +02:00
cominch
441430f507 Merge remote-tracking branch 'original yacy/master' 2012-06-15 10:44:12 +02:00
cominch
3c255c025b Show tags in search results (if activated in ConfigPortal_p.html) 2012-06-15 10:43:05 +02:00
Michael Peter Christen
1f9120d189 create new vocabularies also without an objectspace. this creates an
empty vocabulary
2012-06-15 02:43:55 +02:00
Michael Peter Christen
a5cdfb91de - fixed Cache link (below snippet)
- added 'Augmented Proxy' link below snippet
- added configuration options for augmented proxy
2012-06-14 19:55:34 +02:00
Michael Peter Christen
492b3e09f2 added api icon to triplestore 2012-06-14 19:11:19 +02:00
Michael Peter Christen
16d8f33795 added objectlink generation to vocabulary generation and editor 2012-06-14 18:50:35 +02:00
Michael Peter Christen
f1f97b7c95 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2012-06-14 18:45:38 +02:00
Michael Peter Christen
b3eaaf5ebc check also delete triplestore by default 2012-06-14 18:14:45 +02:00
cominch
f2f07a11f1 hotfix for unresolved pattern 2012-06-14 18:05:10 +02:00
cominch
5fd1a15fcf hotfix until we have updated query routine for tags 2012-06-14 17:56:38 +02:00
cominch
f49d92d8da Cleanup of interaction class and helper routines 2012-06-14 17:41:45 +02:00
cominch
56b0115054 Triplestore: modify routines to access per user store 2012-06-14 15:44:27 +02:00
Michael Peter Christen
d45718251e refactoring (Localization -> Location) 2012-06-14 09:45:57 +02:00
Michael Peter Christen
b8b3c87ba7 - renamed localization to location (that was confusing)
- renamed 'Locale' navigator to 'Location'
- produce Location navigation only if geolocation libraries are loaded
2012-06-14 09:44:14 +02:00
sixcooler
f64e78497a fix for reload-feature in Crawler_p 2012-06-14 02:13:23 +02:00
Michael Peter Christen
e89747bb67 - added automated generation of vocabularies from url stubs
- added clear of all terms for vocabularies
- added deletion of vocabularies
2012-06-13 15:53:18 +02:00
Michael Peter Christen
79464189a4 The 'Locale' vocabulary, which is generated by geo data, has now the
objectspace "http://dbpedia.org/resource/"
2012-06-13 13:05:41 +02:00
Michael Peter Christen
eca38c53e7 added a vocabulary editor 2012-06-13 12:12:20 +02:00
Michael Peter Christen
80e8aaabc8 moved new servlets into one submenu "Content Semantic" 2012-06-12 02:12:01 +02:00
Michael Peter Christen
2bbb6c52cf added option to clean the triplestore when deleting the index 2012-06-12 01:54:36 +02:00
Michael Peter Christen
8b53771db2 changed behavior of navigation processing:
- vocabulary annotation is not done any more into the metadata of urldb
- vocabularies are written into the jena triplestore using a rdf
vocabulary
- vocabularies for rdf tripel must be updated; refactoring done
- with the new navigation tags in the triplestore a faster
pre-urldb-lookup is possible: navigation is processed now within the RWI
during pre-ranking retrieval
- added also a Owl vocabulary stub to add the plain-text url to the
triplestore using the owl:sameas predicate
2012-06-11 23:49:30 +02:00
Michael Peter Christen
5fc6524ca8 - moved triple store to net.yacy.cora.lod (should be generalized there
later
- added abstract add, delete, get methods in the triplestore
- added generation of triples after auto-annotation
- migrated all MultiProtocolURI objects to DigestURI in the parser since
the url hash is needed as subject value in the triples in the triple
store
2012-06-11 16:48:53 +02:00
cominch
c90f174799 preparation and generalization of augmented browsing methods 2012-06-11 09:23:44 +02:00
Roland 'Quix0r' Haeder
edaa09b9b1 Rewrote all String blacklist types to enum 'BlacklistType', closes bug
#143

Conflicts:
	htroot/Supporter.java
	htroot/yacy/crawlReceipt.java
	htroot/yacy/transferRWI.java
	htroot/yacy/transferURL.java
	source/de/anomic/crawler/CrawlStacker.java
	source/de/anomic/data/ListManager.java
	source/net/yacy/peers/Protocol.java
	source/net/yacy/repository/Blacklist.java
	source/net/yacy/repository/LoaderDispatcher.java
	source/net/yacy/search/Switchboard.java
	source/net/yacy/search/index/MetadataRepository.java
	source/net/yacy/search/index/Segment.java
	source/net/yacy/search/query/RWIProcess.java
	source/net/yacy/search/snippet/MediaSnippet.java
2012-06-11 00:17:30 +02:00
Roland 'Quix0r' Haeder
213f006bf1 One is okay ...
Conflicts:
	htroot/Trails.html
2012-06-10 23:40:07 +02:00
Roland 'Quix0r' Haeder
af5a597e47 Scroogle is not comming back, remove dead code
Conflicts:
	source/net/yacy/search/Switchboard.java
2012-06-10 23:38:41 +02:00
cominch
7a4dab6d1d - removed unused variables
- do not replace malformed or invalid URLs in urlproxy

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7835
6c8d7289-2bf4-0310-a012-ef5d649a1542

Conflicts:
	source/de/anomic/http/server/HTTPDFileHandler.java
2012-06-10 23:33:09 +02:00
Michael Peter Christen
90c6fc4b63 load all - but not the persistent local.rdf - triples from
DATA/TRIPLESTORE at startup time. The local.rdf is loaded only if the
persistent switch is on (as before).
2012-06-10 21:49:02 +02:00
Michael Peter Christen
a9eb40c160 fix for autocomplete in index.html 2012-06-10 14:44:37 +02:00
Michael Peter Christen
dd020a1a8a removed autocrawler and feedback servlet link since that was not
cherry-picked
2012-06-10 13:17:23 +02:00
cominch
aa0295917c augmentation
Conflicts:
	source/net/yacy/interaction/AugmentHtmlStream.java
2012-06-10 13:10:21 +02:00
cominch
87a3fbb3c2 interaction javascript 2012-06-10 13:09:00 +02:00
cominch
ed2ea0f08e augmented browsing modification
Conflicts:
	htroot/interaction/OverlayInteraction.html
	source/net/yacy/interaction/AugmentHtmlStream.java
2012-06-10 13:07:57 +02:00
cominch
d4802dc8d5 small change 2012-06-10 13:02:30 +02:00
cominch
a120ef660b RDF demo servlet 2012-06-10 13:02:11 +02:00
cominch
09a34cfe1b prepare RDF dump routines 2012-06-10 12:58:40 +02:00
cominch
300b235ce8 Updated Demo Servlet
Conflicts:
	htroot/About.html
	htroot/DemoServlet.html
	htroot/DemoServlet.java
	htroot/interaction/interaction.js
	source/net/yacy/interaction/Interaction.java
2012-06-10 12:58:29 +02:00
cominch
90512640bf Added config switches for custom parser
Conflicts:
	source/net/yacy/document/TextParser.java
2012-06-10 12:49:36 +02:00
cominch
a12cbcba36 Add a global value store 2012-06-10 12:45:01 +02:00
cominch
e14f2881ae interaction: add special table interaction
Conflicts:
	source/net/yacy/interaction/Interaction.java
2012-06-10 12:41:16 +02:00
cominch
4e4e7a99f8 interaction: add global variable store
Conflicts:
	source/net/yacy/interaction/Interaction.java
2012-06-10 12:34:36 +02:00
cominch
bde07ed7a8 Add tagging overlay element
Conflicts:
	htroot/env/templates/jqueryheader.template
	htroot/yacysearchitem.java
	source/net/yacy/interaction/Interaction.java
2012-06-10 12:28:50 +02:00
cominch
bee3bee8f3 Small fix - return value of JSON should be empty 2012-06-10 12:20:13 +02:00
cominch
ff4ba3ee05 Small fix
Conflicts:
	htroot/yacysearchitem.java
2012-06-10 10:56:39 +02:00
cominch
f05e3968f7 Quick fix 2012-06-10 10:55:09 +02:00
cominch
e859481889 Add Triplestore settings functionality
Conflicts:
	htroot/env/templates/header.template
2012-06-10 10:55:00 +02:00
cominch
b0bc0b4572 Add new demonstration module for client-side key-value store (backend:
triplestore): /DemoServletInteraction.html

Conflicts:
	source/net/yacy/interaction/Interaction.java
2012-06-10 10:53:30 +02:00
cominch
c9dc6cda02 Demonstration: include value from interaction in search results
Conflicts:
	htroot/interaction/OverlayInteraction.html
	htroot/yacysearchitem.java
2012-06-10 10:51:53 +02:00
cominch
ae8adb0e58 Small changes 2012-06-10 10:44:16 +02:00
cominch
bcbd8eee33 Add several parsers, for RDFa and rdf files.
Conflicts:
	source/net/yacy/document/TextParser.java
2012-06-10 10:42:33 +02:00
cominch
9ef5a80f4e add interaction for triples and selector for augmented browsing
Conflicts:
	htroot/interaction/interaction.js
	source/net/yacy/interaction/Interaction.java
2012-06-10 10:38:54 +02:00
cominch
5d20cd324a Add Triplestore and RDF query interface
Conflicts:
	build.xml
	defaults/yacy.init
	source/net/yacy/interaction/AugmentHtmlStream.java
2012-06-10 10:35:59 +02:00
cominch
bc9a618e0a augmented browsing: ignore js and css, integrate more user interaction
Conflicts:
	htroot/interaction/Footer.html
	source/net/yacy/interaction/AugmentHtmlStream.java
2012-06-10 10:29:15 +02:00
cominch
9cbfc1a1c0 augmentedProxy, which forwards every proxy request to a
rewrite engine to customize existing webpages. originally implemented by
Florian Richter.

Conflicts:
	source/de/anomic/http/server/HTTPDProxyHandler.java
2012-06-10 10:15:34 +02:00
cominch
1626be7916 Add menu entries for urlproxy / augmented browsing 2012-06-10 09:59:30 +02:00
Michael Peter Christen
5b25272f40 added location search to main menu 2012-06-09 09:10:54 +02:00
Michael Peter Christen
ea0dceb55d bugfix: do not switch off standard memory strategy when performing a
forced GC
PLEASE CHECK if your peer has standard memory switched on!
2012-06-08 09:48:46 +02:00
Michael Peter Christen
dd14b19c26 lazy initialization of block rank table ... only normal web search uses
this. When interactive search or location search is used, the block rank
is switched off
2012-06-08 09:41:29 +02:00
Michael Peter Christen
701b9a28a0 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
Conflicts:
	htroot/PerformanceMemory_p.java
2012-06-08 09:16:16 +02:00
Michael Peter Christen
ab7107b34b fixed RWIProcess queue limits: now discovering hidden results for mass
result retrieval
2012-06-08 09:14:54 +02:00
Michael Peter Christen
10c9c17d51 fixed handlemap spread factor and null iterator handling 2012-06-08 09:13:41 +02:00
Michael Peter Christen
a61f44f9e4 lazy initialization of block rank table.
this causes that the table is not initialized when there is no search is
done. the effect is most strong if YaCy is started headless which causes
no browser pop-up which otherwise would load the search page and
therefore trigger the initialization of the table.
2012-06-07 13:16:38 +02:00
Michael Peter Christen
c8bbd180e4 enhanced hint for debian package automatic update 2012-06-07 12:36:26 +02:00
Michael Peter Christen
9ad84c5e9f fix for NPE in PerformanceMemory 2012-06-07 12:36:05 +02:00
Michael Peter Christen
96e9d77270 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
Conflicts:
	source/net/yacy/cora/sorting/WeakPriorityBlockingQueue.java
2012-06-06 20:13:28 +02:00
Michael Peter Christen
d7eb18cdf2 accept also file names beginning with "file://" for crawl start from
file.
2012-06-06 14:27:18 +02:00
Michael Peter Christen
3dd8376825 added automatic cleaning of cache if metadata and file database size is
not equal. It might happen that these data is different because one of
that caches is cleaned after a while or when it is too big. The metadata
is then not cleaned, but now wiped after a checkup process at every
application start. This should cause a bit less memory usage.
2012-06-06 14:15:24 +02:00
Michael Peter Christen
d0ec8018f5 fixes for bad long computation 2012-06-06 14:13:31 +02:00
Michael Peter Christen
96c8119b50 added GeoLocation / GeoPoint classes which uses less memory than
Location/Coordinates and has initializers with correct order of lat,lon
coordinates
2012-06-06 12:57:42 +02:00
Michael Peter Christen
461a0ce052 removed warnings 2012-06-05 20:03:43 +02:00
Michael Peter Christen
62ae9bbfda allow more POIs, get more at once 2012-06-05 18:29:54 +02:00
Michael Peter Christen
a1fe65b115 performance hacks 2012-06-05 12:06:26 +02:00
Michael Peter Christen
2fe207f813 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2012-06-04 23:44:38 +02:00
Michael Peter Christen
5aee19daa4 added show from cache in search results (not yet finished) 2012-06-04 23:44:26 +02:00
Michael Peter Christen
e0d8643226 - performance hacks
- added log warnings in case that search processes run into time-out
situations
- better concurrency for Integer formatter (used a non-synchronized
formatter before)
- bugfix for search termination (a poison pill was missing)
- added timeout parameters for search (again) -> target is, that they
are never reached.
2012-06-04 15:37:39 +02:00
Michael Peter Christen
cf79b6cee3 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2012-06-01 08:32:26 +02:00
Michael Peter Christen
6e83b02b83 - bugfix for surrogate file reader
- bugfix for location search: suppress empty search
2012-06-01 00:08:31 +02:00
Michael Peter Christen
9b4c699526 ehanced location search:
- search request are now made using a map boundary
- search results are only computed for the map boundary
- the number of results is adopted to the results in the visible range
- added a double-buffering for the search result markers
- added a search query option for the search results:
/radius/<lat>/<lon>/<radius>
2012-05-31 22:39:53 +02:00
Michael Peter Christen
434af404c1 - added double-buffering for search layers
- added automatic zooming to search result
to location search
2012-05-31 14:05:36 +02:00
Michael Peter Christen
4d9b2dc487 automatically zoom to result layer bounds 2012-05-31 01:12:06 +02:00
Michael Peter Christen
6b40803adf - show number of results in map search interface
- transfer view radius within query
2012-05-31 00:47:52 +02:00
Michael Peter Christen
a8778e9c47 npe fix 2012-05-30 15:28:45 +02:00
Michael Peter Christen
1a6fab60e0 added node state to xml 2012-05-30 09:32:25 +02:00
Michael Peter Christen
20e0cc0822 fix for bad location evaluation 2012-05-29 14:46:13 +02:00
Michael Peter Christen
1ab3de0885 fixes to location search 2012-05-29 12:43:14 +02:00
Michael Peter Christen
f167a1c69f removed osmarender from yacysearch_location because that caused a
javascript error
2012-05-29 02:22:02 +02:00
Michael Peter Christen
71c3163f3d - fixes to node identification
- added link to node in network list
- added marking of portal search node peers
2012-05-29 01:38:54 +02:00
Michael Peter Christen
d1e9fe3db5 enhanced RootState icon 2012-05-29 00:06:33 +02:00
Michael Peter Christen
ad222be7f8 added node state icon in network list 2012-05-25 17:29:54 +02:00
Michael Peter Christen
638390930d another patch to fix the Crawler_p layout 2012-05-25 15:56:21 +02:00
Michael Peter Christen
c846e9ca14 redesign of the crawler monitor page: show crawled pages instead of
queue of urls that shall be crawled
2012-05-25 01:45:38 +02:00
Michael Peter Christen
8b974905ee changed log-in text for all servlets with authentication:
- added hint how to set the password using a shell script
- added a shell script to change the password
2012-05-24 13:24:31 +02:00
Michael Peter Christen
16b21f7a5b Added more steering in Crawler_p.html interface 2012-05-23 18:00:37 +02:00
Michael Peter Christen
c15fcde1c8 add-on to latest commit 2012-05-21 17:52:30 +02:00
Michael Peter Christen
cf47d94888 performance hack to parse numbers inside of substrings without actually
generating a substring. This avoids the allocation of a String object
ech time a substring is parsed. Should affect CPU load during RWI
transmission.
2012-05-21 13:40:46 +02:00
Michael Peter Christen
7bf421b9dd - fixed image search page navigation
- removed some deadlocks and ConcurrentModificationExceptions during
DidYouMean collection
2012-05-21 01:58:29 +02:00
reger
6696cb1313 bugfix: lookup of peernames no result for active peer in page IndexControlRWIs_p.html -> Transfer RWI to other Peer
SeedDB.lookupByName searche for lowercase peerNames, while MapColumnIndex.getIndex uses peername as is in the keyset.
Changed the index init to insert lowercase peer names as key
2012-05-20 05:25:16 +02:00
Michael Peter Christen
4298f00d2d fixed bad usage of given words 2012-05-20 01:35:49 +02:00
Michael Peter Christen
0d32a766ed relax verify attribute for search widget to make it faster:
set to "cacheonly"
2012-05-20 00:50:54 +02:00
reger
ae335a4190 bugfix Tables_p for edit and delete selected row (correction to use "pk_" html prefex) 2012-05-19 23:15:29 +02:00
Michael Peter Christen
f294f2e295 bugfix to http://bugs.yacy.net/view.php?id=181
tried to make a bit less 'noise' to dns server

also included: less processes in snippet fetch to reduce load during
search on small computers
2012-05-19 01:06:33 +02:00
Michael Peter Christen
1473e2258e fix for http://bugs.yacy.net/view.php?id=154 2012-05-18 23:56:40 +02:00
Michael Peter Christen
3e1bc9477f Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2012-05-17 13:58:09 +02:00
Roland 'Quix0r' Haeder
fbb946f913 Made a method static (Eclipse suggested it), removed unused import, pk=null check does now output a warning in logfile 2012-05-17 05:55:44 +02:00
Roland 'Quix0r' Haeder
5f983faef9 No &amp; in JavaScript-embeded URLs, added ability to stop focus in
ConfigPortal.html preview (is this not secured with _p????)

Conflicts:
	htroot/yacyinteractive.java
	htroot/yacysearch.java
2012-05-17 05:49:25 +02:00
Michael Peter Christen
5b3acc12cd Pattern.quote() replaces \\Q and \\E according to publication in
http://www.cs.washington.edu/homes/mernst/pubs/regex-types-ftfjp2012.pdf
2012-05-17 03:55:10 +02:00
Michael Peter Christen
89142d1e8d removed (not all) warnings 2012-05-16 13:42:32 +02:00
Michael Peter Christen
ffa4553229 typo 2012-05-16 10:14:44 +02:00
Michael Peter Christen
5deebd02ea added serialization 2012-05-15 23:10:47 +02:00
reger
b2175ea4ef Add possibility to set custom Solr field names for the YaCy default Solr attributes.
- Changing the format of YaCy's solr.key.list while maintainig backward compatibility
  Federated index config screens adjusted accordingly
- modified the Solr update request to use a 3 min Solr autocommit intervall
2012-05-15 22:34:02 +02:00
Michael Peter Christen
0d58fea210 made multiple connector default 2012-05-12 10:39:01 +02:00
Michael Peter Christen
8864141872 more abstraction in solr connection classes 2012-05-09 17:00:56 +02:00
Michael Peter Christen
c00efc2717 made the solr connection more generic 2012-05-09 16:46:45 +02:00
Michael Peter Christen
f130ab39e8 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2012-05-08 22:55:52 +02:00
Marc Nause
a691023d04 *) better formatting for network QPM
*) refactoring
2012-05-08 20:07:34 +02:00
Michael Peter Christen
dcccbe0be8 removed superfluous column 2012-05-07 00:54:00 +02:00
Michael Peter Christen
77f8e9fb9b Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2012-05-04 17:29:16 +02:00
Michael Peter Christen
ba6aaabc51 refactoring + parser bugfixes 2012-05-04 17:28:27 +02:00
Michael Peter Christen
a18b6dee04 Merge remote branch 'bbyacy-rc1/master' 2012-04-29 21:22:23 +02:00
reger
ea932f841c changed link to opensearchdescription document to a absolute uri (in yacysearch.html and yacysearch.rss)
see http://www.opensearch.org/Specifications/OpenSearch/1.1/Draft_5#The_.22Description.22_element
2012-04-29 05:50:56 +02:00
Michael Peter Christen
453010bd68 - solved problems with backpath normalization
- redesigned in/outbound link handover
- removed iframe links from inbound/outbound in solr scheme
2012-04-27 16:48:51 +02:00
Michael Peter Christen
5f5ed33ed8 patch for media search (audio, video apps) 2012-04-27 14:18:02 +02:00
Michael Peter Christen
0e13022147 - enhanced solr field documentation
- added xml api button to IndexFederated_p - the solr schema.xml file
can be generated by YaCy
2012-04-26 15:25:07 +02:00
Michael Peter Christen
08dcf3e5d1 hack to get all results if the actual number is between 10 and 64 2012-04-26 00:27:21 +02:00
Michael Peter Christen
19efbf1b0f - apply directDocByURL to NOLOAD Queue
- choose pushing to NOLOAD as default for site crawl
2012-04-26 00:23:18 +02:00
Michael Peter Christen
5c66880be2 fix for search result selection in case that contentdom is not set 2012-04-26 00:04:23 +02:00
Michael Peter Christen
3bea25c513 increased image preview size 2012-04-24 16:04:13 +02:00
Michael Peter Christen
a3badd3205 changed search process for images: no more media snippet load process,
show only links from index which had been on the text search page
before. This creates a superfast search process for images!
2012-04-24 12:55:58 +02:00
Michael Peter Christen
4aa0eedead one more scroogle... 2012-04-24 12:05:37 +02:00
Michael Peter Christen
347612ddd4 removed scroogle parser 2012-04-24 12:04:44 +02:00
Michael Peter Christen
f8cd57c92f new indexing strategy: ALL links that appear anywhere are indexed, not
only links where the content can be parsed. All non-parseable links are
placed into the noload queue. The search process must therefore be able
to filter out non-text search results.
- This fixes the problem that image search results appeared in the text
search.
- The interactive search can retrieve now ALL types of links
- The p2p interface is now extended to retrieve only certain types of
links (text, image, video, apps)
- The search process has an extension to filter the right document type
according to the search query
2012-04-22 02:05:17 +02:00
Michael Peter Christen
14f67f217c refactoring of ContentDomain: now subclass of Classification 2012-04-22 00:04:36 +02:00
Michael Peter Christen
a5d7da68a0 refactoring: removed dependency from switchboard in Balancer/CrawlQueues 2012-04-21 13:47:48 +02:00
Michael Peter Christen
33d1062c79 refactoring: the cache belongs to the crawler 2012-04-21 13:34:07 +02:00
Michael Peter Christen
8429967ea7 no more SVN 2012-04-19 13:29:08 +02:00
Michael Peter Christen
0466bb0ddf no more SVN.. 2012-04-19 13:28:12 +02:00
Michael Peter Christen
4844e124b1 one more warning in case that crawling is paused because of low disk
space
2012-04-19 12:35:11 +02:00
Michael Peter Christen
0ec2713af8 'download' 2012-04-19 11:50:24 +02:00
Michael Peter Christen
f30c577fdb add hint to speed up search results 2012-04-19 11:11:14 +02:00
Michael Peter Christen
6b133de3e9 add hint for consulting support 2012-04-19 11:10:48 +02:00
Michael Peter Christen
eb2c8ffa62 display is not used any more 2012-04-17 12:30:14 +02:00
Michael Peter Christen
91a86f0b06 fixed to network graph testing 2012-04-17 11:46:14 +02:00
Michael Peter Christen
f31ad84d98 automatic generation of blacklist pattern, see
http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2685&p=25305#p25305
2012-04-17 11:22:19 +02:00
Michael Peter Christen
7b5b9baee0 added citation rank to ranking profile 2012-04-16 23:43:50 +02:00
reger
06951ef751 remove heuristic scroogle from search option help text in index.html 2012-04-16 04:00:04 +02:00
Michael Peter Christen
e377092198 fix to xml output format 2012-04-13 09:02:18 +02:00
Michael Christen
41be98dc9d extended webstructure api to show together with incoming links also
outgoing links
2012-04-13 11:53:34 +02:00
Michael Christen
8f89c8ef07 added information about inbound, outbound and citation links into
yacydoc api servlet
2012-03-31 07:38:49 +02:00
Michael Christen
71649a1296 added an api to retrieve the new citation.index with the
webstructure.xml api. This api will respond with details about a single
URL if requested with 'webstructure.xml?about=[url|urlhash|host]'.
2012-03-29 17:22:31 +02:00
Lotus
3e61287326 some better feedback on properties change 2012-03-25 22:21:42 +02:00
Lotus
96ac95cff9 added hint how to change integration options 2012-03-23 17:02:50 +01:00
Thomas
4f61b8fd82 Fixes for compare-search 2012-03-21 21:43:47 +01:00
Thomas
e0680de7b3 Remove Scroogle from compare-search, Scroogle is dead 2012-03-20 23:00:06 +01:00
Lotus
78f0d8f046 no focus on preview frames for search integration
fixes bug http://bugs.yacy.net/view.php?id=161
2012-03-17 21:10:29 +01:00
Lotus
7792ac6406 fix links & bug #163 2012-03-10 10:59:56 +01:00
Michael Peter Christen
532c7cf827 added physics experiment to the graph plotter. not active by default 2012-02-28 13:18:46 +01:00
Michael Peter Christen
aba9b1bfa0 better names for elements of a linked graph 2012-02-27 21:27:17 +01:00
Michael Peter Christen
2fc8ecee36 ConcurrentLinkedQueue has a VERY long return time on the .size() method.
See
http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/ConcurrentLinkedQueue.html

and the following test programm:

public class QueueLengthTimeTest {


    public static long countTest(Queue<Integer> q, int c) {
        long t = System.currentTimeMillis();
        for (int i = 0; i < c; i++) {
            q.add(q.size());
        }
        return System.currentTimeMillis() - t;
    }

    public static void main(String[] args) {
        int c = 1;
        for (int i = 0; i < 100; i++) {
            Runtime.getRuntime().gc();
            long t1 = countTest(new ArrayBlockingQueue<Integer>(c), c);
            Runtime.getRuntime().gc();
            long t2 = countTest(new LinkedBlockingQueue<Integer>(), c);
            Runtime.getRuntime().gc();
            long t3 = countTest(new ConcurrentLinkedQueue<Integer>(),
c);

            System.out.println("count = " + c + ": ArrayBlockingQueue =
" + t1 + ", LinkedBlockingQueue = " + t2 + ", ConcurrentLinkedQueue = "
+ t3);
            c = c * 2;
        }
    }
}
2012-02-27 00:42:32 +01:00
Michael Peter Christen
8aba045ba1 if a new pop-up page is set in config portal, then this page applies
also to the default page configuration for the httpd if no path is
given.
2012-02-26 20:53:32 +01:00
Michael Peter Christen
fa7b3481b3 better navigation in file search: less results by first try, but much
faster. after the first search is done, buttons appear to get more
results for the same search
2012-02-26 17:32:45 +01:00
Michael Peter Christen
8c06925984 animation of the web structure picture 2012-02-25 15:42:29 +01:00
Michael Peter Christen
99c74699de removed scroogle (scroogle is dead) 2012-02-25 12:57:59 +01:00
Michael Peter Christen
6e51a00a2f Revert "fix for page navigation: show only as much pages as are available for given navigation constraints, not as given by total results size"
This reverts commit 73f5a9e8b3.
2012-02-24 02:46:56 +01:00
Michael Peter Christen
73f5a9e8b3 fix for page navigation: show only as much pages as are available for
given navigation constraints, not as given by total results size
2012-02-24 02:31:03 +01:00
Michael Peter Christen
9c51dc0f13 fixed a bug with navigation: if a navigation was applied to file type or
protocol, then it was not possible to remove that again. This is the fix
for that.
2012-02-24 02:28:40 +01:00
Michael Peter Christen
8bfc987374 enhanced hint how to enter file:// urls 2012-02-24 02:14:54 +01:00
Michael Peter Christen
c6c61be3f0 fix for http://bugs.yacy.net/view.php?id=148 2012-02-24 00:38:57 +01:00
Michael Peter Christen
edaa8ac94c Merge commit 'e15e633a0128b8d31011283a65b4ef26a6dddcd8' 2012-02-23 10:07:13 +01:00
reger
e15e633a01 Bugfix for IE9 (doesn't accept html form within form)
changes of API schedule row data changed form input form to unique field names
using row pk.
Fix for issue 96 http://bugs.yacy.net/view.php?id=96

IE9-64bit doesn't interprete iframe with align parameter as desired
misaligns following content (in CrawlProfileEditor_p.html)
2012-02-23 02:40:07 +01:00
Michael Peter Christen
a9b4d49b75 removed debug output 2012-02-21 22:31:14 +01:00
Michael Peter Christen
8d63a5887c bugfixes 2012-02-02 23:38:23 +01:00