Michael Peter Christen
62ae9bbfda
allow more POIs, get more at once
2012-06-05 18:29:54 +02:00
Michael Peter Christen
407fdf6968
more bug fixes and performance hacks for search process
2012-06-05 15:04:23 +02:00
Michael Peter Christen
a1fe65b115
performance hacks
2012-06-05 12:06:26 +02:00
Michael Peter Christen
2fe207f813
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-06-04 23:44:38 +02:00
Michael Peter Christen
5aee19daa4
added show from cache in search results (not yet finished)
2012-06-04 23:44:26 +02:00
Michael Peter Christen
5e562dcdb7
adopted vocabulary usage within anotation/naviagtion feature of search
...
to new SimpleVocabulary class
2012-06-04 23:43:30 +02:00
Michael Peter Christen
514700291a
moved Vocabulary to cora package (added in git
...
964406ad17
)
2012-06-04 23:41:36 +02:00
Michael Peter Christen
0284a4d88f
more fixes for double precision of coordinates
2012-06-04 23:37:41 +02:00
Michael Peter Christen
964406ad17
added concurrency enhancement to xml parser
2012-06-04 23:35:56 +02:00
Michael Peter Christen
240045cf7c
fix for bad distance computation
2012-06-04 16:33:16 +02:00
Michael Peter Christen
e0d8643226
- performance hacks
...
- added log warnings in case that search processes run into time-out
situations
- better concurrency for Integer formatter (used a non-synchronized
formatter before)
- bugfix for search termination (a poison pill was missing)
- added timeout parameters for search (again) -> target is, that they
are never reached.
2012-06-04 15:37:39 +02:00
Michael Peter Christen
7a329465b3
using pre-compile pattern in blacklist; should enhance search speed
2012-06-04 15:34:53 +02:00
Michael Peter Christen
6e83b02b83
- bugfix for surrogate file reader
...
- bugfix for location search: suppress empty search
2012-06-01 00:08:31 +02:00
Michael Peter Christen
9b4c699526
ehanced location search:
...
- search request are now made using a map boundary
- search results are only computed for the map boundary
- the number of results is adopted to the results in the visible range
- added a double-buffering for the search result markers
- added a search query option for the search results:
/radius/<lat>/<lon>/<radius>
2012-05-31 22:39:53 +02:00
Michael Peter Christen
834dc6b263
store more data from interface access
2012-05-31 00:47:07 +02:00
Michael Peter Christen
1f48d1528b
performance hacks
2012-05-31 00:46:30 +02:00
Michael Peter Christen
c70aaccdc9
better location to generate a guid for rss messages
2012-05-30 17:14:25 +02:00
Michael Peter Christen
10da7335ea
performance hack: use a hash cache for all hashes that are computed by a
...
byte array. If this hash is used in a HashMap (which is very often the
case) then this hack eliminates a lot of re-computations of the same
hash.
2012-05-30 16:59:13 +02:00
Michael Peter Christen
f8a0cf6d7c
RSSMessages do not need a concurrent hash map -> removed overhead
2012-05-30 16:44:03 +02:00
Michael Peter Christen
07ca7e4dd1
enhanced RSS parsing by ensuring that it is parsed with a buffered input
...
stream
2012-05-30 16:40:37 +02:00
Michael Peter Christen
7c1feefb28
introduced a default 10 second time-out in rwi normalization time
...
uring search process to prevent endless deadlocks after a very long
running search
2012-05-30 16:26:05 +02:00
Michael Peter Christen
8d997d55b6
better logging
2012-05-30 15:47:35 +02:00
Michael Peter Christen
65d37e6a20
only ASCII needed in seed bitflags
2012-05-30 15:42:28 +02:00
Michael Peter Christen
0f82fb3628
using double instead float for a better release ordering
2012-05-30 15:28:20 +02:00
Michael Peter Christen
43c2c6e588
better logging
2012-05-30 15:27:45 +02:00
sixcooler
56087c1f23
bump to httpclient- httpcore-, httpmime- 4.2
2012-05-30 14:46:21 +02:00
Michael Peter Christen
20e0cc0822
fix for bad location evaluation
2012-05-29 14:46:13 +02:00
Michael Peter Christen
71c3163f3d
- fixes to node identification
...
- added link to node in network list
- added marking of portal search node peers
2012-05-29 01:38:54 +02:00
Michael Peter Christen
4d3cc02168
replaced old bzip2 library against better documented commons-compress
...
package from http://commons.apache.org/compress/
2012-05-28 23:53:48 +02:00
Michael Peter Christen
ad222be7f8
added node state icon in network list
2012-05-25 17:29:54 +02:00
Michael Peter Christen
eff7667554
fix for http://bugs.yacy.net/view.php?id=188
2012-05-25 16:21:44 +02:00
Michael Peter Christen
3c2bec681f
added a root node flag: identifies peers with short ping time
2012-05-25 15:33:02 +02:00
Michael Peter Christen
c846e9ca14
redesign of the crawler monitor page: show crawled pages instead of
...
queue of urls that shall be crawled
2012-05-25 01:45:38 +02:00
Michael Peter Christen
8b974905ee
changed log-in text for all servlets with authentication:
...
- added hint how to set the password using a shell script
- added a shell script to change the password
2012-05-24 13:24:31 +02:00
Michael Peter Christen
16b21f7a5b
Added more steering in Crawler_p.html interface
2012-05-23 18:00:37 +02:00
Michael Peter Christen
acc19e190d
hack against 100% cpu during crawl delete
2012-05-23 15:45:07 +02:00
Michael Peter Christen
c15fcde1c8
add-on to latest commit
2012-05-21 17:52:30 +02:00
Michael Peter Christen
cf47d94888
performance hack to parse numbers inside of substrings without actually
...
generating a substring. This avoids the allocation of a String object
ech time a substring is parsed. Should affect CPU load during RWI
transmission.
2012-05-21 13:40:46 +02:00
Michael Peter Christen
7e0ddbd275
added a "fromCache" flag in Response object to omit one cache.has()
...
check during snippet generation. This should cause less blockings
2012-05-21 03:03:47 +02:00
Michael Peter Christen
81737dcb18
removed stack trace from swf parser since we cant do anything there
2012-05-21 02:27:06 +02:00
Michael Peter Christen
7bf421b9dd
- fixed image search page navigation
...
- removed some deadlocks and ConcurrentModificationExceptions during
DidYouMean collection
2012-05-21 01:58:29 +02:00
Michael Peter Christen
125d47b3c1
added more interruptions in DidYouMean because that was the cause for
...
some blockings during search
2012-05-21 00:59:41 +02:00
Michael Peter Christen
c6a09eab0b
synchronization needed
2012-05-21 00:58:29 +02:00
Michael Peter Christen
fb94b47b1a
changed queue sizes to have less memory occupied during indexing
2012-05-21 00:19:03 +02:00
Michael Peter Christen
76157dc2c3
bugfix for http://bugs.yacy.net/view.php?id=173
2012-05-21 00:18:00 +02:00
reger
6696cb1313
bugfix: lookup of peernames no result for active peer in page IndexControlRWIs_p.html -> Transfer RWI to other Peer
...
SeedDB.lookupByName searche for lowercase peerNames, while MapColumnIndex.getIndex uses peername as is in the keyset.
Changed the index init to insert lowercase peer names as key
2012-05-20 05:25:16 +02:00
Michael Peter Christen
c6558cba08
more classification bugs
2012-05-20 02:59:47 +02:00
Michael Peter Christen
082831b9d6
search contentdom was checked in wrong way - fixed
2012-05-20 01:23:02 +02:00
reger
ee553d971e
correct typo in scripts_txt comment
2012-05-19 02:09:16 +02:00
Michael Peter Christen
f294f2e295
bugfix to http://bugs.yacy.net/view.php?id=181
...
tried to make a bit less 'noise' to dns server
also included: less processes in snippet fetch to reduce load during
search on small computers
2012-05-19 01:06:33 +02:00
Michael Peter Christen
acf8d521a2
fix for http://bugs.yacy.net/view.php?id=126
2012-05-19 00:21:03 +02:00
Michael Peter Christen
bb88878b4d
the last commit was incomplete..
2012-05-18 22:33:16 +02:00
Michael Peter Christen
d320a31ae1
bugfix for http://bugs.yacy.net/view.php?id=186
2012-05-18 22:18:47 +02:00
Michael Peter Christen
fa735f4f04
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-05-17 23:40:08 +02:00
Michael Peter Christen
3e1bc9477f
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-05-17 13:58:09 +02:00
Michael Peter Christen
6f8a2fef1f
small speed enhancement using a column factory
2012-05-17 11:08:48 +02:00
Roland 'Quix0r' Haeder
d10627d591
More sync in close() methods
...
Conflicts:
source/net/yacy/kelondro/logging/GuiHandler.java
source/net/yacy/kelondro/workflow/InstantBusyThread.java
2012-05-17 06:03:18 +02:00
Roland 'Quix0r' Haeder
b3ae2aa41f
With or without 'final'? At least please try it in other methods
...
Conflicts:
source/de/anomic/tools/tarTools.java
2012-05-17 06:00:49 +02:00
Roland 'Quix0r' Haeder
fbb946f913
Made a method static (Eclipse suggested it), removed unused import, pk=null check does now output a warning in logfile
2012-05-17 05:55:44 +02:00
Michael Peter Christen
52d307c735
prevent that the snippet fectch process removes catchall entries
2012-05-17 05:18:52 +02:00
Michael Peter Christen
7eece0256f
moved yacy.logging to defaults according to request in
...
http://bugs.yacy.net/view.php?id=55
2012-05-17 04:26:03 +02:00
Michael Peter Christen
5b3acc12cd
Pattern.quote() replaces \\Q and \\E according to publication in
...
http://www.cs.washington.edu/homes/mernst/pubs/regex-types-ftfjp2012.pdf
2012-05-17 03:55:10 +02:00
Michael Peter Christen
89142d1e8d
removed (not all) warnings
2012-05-16 13:42:32 +02:00
Michael Peter Christen
5deebd02ea
added serialization
2012-05-15 23:10:47 +02:00
reger
b2175ea4ef
Add possibility to set custom Solr field names for the YaCy default Solr attributes.
...
- Changing the format of YaCy's solr.key.list while maintainig backward compatibility
Federated index config screens adjusted accordingly
- modified the Solr update request to use a 3 min Solr autocommit intervall
2012-05-15 22:34:02 +02:00
Michael Peter Christen
15db703808
added missing serialization to remove all warnings
2012-05-15 13:13:07 +02:00
Michael Peter Christen
1795a7325b
made HandleSet serializable
2012-05-15 12:55:15 +02:00
Michael Peter Christen
e7e381d110
added configuration to switch off redirection following in crawler
2012-05-15 12:25:46 +02:00
Michael Peter Christen
2717c1b749
fixed bug in solr interface
2012-05-15 12:25:14 +02:00
Michael Peter Christen
70505107ca
enhanced crawler/balancer: better remaining waiting-time guessing
2012-05-15 12:24:54 +02:00
Michael Peter Christen
f150bc218b
fixed bug in solr error document
2012-05-14 14:56:21 +02:00
Michael Peter Christen
cb54c1737b
solrj connector bugfix
2012-05-14 11:56:04 +02:00
Roland 'Quix0r' Haeder
a093ccf5eb
Now used synchronization in all close() methods to make sure all objects
...
are 'closed' in an ordered way
Conflicts:
source/de/anomic/http/server/ChunkedInputStream.java
source/de/anomic/http/server/ChunkedOutputStream.java
source/de/anomic/http/server/ContentLengthInputStream.java
source/net/yacy/cora/protocol/Domains.java
source/net/yacy/cora/services/federated/solr/SolrShardingConnector.java
source/net/yacy/cora/services/federated/solr/SolrSingleConnector.java
source/net/yacy/document/content/dao/PhpBB3Dao.java
source/net/yacy/document/parser/html/AbstractTransformer.java
source/net/yacy/kelondro/blob/BEncodedHeap.java
source/net/yacy/kelondro/blob/HeapReader.java
source/net/yacy/kelondro/index/RAMIndexCluster.java
source/net/yacy/kelondro/io/ByteCountInputStream.java
source/net/yacy/kelondro/logging/ConsoleOutErrHandler.java
source/net/yacy/kelondro/table/SQLTable.java
2012-05-14 07:41:55 +02:00
Michael Peter Christen
49cab2b85f
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-05-13 09:51:06 +02:00
Michael Peter Christen
0d58fea210
made multiple connector default
2012-05-12 10:39:01 +02:00
Michael Peter Christen
7740c02c56
- enhanced the solr connector
...
- added new multiple connector (to replace singleConnector)
2012-05-12 10:32:42 +02:00
Michael Peter Christen
0cf3d36eae
more tolerance in case of corrupted file
2012-05-11 20:46:50 +02:00
Michael Peter Christen
acc6db28ff
added missing classes for solr interface
2012-05-09 23:43:12 +02:00
Michael Peter Christen
adeb33bb36
better abstraction for solr objects
2012-05-09 17:21:19 +02:00
Michael Peter Christen
8864141872
more abstraction in solr connection classes
2012-05-09 17:00:56 +02:00
Michael Peter Christen
c00efc2717
made the solr connection more generic
2012-05-09 16:46:45 +02:00
Michael Peter Christen
ea2bd43b28
patch for broken configurations
2012-05-09 12:29:07 +02:00
Michael Peter Christen
e5ca7f22b1
enhancement in circle drawing
2012-05-09 12:28:43 +02:00
Michael Peter Christen
34f4225d7e
less 'wellformed' calls without asserts
2012-05-08 23:24:39 +02:00
Marc Nause
a691023d04
*) better formatting for network QPM
...
*) refactoring
2012-05-08 20:07:34 +02:00
Michael Peter Christen
77f8e9fb9b
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-05-04 17:29:16 +02:00
Michael Peter Christen
ba6aaabc51
refactoring + parser bugfixes
2012-05-04 17:28:27 +02:00
Michael Peter Christen
2a0434efa4
Merge commit 'c1f6b4fb5226d3d2f8b2bec9e361f6b3476e03ff'
2012-04-29 21:21:49 +02:00
Michael Peter Christen
942896fe46
removed methods not supported by new solrj connector for httpclient 4
...
Error was:
java.lang.UnsupportedOperationException: Client was created outside of
HttpSolrServer
at
org.apache.solr.client.solrj.impl.HttpSolrServer.setDefaultMaxConnectionsPerHost(HttpSolrServer.java:614)
at
net.yacy.cora.services.federated.solr.SolrSingleConnector.<init>(SolrSingleConnector.java:128)
at
net.yacy.cora.services.federated.solr.SolrShardingConnector.<init>(SolrShardingConnector.java:55)
at net.yacy.search.Switchboard.<init>(Switchboard.java:657)
at net.yacy.yacy.startup(yacy.java:222)
at net.yacy.yacy.main(yacy.java:1018)
2012-04-27 18:26:36 +02:00
Michael Peter Christen
22e1f68c0b
solrj user authentication patch
2012-04-27 17:53:45 +02:00
Michael Peter Christen
09484955dc
added new entry class for embed tags
2012-04-27 17:48:51 +02:00
Michael Peter Christen
62f2554a01
- fixed build problems (deprecated methods using httpclient 3.1)
...
- removed httpclient 3.1 lib which was used by solrj (solrj now uses
httpclient 4)
2012-04-27 17:46:08 +02:00
Michael Peter Christen
a6d60fc21f
concurrency enhancement in ConfigurationSet
2012-04-27 17:20:18 +02:00
Michael Peter Christen
453010bd68
- solved problems with backpath normalization
...
- redesigned in/outbound link handover
- removed iframe links from inbound/outbound in solr scheme
2012-04-27 16:48:51 +02:00
Michael Peter Christen
5f5ed33ed8
patch for media search (audio, video apps)
2012-04-27 14:18:02 +02:00
Michael Peter Christen
7860c1df80
fix needed for new solrj library
2012-04-27 14:13:59 +02:00
Michael Peter Christen
0e13022147
- enhanced solr field documentation
...
- added xml api button to IndexFederated_p - the solr schema.xml file
can be generated by YaCy
2012-04-26 15:25:07 +02:00
Michael Peter Christen
19efbf1b0f
- apply directDocByURL to NOLOAD Queue
...
- choose pushing to NOLOAD as default for site crawl
2012-04-26 00:23:18 +02:00
Michael Peter Christen
659178942f
- Redesigned crawler and parser to accept embedded links from the NOLOAD
...
queue and not from virtual documents generated by the parser.
- The parser now generates nice description texts for NOLOAD entries
which shall make it possible to find media content using the search
index and not using the media prefetch algorithm during search (which
was costly)
- Removed the media-search prefetch process from image search
2012-04-24 16:07:03 +02:00
Michael Peter Christen
a3badd3205
changed search process for images: no more media snippet load process,
...
show only links from index which had been on the text search page
before. This creates a superfast search process for images!
2012-04-24 12:55:58 +02:00