Michael Peter Christen
ba6aaabc51
refactoring + parser bugfixes
2012-05-04 17:28:27 +02:00
Michael Peter Christen
453010bd68
- solved problems with backpath normalization
...
- redesigned in/outbound link handover
- removed iframe links from inbound/outbound in solr scheme
2012-04-27 16:48:51 +02:00
Michael Peter Christen
5f5ed33ed8
patch for media search (audio, video apps)
2012-04-27 14:18:02 +02:00
Michael Peter Christen
0e13022147
- enhanced solr field documentation
...
- added xml api button to IndexFederated_p - the solr schema.xml file
can be generated by YaCy
2012-04-26 15:25:07 +02:00
Michael Peter Christen
08dcf3e5d1
hack to get all results if the actual number is between 10 and 64
2012-04-26 00:27:21 +02:00
Michael Peter Christen
19efbf1b0f
- apply directDocByURL to NOLOAD Queue
...
- choose pushing to NOLOAD as default for site crawl
2012-04-26 00:23:18 +02:00
Michael Peter Christen
5c66880be2
fix for search result selection in case that contentdom is not set
2012-04-26 00:04:23 +02:00
Michael Peter Christen
3bea25c513
increased image preview size
2012-04-24 16:04:13 +02:00
Michael Peter Christen
a3badd3205
changed search process for images: no more media snippet load process,
...
show only links from index which had been on the text search page
before. This creates a superfast search process for images!
2012-04-24 12:55:58 +02:00
Michael Peter Christen
4aa0eedead
one more scroogle...
2012-04-24 12:05:37 +02:00
Michael Peter Christen
347612ddd4
removed scroogle parser
2012-04-24 12:04:44 +02:00
Michael Peter Christen
f8cd57c92f
new indexing strategy: ALL links that appear anywhere are indexed, not
...
only links where the content can be parsed. All non-parseable links are
placed into the noload queue. The search process must therefore be able
to filter out non-text search results.
- This fixes the problem that image search results appeared in the text
search.
- The interactive search can retrieve now ALL types of links
- The p2p interface is now extended to retrieve only certain types of
links (text, image, video, apps)
- The search process has an extension to filter the right document type
according to the search query
2012-04-22 02:05:17 +02:00
Michael Peter Christen
14f67f217c
refactoring of ContentDomain: now subclass of Classification
2012-04-22 00:04:36 +02:00
Michael Peter Christen
a5d7da68a0
refactoring: removed dependency from switchboard in Balancer/CrawlQueues
2012-04-21 13:47:48 +02:00
Michael Peter Christen
33d1062c79
refactoring: the cache belongs to the crawler
2012-04-21 13:34:07 +02:00
Michael Peter Christen
8429967ea7
no more SVN
2012-04-19 13:29:08 +02:00
Michael Peter Christen
0466bb0ddf
no more SVN..
2012-04-19 13:28:12 +02:00
Michael Peter Christen
4844e124b1
one more warning in case that crawling is paused because of low disk
...
space
2012-04-19 12:35:11 +02:00
Michael Peter Christen
0ec2713af8
'download'
2012-04-19 11:50:24 +02:00
Michael Peter Christen
f30c577fdb
add hint to speed up search results
2012-04-19 11:11:14 +02:00
Michael Peter Christen
6b133de3e9
add hint for consulting support
2012-04-19 11:10:48 +02:00
Michael Peter Christen
eb2c8ffa62
display is not used any more
2012-04-17 12:30:14 +02:00
Michael Peter Christen
91a86f0b06
fixed to network graph testing
2012-04-17 11:46:14 +02:00
Michael Peter Christen
f31ad84d98
automatic generation of blacklist pattern, see
...
http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2685&p=25305#p25305
2012-04-17 11:22:19 +02:00
Michael Peter Christen
7b5b9baee0
added citation rank to ranking profile
2012-04-16 23:43:50 +02:00
reger
06951ef751
remove heuristic scroogle from search option help text in index.html
2012-04-16 04:00:04 +02:00
Michael Peter Christen
e377092198
fix to xml output format
2012-04-13 09:02:18 +02:00
Michael Christen
41be98dc9d
extended webstructure api to show together with incoming links also
...
outgoing links
2012-04-13 11:53:34 +02:00
Michael Christen
8f89c8ef07
added information about inbound, outbound and citation links into
...
yacydoc api servlet
2012-03-31 07:38:49 +02:00
Michael Christen
71649a1296
added an api to retrieve the new citation.index with the
...
webstructure.xml api. This api will respond with details about a single
URL if requested with 'webstructure.xml?about=[url|urlhash|host]'.
2012-03-29 17:22:31 +02:00
Lotus
3e61287326
some better feedback on properties change
2012-03-25 22:21:42 +02:00
Lotus
96ac95cff9
added hint how to change integration options
2012-03-23 17:02:50 +01:00
Thomas
4f61b8fd82
Fixes for compare-search
2012-03-21 21:43:47 +01:00
Thomas
e0680de7b3
Remove Scroogle from compare-search, Scroogle is dead
2012-03-20 23:00:06 +01:00
Lotus
78f0d8f046
no focus on preview frames for search integration
...
fixes bug http://bugs.yacy.net/view.php?id=161
2012-03-17 21:10:29 +01:00
Lotus
7792ac6406
fix links & bug #163
2012-03-10 10:59:56 +01:00
Michael Peter Christen
532c7cf827
added physics experiment to the graph plotter. not active by default
2012-02-28 13:18:46 +01:00
Michael Peter Christen
aba9b1bfa0
better names for elements of a linked graph
2012-02-27 21:27:17 +01:00
Michael Peter Christen
2fc8ecee36
ConcurrentLinkedQueue has a VERY long return time on the .size() method.
...
See
http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/ConcurrentLinkedQueue.html
and the following test programm:
public class QueueLengthTimeTest {
public static long countTest(Queue<Integer> q, int c) {
long t = System.currentTimeMillis();
for (int i = 0; i < c; i++) {
q.add(q.size());
}
return System.currentTimeMillis() - t;
}
public static void main(String[] args) {
int c = 1;
for (int i = 0; i < 100; i++) {
Runtime.getRuntime().gc();
long t1 = countTest(new ArrayBlockingQueue<Integer>(c), c);
Runtime.getRuntime().gc();
long t2 = countTest(new LinkedBlockingQueue<Integer>(), c);
Runtime.getRuntime().gc();
long t3 = countTest(new ConcurrentLinkedQueue<Integer>(),
c);
System.out.println("count = " + c + ": ArrayBlockingQueue =
" + t1 + ", LinkedBlockingQueue = " + t2 + ", ConcurrentLinkedQueue = "
+ t3);
c = c * 2;
}
}
}
2012-02-27 00:42:32 +01:00
Michael Peter Christen
8aba045ba1
if a new pop-up page is set in config portal, then this page applies
...
also to the default page configuration for the httpd if no path is
given.
2012-02-26 20:53:32 +01:00
Michael Peter Christen
fa7b3481b3
better navigation in file search: less results by first try, but much
...
faster. after the first search is done, buttons appear to get more
results for the same search
2012-02-26 17:32:45 +01:00
Michael Peter Christen
8c06925984
animation of the web structure picture
2012-02-25 15:42:29 +01:00
Michael Peter Christen
99c74699de
removed scroogle (scroogle is dead)
2012-02-25 12:57:59 +01:00
Michael Peter Christen
6e51a00a2f
Revert "fix for page navigation: show only as much pages as are available for given navigation constraints, not as given by total results size"
...
This reverts commit 73f5a9e8b3
.
2012-02-24 02:46:56 +01:00
Michael Peter Christen
73f5a9e8b3
fix for page navigation: show only as much pages as are available for
...
given navigation constraints, not as given by total results size
2012-02-24 02:31:03 +01:00
Michael Peter Christen
9c51dc0f13
fixed a bug with navigation: if a navigation was applied to file type or
...
protocol, then it was not possible to remove that again. This is the fix
for that.
2012-02-24 02:28:40 +01:00
Michael Peter Christen
8bfc987374
enhanced hint how to enter file:// urls
2012-02-24 02:14:54 +01:00
Michael Peter Christen
c6c61be3f0
fix for http://bugs.yacy.net/view.php?id=148
2012-02-24 00:38:57 +01:00
Michael Peter Christen
edaa8ac94c
Merge commit 'e15e633a0128b8d31011283a65b4ef26a6dddcd8'
2012-02-23 10:07:13 +01:00
reger
e15e633a01
Bugfix for IE9 (doesn't accept html form within form)
...
changes of API schedule row data changed form input form to unique field names
using row pk.
Fix for issue 96 http://bugs.yacy.net/view.php?id=96
IE9-64bit doesn't interprete iframe with align parameter as desired
misaligns following content (in CrawlProfileEditor_p.html)
2012-02-23 02:40:07 +01:00