Michael Peter Christen
2e7219f9fd
removed hightlighting of search results within collections in GSA
...
interface
2012-11-09 16:25:24 +01:00
Michael Peter Christen
074dfd297b
added icons and a selection for hosts with urls pending for crawler or
...
with errors
2012-11-09 16:24:56 +01:00
Michael Peter Christen
f07e5fb553
release 1.2
2012-11-07 23:14:45 +01:00
Michael Peter Christen
4c4e0eece2
added new submenu 'Target Analysis' with three servlets which are useful
...
to analyse the target servers: robots.txt table, mass target analysis
and a regex tester
2012-11-07 21:26:01 +01:00
Michael Peter Christen
61995d508e
do the commit anyway before calling a search interface
2012-11-07 17:27:50 +01:00
Michael Peter Christen
842faf96a2
fixed media search
2012-11-07 17:27:13 +01:00
Michael Peter Christen
86ec199126
using a better file name
2012-11-07 16:39:49 +01:00
Michael Peter Christen
93001586a0
removed warnings, removed too-fast pausing of crawls
2012-11-07 15:37:14 +01:00
Michael Peter Christen
8041742e48
added matching of path to query pattern
2012-11-07 15:06:13 +01:00
Michael Peter Christen
8b1c9cba3d
fixed a problem with non-terminating crawls
2012-11-07 15:05:44 +01:00
Michael Peter Christen
61a1d32356
fix to ftp client
2012-11-07 14:58:28 +01:00
Michael Peter Christen
5105256927
update to search result logging (this was a remaining issue from the
...
solr 4.0.0 migration)
2012-11-07 14:15:27 +01:00
Michael Peter Christen
570e42c4e3
fix for filetype naviagtor
2012-11-07 13:53:29 +01:00
Michael Peter Christen
71ed8e5e07
bugfixes for crawler
2012-11-07 12:52:19 +01:00
Michael Peter Christen
29fbbb49dc
better colors for host browser and corrected document count
2012-11-07 12:23:21 +01:00
Michael Peter Christen
12c0db20e5
fixed npe for surrogate import
2012-11-07 02:46:51 +01:00
Michael Peter Christen
6244b084cd
fixed wrong order of result count values
2012-11-07 02:29:33 +01:00
Michael Peter Christen
631b08e7e2
update to HostBrowser
2012-11-07 02:17:24 +01:00
Michael Peter Christen
51f420e4f5
removed location search because it is only working in special cases
2012-11-07 02:04:41 +01:00
Michael Peter Christen
52df6ee369
more logging
2012-11-07 02:04:08 +01:00
Michael Peter Christen
158732af37
automatically delete entries from the crawl profile list if crawl is
...
terminated.
2012-11-07 02:03:44 +01:00
Michael Peter Christen
15d1460b40
added information about the reason of pausing of crawls
2012-11-06 15:21:56 +01:00
Michael Peter Christen
2371ef031c
added solr faceted search support to YaCy search results
...
added solr highlighting / YaCy snippets to YaCy search results
- facets are now much more complete
- facets are computed and searched much faster
- snippet computation is done by solr if solr knows the snippet
2012-11-06 14:32:08 +01:00
Michael Peter Christen
b30a7162fa
added more thread-renaiming for search processes
2012-11-06 12:31:23 +01:00
Michael Peter Christen
900445d8e9
set the thread name during solr queries to the solr query to get better
...
debugging options
2012-11-06 11:48:04 +01:00
Michael Peter Christen
d481abd087
added the visualization of error-urls to host browser
...
- only visible for admins
- a faceted search generates a huge list for all hosts in the host list
- the faceted search algorithms had to be modified for that
- within the browsing of the directory path, the error cause is written
to the url which is presented as error-url
- the errors are also accumulated for directory sums
2012-11-06 00:29:37 +01:00
Michael Peter Christen
a15819fbec
fix for some interface problems
2012-11-05 22:14:52 +01:00
Michael Peter Christen
791e1dcfdf
when a new crawl is started, delete all entries about error-urls for
...
crawl-start domains
2012-11-05 22:14:27 +01:00
Michael Peter Christen
c6a6f4c4e6
added a hack which makes the HostBrowser more performant when the given
...
host has a lot of urls. If the number of urls is > 1000, then the list
of documents is restricted to such which have no subpath, if the root
path is selected. However, this can cause a problem if no documents on
the root path exist but only on paths below that root path.
2012-11-05 18:57:21 +01:00
Michael Peter Christen
619bf7e875
fixed filetype modified for media types in text search
2012-11-05 18:08:00 +01:00
Michael Peter Christen
97f82994a6
automatically pause the crawler if there is a problem with solr
2012-11-05 16:34:42 +01:00
Michael Peter Christen
64ac2b7b7d
new submenu template
2012-11-05 15:36:42 +01:00
Michael Peter Christen
5e77801aac
update to web interface structure
2012-11-05 15:23:03 +01:00
Michael Peter Christen
8fb370d9f8
renovated the way how search results are count. should be correct now...
2012-11-05 03:19:28 +01:00
Michael Peter Christen
7bec253bb0
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-11-04 09:21:58 +01:00
Michael Peter Christen
d88eb657fd
Merge branch 'master' of git://gitorious.org/~reger/yacy/bbyacy-rc1
2012-11-04 09:21:21 +01:00
orbiter
354ef8000d
- added 'deleteold' option to crawler which causes that documents are
...
deleted which are selected by a crawl filter (host or subpath)
- site crawl used this option be default now
- made option to deleteDomain() concurrency
2012-11-04 02:58:26 +01:00
Michael Peter Christen
19d1f474ce
host browser now shows also number of pending files per subdirectory +
...
bugfixes
2012-11-02 14:40:02 +01:00
Michael Peter Christen
75dd706e1b
update to HostBrowser:
...
- time-out after 3 seconds to speed up display (may be incomplete)
- showing also all links from the balancer queue in the host list (after
the '/') and in the result browser view with tag 'loading'
2012-11-02 13:57:43 +01:00
Michael Peter Christen
e2c4c3c7d3
migration to solr 4.0.0
2012-11-02 12:29:48 +01:00
Michael Peter Christen
b764de424a
code cleanup
2012-11-02 10:28:32 +01:00
Michael Peter Christen
69aa39d664
update to libraries required by solr 4.0.0
2012-11-02 10:27:44 +01:00
Michael Peter Christen
9330ad4838
- fixed the delete option in host browser
...
- added a delete method which can be used to delete a full subpath in
solr.
2012-11-02 01:22:31 +01:00
Michael Peter Christen
a63179f3f9
added the MIME attribute for the R tag in GSA search result writer
2012-11-02 00:14:29 +01:00
Michael Peter Christen
40df2fd193
added the host browser as link to search results. that means you can
...
select a browsing position after a search is done on the search results.
2012-11-01 21:38:05 +01:00
Michael Peter Christen
1168d09de8
more refactoring - integrated the code of SnippetProcess into
...
SearchEvent
2012-11-01 17:40:06 +01:00
Michael Peter Christen
6629e37685
tried to clean up the search process mess
2012-11-01 17:16:43 +01:00
Michael Peter Christen
c5f67a5d6d
fixed a problem with local search from solr results: now all results
...
from solr are shown (again)
2012-11-01 10:22:22 +01:00
sixcooler
02957d5982
missing license-files
...
(sorry I didn't commit theses files by mistake)
2012-10-31 23:47:08 +01:00
Michael Peter Christen
16216c2344
added missing libraries
2012-10-31 23:29:47 +01:00