Commit Graph

55 Commits

Author SHA1 Message Date
okybaca
4add1f6bc7 replaced all the links to legacy legacy wiki to legacy wiki 2023-10-29 13:12:24 +01:00
luccioman
bf4f320b16 Optionally render the response header when using the Solr html writer
With params rendered as html input fields for conveniently modifying
params values and refreshing results.
2018-07-23 18:36:57 +02:00
luccioman
d92b191942 Ensure no remote Solr is attached before "Shut Down and Re-Start Solr"
Otherwise once this operation is applied, the remote Solr(s) instances
are deconnected and the embedded Solr is connected even if disabled by
setting "core.service.fulltext".

Also use constants for related default setting values.
2018-04-06 20:34:54 +02:00
luccioman
cde237b687 Enforced access controls on some administrative actions.
- ensure use of HTTP POST method : HTTP GET should only be used for
information retrieval and not to perform server side effect operations
(see HTTP standard https://tools.ietf.org/html/rfc7231#section-4.2.1)
 - a transaction token is now required for these administrative form
submissions to ensure the request can not be included in an external
site and performed silently/by mistake by the user browser
2017-03-26 11:48:00 +02:00
luccioman
89017e17e4 Converted ajax URL to relative and added a check on the response status.
This makes YaCy easier to configure when running behind a reverse Proxy.

The check on status avoids trying to update the page with error text
content when the server returned a 404 or 500 error message for example.
2016-11-25 11:13:16 +01:00
reger
17fc09036e fix cutoff text in button and adjust formatting 2016-11-12 21:58:06 +01:00
Michael Peter Christen
ff11ac89f7 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git 2015-06-04 23:04:04 +02:00
Michael Peter Christen
5e2d23b7a0 removed the new index export method from the IndexControlURLs_p.html
servlet and moved it to a new /IndexExport_p.html servlet. This servlet
is now more prominent linked in the main menu under Production -> Index
Export/Import
2015-06-04 23:03:46 +02:00
reger
49b79987c9 remove obsolete searchfl work table
was used to register urls with not complete words in snippet but is never accessed
2015-06-04 22:44:01 +02:00
Michael Peter Christen
c7576d6028 added a full solr export to the IndexControlURLs_p.html servlet. The
export function is also now the default export option. The export file
format for a full solr export is very similar to a solr search result
xml, only the <lst name="responseHeader"> tag is missing.

The exported xml has a special line termination feature: all documents
will be exported into a single line without any CR in between. That
means that every document is completely inside a single line. While this
is not readable at all for humans, it is very useful for linux line
processing scripts, like grep. Using grep it will be easy to select
single documents which match for a given pattern.

Such dumps shall be importable with the DATA/SURROGATE/in import
function, but that import is not yet adopted to the new file format.
2015-05-29 15:05:52 +02:00
Michael Peter Christen
0a879c98e7 added new 'firstSeen' database table and necessary data structures which
hold a date for each URL to record when a url was first seen. This is
then used to overwrite the modification date for urls upon recrawl in
case that the first-seen date is before the latest document date. This
behaviour is necessary due to the common behaviour of content management
systems which attach always the current date to all documents. Using the
firstSeen database it is possible to approximate a real first document
creation date in case that the crawler starts frequently for the same
domain. As a result the search results ordered by date have a much
better quality and the usage of YaCy as search agent for latest news has
a better quality.
2014-11-13 00:58:58 +01:00
orbiter
cbb5f06630 do not remove the index deletion option from the IndexControlURLs_p.html
servlet after a deletion happend, instead show but disable the option
when the index is empty.
2014-08-27 00:45:39 +02:00
orbiter
73c2e47de3 added a confirmation dialog to complete index deletion 2014-08-27 00:31:03 +02:00
reger
a88ea14e09 harmonize use of style for "delete" button
- apply the monstly used btn-danger class
2014-06-22 23:33:59 +02:00
Michael Peter Christen
656e2ce62a replacing direct html table cellspacing with css set-up for cellspacing 2014-03-31 01:15:35 +02:00
orbiter
f8f88d4e81 replaced pdblue-homebrew buttons with bootstrap standard buttons 2014-03-20 22:52:01 +01:00
Michael Peter Christen
92655c7fd9 - added bootstrap css framework
- adopted all YaCy administration pages to new framework
- created new search page layout (working, but still work in progress)
- old skin files are fully appliable! (and looking good)
- target is a new style based on bootstrap examples, see /test.html
- icons in YaCy may be replaced by glyphicons (to be done)
2014-03-18 13:42:31 +01:00
malykhin.dmitry
29a7598991 update russian lang-file and small improve web-interface 2014-02-27 07:43:17 +04:00
reger
365f77ea8c make internal page links relative to ease any future development for context aware servlets
note also http://bugs.yacy.net/view.php?id=106
2014-02-10 21:40:42 +01:00
Michael Peter Christen
6e59ca4ebf removed jena library and all code that depended on jena. When jena was
introduced, it was also used for search facets. The generic search
facets are now deduced from generic solr fields which makes jena as tool
for facet semantics superfluous.
2014-02-07 01:20:06 +01:00
reger
e05320b776 upd: to open more external links in new browser-tab 2013-12-26 01:16:53 +01:00
reger
1437c45383 merge rc1/master 2013-11-07 21:30:17 +01:00
sixcooler
e5abccdfe4 added optimize-option 2013-06-28 14:51:37 +02:00
Michael Peter Christen
54024958ac added url_file_name_s in qeury for live-search of urls 2013-06-25 16:36:05 +02:00
orbiter
2b320313d9 replaced yacydoc servlet usage by a solr result output using an html
output writer. This made the creation of a html result writer necessary
which is included in this commit. The yacydoc servlet was used to
present all metadata to a document, but the solr interface can serve for
this purpose in a much better way. All usages (instead one) of yacydoc
were replaced by a solr call. This affects also the 'metadata' link
attached to search results.
2013-06-09 12:12:34 +02:00
Michael Peter Christen
281959a2d7 added option to re-boot the embedded solr during run-time. Added also
API recording for this method so it can be repeated automatically. The
index dump generation is now also available for API recording. Added
some synchronization in backend which was necessary for this.
2013-05-29 13:09:34 +02:00
Michael Peter Christen
56d5946a59 - added flags in IndexFederated_p.html to switch on or off the webgraph
index (new solr core webgraph) .. this is now off by default
- completely redesigned this servlet
- added description how to attach a remote solr
- adjusted naming of servlet and menues
- moved 'lazy initialization' attribut from IndexSchema to
IndexFederated (this is a general option) back again.
2013-02-24 18:09:34 +01:00
Michael Peter Christen
0fe7b6fd3b migrated the index export methods from the old metadata to solr. Now
exports are done using solr queries. removed superfluous methods and
servlets.
2013-01-24 12:39:19 +01:00
Michael Peter Christen
38d3feae65 added separate delete commands for the local+remote solr index, the old
metadata and old rwi and for the citation index. The important
advancement is the separation of the citation index deletion because
that index is responsible for the linkdepth calculation. Now a search
index can be deleted without the citation index and that should cause
that less clickdepths must be post-processed.
2013-01-04 16:39:34 +01:00
Michael Peter Christen
941873fba4 moved the index deletion functions from IndexControlRWIs to
IndexControlURLs where it appears more naturally. Because the RWI
administration is less important in the presence of Solr, the
IndexControlURL is now the default servlet when the Index Administration
button on the main menu is selected.
2012-10-10 00:09:27 +02:00
Michael Peter Christen
ccd65ecf8d fixed url search in IndexControlURLs_p.html / using now the solr
interface
2012-09-27 00:31:59 +02:00
Michael Peter Christen
1b474139dd used the new zip writer/reader to add a solr dump process: the whole
solr index can be written to a zip dump and also restored during runtime
2012-09-24 17:05:28 +02:00
Michael Peter Christen
5df553c152 - added a json writer for solr (yes there was one using xslt but this
one writes the same way as yacysearch.json)
- using the new json solr result to change the ajax search in
IndexControlURLs to the new solr search
2012-09-10 14:30:44 +02:00
Michael Peter Christen
4b36a2c3b4 small style changes 2012-09-04 11:23:41 +02:00
Michael Peter Christen
8ca842b137 added new button design to more buttons 2012-09-03 16:04:57 +02:00
Michael Peter Christen
03280fb161 removed segments-concept and the Segments class:
the segments had been there to create a tenant-infrastructure but were
never be used since that was all much too complex. There will be a
replacement using a solr navigation using a segment field in the search
index.
2012-06-28 14:27:29 +02:00
orbiter
abb35addb8 added
accept-charset="UTF-8"
to all forms
this applies patches from http://forum.yacy-websuche.de/viewtopic.php?p=20891#p20891

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7482 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-14 22:57:43 +00:00
orbiter
9a1e0158fa better servlet naming in index administration
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7455 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-01-28 11:32:31 +00:00
mikeworks
61e87c0b14 IndexControlRWIs_p.html, IndexControlURLs_p.html, ViewFile.html/.java: changes to HTML output and &nbsp; in case of empty values for XHTML strict / transitional validation
de.lng: Added missing translation for Show Content and changed existing line 
--> Index Administration should now correctly validate XHTML 1.0 Strict / Trans

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7255 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-17 16:51:29 +00:00
suessthomas
5c5e6accdb Fixes for (X)HTML compatibility.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6854 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-05-05 21:12:58 +00:00
orbiter
3014e5f6f9 - integrated live search in the IndexControlURLs input window for URLs:
this searchs for occurrences of the given word in URLs and presents them
  in a pop-up list below the input line
- some bugfixes for the new robots table viewer

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6735 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-09 15:44:11 +00:00
suessthomas
063b29060c Minor Changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6656 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-08 10:13:14 +00:00
orbiter
735e2737e3 * added index segments
This is a major change in the organization of indexes.
Please consider a back-up of your data before you run this update.
All existing index files will be moved and renamed to a new position.
With this change, it will be possible to maintain different indexes for different purposes and it will be possible to have a distinction between DHT-in and DHT-out specific indexes. Tenants may also have their own index, and it may be possible to have histories and back-ups of indexes. This is just the beginning, many servlets must be adopted after this change, but all functions that had been there should still work.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6389 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-09 14:44:20 +00:00
orbiter
89d8e824ed memory protection for URLAnalysis
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5649 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-24 22:05:09 +00:00
orbiter
b57c9da1f8 - fixes to doc, ppt, xls parser: better title
- fixes to httpd server response header generation
- fixes to a server date computation bug
- new Button in indexControl to view content of url in ViewFile


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5576 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-05 15:15:13 +00:00
orbiter
91af105373 last changes before release
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5493 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-13 23:49:08 +00:00
orbiter
4bd927d513 the Semantic Web moves in!
- added two new api files for document metadata:
- added a XHTML+RDFa html file shows the document metadata in a format that presents the data for rendering and for metadata retrieval. This is a typical document format for a semantic web data structure. the used RDF vocabulary is Dublin Core
- added a xml file that shows the same data as pure DC metadata
- integrated the API into the existing IndexControlURLs interface

With about one billion metadata files (URL metadata) this extension makes the freeworld YaCy network
to one of the probably largest metadata document provider for the semantic web!

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5490 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-13 22:04:38 +00:00
orbiter
c97d0fcee7 modified the domain list export function:
- used the new superfast domain list generation from the domain statistics
- better interactive behavior

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5118 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-04 20:28:36 +00:00
orbiter
77ee0765a4 - added domain statistic generation to IndexControlURLs_p.html servlet
- added 'delete all' button to all results of such a domain statistic output which causes that all urls to this domain are deleted
- extended stack cleaner to clean also the statistics: they are not completely destroyed, only the smallest counting domains are removed


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5117 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-04 19:41:57 +00:00
lulabad
fc54d4519e some more XHTML strict errors
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4471 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-10 09:06:17 +00:00