Commit Graph

2471 Commits

Author SHA1 Message Date
apfelmaennchen
3905caf8a1 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5536 6c8d7289-2bf4-0310-a012-ef5d649a1542 2009-01-29 22:07:18 +00:00
apfelmaennchen
08ed14603e - fixed YaCy-UI sciencenet search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5535 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-29 22:06:06 +00:00
apfelmaennchen
55dd15e344 - clean up of YaCy-UI
- added /yacy/ui/yacyuisearch.html (stand alone version)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5534 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-29 19:57:18 +00:00
apfelmaennchen
025ebd7574 small fix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5531 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-28 22:52:26 +00:00
apfelmaennchen
9bd9ccade2 refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5530 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-28 22:47:03 +00:00
low012
b41a06228f *) cleaning up...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5529 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-28 19:48:52 +00:00
low012
ce81391095 *) using parameters like site: in the search field does not affect urlmask anymore
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5528 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-28 19:37:24 +00:00
apfelmaennchen
96684df1a9 - security fix for addTag.java and editTag.java
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5526 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-28 06:43:13 +00:00
apfelmaennchen
6dd52422ea - added two dialogs to manage bookmark tags in YaCy-UI
- fixed renameTag() in bookmarksDB
- added /api/bookmarks/tags/addTag.xml
- added /api/bookmarks/tags/editTag.xml

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5525 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-28 00:15:43 +00:00
apfelmaennchen
12511bd236 - some additional icons for YaCy-UI
- added license (readme.txt) to exiting icon sets

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5524 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-27 18:17:59 +00:00
apfelmaennchen
e76cbd9016 YaCy-UI: some small cosmetical changes....
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5522 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-26 19:12:03 +00:00
apfelmaennchen
af3147c3fc and one more....eclipse ist tricky...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5521 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-26 18:41:49 +00:00
apfelmaennchen
9317650272 forgot to post this one...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5520 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-26 18:40:56 +00:00
apfelmaennchen
92d77c3bef Major update to YaCy-UI...still not perfect...but I thought I share my progress :-)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5519 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-26 18:38:58 +00:00
borg-0300
a2b336dfe7 small table fix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5518 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-26 16:25:33 +00:00
low012
80e6356860 *) r 5512 has introduced a bug which resulted in useless filters if site:, filtetype:, or inurl: was used since the filters included the word "null".
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5517 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-25 22:16:49 +00:00
lotus
4ef6b15eb8 limit -Xmx setting to 1999m on win32. bigger values would never work.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5513 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-23 15:13:09 +00:00
lotus
5078e837ac better readability / no functional changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5512 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-22 13:41:11 +00:00
orbiter
dedfc7df7f removed distinction between DHT-in and DHT-out. This is necessary to make room for the new cell data structure, which cannot use this this distinction in the first place, but will enable the same meaning with different mechanisms (segments, later)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5511 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-22 00:03:54 +00:00
f1ori
34da04c7dd * fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1754
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5510 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-21 21:30:20 +00:00
low012
9a4780c165 *) Javascript should work with Konqueror too now (http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1757)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5507 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-20 23:29:24 +00:00
low012
1927fd5992 *) hopefully fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1757
*) small change which prevents weird situation when choosing empty list of entries to edit

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5503 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-18 21:23:53 +00:00
apfelmaennchen
98ab7ae20a small fix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5501 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-17 09:51:04 +00:00
apfelmaennchen
d7122722b2 hopefully a fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1736#p12061
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5500 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-16 23:22:42 +00:00
lotus
a2bc32e909 fix for IE6 api-icon display
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5499 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-16 07:33:36 +00:00
orbiter
6663365720 adopted many calls to new api path
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5498 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-16 00:02:55 +00:00
orbiter
b423d0a036 moved all servlets from htroot/xml to htroot/api
the file server contains a patch that temporary matches all xml paths to api,
that means all interfaces still work. Please adopt all your interfaces to the new path.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5497 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-15 23:52:58 +00:00
orbiter
91af105373 last changes before release
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5493 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-13 23:49:08 +00:00
orbiter
4bd927d513 the Semantic Web moves in!
- added two new api files for document metadata:
- added a XHTML+RDFa html file shows the document metadata in a format that presents the data for rendering and for metadata retrieval. This is a typical document format for a semantic web data structure. the used RDF vocabulary is Dublin Core
- added a xml file that shows the same data as pure DC metadata
- integrated the API into the existing IndexControlURLs interface

With about one billion metadata files (URL metadata) this extension makes the freeworld YaCy network
to one of the probably largest metadata document provider for the semantic web!

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5490 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-13 22:04:38 +00:00
apfelmaennchen
613c49bc38 YaCy-UI: update to welcome text (change-log and bug tracker) for stable release
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5485 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-13 05:59:39 +00:00
orbiter
bed38a5f8c fix for uncaught exception in RSSReader
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5482 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-13 00:20:37 +00:00
lotus
c7c291bc6b allow simultaneous inurl: site: and filetype: search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5478 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-12 14:59:27 +00:00
orbiter
9ef77d57f5 added an access control to the search interface using white/blacklists:
in the network configuration, you can configure a whiteliste and a blacklist
- blacklistet clients cannot search
- whitelistet client get never any search restrictions
- for all other clients: apply DoS search restrictions
Please see the example configuriation in yacy.network.freeworld.unit
by default, all clients from localhosts get whitlistet.
If you have your own YaCy network, please put all the IPs of your peers into the whitelist

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5475 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-12 10:55:48 +00:00
orbiter
ac89e8e84d removed unused search interface
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5474 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-12 08:45:12 +00:00
apfelmaennchen
9b26dfec80 small fix to correct encoding of xml output
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5473 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-11 22:50:53 +00:00
orbiter
efe801173c better dht-in cache flush. see also:
http://forum.yacy-websuche.de/viewtopic.php?p=11936#p11936

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5472 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-11 22:39:49 +00:00
apfelmaennchen
3dc208fad0 bugfix: bookmarks can now handle folder names like /news and /newspaper without getting confused...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5470 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-11 19:39:51 +00:00
apfelmaennchen
cc207a979e - added new unified bookmark api to /xml/bookmarks/
the get_bookmarks api currently supports:
  .xml: posts, xbel, rss, flexigrid
  .json: posts, flexigrid
  .html: work in progress
- YaCy-UI: support for new bookmark api 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5467 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-11 12:29:19 +00:00
low012
f26b8fcb1b *) comment mode is 'moderated' instead of 'activated' by default now (to avoid spam being visible)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5465 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-10 12:58:35 +00:00
orbiter
b2a8c653ee small fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5464 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-10 09:21:44 +00:00
orbiter
8632eebf60 - added api icon to the web structure visualization
- removed fixed horizontal menu
- the api icon in the search results can only be seen when display=1

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5461 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-09 15:42:20 +00:00
orbiter
4f45605f04 small update for timing in search result processing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5460 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-09 15:28:45 +00:00
lotus
66818a2f2e smaller api banner
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5459 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-09 10:27:04 +00:00
lotus
4641ecd6d9 inurl: search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5456 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-08 18:59:29 +00:00
orbiter
299189f1a9 added the API icon to the bookmarks, the network page and the search page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5455 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-07 23:45:20 +00:00
orbiter
a1bf687b3b added first API tooltip!
- description of JSON search result in interactive search

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5454 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-07 22:57:32 +00:00
lotus
0d1bd78674 * full site: syntax support e.g. site:de.wikipedia.org
possible if dots in query would work yet

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5453 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-07 21:05:07 +00:00
orbiter
9bed4de280 fix for the search bug introduced in SVN 5449
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5451 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-06 23:16:10 +00:00
lotus
d6a5c98080 api banner concept draft
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5450 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-06 20:51:22 +00:00
orbiter
b2b7edae18 fixed interactive search
- added dummy servlet class, because otherwise the template engine is not triggered.
thats so because the yacy httpd works much faster as normal file server without a scan
of the served pages. Therefore each page with templates must now have a class file associated to it.
- fixed json output format of yacysearch

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5449 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-06 20:04:09 +00:00
orbiter
c6880ce28b removed the permanent cache flush and replaced it with a periodic cache flush
The cache is now flushed only for one second every ten seconds. During a crawl the cache
fills up completely, and is only flushed if space is needed for more documents.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5446 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-06 13:51:59 +00:00
lotus
ca80930892 accept leading dots on filetype: and site: search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5444 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-06 10:04:24 +00:00
orbiter
6c7e83909b - refactoring of data access methods to be prepared for new cell data structure
- removed a memory overhead in collections which prevent OOM Exception in low memory configurations

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5443 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-06 09:38:08 +00:00
low012
1af728ae09 *) regex for site operator changed as proposed by Lotus
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5441 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-05 18:30:34 +00:00
low012
9e58ae036d *) added site operator which can be used to only show results from a certain domain. example: "test site:edu" shows only documents which contain the word test and which come from an edu domain
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5439 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-04 14:58:32 +00:00
low012
19e7c56f7f *) apply filter to dir list to only show .black files as blacklists
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5438 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-04 10:14:19 +00:00
orbiter
c4c4c223b9 fixed a problem with attribute flags on RWI entries that prevented proper selection of index-of constraint
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5437 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-04 02:27:29 +00:00
low012
4bffe664ca *) moved entry field for new expressions to top of the list as requested in forum (http://forum.yacy-websuche.de/viewtopic.php?f=9&t=1678)
*) added some Javascript to disable list selection on bottom of list in cases it is not needed (edit, delete) and only enable it if needed (move), if JS is turned off everything will work as usual

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5435 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-03 10:18:48 +00:00
low012
9d5d30f877 *) http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1672
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5422 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-31 16:50:10 +00:00
orbiter
5448aad328 removed unused code
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5421 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-30 12:12:00 +00:00
orbiter
28d2d28573 added support for filetype search
(just use filetype:<type> in the search query)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5418 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-29 17:57:04 +00:00
orbiter
78c568331e added test channel to /xml/feed.rss
can be obtained with 
http://localhost:8080/xml/feed.rss?set=TEST
returns always a single feed entry with a fresh date

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5416 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-29 12:39:07 +00:00
orbiter
e004da48d3 - added fast fingerprint computation for files (any). Will be used in new index dump method
- refactoring

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5415 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-29 12:22:13 +00:00
low012
eab72424df *) Fixed small bug: When adding new elements to blacklist via import, the blacklist which the elements were added to was supposed to be displayed, which did not work correctly.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5414 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-28 09:58:02 +00:00
low012
0e56675596 *) cleaning up ;-)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5413 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-27 20:09:36 +00:00
low012
cf69557ea2 *) blacklists can be exported as XML or plain text now
*) blacklist import via file upload works now

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5412 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-27 15:38:20 +00:00
low012
1594a15be9 *) explicit mentioning of blacklist in blacklist cleaner
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5411 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-27 13:06:05 +00:00
low012
5a89266598 *) new parameters for future use (better blacklist handling for im- and export)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5403 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-19 19:33:08 +00:00
orbiter
e34ac22fbd - added new monitoring servlet at
http://localhost:8080/PerformanceConcurrency_p.html
- used the new monitoring to do some fine-tuning of the indexing queue

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5402 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-19 15:26:01 +00:00
orbiter
d376d81fc4 replaced busy thread control of crawl stacker by blocking threads
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5400 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-18 23:18:34 +00:00
orbiter
7535fd7447 - refactoring of CrawlEntry and CrawlStacker
- introduced blocking queues in CrawlStacker to make it ready for concurrency
- added a second busy thread for the CrawlStacker
The CrawlStacker is multithreaded. It shall be transformed into a BlockingThread in another step.
The concurrency of the stacker will hopefully solve some problems with cases where DNS blocks.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5395 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-17 22:53:06 +00:00
lotus
6569cbbec1 npe fix: http://forum.yacy-websuche.de/viewtopic.php?t=1646
(break to avoid bad side effects)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5394 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-16 20:53:31 +00:00
orbiter
2802138787 - refactoring of CrawlStacker (to prepare it for new multi-Threading to remove DNS lookup bottleneck)
- fix of shallBeOwnWord target computation heuristic


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5392 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-15 00:02:58 +00:00
lotus
b1e211b258 no error-alert: http://forum.yacy-websuche.de/viewtopic.php?t=1639
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5391 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-13 12:04:08 +00:00
orbiter
13cb0916ee changes to statistics and content of thread dump servlet
(points now more directly to performance leaks without mentioning class calls inside of sun/java calls that cannot be changed anyway)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5390 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-11 20:13:14 +00:00
orbiter
e1acdb952c fix for problem with userDB and bookmarksDB which was caused by changes in kelondroRA in SVN 5376
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5385 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-08 00:17:45 +00:00
lotus
e918d64c23 show hand-cursor an labels
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5383 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-06 17:32:53 +00:00
orbiter
4a2dac659e more speed hacks:
- modified and activated write buffer
- increased cache flush factor
- fixed a problem with deadlocking of indexing process

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5382 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-05 13:55:48 +00:00
lotus
1fb518a5b4 display <String> etc.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5380 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-04 20:21:53 +00:00
orbiter
47292e696a more performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5379 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-04 12:54:16 +00:00
orbiter
bd1dc9cd5d thread dump with statistics, a little bit of profiling
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5377 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-03 23:26:25 +00:00
orbiter
d39d420b39 performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5376 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-03 15:38:29 +00:00
lotus
5280ad638d added basic performance page
other performance settings can be found on advanced settings

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5375 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-03 14:10:01 +00:00
lotus
1a51d9fcfd display proper values
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5374 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-02 17:57:30 +00:00
orbiter
0b4808ba3d added new interactive search feature:
- during the user types search queries, the local database is searched
- results are presented interactively

This was implemented using a new JSON result format for search results in YaCy
- added JSON as file format for servlets
- refactoring of current search servlets (xml and html)
- added JSON output format for search results
- added AJAX-based search page, that uses the yacysearch.json selrvlet to print results as a query is typed

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5373 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-02 15:24:25 +00:00
lotus
fea82b54ef more contrast on search snippets
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5370 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-26 19:57:13 +00:00
lotus
1951d30a62 addendum to last commit
handle words with length < 3 correctly

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5369 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-26 19:43:40 +00:00
lotus
325ba7bfb8 only query words with length > 2
this is not complete, yet

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5368 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-26 16:41:38 +00:00
lotus
489edb4473 improved pattern selection
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5367 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-26 10:06:38 +00:00
low012
e423fa9846 *) added method to only get file names in directory listing which match a filter
*) only files which end with .black will be listed as blacklists
*) added a little bit of Javadoc

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5366 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-25 20:26:06 +00:00
lotus
577b53aee6 added more search engines
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5365 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-24 13:05:20 +00:00
lotus
7f4d411c0d npe-fix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5364 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-24 13:04:57 +00:00
lotus
1545e5440a * index deletion: checkbox-confirmation
* watch crawler: less load on exhausted peers; wait for data before reloading again

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5359 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-23 12:02:58 +00:00
orbiter
10f5ec1040 reverted last commit (more testing needed)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5356 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-22 00:12:50 +00:00
daburna
ba5b274b8c #translation update:
-blacklist
-crawlstart
...

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5353 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-21 16:45:45 +00:00
orbiter
0ca4bc7b79 - added reader and visualization for mediawiki-export files:
files exported from mediawiki using the xml schema according to
http://www.mediawiki.org/xml/export-0.3/
can be processed to be viewed in a YaCy servlet.
To acces such a file, place it into
DATA/HTCACHE/mediawiki/
i.e. the export from german wikipedia would be:
DATA/HTCACHE/mediawiki/wikipedia.de.xml
This file can then be accessed using the URL
http://localhost:8080/mediawiki_p.html?dump=wikipedia.de.xml&title=YaCy
if this is done the first time, an index file is created
(for this case: more than 4 million lines must be written, this takes about 15 minutes)
Then try the same url again.

- enhanced also the md5 computation speed


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5352 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-20 18:31:52 +00:00
lotus
4f996a7651 fix for logparser pattern
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5349 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-17 16:23:17 +00:00
orbiter
867d0f2f56 removed some unnecessary pause delays
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5346 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-14 23:36:33 +00:00
lotus
fd83e59f8e new remote search average
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5343 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-14 11:50:46 +00:00
orbiter
dba7ef5144 extended crawling constraints:
- removed never-used secondary crawl depth
- added a must-not-match filter that can be used to exclude urls from a crawl
- added stub for crawl tags which will be used to identify search results that had been produced from specific crawls
please update the yacybar: replace property name 'crawlFilter' with 'mustmatch'.
Additionally, a new parameter named 'mustnotmatch' can be used, which should be by default the empty sring (match-never)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5342 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-14 09:58:56 +00:00
orbiter
0ae84f4f8e set some default values for a crawl start that should cause less confusion and mistakes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5334 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-12 19:48:22 +00:00
lotus
4745e89451 auto-choose crawl type
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5331 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-12 14:44:23 +00:00
low012
421d056550 *) changed layout of blacklist adminstration (less cluttered)
*) it is possible to move/edit/delete more than one entry at a time now
*) it is easier to choose a target for blacklist import now
*) fixed several bugs
*) to be continued...

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5330 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-12 00:47:54 +00:00
orbiter
674ad2d55b different handling of error cases that occur during loading files with http or ftp:
methods throw exception instead of returning an error string

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5328 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-11 21:33:40 +00:00
f1ori
ae80f3e6a5 * extend opensearchdescription to support compare_yacy.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5324 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-09 00:23:19 +00:00
orbiter
1b18d4bcf3 enhancement to crawling and remote crawling:
- for redirector and  remote crawling place crawling url on notice queue instead of direct enqueueing in crawler queue
- when a request to a remote crawl provider fails, remove the peer from the network to prevent that the url fetcher gets stuck another time again

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5320 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-06 12:30:55 +00:00
orbiter
3f746be5d4 - consolidation and refactoring of many DHT target - computing methods
- implemented vertical DHT acceptance ("my own DHT") to accept new targets
- added new target computation for global search: addresses vertical targets also
- enhanced remote crawling: collection of remote crawl urls if queue has less than 100 entries (was: 0 entries)
- better performance value computations for PPM selection in network configuration

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5319 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-06 10:07:53 +00:00
orbiter
d014b2728a Design-check, Extension and Refactoring of DHT target position computation:
- two different computations (but mathematical equivalent) of the DHT distance had been consolidated
- moved from 0.0 .. 1.0 double-range position computation to 0 .. Long.Max range for DHT targets
- added fast Long - to - hash computation
- high-precision target computation of gaps for new peers
- added new target computation for horizontal and vertical DHT targets (not yet in use)
- old horizontal-only DHT targets will be upwards compatible to new horizontal and vertical DHT positions

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5318 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-03 00:27:23 +00:00
low012
83967f8c77 *) servlet does not forget chosen blacklist anymore when editing, moving or delting an entry
*) move or edit will only be performed if new value actually differs from old one

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5309 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-30 00:03:14 +00:00
low012
04e41a392f *) fixed bug where RegExes were not deleted and even added to the list a second time when the user tried to edit them
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5308 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-29 22:49:44 +00:00
low012
7bac4796d2 *) added servlet which returns all shared blacklists of a peer without information about which part of YaCy (crawler, proxy, ...) blacklist is activated for (to be used for better online import)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5306 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-27 17:33:43 +00:00
low012
baae3d91b1 *) fixed warning when compiling listManager
*) fixed display of values of information for which part of YaCy (crawler, proxy, ...) blacklist is activated for
*) replaced regular put() with putXML() in several cases

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5305 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-27 16:56:19 +00:00
low012
444575e33d *) prevent XSS when importing blacklist
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5304 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-27 11:06:38 +00:00
low012
a99a629ed4 *) quick fix to prevent comments for blog entries which don't exist (http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1554)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5302 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-25 12:04:10 +00:00
low012
00e27e5050 *) fixed bug which made it possible to write files outside of the DATA/LIST directory when creating a new blacklist
*) a blacklist will only be created if no blacklist with same name exists (some refactoring has been necessary for this)
*) further minor fixes
*) to be continued...

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5301 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-25 00:11:03 +00:00
orbiter
0edec2b760 FULL redesign of algorithms in htmlTools to encode/decode strings from/to unicode and html.
The old process used a not really efficient way to detect html encoding strings in texts.
All calling methods had been adoped to call the new class in an enhanced way with less parameters.

Many classes in interfaces used a XML encoding only (instead of full html conversion from unicode to html); this behavior was not changed with this commit but should be controlled again since it points out possible XSS leaks

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5295 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-22 18:59:04 +00:00
orbiter
47f0c3b002 replaced the cacheAdmin with the ViewFile servlet, because the cacheAdmin was an interface to the old HTCACHE data structure which does not exist any more. Changed links to point to the ViewFile servlets.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5289 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-21 11:27:50 +00:00
orbiter
1778fb420d - added some performance tweaks to the new BLOB buffer
- removed the now superfluous HT storage thread
- reduced number of file decompression by shifting the compression moment to the future


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5286 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-19 18:10:42 +00:00
low012
77e41da7d2 *) further propagation of display value (see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1536)
*) removed another depreciated parameter "time" which led to ugly -UNRESOLVED_PATTERN- in URL

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5285 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-18 19:39:46 +00:00
low012
ff46ce8520 *) fixed display=2 (see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1536)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5283 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-17 19:57:39 +00:00
orbiter
826ca79735 refactoring and new architecture to store the files of the web cache:
- files are not stored any more as individual files
- a new database structure using BLOBHeap files stores many cache entries in common files
- all file-writing procedures had been migrated to generate byte[] objects which are written with the new database methods

this is only an intermediate step to the final architecture, where cached files are written together with their metadata in one single database structure.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5276 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-16 21:24:09 +00:00
low012
2b18a9b2c4 *) removed depreciated parameter "time" which led to ugly -UNRESOLVED_PATTERN- in URL
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5275 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-16 19:31:29 +00:00
orbiter
7860d5d632 fix for bug in seed list management (cause was bad class overloading, only visual effects!)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5265 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-12 19:51:53 +00:00
lotus
603282bcf4 fix for out of bounds exception
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5264 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-11 07:47:34 +00:00
orbiter
ffed5fc415 fixed problem with lost peers in database
migrated seedDB from BLOBTree to BLOBHeap

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5263 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-10 14:40:02 +00:00
lotus
736dd86193 - option enableSimpleConfig can disable hidden tables
- corrected some Xmx values
- friendlier welcome message format

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5259 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-09 12:48:43 +00:00
orbiter
ff68f394dd fix for problem with balancer and lost crawl profiles:
if crawl profile ist lost, no robots.txt is loaded any more

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5258 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-08 18:26:36 +00:00
apfelmaennchen
3717d2057a YaCy-UI: fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1483
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5251 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-05 18:50:43 +00:00
orbiter
d0bdcdd57c small changes to attributes of DoS attack protection parameters
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5246 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-03 19:44:42 +00:00
orbiter
9ac16f565b - fixed several bugs in database management functions
- fixed a display bug for the performance graph
- fixed deadlock when initialization of awt happens simultanously
- removed some debugging output

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5245 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-03 18:57:02 +00:00
lotus
7fdf65339d system status dropped into next line if seed server was enabled. display needs about 230px, set fixed width again.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5243 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-03 11:42:24 +00:00
lotus
7782a43060 fix if LANGUAGE: was not defined and the end of the query
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5242 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-03 11:36:17 +00:00
apfelmaennchen
2c23e6ad34 YaCy-UI: added some features for Admin Console
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5241 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-03 05:32:31 +00:00
apfelmaennchen
efcae14714 YaCy-UI:
- added 'Open' button to search result toolbar
- lets you open all selected search results in new window/tab
- added 'any language' filter as default

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5240 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-02 17:38:03 +00:00
lotus
902a0d0f38 fieldset of system status was bigger than defined space. IE overlapped some text.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5238 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-01 20:33:32 +00:00
orbiter
820a03f9d6 - removed some warnings
- used fix in SVN 5233 for ysearch.java and search.java

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5237 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-01 20:20:39 +00:00
lotus
69925a7e91 fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1441
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5236 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-01 20:16:50 +00:00
lotus
fe2792e9ce use accept-language header instead of user agent for language detection
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5235 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-01 17:47:11 +00:00
lotus
e5904e6a21 removed color definition for input elements in default skin
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5234 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-01 13:07:24 +00:00
lotus
93ddf206e6 opensearch fix if user agent had no language
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5233 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-30 20:13:18 +00:00
lotus
b8538fae04 search form like on result page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5232 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-30 19:59:55 +00:00
lotus
3cce13d1b7 more compact search form
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5231 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-30 18:03:52 +00:00
lotus
95fddf056c - better support for narrow windows on searchpage
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5230 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-30 17:50:42 +00:00
orbiter
c8bdd965ec - larger update time for status page
- balancer writes cause of robots.txt in log file for crawl delay
- removed log output for forced GC
- smaller RAM flush for RWI cache, should cause more usage of cache and faster crawling

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5228 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-30 11:09:46 +00:00
lotus
3a919bf24e better solution for search result layout
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5227 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-29 19:11:16 +00:00
lotus
f95ec8b813 fix for non-accessible 2nd-line tabs on admin console
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5226 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-29 18:33:38 +00:00
orbiter
c44e97d6dd more lines in log on status page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5225 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-29 18:07:55 +00:00
orbiter
dc149df3b1 new status page layout:
- smaller kaskelix image to make room for more information
- added the memory graph, since this picture is widely used to monitor YaCys activities
- added border to log line iframe (looks better together with memory graph)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5224 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-29 14:59:06 +00:00
daburna
298196e7a4 translation update
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5223 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-29 13:24:03 +00:00
lotus
dda771db9d - search result layout
- tray only for windows

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5222 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-29 12:39:57 +00:00