Commit Graph

6589 Commits

Author SHA1 Message Date
orbiter
0018163c07 moved table row/column matching method from front-end to back-end
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6770 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-26 10:01:27 +00:00
orbiter
e12f1fd821 - added setting of access rights for executable scripts after auto-installation
The correct access right was missing expecially for bin/apicall.sh

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6769 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-25 09:51:01 +00:00
orbiter
21fcbcc35f added sorting function in network table, reverting SVN 6736 (not removing new sorttable)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6768 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-25 07:47:41 +00:00
orbiter
31e29a8831 - removed synchronization during index dump and index cleaning
- added semaphores to synchronize index dump and index cleaning for each process separately

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6767 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-25 07:09:53 +00:00
orbiter
95f31da8da increase dump cache queue length from 1 to 2
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6766 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-24 20:36:35 +00:00
orbiter
fad3abb524 Tables_p.html servlet can now show tables with selected rows using a search field
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6765 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-24 10:52:37 +00:00
low012
4c6dc396d8 *) more beautyful (IMO) code, no functional changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6764 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-23 21:05:26 +00:00
orbiter
6c093d6aed - enhanced domain navigator computation
- fixed domain navigator content in case that a mustmatch constraint was given

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6763 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-23 13:41:41 +00:00
orbiter
bb63c5d075 using a Pattern object with precompiled regular expressions to apply must-match constraints to search results: should speed up pre-sorting of search results and should cause richer search result sets
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6762 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-23 10:17:28 +00:00
suessthomas
5233177a7f A small typo fixed
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6761 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-23 08:44:34 +00:00
orbiter
e0da0a84b0 performance fix in http parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6760 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-22 09:12:52 +00:00
orbiter
90dd197ae7 - no latency for local crawls
- catch interrupted exception during 'fast' crawls in workflow processor

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6759 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-22 09:12:18 +00:00
lotus
ea69300857 fix bad floating navigators on little results
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6758 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-20 22:09:55 +00:00
orbiter
bfb518cd47 some refactoring to get the LoaderDispatcher a little bit more independent from the switchboard
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6755 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-20 10:28:03 +00:00
orbiter
36bd843ece for for RFC5322 comformance as suggested by Quix0r in http://forum.yacy-websuche.de/viewtopic.php?p=19585#p19585
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6754 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-20 10:23:47 +00:00
orbiter
c855fc48c6 only load robots.txt for http and http protocol
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6753 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-20 10:15:11 +00:00
orbiter
0465f28f7f applied 'null in rss2.js' fix from Quix0r, see
http://forum.yacy-websuche.de/viewtopic.php?p=19612#p19612

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6752 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-20 09:58:05 +00:00
orbiter
748abfcffa added patches to prevent yacy-protocol DoS settings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6751 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-19 15:31:15 +00:00
orbiter
e820ed061a avoiding excessive DNS lookups to determine localhost
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6750 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-19 14:28:25 +00:00
orbiter
11983bc936 redesigned some parts of the parser entry point:
- in all cases that the parser is entered it is a whole set of possible parsers computed according to given mime type and file extension,
that means that all parsers are considered where the registered mime acceptance and extension acceptions matches.
that may cause that several parsers are tried for the same file which will cause a success in cases where there was only the mime type was used to choose the right parser and the mime type was given wrongly by the host httpd.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6749 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-19 13:04:42 +00:00
orbiter
de88200e11 - added Byte Order Mark recognition to serverObjects
The BOM character FEFF may appear at the beginning of strings if some browsers append the characters %EF%BB%BF to input values.
see http://en.wikipedia.org/wiki/Byte_order_mark

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6748 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-19 10:58:40 +00:00
orbiter
89b4fff1c2 adopted ant script for new exif library
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6746 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-12 12:36:38 +00:00
orbiter
24e5faee75 added exif parsing for jpg images
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6745 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-12 12:23:38 +00:00
orbiter
82f76e1296 removed log line
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6744 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-11 20:31:38 +00:00
orbiter
0f8004f9da enhanced html parser to recognize a href tags inside header tags
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6743 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-11 17:52:07 +00:00
orbiter
3300930fc5 - (almost) fixed FTP crawler
- integrated/fixed SMB crawler

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6742 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-11 15:43:06 +00:00
orbiter
35d0057cb0 stopYACY.sh can now use curl
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6741 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-11 00:12:53 +00:00
orbiter
61493a9a9f added more information about metadata in ViewFile.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6740 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-11 00:11:14 +00:00
orbiter
1198b9989d bugfixes, more sorttable
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6739 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-10 15:39:36 +00:00
orbiter
27b2998eb4 added searchtable function to more tables in interface
you can now sort by any column in most tables in YaCy just by clicking on the headline column of the table

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6738 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-10 10:05:41 +00:00
orbiter
9623d9e6d2 added a smb loader component for the YaCy crawler
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6737 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-10 08:55:29 +00:00
orbiter
c77fbd0390 added sorttable (http://www.kryogenix.org/code/browser/sorttable/)
javascript library to make tables sortable

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6736 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-09 23:40:16 +00:00
orbiter
3014e5f6f9 - integrated live search in the IndexControlURLs input window for URLs:
this searchs for occurrences of the given word in URLs and presents them
  in a pop-up list below the input line
- some bugfixes for the new robots table viewer

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6735 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-09 15:44:11 +00:00
orbiter
ae2f3f000f better handling of table copy abandon .. prevent memory leak
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6734 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-09 13:32:15 +00:00
orbiter
0769517129 added a robots.txt monitor in the crawler monitor submenu
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6733 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-09 11:31:15 +00:00
orbiter
48995e71c4 added soft-auth to general authentication scheme
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6732 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-09 00:07:17 +00:00
orbiter
72f00dee59 removed never-used server access account function
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6731 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-08 22:30:45 +00:00
orbiter
474bb4de82 ups
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6730 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-07 23:32:18 +00:00
orbiter
8c88abf685 added follow-me link for twitter in status hints
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6729 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-07 23:29:29 +00:00
orbiter
58d75a6bde allow more results for a single query at the same time if the client is not authorized. This is necessary for the search widget where the default number of results is now set to 20 instead of 10 to cause that a scroll bar is shown which is necessary to get a trigger for new searches for more results.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6728 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-07 22:49:20 +00:00
orbiter
57e1eae95e longer time-out for url fetching .. may help to show all that links that the statistic say for a search result
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6727 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-07 22:23:08 +00:00
orbiter
9e639603e3 after frequent occurrences of 100% CPU usages and permanent blockings I try to disable a function in a method that may cause the problem when calling an external library (apache http client 3.x). The thread dump that shows the problem is attached here.
at java.lang.StringCoding.encode(StringCoding.java:266)
	at java.lang.String.getBytes(String.java:946)
	at org.apache.commons.httpclient.util.EncodingUtil.getAsciiBytes(EncodingUtil.java:237)
	at org.apache.commons.httpclient.methods.multipart.Part.sendDispositionHeader(Part.java:220)
	at org.apache.commons.httpclient.methods.multipart.Part.send(Part.java:308)
	at org.apache.commons.httpclient.methods.multipart.Part.sendParts(Part.java:385)
	at org.apache.commons.httpclient.methods.multipart.MultipartRequestEntity.writeRequest(MultipartRequestEntity.java:164)
	at de.anomic.http.client.Client.zipRequest(Client.java:364)
	at de.anomic.http.client.Client.POST(Client.java:339)
	at de.anomic.yacy.yacyClient.wput(yacyClient.java:285)
	at de.anomic.yacy.yacyClient.transferURL(yacyClient.java:1053)
	at de.anomic.yacy.yacyClient.transferIndex(yacyClient.java:942)
	at de.anomic.yacy.dht.Transmission$Chunk.transmit(Transmission.java:200)
	at de.anomic.yacy.dht.Dispatcher.storeDocumentIndex(Dispatcher.java:397)
	at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at net.yacy.kelondro.workflow.InstantBlockingThread.job(InstantBlockingThread.java:103)
	at net.yacy.kelondro.workflow.AbstractBlockingThread.run(AbstractBlockingThread.java:66)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:637)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6726 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-07 21:19:23 +00:00
orbiter
4144927d94 show less errors
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6725 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-07 21:02:08 +00:00
mikeworks
736df39c9c Updated German translation de.lng: mainly ViewFile.html additions and removed (De)Select All from Table_API_p.html section
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6724 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-07 16:31:49 +00:00
orbiter
b88f5fbb4b slightly changed crawling policy
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6723 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-07 01:46:08 +00:00
orbiter
de01fe0e6d fix for bug in url parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6722 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-07 01:33:18 +00:00
orbiter
7684a575c4 fix for deletion of error database each time when YaCy starts up
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6721 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-07 00:33:39 +00:00
orbiter
f561e340c6 show more results of single domains when not authorized fully (up to 100)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6720 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-07 00:12:58 +00:00
orbiter
c4bdb1e7f2 added one more option in ViewFile to show an iframe like for the orginal web page content but using the cache than the direct link to the content in the web. Upgraded the very old and previously not any more used CacheResource_p servlet to a new and working version.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6719 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-06 23:41:51 +00:00
orbiter
c09a995930 better logging of double occurrences of urls in the crawler
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6718 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-06 20:31:30 +00:00