Commit Graph

7656 Commits

Author SHA1 Message Date
orbiter
204e98db3a added a protection against rwi flooding
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7993 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-10-10 01:10:49 +00:00
orbiter
7598a9e26b fix for thread dump
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7992 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-10-07 23:23:49 +00:00
orbiter
3f606407bc added new scripts to bin in build
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7991 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-10-07 22:57:20 +00:00
orbiter
8eef8722d1 update to ThreadDump analysis: freerunner and thread state recognition
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7990 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-10-07 22:53:14 +00:00
orbiter
1df43b137d another performance hack
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7989 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-10-06 23:35:14 +00:00
orbiter
7df0643f0e performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7988 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-10-06 23:31:04 +00:00
orbiter
a7df70221e refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7987 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-10-04 09:06:24 +00:00
orbiter
1b45e33f04 added robots tag parser to solr scheme
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7986 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-30 13:39:01 +00:00
orbiter
cf4fd525ee added directDocByURL attribute in crawl profile
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7985 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-30 12:38:28 +00:00
orbiter
c61e4cfd78 - fix for incomplete clear() in balancer
- renamed Parser Errors to Rejected URLs

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7984 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-30 10:27:14 +00:00
orbiter
813f297a95 another performance hack: re-use of known host addresses for isLocal property; avoids look-up in local hash
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7983 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-30 08:26:31 +00:00
orbiter
035ebfbf3b - performance hacks (should affect the crawl balancer and reduce CPU load during crawl stack re-fill)
- this may have also (good) performance side effects on other parts of YaCy


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7982 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-30 07:57:50 +00:00
orbiter
9c131adeb6 show IP of crawled host and country in CrawlResults
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7981 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-29 15:30:15 +00:00
orbiter
b250e6466d implemented crawl restrictions for IP pattern and country lists
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7980 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-29 15:17:39 +00:00
f1ori
e207c41c8e * fix urlproxy for urls containing dolar signs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7979 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-29 12:53:55 +00:00
orbiter
3ac6fb0baf added dump check script
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7978 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-28 21:18:49 +00:00
orbiter
57d5529a01 performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7977 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-28 21:16:40 +00:00
orbiter
5ad7f9612b added crawl settings for three new filters for each crawl:
must-match for IPs (IPs that are known after DNS resolving for each URL in the crawl queue)
must-not-match for IPs
must-match against a list of country codes (allows only loading from hosts that are hostet in given countries)

note: the settings and input environment is there with that commit, but the values are not yet evaluated

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7976 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-27 21:58:18 +00:00
orbiter
47a8c69745 added a new feature to MultiProtocolURIs to get the locale for each url:
This is done using a new library InetAddressLocator.jar which is NOT added by default to YaCy because it is very old and with that library we will never get a debian package. However, some people want that functionality and it can be made available if the library is taken from http://javainetlocator.sourceforge.net/ and placed into the /lib directory where it will be found using reflection.
The new feature will be used to extend the crawler steering.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7975 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-27 15:26:14 +00:00
orbiter
2c3161b4ac refactoring:
RankingProcess -> RWIProcess
ResultFetcher -> SnippetProcess


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7974 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-26 21:42:28 +00:00
orbiter
d2ea250d99 refactoring:
- moved many classes from de.anomic to net.yacy
- made more sub-packages for search classes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7973 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-25 16:59:06 +00:00
low012
42b5f09f68 *) this should fix a bug in snippet creation (also cleaned up a little bit)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7972 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-25 16:07:22 +00:00
low012
277b454a62 *) added comments
*) minor refactoring

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7971 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-25 13:16:52 +00:00
orbiter
6b22865dbc - removed some warinings
- removed a dead update location

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7970 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-24 01:58:54 +00:00
orbiter
fabda9ad31 added script that can be used to delete a single url from the index
call:
bin/deleteurl.sh <url>


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7969 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-21 23:33:44 +00:00
orbiter
0c6d95e57b - more tolerance against failure of table opening
- more connections for solrj

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7968 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-21 15:08:05 +00:00
orbiter
30d340563e fix in result count display
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7967 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-21 11:01:01 +00:00
orbiter
4f31869c5a enhanced search result timing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7966 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-21 10:43:08 +00:00
orbiter
6b02b696b0 - add number of search results to end of rss and json output to reflect latest status of retrieval
- distinguish search access with different verify state in access of search cache

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7965 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-20 19:41:44 +00:00
f1ori
87e6abd168 * fix urls containing a port number in urlproxy
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7964 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-20 15:02:15 +00:00
f1ori
97045022fa * pass cookies to Server Side Includes
* User.html a bit more usable


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7963 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-20 14:54:14 +00:00
lotus
6fba6e7cee fix: follow link target setting on image search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7962 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-18 16:59:01 +00:00
orbiter
ce2a76d603 performance hack for search process
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7961 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-16 10:00:51 +00:00
orbiter
a6bb0f9af4 fixed missing menu entries in access tracker
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7960 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-15 23:26:09 +00:00
orbiter
aaf7a0feaa yet another cache strategy
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7959 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-15 22:40:01 +00:00
orbiter
8a428d3e77 ensure termination of pdf parser to avoid deadlocking of other processes during search result preparation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7958 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-15 11:17:38 +00:00
orbiter
2c4a672fe2 bugfixes and performance hacks for tabe index
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7957 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-15 11:17:02 +00:00
orbiter
dad5b586a4 added a concurrent warmin-up of Table data structures. that should speed-up the start-up process but may also cause stronger CPU load at that time.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7956 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-15 10:01:21 +00:00
orbiter
734059d33e performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7955 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-14 23:34:05 +00:00
orbiter
23e81b28b2 synchronization enhancements
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7954 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-14 21:19:02 +00:00
orbiter
dd4635e323 patches
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7953 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-14 20:11:27 +00:00
orbiter
65ab067491 migration to solrj 3.4.0
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7952 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-14 20:08:59 +00:00
orbiter
ffd848c7a9 moved the log, memory, processes and the messages into a new computation monitor main menu item
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7951 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-14 09:59:30 +00:00
orbiter
ef72fdac79 added keyboard-based search result page navigation:
- page-up or tab switches to next search result page
- page-down switches to previous search result page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7950 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-14 09:15:09 +00:00
orbiter
e48ce5d80e - style change for search box: larger font, selected by default
- style change for search results: by default no parser, size, image info

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7949 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-14 09:05:06 +00:00
orbiter
5905392ca3 redesign and simplification of main menu; bundling of some sub-menues
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7948 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-14 01:24:56 +00:00
orbiter
e5a93a1742 fix for image name
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7947 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-14 00:33:08 +00:00
orbiter
5fd4f3fef8 fresh look for yacy icons
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7946 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-14 00:03:33 +00:00
orbiter
95790b82d9 replaced old-style favicon
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7945 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-13 23:34:36 +00:00
orbiter
bb0c045036 fix for problem with relocation of network
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7944 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-13 18:46:11 +00:00