Commit Graph

232 Commits

Author SHA1 Message Date
orbiter
1ed3e2daef added option to extract domains and/or urls from the eurl database
when extracting from eurl, the html output format is recommended, since
this format adds also the fail reason to the domain/url.
The complete syntax for domain extraction is now
java -Xmx<megabytes>m -classpath classes yacy -domlist [ -source { lurl | eurl } ] [ -format { text  | zip | gzip | html } ] [ <path to DATA folder> ]


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2322 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-24 08:08:33 +00:00
orbiter
58df8b7bbf a large collection of different changes
* mainly for the transition to the new indexing database structure
* a bugfix for an endless loop inside kelondroTree iteration
* a bugfix for bulk read inside a kelondroTree iteration; the bug caused that some elements had been iterated twice
* very strong speed enhancement for url/domain extraction

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2320 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-23 22:39:41 +00:00
orbiter
493b1cd2bf better logging for domain extraction
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2319 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-22 11:43:56 +00:00
orbiter
685430a1b5 bugfix in new URL class, better loggin for domain extraction
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2317 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-22 11:33:01 +00:00
orbiter
c57b78722b added some more logging to domain extraction
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2316 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-22 10:56:40 +00:00
orbiter
cc2be7fb43 fix for genurllist in case of bad urls
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2314 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-22 10:00:21 +00:00
allo
ff3f174a2d case insentive commandline options
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2306 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-19 11:20:22 +00:00
allo
ff39a7a0d1 Overlay for welcome.*
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2299 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-18 11:07:30 +00:00
allo
8795875800 dirlisting for all empty directories.
no problem to update dir.java anymore, because its only in htroot/htdocsdefault needed.
migration to delete old dir.* files in the fileshare

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2294 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-17 15:49:42 +00:00
orbiter
3879a0ecd0 replaced java.net.URL usage by use of new class de.anomic.net.URL
This shall be seen as an experiment to exclude all cases where
there could be a DNS lookup during URL comparisment.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2290 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-13 01:21:53 +00:00
orbiter
92f4cb4d73 added option to configure the start-up delay time for kelondro database files.
the start-up delay is used to pre-load the database node cache

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2276 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-03 23:57:33 +00:00
hydrox
53077f5835 *)fixed paths to yacy.logging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2252 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-28 09:04:53 +00:00
orbiter
12af69dd86 cosmetics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2212 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-18 11:49:31 +00:00
theli
45b39ee1be *) solving unpacking problems with to long filename by
a) renaming the parent folder in the tgz file to yacy
      (can be configured via build properties file)
   b) reconfiguring build file to throw an error if a file
      name is too long 
Please note that currently there is _no_ proplem with too long
class names because of step a.

      

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2207 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-14 15:18:41 +00:00
theli
f01bd25489 *) Bugfix for OutOfMemory problem during minimizeUrlDB
See: http://www.yacy-forum.de/viewtopic.php?t=2498
*) out of date import functions removed (can be done via web gui)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2189 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-09 05:38:59 +00:00
orbiter
4a907a570f 1st step to migrate kelondroTree to usage of kelondroRow instead of byte[][]
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2162 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-31 23:31:46 +00:00
orbiter
eaa6f012f0 refactoring: better naming for classic DB (files in WORDS)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2151 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-28 11:59:16 +00:00
orbiter
5041d330ce refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2150 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-28 11:44:50 +00:00
orbiter
7b3b12888c refactoring: integrated indexContainer abstraction layer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2149 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-28 01:09:31 +00:00
orbiter
cb295fbbdc refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2147 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-26 23:55:30 +00:00
rramthun
bc94a714b2 Better explanation for the auto-dom-filter.
Some javadoc.
Small change to DetailedSearch.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2146 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-26 12:18:12 +00:00
orbiter
196b8abb30 refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2144 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-26 09:32:50 +00:00
orbiter
757ec28430 refactoring: better data capsulation for indexURL
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2131 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-23 08:59:45 +00:00
theli
8b7626f8d1 *) Automatic redirection of browser if user changes port settings in ConfigBasic
See: http://www.yacy-forum.de/viewtopic.php?t=2415
*) If ssl is available, the browser conntects to yacy via https on yacy startup
   See: http://www.yacy-forum.de/viewtopic.php?p=21649#21649

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2127 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-20 14:05:49 +00:00
orbiter
90d569d70f refactoring of index management:
url storage is part of index management; moved plasmaURL to indexURL

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2122 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-19 23:50:55 +00:00
orbiter
a930be4ba3 refactoring of index management:
generalized the index entry

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2121 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-19 23:19:20 +00:00
orbiter
a474669338 start with refactoring of index management
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2110 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-16 16:11:55 +00:00
orbiter
015d044c25 tried to fix some problems with latest changes to httpc
very experimental!

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2078 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-10 16:01:14 +00:00
orbiter
fd7c17e624 added virtual host support:
all yacy-to-yacy communication now send the <peer-hexhash>.yacyh
virtual domain inside the http 'Host' property field.
This shall enable running a yacy peer on a virtual host.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2074 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-09 13:11:00 +00:00
hydrox
49f3b56526 *) URLCache in minizimeURLDB can be changed now (standart is 4mb)
*) moved Exception Stackprints to loggingengine

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2028 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-20 08:20:12 +00:00
orbiter
0c9b61820e enhanced re-crawl settings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1960 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-24 13:45:01 +00:00
rramthun
42b0b10a95 -Adding Windows Media to types which are not sended compressed
-Renaming writeandzip to writeandgzip to avoid confusion about type of compression
-Adding new startup message to windows script
-The usual language "enhancements" ;-)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1953 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-23 20:12:23 +00:00
orbiter
128e4ab199 - in serverSystem: maxPathLength is now a variable, not a method
- upon startup the calculated maximum path length is shown

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1932 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-19 01:33:20 +00:00
rramthun
9c85820d35 added MIME-type for wmv and rm
removed double copyright at startup

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1922 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-18 12:39:00 +00:00
orbiter
488a0ed580 replaced old keyIterator and rowIterator by buffered iterators
that are synchronized with database access
Main change is done in kelondroTree, other classes are only adoptions

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1918 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-17 23:43:24 +00:00
allo
2b31f51896 bugfix for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1915 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-17 21:09:07 +00:00
orbiter
3286b1f498 re-organisation of lurl-creation and -stacking
this was necessary to prevent useless write to the database
in case of blacklist appearance of the url

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1905 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-17 10:16:07 +00:00
rramthun
9f979d4fa5 Domain-lists gzip-compressable and sendable via cr-send/receive
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1883 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-13 20:12:31 +00:00
allo
f6452879d5 prevent nullpointer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1858 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-08 11:31:39 +00:00
allo
a8fa9990aa default skins support
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1825 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-05 10:07:52 +00:00
orbiter
3173b5c9b3 fixed port parsing during shutdown for extended port format
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1812 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-04 11:07:01 +00:00
orbiter
1b9b8922d9 * fixed problems with new basic 1-2-3 configuration (now authentication required)
* fixed graphics problem
* fixed some other problems with default values
* 1-2-3 config now appears automatically on start-up if no password is set
* added new config menu
* moved profile to new config menu


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1792 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-01 22:27:20 +00:00
orbiter
3703f76866 - fixed re-search bug: after a search with several words, a second search could not
find the same words as before. This was caused because indexContaines stored the url references
  with a hashtable. A tree was needed to work with the index conjunction-by-numeration
- added permanent ram cache flush (again)
- removed direct flush of ram cache after a large container is added.
  this happens especially during DHT transmission and therefore this fix should
  speed up DHT transmission on server side.
- removed unused and out-dated methods

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1765 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-25 08:42:45 +00:00
rramthun
013b24ea0d First version of Italian translation by Riccardo Lemmi
Updated german language with bugfix.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1726 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-21 14:10:00 +00:00
orbiter
34341a868e code cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1701 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-19 00:39:16 +00:00
allo
bfd37e34aa using other XML Parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1693 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-18 12:00:13 +00:00
theli
9b941fb773 *) bugfix for usage of yacy with extended port binding (e.g. #eth0:8080, 192.168.0.1:8080, etc.)
- port was reported incorrectly to other peers


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1678 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-17 10:53:20 +00:00
orbiter
7eb10675b3 re-organization of index management
this was done to be prepared for new storage algorithms


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1635 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-14 00:12:07 +00:00
allo
40199cea1f migration with svn Numbers
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1623 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-12 16:46:43 +00:00
orbiter
1e4578aab6 VERY EXPERIMENTAL removal of index ram cache flushing thread.
The cache will fill up and flushed explicitely when it is full.
This shall remove double-access of assortments (indexing and flush)
during indexing process. Hopefully this should reduce IO.
The main idea is: the cache shall mainly be flushed by DHT transfer, and
only indexes that shall be hosted by the own peer are flushed to the
assortments. This needs further work.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1617 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-11 23:19:01 +00:00
allo
f61161b90b fix for translations on startup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1566 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-07 18:06:00 +00:00
allo
7bd61ab0e5 Locales will now be in DATA/HTDOCS. So it works with readonly htroot.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1527 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-04 10:50:22 +00:00
allo
1f3eaf9f8e use DATA/HTDOCS for notifier.gif. Works even if htroot is readonly
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1526 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-03 21:21:42 +00:00
theli
0fbe1a4515 *) Adding additional shutdown method which is neede to run yacy als windows service
See: http://www.yacy-websuche.de/wiki/index.php/De:WinService

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1507 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-01 11:03:37 +00:00
orbiter
ec5d88664a tried too fix serverSwitch synchronization problems
see also: http://www.yacy-forum.de/viewtopic.php?p=16110#16110

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1499 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-30 23:07:20 +00:00
orbiter
3419b3bcdd fix for bug that caused the peer-counter problem.
See http://www.yacy-forum.de/viewtopic.php?p=16016#16016
The kelondroDyn now uses a generic fill character.
kelondroDyn-Tables containing peer/word/url-hashes must not use '_'
as fill character.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1498 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-30 22:16:55 +00:00
theli
48e302252e *) adding possibility to build a distribution containing an exe file for windows users
see: build file target "distWinExe"

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1494 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-30 16:16:37 +00:00
orbiter
fa90c3ca7a - removed some usage of indexEntity
- changed index collection process: indexes are not first flushed to indexEntity,
  but now collected directly from ram cache and assortments

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1489 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-30 12:42:06 +00:00
theli
09dc7bbcd7 *) Adding function to scan seed.DBs for peers affected by the
"too short peer hash"-Bug.
   See: http://www.yacy-forum.de/viewtopic.php?p=16056

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1488 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-30 08:31:14 +00:00
theli
2a7c958877 *) Adding function to scan seed.DBs for peers affected by the
"too short peer hash"-Bug.
   See: http://www.yacy-forum.de/viewtopic.php?p=16056

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1487 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-30 08:28:22 +00:00
theli
c69f7a39a3 *) adding a startup-test to avoid running into the unzip bug
See:
   http://www.yacy-forum.de/viewtopic.php?t=1763
   http://www.yacy-forum.de/viewtopic.php?t=715
   http://www.yacy-forum.de/viewtopic.php?t=1674

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1420 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-24 08:33:52 +00:00
theli
b4e2efef10 *) first test of new iteration function
ATTENTION: please don't use it at the moment

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1418 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-23 17:20:30 +00:00
orbiter
f4ffa9aee5 - implemented more attributes to index entries
- implemented hand-over of new word index attributes during remote search
- implemented word-distance computation during search

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1382 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-20 15:14:21 +00:00
allo
b453199c68 first step for a special migration class.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1365 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-17 21:33:19 +00:00
hydrox
695dfb7eab *) -rwihashlist can now write to a zip-file
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1347 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-15 10:29:48 +00:00
allo
4f8127946e inc Files are now translatable
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1345 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-14 22:11:45 +00:00
allo
fe2d983c3e recursive Translations!
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1341 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-14 11:56:20 +00:00
hermens
971247b78f - rotate merged indexes after merging
see: http://www.yacy-forum.de/viewtopic.php?t=1717
- fix -rwihashlist to correctly shutdown



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1336 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-13 23:59:04 +00:00
orbiter
21fac0b6da small bugfix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1310 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-11 00:43:00 +00:00
orbiter
2028403670 - consolidated different orderings to kelondroNaturalOrder
- added another iteration method to rwihash-enumeration


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1309 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-11 00:32:44 +00:00
orbiter
9544c47684 added some UTF-8 handling.
hope this will help somehow.. for shure not THE solution to our UTF-8 problem


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1308 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-10 16:48:59 +00:00
orbiter
537a819824 extended RWIHashList DHT control method:
it is now possible to select only assortments or only files in WORDS
selection of words only from the ram cache is not yet possible.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1305 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-10 01:04:22 +00:00
hydrox
8b6d31763d *)added function to create a list of all RWI hashs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1287 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-04 13:55:45 +00:00
orbiter
9086261476 refactoring of base64 encoding:
the kelondro database needs specific information about the order of
base64-encoded keys. Since no other package depends on base64
(only the httpd uses base64 for encryption, but does not need to encode these strings)
it is good to move base64 encoding to the new ordering classes in kelondro.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1284 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-04 00:39:00 +00:00
rramthun
d0c2c67f4c Update YaWoStat version.
See http://www.yacy-forum.de/viewtopic.php?p=14215#14215 for possible use.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1236 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-19 19:14:18 +00:00
hydrox
9b617bcb65 *)compression of -domlist now optional (-format zip
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1230 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-17 21:19:51 +00:00
hydrox
2bd4a66133 *)-domlist now creates a zipped txt-file.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1229 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-17 15:43:13 +00:00
orbiter
4500506735 fixed some bugs concerning url entry retrieval and intexControl interface
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1212 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-15 10:31:00 +00:00
orbiter
bb79fb5d91 - changed handling of error cases retrieving urls from database
(no more NULL values are returned, instead, an IOException is thrown)
- removed ugly damagedURLS implementation from plasmaCrawlLURL.java
  (this inserted a static value into the Object which is not really a good style)
- re-coded damagedURLS collection in yacy.java by catching an exception and evaluating the exception message
to do:
- the urldbcleanup feature must be re-tested


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1200 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-11 00:25:02 +00:00
theli
5a627a690f *) Extending hydrox urlDbCleanup function
- now the function tries to correct the URL first
   - if the url can not be corrected it will be deleted
   See: http://www.yacy-forum.de/viewtopic.php?p=13898

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1197 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-09 15:41:06 +00:00
hydrox
96930f0d2b *)added function to removed malformed URLs from urlHash.db
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1182 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-07 11:10:08 +00:00
orbiter
d007d14905 re-insert of migrateSwitchConfigSettings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1180 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-07 10:31:48 +00:00
orbiter
0e88ba997e * added option to generate url-lists as plain text file or in html
* modified generation of dom-lists so that they can be also generated as html
these options can be called as:
java -classpath classes yacy -domlist -format html
java -classpath classes yacy -domlist -format html .
java -classpath classes yacy -domlist -format text .
java -classpath classes yacy -urllist -format html .
java -classpath classes yacy -urllist -format text .
the -format <type> can be ommitted. The text is default
a home path can be asserted or omitted at the end of the parameters

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1178 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-07 01:40:52 +00:00
orbiter
37f88b4017 code cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1176 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-06 23:51:29 +00:00
orbiter
ec2b39c1ce code cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1175 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-06 22:30:15 +00:00
orbiter
76618442e0 code cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1173 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-06 21:21:14 +00:00
orbiter
7920e1547d code cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1163 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-05 09:13:13 +00:00
orbiter
1d6a6d1f85 code cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1159 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-05 00:17:12 +00:00
orbiter
bfe51c7228 added generation of domain-list
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1112 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-21 01:30:30 +00:00
theli
8e308cf50e *) Possibility to change the server port on-the-fly.
- Now it's possible to change the server port without the need to restart the whole server.
   

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1089 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-15 15:03:15 +00:00
theli
3631cb1f6d *) deleting empty entities during index selection
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1086 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-15 12:23:46 +00:00
theli
ca26aab9b1 *) More debugging output for migrateWords
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1085 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-15 11:55:09 +00:00
theli
3c11d7b81c *) Bugfix for minimizeUrlDB
- function didn't work correctly because of new url hash structure
   See: http://www.yacy-forum.de/viewtopic.php?p=12753#12753

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1080 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-15 07:35:04 +00:00
orbiter
9913049009 fixed outOfMemory bug caused by loops in kelondroTree during enumeration
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1079 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-15 01:20:05 +00:00
theli
fd58d5f8e6 *) Adding possibility to specify the interface / IP-Address where YaCy should bind to.
- e.g. Port = 192.168.0.1:8080
          Port = #eth0:8080
          Port = 8080

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1071 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-13 17:03:52 +00:00
allo
889de6686c Migration in yacyVersion
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1070 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-13 15:12:48 +00:00
orbiter
79818a320f introduced citation-rank transmission protocol and activate transport for anonymisation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1055 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-10 23:48:20 +00:00
orbiter
02f8013013 auto-delete of corrupted word files during word-migration
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1047 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-07 14:57:37 +00:00
hydrox
56b9f34411 *)removed unused imports
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1015 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-02 16:30:45 +00:00
orbiter
4d1e56e4d9 fixed intermission-bug (removed 'break for intermission' of httpd-thread)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1009 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-31 10:46:13 +00:00