Commit Graph

369 Commits

Author SHA1 Message Date
reger
1adb4b8741 merge rc1/master 2013-10-16 03:02:21 +02:00
sixcooler
dfb73c9519 bump to httpclient-4.3.1 - a bugfix release 2013-10-14 23:32:24 +02:00
reger
cf32a92629 - add size check to multipart form data handling of YaCyDefaultServlet (same as in HTTPDemon.parseMultipart)
- reduce Jetty logging 
- give build.run a bit more memory (set to YaCy.default 600m from 512m)
2013-10-13 20:56:03 +02:00
reger
a44eede8b8 merge rc1/master 2013-10-11 01:50:25 +02:00
reger
71d2655c02 downgrade to Jetty 8 to assure support of JRE 1.6
- introduce a YaCyHttp interface to modulize/separate http server
- adjust the Jetty version specific implementation part (in package net.yacy.http)
     - putting the version specific code in classes starting with Jetty8xxxx
     - moved existing Jetty9xxx implementation into a test class (to keep the code)
- adjust build to the changed jars
- make use of the introduced YaCyHttpServer interface in related htroot servlets

- adjust other test cases/classes
2013-10-09 00:40:48 +02:00
Michael Peter Christen
21aa6a0321 migration to Solr 4.5.0 2013-10-07 17:09:40 +02:00
reger
f46771bdf5 upd build script from rc1/master 2013-09-30 03:47:55 +02:00
sixcooler
15b1bb2513 bump to httpClient-4.3 2013-09-25 14:48:37 +02:00
reger
f7f86d8a5d update to Jetty 9 jars
- include javax.servlet 3.0
2013-09-14 20:49:05 +02:00
reger
aafef72a8a merged current rc1/master into jetty branch to allow further development with latest version
ServerSideIncludes and servlet return values need further work (for working jetty integration)
- TODO: added nasty quickfix to allow SSI -  needs further work
- TODO: YaCy servlet return values/parameters are not handled
2013-09-09 02:36:06 +02:00
sixcooler
5189620026 add branch to packet-name if not build from master 2013-08-06 03:48:29 +02:00
Michael Peter Christen
5b7c0d0745 update to pdfbox 1.8.2 2013-07-30 14:14:16 +02:00
Michael Peter Christen
f13df9dbb6 migration to solr 4.4.0 2013-07-30 14:01:16 +02:00
reger
3760e2616b bump up lib/metadata-extractor-2.6.2.jar (used for image parser) with needed code adjustments 2013-06-25 23:24:02 +02:00
Michael Peter Christen
5f92c68f1f removed block rank ranking and all YBR files in /ranking 2013-05-30 13:01:22 +02:00
Michael Peter Christen
9bd2aee180 migrated to solr 4.3.0 2013-05-09 02:17:53 +02:00
Michael Peter Christen
ad050ec88d - upgraded httpclient, httpcore and httpmime
- removed httpclient 3.1 which has been used by solrj < 4.x.x and is now
not used any more
- fixed some parts in YaCy which used methods from httpclient 3.1
2013-05-09 00:22:45 +02:00
orbiter
48e9a54e80 updated pdf parser 2013-05-08 15:17:06 +02:00
Michael Peter Christen
27907c9739 added missing library after solr upgrade 2013-04-07 10:36:05 +02:00
Michael Peter Christen
cf0acd2cb4 upgrade to solr 4.2.1 2013-04-06 16:11:24 +02:00
Michael Peter Christen
461d46101d - Removed log4j from libraries. This can be removed because the package
log4j-over-slf4j is there. From slf4j all loggings are routed to the jdk
logger. Now all loggings are consistently done to the jdk logger.
- added some lines to the logging properties to suppress many solr
logging statements. The number of the logging entries had already become
a performance issue, therefore removing these from the log should
increase performance.
2013-02-23 16:45:05 +01:00
orbiter
36f9b0fc16 updated wstx-asl to 3.2.9 2013-02-23 14:33:17 +01:00
reger
1951ba61ae remove CPGEN from Windows batch files
(classpath for all needed libraries is defined in manifest  of yacycore.jar)
2013-02-17 03:26:46 +01:00
Michael Peter Christen
09a2b09c48 guava update 2013-02-04 11:21:05 +01:00
Michael Peter Christen
80fe3d7860 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
Conflicts:
	source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java
2013-02-04 10:57:54 +01:00
Michael Peter Christen
4323621a76 update to Solr 4.1.0 2013-02-04 10:55:49 +01:00
reger
160ce568b3 move testing SolrServlet.main to test, making include of jetty*.jar in distribution and classpath obsolete
- move jetty*.jar to test library 
- move SolrServlet.main as is to test, add also a junit test simulating main 
  - add build.xml cleanup for EmbeddedSolrConnectorTest created test/DATA
- adjust some test compile errors
2013-02-03 22:32:38 +01:00
reger
be5d3a1066 adding classpath to Manfiest of yacycore.jar
- this allows to start w/o giving explicite java -cp (just java -jar lib/yacycore.jar works)
- especially helpful while running YaCy as Win service, 
  making it obsolete to adjust classpath cfg of the service wrapper on upgrades of lib/*.jar's
2013-01-29 03:01:57 +01:00
sixcooler
f3e705c4fe bump to httpclient / httpcore 4.2.3 (bugfix-release) 2013-01-17 20:10:49 +01:00
Michael Peter Christen
9dfc9c95d8 updated slf4j and log4j 2012-12-27 04:37:21 +01:00
Michael Peter Christen
95712fdc8b update to pdf parser 2012-12-27 04:16:31 +01:00
Michael Peter Christen
e2c4c3c7d3 migration to solr 4.0.0 2012-11-02 12:29:48 +01:00
Michael Peter Christen
69aa39d664 update to libraries required by solr 4.0.0 2012-11-02 10:27:44 +01:00
sixcooler
9d062873d2 bump to httpclient-4.2.2 2012-10-31 19:09:48 +01:00
sof
5cb244b79b Merge remote branch 'origin/master' 2012-10-05 18:54:39 +02:00
apfelmaennchen
88b062210c Added a parser for audio file tags (e.g. ID3 tags for MP3 files) based
on the jaudiotagger library. The parser is disabled by default as it
needs to store temporary files for non file:// protocols, which might be
disliked. For your local MP3-collection it loads nicely Artist,
Title, Album etc. from the audio files meta data.
2012-10-05 18:54:26 +02:00
sixcooler
9aa21506be bump to httpcore-4.2.2 (maintenance release) 2012-10-03 02:15:02 +02:00
Michael Peter Christen
d0015df61c added lucene memory library which is now necessary as solr has to
process more complex queries
2012-09-28 13:48:51 +02:00
Michael Peter Christen
80edd8ecd7 some more after-refactoring fixes 2012-09-28 10:24:57 +02:00
Michael Peter Christen
bc865ab816 more cleaning (yacy-cora) 2012-09-25 12:19:24 +02:00
Michael Peter Christen
e65cecc419 - updated lucene libraries to 3.6.1
- added lucene-grouping which enables faceted search; try this:
http://localhost:8090/solr/select?q=*:*&start=0&rows=3&facet=true&facet.field=host_s
2012-09-10 10:12:38 +02:00
Michael Peter Christen
2ccf1dba71 upgrade to solr 3.6.1 2012-08-17 15:11:21 +02:00
cominch
e74d66e28c augmented browsing: remove htmlparser library 2012-08-14 10:09:46 +02:00
cominch
e2119f4e76 augmented browsing: replace htmlparser by jsoup, which is more stable
and reliable
2012-08-14 10:06:12 +02:00
sixcooler
a99ef68422 bump to httpclient-4.2.1 2012-07-09 18:58:33 +02:00
Michael Peter Christen
65f56b1fd4 Merge branch 'master' of ssh://gitorious.org/yacy/rc1 into jetty
Conflicts:
	.classpath
	build.xml
	htroot/Status.java
	source/de/anomic/http/server/HTTPDProxyHandler.java
	source/net/yacy/yacy.java
2012-06-29 21:16:20 +02:00
Michael Peter Christen
7b53be141f upgraded to pdfbox 1.7.0
changes in http://www.apache.org/dist/pdfbox/1.7.0/RELEASE-NOTES.txt
with many bugfixes, including performance related
2012-06-22 16:49:58 +02:00
Michael Peter Christen
fad3b14813 added jetty libraries, needed for future use as web server and as
application server for the solr search interface
2012-06-22 15:31:17 +02:00
Michael Peter Christen
b9d42fd9c8 using com.google.common.io.Files instead of homebrew methods 2012-06-22 11:39:17 +02:00
Michael Peter Christen
1be0025a9c - added test for EmbeddedSolrConnector
- added needed libraries for this test
this includes most (all) files needed for an embedded solr
2012-06-22 00:36:49 +02:00
Michael Peter Christen
dbdd697f4d moved RDFaParser.xsl configuration file to defaults 2012-06-21 16:09:12 +02:00
Michael Peter Christen
e12bb254b4 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2012-06-21 14:55:50 +02:00
Michael Peter Christen
3f55dc7c1e - added solr core and libraries that solr needs (lucene is missing, will
follow later)
- added embedded solr connector which can connect to solr
programmatically (without using a server in between)
2012-06-21 14:55:38 +02:00
Michael Peter Christen
786be7d175 better integration of RDFaParser 2012-06-20 16:39:04 +02:00
cominch
5d20cd324a Add Triplestore and RDF query interface
Conflicts:
	build.xml
	defaults/yacy.init
	source/net/yacy/interaction/AugmentHtmlStream.java
2012-06-10 10:35:59 +02:00
cominch
b21048892b augmentedParser add features and integrate external html parser to
modify existing web pages

Conflicts:
	addon/YaCy.app/Contents/Info.plist
	build.xml
2012-06-10 10:23:35 +02:00
sixcooler
56087c1f23 bump to httpclient- httpcore-, httpmime- 4.2 2012-05-30 14:46:21 +02:00
Michael Peter Christen
7acd7e88b3 added all shell scripts in /bin to add also latest passwd.sh file 2012-05-29 12:00:32 +02:00
Michael Peter Christen
4d3cc02168 replaced old bzip2 library against better documented commons-compress
package from http://commons.apache.org/compress/
2012-05-28 23:53:48 +02:00
Michael Peter Christen
ca7de1dbd0 moved files to defaults 2012-05-18 22:33:41 +02:00
Michael Peter Christen
6c4f8fdc44 removed superfluous files 2012-04-30 11:09:54 +02:00
Michael Peter Christen
62f2554a01 - fixed build problems (deprecated methods using httpclient 3.1)
- removed httpclient 3.1 lib which was used by solrj (solrj now uses
httpclient 4)
2012-04-27 17:46:08 +02:00
Michael Peter Christen
f838997126 updated commons io from 2.0.1 to 2.1 2012-02-24 01:35:01 +01:00
Michael Peter Christen
eeb57ae824 updated http client libraries 2012-02-24 01:06:30 +01:00
Michael Peter Christen
ffb72249ea added missing apicat.sh 2012-02-01 00:49:40 +01:00
Michael Peter Christen
a30b028cc0 updated libraries 2012-01-18 01:21:41 +01:00
Marek Otahal
a231d0eeb9 Run from Java the whole app YACY
start for java webStart
allow for better integration with IDE

Conflicts:
	source/net/yacy/gui/framework/Browser.java
2012-01-09 01:49:37 +01:00
Michael Peter Christen
e1434635d4 changed required setting for package signing 2012-01-07 12:37:02 +01:00
admin
d171a2fa3e fixed ant build for deb target: no more svn numbers 2011-12-08 22:29:06 +01:00
sixcooler
d14ee8e464 Revision 9000+ hack
do not handle the revision in build.properties anymore
(9000 as fallback)
build-date from git-HEAD (instead when build is fired)
(orginal build-date as fallback)
2011-12-07 04:20:49 +01:00
Michael Christen
7afcdcd573 release 1.01 - now with virtual svn number 9000 2011-12-07 01:03:08 +01:00
sixcooler
b79da58eac Ant-Task for getting version from git
tries to find svn-version or any tag - what ever comes first
be careful using this with non-numerical tags!
2011-11-24 23:44:54 +01:00
sixcooler
69570fda24 bring my master to stuff from remote 2011-11-24 19:21:58 +01:00
sixcooler
d9c56aa37a Ant-Task for getting version from git 2011-11-24 15:51:25 +01:00
sixcooler
9f8240b350 script for clean copy of URL-tables 2011-11-14 12:20:59 +01:00
apfelmaennchen
9067ab20b2 - included missing image for portalsearch.tar.gz in build.xml
- compressed (minify) yacy-portalsearch.js for better performance
- removed language selector, as it doesn't work really well (at least for me) 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8026 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-10 09:13:58 +00:00
apfelmaennchen
a425fbd8d6 - created new target 'portalsearch' in build.xml to generate yacy-portalsearch.tar.gz for static hosting
- some refactoring for search widget and jquery
- update for ConfigLiveSearch.html to refelct latest changes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8023 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-09 21:01:38 +00:00
orbiter
3f606407bc added new scripts to bin in build
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7991 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-10-07 22:57:20 +00:00
orbiter
d2ea250d99 refactoring:
- moved many classes from de.anomic to net.yacy
- made more sub-packages for search classes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7973 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-25 16:59:06 +00:00
orbiter
65ab067491 migration to solrj 3.4.0
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7952 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-14 20:08:59 +00:00
orbiter
dc25c48fc9 added more libraries that are needed by solrj
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7922 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-02 23:43:04 +00:00
sixcooler
52b477cf6f bump to httpclient-4.1.2, httpcore-4.1.3 - bugfixrelease
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7876 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-08-12 17:42:32 +00:00
sixcooler
48560a44a9 bump to httpcore-4.1.2: a bugfixrelease
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7853 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-25 00:48:29 +00:00
orbiter
d3c89b90ce temporary adding the old httpclient-3.1 again because the solrj classes need them. should be removed as soon solrj supports httpclient-4
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7831 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-04 17:04:49 +00:00
orbiter
768c59740c - replaced solrj 3.1 with solrj 3.3
- updated also slf4j
- added authentication for solrj


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7829 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-04 16:35:30 +00:00
orbiter
528b59e078 replaced xerces.jar library that was originally added 2005 with SVN 126 to the libx directory and that was moved to lib in SVN 5781
the new replacement is taken from http://xerces.apache.org and has the version 2.11.0 and was inside the file Xerces-J-bin.2.11.0.tar.gz
and consists of two files named xercesImpl.jar and xml-apis.jar
The original purpose of that library was to support:
- content parsers
- optional seed uploader
- SOAP API (which will be committed later)
Since the SOAP API does not exist any more the purpose is to support content parser and an optional seed uploader

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7819 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-02 22:33:35 +00:00
orbiter
e7e1a0f328 replaced commons-io v1.4 with v2.0.1
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7818 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-02 21:10:13 +00:00
orbiter
5092a14bcb replaced fontbox, jempbox, pdfbox v 1.5 with v1.6
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7817 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-07-02 20:52:33 +00:00
suessthomas
ccad615f58 The Java-XMS and Xmx values for the target of "run" (run YaCy) inserted.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7777 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-11 21:22:08 +00:00
orbiter
77fe69395d added jempbox-1.5.0.jar which is required by pdfbox-1.5 as stated in http://pdfbox.apache.org/dependencies.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7774 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-05 20:04:41 +00:00
sixcooler
efcd21e0ed new httpclient, httcore (bugfixrelease)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7769 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-06-02 21:34:50 +00:00
orbiter
761b1c71dc added latest pdfbox
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7761 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-30 14:56:36 +00:00
orbiter
3b578a28ef some patches to prevent that empty or bad IP information is broadcasted
- on client-side: fix bad IP reports from remote Peers by replacing their reported IP with their server IP if the reported IP is bad, broken or disallowed
- on server-side: the same during a peer ping (here the ping'ed server acts also as client during the back-ping) and also when receiving a message or a search where the client sends also its seed. Here the IP is replaced by the client IP if the reported IP is broken or bad

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7687 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-29 10:58:12 +00:00
orbiter
c493f101c0 added one more script file to release build script
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7681 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-28 13:19:24 +00:00
orbiter
f6077b3cc0 added more attributes for html parser and enhanced data structures
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7679 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-28 13:09:01 +00:00
apfelmaennchen
a0e4960a4d YMark:
- first attempt for a firefox json bookmark importer
- added JSON library json-simple-1.1.jar

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7658 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-15 20:58:58 +00:00
orbiter
19fd13d3bc Added federated index storage to solr.
YaCy supports now the storage to remote solr indexes.
More federated storage (and search) methods may follow.

The remote index scheme is the same as produced by the SolrCell; see
http://wiki.apache.org/solr/ExtractingRequestHandler
Because this default scheme is used, the default example scheme can be used as solr configuration
This is also the same scheme that solr uses if documents are imported with apache tika.

federated solr storage is switched off by default.

To use this, do the following:
- set federated.service.solr.indexing.enabled = true
- download solr from http://www.apache.org/dyn/closer.cgi/lucene/solr/
- extract the solr (3.1) package, 'cd example' and start solr with 'java -jar start.jar'
- start yacy and then start a crawler. The crawler will fill both, YaCy and solr indexes.
- to check whats in solr after indexing, open http://localhost:8983/solr/admin/

Until now it is not possible to use the solr index to search with YaCy in that solr index.
This functionality is now available for two reasons:
1) to compare the functionality of Solr and YaCy and to compare the search speed
2) to use YaCy as a search appliance for people who need a crawler or other source harvesting methods
   that YaCy provides (like dublin core reading, wikimedia dump reading, rss feed reader etc) if people still
   want to use solr instead of YaCy.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7654 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-14 20:05:04 +00:00
orbiter
4c013d9088 more UTF8 getBytes() performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7649 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-12 05:02:36 +00:00
f1ori
399d7d6878 * fix permissions of bin/-folder in debian package
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7647 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-07 07:31:17 +00:00
f1ori
21fe5e6c6a * add bin-folder to debian package
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7638 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-02 10:58:56 +00:00