Commit Graph

9 Commits

Author SHA1 Message Date
orbiter
0d8072aa99 removed warnings 2014-05-13 22:29:05 +02:00
Michael Peter Christen
8aeef73d49 fix for virtual root nodes 2014-04-11 15:12:34 +02:00
Michael Peter Christen
dd12dd392f introduction of a data structure for HyperlinkEdges which should use
less memory as it does no double-storage of source links for each edge
of the graph.
2014-04-11 12:09:33 +02:00
Michael Peter Christen
6ea8bb7348 using MultiProtocolURL for edge data which is faster (hash computation
is now much easier) and smaller in size
2014-04-11 10:58:37 +02:00
Michael Peter Christen
a37d067692 refactoring 2014-04-10 23:46:35 +02:00
orbiter
95780eed32 Merge branch 'master' of git@gitorious.org:yacy/rc1.git 2014-04-10 21:40:54 +02:00
Michael Peter Christen
67beef657f strong redesign of html parser: object recursion is now made using a
stack on html tag objects, not using a recursive parse-again method
which may cause bad performance and huge memory allocation. The new
method also produced better parsed image objects with exact anchor text
references.
2014-04-10 18:58:03 +02:00
orbiter
c250fac9f4 linkstructure refactoring to get more options for clickdepth analysis 2014-04-09 17:52:51 +02:00
Michael Peter Christen
bd886054cb new structure and enhancements for link graph computation:
- added order option to solr queries to be able to retrieve document
lists in specific order, here: link length
- added HyperlinkEdge class which manages the link structure
- integrated the HyperlinkEdge class into clickdepth computation
- extended the linkstructure.json servlet to show also the clickdepth
and other statistic information
2014-04-09 12:45:04 +02:00