Link popularity ranking. Algorithms are eigenvector methods for identifying "authoritative" or "influential" articles, given hyperlink or citation information.
Authoritative Sources in a Hyperlinked Environment
HITs is a link-structure analysis algorithm which ranks pages by "authorities" (pages which have many incoming links and provide the best source of information on a given topic) and "hubs" (pages which have many outgoing links and provide useful lists of possibly relevant pages). Ranking is performed at query time. [PDF]
The PageRank Citation Ranking: Bringing Order to the Web
First Stanford paper about PageRank. It is a static ranking, performed at indexing time, which interprets a link from page A to page B as a vote, by page A, for page B. Web is seen as a direct graph and votes recursively propagate from nodes to nodes. Ranking is performed at indexing time. Used by Google.
Adaptive On-Line Page Importance Computation
A good explanation about the convergence of various algorithms. This paper also describes an adaptive and on-line algorithm for computing the page importance. It can be used for focus crawling as well as for search engine's ranking.
The Clever Project
The CLEVER search engine incorporates several algorithms that make use of hyperlink structure for discovering information on the Web. It is an extension of Hits method.
DiscoWeb: Discovering Web Communities Via Link Analysis
This paper describes a prototype system, later known as the Teoma Search Engine. It performs a Link Analysis, loosely based on the Kleimberg method, and computed at query time.
Finding Authorities and Hubs From Link Structures on the World Wide Web
A survey on PageRank, Hits and SALSA. It also describes two Bayesian statistical algorithms for ranking of hyperlinked documents and the concepts of monotonicity and locality, as well as various concepts of distance and similarity between ranking algorithms.
The Intelligent Surfer: Probabilistic Combination of Link and Content Information in PageRank
This method uses query dependent importance scores and a probabilistic approach to improve upon PageRank. It pre-computes importance scores offline for every possible text query. [PDF]
Larry Page Describes PageRank
Postscript-format slides which introduces citation importance ranking by Larry Page, Google's founder.
The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity
This paper describes a joint probabilistic model for modeling the contents and inter-connectivity of document collections such as sets of web pages or research paper archives. [PDF]
PageRank Calculation with Lossy Encoding
Lossy encoding for large scale PageRank calculation. [PDF]
PageRank U.S. Patent 6,285,999
Lawrence Page's PageRank Patent.
SALSA: The Stochastic Approach for Link-Structure Analysis
A focused search algorithm (SALSA) based on Markov chains. It starts with a query on a broad topic, discards useless links, and then weights the remaining terms. A stochastic crawl is used to discover the authorities on this topic. [PS format]
Survey on Google's PageRank
Information on the algorithm, how to increase PageRank, what diminishes it and how to distribute PageRank within a website.
Web-Trec 8 and PageRank
About the using of PageRank in Web Track 8 "large" and "small" datasets. [PDF]
Web-Trec 9 and Link Popularity
About the using of Link Popularity in Web Track 9 datasets. [PDF]
What is this Page Known for? Computing Web Page Reputations,
PageRank and Hub and Authority generalization based on the topic of Web Pages. Definition of a model where a surfer can move forward (following an out-going link) and backward (following an in-going link in the inverse direction). [PS format]
WWW2003 - Scaling Personalized Web Search
Presentation paper. Link Popularity algorithms biased according to a user-specified set of given interesting pages. [PDF] (May 01, 2003)
[Mozzie the Clock-Keeper]
Last update:
March 9, 2016 at 13:24:12 UTC
All Languages