Combating Web Spam with TrustRank
Zoltan Gyongyi, Hector Garcia-Molina, Stanford University, and Jan Pedersen, Yahoo. Proceedings of the 30th VLDB Conference, 2004. The authors propose techniques which allow to semi-automatically identify reputable pages and then discover more good pages based on the structure of the web. ODP is mentioned because setting up ODP clones is a technique to influence PageRank: to balance this spamming technique, the authors removed all sites which are not listed in the major directories from the data set used for the experiment. [PDF]
Topical TrustRank: Using Topicality to Combat Web Spam
Baoning Wu, Vinay Goel and Brian D. Davison propose to partition the seed set used in TrustRank by topic and calculate trust scores for each topic separately, making use of the Open Directory Project. Paper presented to the 15th International World Wide Web Conference, May 2006.
Web Spam Taxonomy
By Zoltán Gyöngyi and Hector Garcia-Molina, Stanford University. First International Workshop on Adversarial Information Retrieval on the Web, May 2005. Offers a definition of spam and an overview on current spamming techniques. The ODP guidelines are quoted as example for existing definitions of spam. [PDF]
Last update:August 26, 2011 at 10:34:28 UTC