TrustRank: Difference between revisions

Content deleted Content added

Inline

Latest revision as of 22:04, 9 January 2024

TrustRank is an algorithm that conducts link analysis to separate useful webpages from spam and helps search engine rank pages in SERPs (Search Engine Results Pages). It is semi-automated process which means that it needs some human assistance in order to function properly. Search engines have many different algorithms and ranking factors that they use when measuring the quality of webpages. TrustRank is one of them.

Because manual review of the Internet is impractical and very expensive, TrustRank was introduced in order to help achieve this task much more quickly and cheaply. It was first introduced by researchers Zoltan Gyongyi and Hector Garcia-Molina of Stanford University and Jan Pedersen of Yahoo! in their paper "Combating Web Spam with TrustRank" in 2004.^[1] Today, this algorithm is a part of major web search engines like Yahoo! and Google. ^[2]

One of the most important factors that help web search engine determine the quality of a web page when returning results are backlinks. Search engines take a number and quality of backlinks into consideration when assigning a place to a certain web page in SERPs. Many web spam pages are created only with the intention of misleading search engines. These pages, chiefly created for commercial reasons, use various techniques to achieve higher-than-deserved rankings in the search engines' result pages. While human experts can easily identify spam, search engines are still being improved daily in order to do it without help of humans.

One popular method for improving rankings is to increase the perceived importance of a document through complex linking schemes. Google's PageRank and other search ranking algorithms have been subjected to such manipulation.

TrustRank seeks to combat spam by filtering the web based upon reliability. The method calls for selecting a small set of seed pages to be evaluated by an expert. Once the reputable seed pages are manually identified, a crawl extending outward from the seed set seeks out similarly reliable and trustworthy pages. TrustRank's reliability diminishes with increased distance between documents and the seed set.

The logic works in the opposite way as well, which is called Anti-Trust Rank. The closer a site is to spam resources, the more likely it is to be spam as well.^[3]

The researchers who proposed the TrustRank methodology have continued to refine their work by evaluating related topics, such as measuring spam mass.

References

^ Gyongyi, Zoltan; Garcia-Molina, Hector (2004). Combating Web Spam with TrustRank (PDF). Proceedings of the 30th VLDB Conference. Toronto, Canada. Retrieved 26 May 2022.
^ 7603350, Guha, Ramanathan, "United States Patent: 7603350 - Search result ranking based on trust", issued October 13, 2009
^ Krishnan, Vijay; Raj, Rashmi. "Web Spam Detection with Anti-Trust Rank" (PDF). Stanford University. Retrieved 11 January 2015.

External links

Z. Gyöngyi, H. Garcia-Molina, J. Pedersen: Combating Web Spam with TrustRank
Link-based spam detection Yahoo! assigned patent application using TrustRank

[1] Gyongyi, Zoltan; Garcia-Molina, Hector (2004). Combating Web Spam with TrustRank (PDF). Proceedings of the 30th VLDB Conference. Toronto, Canada. Retrieved 26 May 2022.

[2] 7603350, Guha, Ramanathan, "United States Patent: 7603350 - Search result ranking based on trust", issued October 13, 2009

[3] Krishnan, Vijay; Raj, Rashmi. "Web Spam Detection with Anti-Trust Rank" (PDF). Stanford University. Retrieved 11 January 2015.

[1]

[2]

[3]

@@ Line 1: / Line 1: @@
 {{refimprove|date=February 2018}}
+'''TrustRank''' is an [[algorithm]] that conducts [[link analysis]] to separate useful [[Web page|webpages]] from [[Spamming|spam]] and helps search engine rank pages in [[Search engine results page|SERPs]] (Search Engine Results Pages). It is semi-automated process which means that it needs some human assistance in order to function properly. Search engines have many different algorithms and ranking factors that they use when measuring the quality of webpages. TrustRank is one of them.
-'''TrustRank''' is a [[link analysis]] technique described in the paper ''Combating Web Spam with TrustRank'' by researchers Zoltan Gyongyi and Hector Garcia-Molina of [[Stanford University]] and Jan Pedersen of [[Yahoo!]]. The technique is used for semi-automatic separation of useful [[webpage]]s from [[spamdexing|spam]].
+Because manual review of the Internet is impractical and very expensive, TrustRank was introduced in order to help achieve this task much more quickly and cheaply. It was first introduced by researchers Zoltan Gyongyi and Hector Garcia-Molina of [[Stanford University]] and Jan Pedersen of [[Yahoo!]] in their paper "Combating Web Spam with TrustRank" in 2004.<ref>{{cite conference |url=http://ilpubs.stanford.edu:8090/770/1/2004-52.pdf |title=Combating Web Spam with TrustRank |last1=Gyongyi |first1=Zoltan |last2=Garcia-Molina |first2=Hector |date=2004 |location=Toronto, Canada |conference=Proceedings of the 30th VLDB Conference |accessdate=26 May 2022}}</ref> Today, this algorithm is a part of major web search engines like Yahoo! and Google. <ref>{{Cite patent|number=7603350|title=United States Patent: 7603350 - Search result ranking based on trust|gdate=October 13, 2009|invent1=Guha|inventor1-first=Ramanathan|url=http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/PTO/search-adv.htm&r=1&p=1&f=G&l=50&d=PTXT&S1=7,603,350.PN.&OS=pn/7,603,350&RS=PN/7,603,350}}</ref>
-Many [[web spam]] pages are created only with the intention of misleading [[search engine]]s.  These pages, chiefly created for commercial reasons, use various techniques to [[Search engine optimization|achieve higher-than-deserved rankings]] on the [[Search engine results page|search engines' result pages]]. While human experts can easily identify spam, a manual review of the Internet is impractical.
+One of the most important factors that help [[web search engine]] determine the quality of a web page when returning results are [[Backlink|backlinks]]. Search engines take a number and quality of backlinks into consideration when assigning a place to a certain web page in SERPs. Many [[web spam]] pages are created only with the intention of misleading [[search engine]]s. These pages, chiefly created for commercial reasons, use various techniques to [[Search engine optimization|achieve higher-than-deserved rankings]] in the [[Search engine results page|search engines' result pages]]. While human experts can easily identify spam, search engines are still being improved daily in order to do it without help of humans.
 One popular method for improving rankings is to increase the perceived importance of a document through complex linking schemes.  [[Google]]'s [[PageRank]] and other search ranking algorithms have been subjected to such manipulation.
 TrustRank seeks to combat spam by filtering the web based upon reliability. The method calls for selecting a small set of seed pages to be evaluated by an expert. Once the reputable seed pages are manually identified, a crawl extending outward from the seed set seeks out similarly reliable and trustworthy pages.  TrustRank's reliability diminishes with increased distance between documents and the seed set.
 The logic works in the opposite way as well, which is called Anti-Trust Rank. The closer a site is to spam resources, the more likely it is to be spam as well.<ref>{{cite web|last1=Krishnan|first1=Vijay|last2=Raj|first2=Rashmi|title=Web Spam Detection with Anti-Trust Rank|url=http://i.stanford.edu/~kvijay/krishnan-raj-airweb06.pdf|publisher=Stanford University|accessdate=11 January 2015}}</ref>
@@ Line 26: / Line 28: @@
 == External links ==
 * [http://www.vldb.org/conf/2004/RS15P3.PDF Z. Gyöngyi, H. Garcia-Molina, J. Pedersen: ''Combating Web Spam with TrustRank'']
-* [http://appft1.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PG01&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.html&r=1&f=G&l=50&s1=%2220060095416%22.PGNR.&OS=DN/20060095416&RS=DN/20060095416 Link-based spam detection] Yahoo! assigned patent application using Trustrank
+* [http://appft1.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PG01&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.html&r=1&f=G&l=50&s1=%2220060095416%22.PGNR.&OS=DN/20060095416&RS=DN/20060095416 Link-based spam detection] Yahoo! assigned patent application using TrustRank
 [[Category:Reputation management]]
 [[Category:Link analysis]]
-{{web-stub}}

Latest revision as of 22:04, 9 January 2024

See also

References

External links