<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: Guest Commentary &#8211; Vertical Web Search: One Concept, Lots of Approaches</title>
	<atom:link href="http://comparisonengines.com/2006/06/13/guest-commentary-vertical-web-search-one-concept-lots-of-approaches/feed/" rel="self" type="application/rss+xml" />
	<link>http://comparisonengines.com/2006/06/13/guest-commentary-vertical-web-search-one-concept-lots-of-approaches/</link>
	<description>Just another WordPress.com weblog</description>
	<lastBuildDate>Fri, 09 Dec 2011 18:27:04 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: Jon G.</title>
		<link>http://comparisonengines.com/2006/06/13/guest-commentary-vertical-web-search-one-concept-lots-of-approaches/#comment-540</link>
		<dc:creator><![CDATA[Jon G.]]></dc:creator>
		<pubDate>Wed, 14 Jun 2006 18:19:38 +0000</pubDate>
		<guid isPermaLink="false">http://www.comparisonengines.com/?p=445#comment-540</guid>
		<description><![CDATA[Randy,

       I left Eurekster out since their focus is on a relevancy algorithm.  When you have a full-on crawl (or use someone elseâ€™s) the billions of pages mean that you need a very good relevancy algorithm. Google has PageRank, at Become.com we use AIR technology to retrieve relevant results from our 3.2B indexed pages; Eurekster is looking at user feedback to get the best results.
       For years at AltaVista and Yahoo! we looked at user feedback (specifically using CTR on web results) to improve relevancy, but there were always concerns about click fraud.  The current social search guys have kinda a catch-22, if they become large and successful they create a huge incentive for spammers to try to add â€˜bots to the community to skew the results.  One way around this may be to have different social networks around each of the sites,  so no one site has enough traffic to be worth trying to distort.  This seems to be the promising approach that Eurekster is taking.]]></description>
		<content:encoded><![CDATA[<p>Randy,</p>
<p>       I left Eurekster out since their focus is on a relevancy algorithm.  When you have a full-on crawl (or use someone elseâ€™s) the billions of pages mean that you need a very good relevancy algorithm. Google has PageRank, at Become.com we use AIR technology to retrieve relevant results from our 3.2B indexed pages; Eurekster is looking at user feedback to get the best results.<br />
       For years at AltaVista and Yahoo! we looked at user feedback (specifically using CTR on web results) to improve relevancy, but there were always concerns about click fraud.  The current social search guys have kinda a catch-22, if they become large and successful they create a huge incentive for spammers to try to add â€˜bots to the community to skew the results.  One way around this may be to have different social networks around each of the sites,  so no one site has enough traffic to be worth trying to distort.  This seems to be the promising approach that Eurekster is taking.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jon G.</title>
		<link>http://comparisonengines.com/2006/06/13/guest-commentary-vertical-web-search-one-concept-lots-of-approaches/#comment-539</link>
		<dc:creator><![CDATA[Jon G.]]></dc:creator>
		<pubDate>Wed, 14 Jun 2006 17:44:27 +0000</pubDate>
		<guid isPermaLink="false">http://www.comparisonengines.com/?p=445#comment-539</guid>
		<description><![CDATA[Siva,

       I chose to classify Fatlens as a scraper based upon the way that the system stores and presents information.  Fatlens is extracting select, structured information from web pages so that users can do apples-to-apples comparisons of different ticket offerings.  This is quite different than the full page indexing that a general search engine like Google does.
  To be clear, Fatlens does do a web crawl (http://fatlens.com/main/fatbot.php) to find the tickets pages it extracts data from (most scrapers do some form of web crawling).  If you could provide any additional insight into how FatBot operates (# of sites crawled, how the sites are selected, etc.) that would be very insightful.]]></description>
		<content:encoded><![CDATA[<p>Siva,</p>
<p>       I chose to classify Fatlens as a scraper based upon the way that the system stores and presents information.  Fatlens is extracting select, structured information from web pages so that users can do apples-to-apples comparisons of different ticket offerings.  This is quite different than the full page indexing that a general search engine like Google does.<br />
  To be clear, Fatlens does do a web crawl (<a href="http://fatlens.com/main/fatbot.php" rel="nofollow">http://fatlens.com/main/fatbot.php</a>) to find the tickets pages it extracts data from (most scrapers do some form of web crawling).  If you could provide any additional insight into how FatBot operates (# of sites crawled, how the sites are selected, etc.) that would be very insightful.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Randy McClure</title>
		<link>http://comparisonengines.com/2006/06/13/guest-commentary-vertical-web-search-one-concept-lots-of-approaches/#comment-538</link>
		<dc:creator><![CDATA[Randy McClure]]></dc:creator>
		<pubDate>Wed, 14 Jun 2006 10:25:20 +0000</pubDate>
		<guid isPermaLink="false">http://www.comparisonengines.com/?p=445#comment-538</guid>
		<description><![CDATA[Thank you for the posting on vertical web search. What about social search engines like Eurekster? I just started using this type of collaborative search technology where user input influences the search results. Seems like this type of search technology has a lot of potential.]]></description>
		<content:encoded><![CDATA[<p>Thank you for the posting on vertical web search. What about social search engines like Eurekster? I just started using this type of collaborative search technology where user input influences the search results. Seems like this type of search technology has a lot of potential.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Todd Wilson</title>
		<link>http://comparisonengines.com/2006/06/13/guest-commentary-vertical-web-search-one-concept-lots-of-approaches/#comment-537</link>
		<dc:creator><![CDATA[Todd Wilson]]></dc:creator>
		<pubDate>Wed, 14 Jun 2006 05:04:39 +0000</pubDate>
		<guid isPermaLink="false">http://www.comparisonengines.com/?p=445#comment-537</guid>
		<description><![CDATA[Great post.  The &quot;Advantages&quot; and &quot;Challenges&quot; sections are especially helpful.  In the end, a lot of it comes down to trade-offs.  When a company undertakes a project like this, certain questions will inform the approach:

- How accurate and comprehensive does the data need to be?
- What technical (human) resources are available?  If you have access to 60+ PhD&#039;s you&#039;re obviously in a different position than if you have only a B.S. in C.S. fresh out of college.
- How many sites will need to be crawled/scraped/extracted?
- What&#039;s the time-line?  Sophisticated crawlers can take significantly longer than straightforward template-based extraction engines.

I actually wrote a blog entry not too long ago entitled &quot;Three common methods for data extraction&quot; that may be helpful to some: http://blog.screen-scraper.com/2006/03/21/three-common-methods-for-data-extraction/.  It&#039;s a bit more general, and not necessarily targeted to comparison engines.  It does, however, also delineate advantages and disadvantages to various approaches.]]></description>
		<content:encoded><![CDATA[<p>Great post.  The &#8220;Advantages&#8221; and &#8220;Challenges&#8221; sections are especially helpful.  In the end, a lot of it comes down to trade-offs.  When a company undertakes a project like this, certain questions will inform the approach:</p>
<p>- How accurate and comprehensive does the data need to be?<br />
- What technical (human) resources are available?  If you have access to 60+ PhD&#8217;s you&#8217;re obviously in a different position than if you have only a B.S. in C.S. fresh out of college.<br />
- How many sites will need to be crawled/scraped/extracted?<br />
- What&#8217;s the time-line?  Sophisticated crawlers can take significantly longer than straightforward template-based extraction engines.</p>
<p>I actually wrote a blog entry not too long ago entitled &#8220;Three common methods for data extraction&#8221; that may be helpful to some: <a href="http://blog.screen-scraper.com/2006/03/21/three-common-methods-for-data-extraction/" rel="nofollow">http://blog.screen-scraper.com/2006/03/21/three-common-methods-for-data-extraction/</a>.  It&#8217;s a bit more general, and not necessarily targeted to comparison engines.  It does, however, also delineate advantages and disadvantages to various approaches.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Siva Kumar</title>
		<link>http://comparisonengines.com/2006/06/13/guest-commentary-vertical-web-search-one-concept-lots-of-approaches/#comment-536</link>
		<dc:creator><![CDATA[Siva Kumar]]></dc:creator>
		<pubDate>Wed, 14 Jun 2006 00:14:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.comparisonengines.com/?p=445#comment-536</guid>
		<description><![CDATA[Hi Jon,

Thanks for taking the time to help educate the comparison engines audience on Vertical Web Searches. Thank you also for mentioning FatLens in your post. However, I&#039;d like to point out that your assumption about our technology is incorrect. FatLens is not an example of a scraper.

According to your classification scheme, we would best be described as a crawler aimed at creating a vertical search solution for shopping.

Siva Kumar

PS. The post was very illustrative of how one would view text or Web page search technologies. Shopping as a vertical search area could use a bit more discussion on alternative technology approaches.]]></description>
		<content:encoded><![CDATA[<p>Hi Jon,</p>
<p>Thanks for taking the time to help educate the comparison engines audience on Vertical Web Searches. Thank you also for mentioning FatLens in your post. However, I&#8217;d like to point out that your assumption about our technology is incorrect. FatLens is not an example of a scraper.</p>
<p>According to your classification scheme, we would best be described as a crawler aimed at creating a vertical search solution for shopping.</p>
<p>Siva Kumar</p>
<p>PS. The post was very illustrative of how one would view text or Web page search technologies. Shopping as a vertical search area could use a bit more discussion on alternative technology approaches.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

