Shopping.com’s Spider Discovered


This opens up a big can of worms, but I’m very proud of SDC for releasing this.

http://www0.shopping.com/bot.html

As part of our ongoing efforts to improve the buying experience for shoppers online, Shopping.com is experimenting with new ways to collect and aggregate data through web crawling. At this point we do not plan on integrating inventory from our web crawling index with inventory that we receive from our merchants.

More later…oh, and did you see Become’s new design?

Update: I was alerted to SDC’s spider by Vic Berggren who runs a myriad of sites. He discovered the crawler on Interlink Communications Systems, but Vic is also involved with New Balance Tampa and the excellent ShoeStock blog. In his first month writing the blog, he had 3,600 unique visitors and 5,500 page views.

About these ads

3 Responses to Shopping.com’s Spider Discovered

  1. paulobrien says:

    So what is this saying?
    How often will sdcresearchlabs-testbot access my web pages?
    For most sites, sdcresearchlabs-testbot shouldn’t access your site more than once every few seconds on average.
    They’ll crawl stores every few seconds… or not more than that…. so they aren’t answering the question…
    If not daily, we’ll have problems

    By the way, I just wrapped up my 3 part series about optimizing CSEs and summarizing some new CSE research. Check it out
    http://seobrien.blogspot.com

  2. Vic Berggren says:

    Thanks for the mention(s) Brian!

    I can analyze the crawl and see if a pattern emerges regarding frequency.

    What’s interesting is that the site they crawled does not have an SDC PPC agreement. So, they were able to find the site (somehow) and locate the products.

    Do you suspect that we may begin to see un_paid_for data at the bottom of a standard price comparison grid… sorta like ‘web results’?

  3. Vic Berggren says:

    OK, they definitely crawled again today… roughly 1200 pages. Smart crawler too. It does not deviate too far from the product page at all and when it does its after the other category bits or a Data Sheet/Tech Spec for some of our Networking Products.

    It’s also multi-threaded; grabbing between 8-10 pages per group.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 277 other followers

%d bloggers like this: