We recently wrote an integration between Instagram and CafePress called Instapress. In the process, we learned a thing or two about the CafePress API and some of its idiosyncrasies. So, in the spirit of sharing, here’s an example of using the CafePress API to fetch products and their images.

What you’re going to be doing in a nutshell is:

  1. Searching for products by tag
  2. Fetching a performance-optimized version of the product image to display later

Our use case was taking images from Instagram, uploading them to CafePress as “designs” and then creating products out of them. We had our own database where we stored product ids and product image urls, so that the entire experience (minus actually putting a product in your shopping cart) was hosted outside CafePress, not inside. That’s where storing the product image urls comes in. I’m not going to show you the entire flow here, just one of the more interesting tricks we learned around products.

The CafePress API documentation isn’t too easy to understand right off the bat. But once you see the pattern, it becomes much easier. All the detailed docs are here: http://open-api.cafepress.com/documentation.list.cp. We are particularly interested in getting product listings.

The entire code is listed at the bottom of this post. But here are a few of the more interesting bits, highlighted:

The Right URL

The API documentation is just basically an XML description of a bunch of resources. It’s not obvious right away what to do with it, so getting the url right is the first challenge. Here’s the url we ultimately used in its entirety:
“http://open-api.cafepress.com/product.search.cp?appKey=supersecret&v=3&query=puppies&pageNumber=1&resultsPerPage=100&maxProductsPerDesign=2&sort=by_date_desc”

Understanding The Response

The response has many fields, some of them interesting and some not. If what you want is just an id to hold on to, a URL to send people to to buy things, and a thumbnail, then these are the fields you want: productNumber, caption, thumbnailUrl, marketplaceUrl.

We used the always-awesome Nokigiri gem to parse the XML and put it in a simple hash like so:

doc.xpath('//products/product').each do |product|
  product_rv = {}
  product_rv[:productNumber] = product.attr('productNumber')
  product_rv[:caption] = product.attr('caption')
  product_rv[:thumbnailUrl] = optimized_product_uri(product.attr('thumbnailUrl'))
  product_rv[:marketplaceUrl] = product.attr('marketplaceUrl')
  rv << product_rv
end

Optimized Image URLs

The thumbnail URL that the API gives you is not actually the preferred CafePress URL. As a consequence, in our first iteration our page load times were horrendous because for every product image (in our case, sometimes hundreds on a page), the browser was getting redirected to a different URL to actually fetch the image. We got around that by figuring out what those redirects would be ahead of time and only storing the final, redirected URL in our database.

For example, if you ask the API for details about product id 727483942, then the API will tell you this is the thumbnailURL: "http://images8.cafepress.com/product/727483942_125x125.png". But try slapping that in an img tag on a web page and the browser will be redirected to a totally different URL (and presumably a faster CDN): "http://i1.cpcache.com/product/727483942/shiba_puppy_samsung_galaxy_s3_thinsheld.png?height=125&width=125"

This means wasting 100-300ms per image on requests that are just going to be redirects anyway. All of this code is encapsulated in the optimized_product_uri() function in the code below

The Full Code

require 'net/http'
require 'nokogiri'
require 'cgi'

def optimized_product_uri(old_uri)
    parsed_uri = URI.parse(old_uri)
    request = Net::HTTP::Get.new(old_uri)
    net = Net::HTTP.new(parsed_uri.host, 80)
    net.use_ssl = false 
    net.read_timeout = 5
    net.open_timeout = 5
    response = net.start do |http|
      http.request(request)
    end
    if response.code == '301'
      response['location']
    else
      old_uri
    end
end

key = 'supersecret'
host = 'open-api.cafepress.com' 
params =  {:appKey=>key,
          :v => '3',
          :query=>'puppies',
          :pageNumber=>'1',
          :resultsPerPage=>'100',
          :maxProductsPerDesign=>'2',
          :sort=>'by_date_desc'}


url = "http://#{host}/product.search.cp?".concat(params.collect{|k,v| "#{k}=#{CGI::escape(v.to_s)}"}.join("&"))
          
request = Net::HTTP::Get.new(url)
net = Net::HTTP.new(host, 80)
net.use_ssl = false #cafepress doesn't support SSL. Krikey!
net.set_debug_output STDOUT #useful for seeing the raw wire messages back and forth
net.read_timeout = 10
net.open_timeout = 10
response = net.start do |http|
  http.request(request)
end

rv = []
doc =Nokogiri::XML(response.body)
doc.xpath('//products/product').each do |product|
  product_rv = {}
  product_rv[:productNumber] = product.attr('productNumber')
  product_rv[:caption] = product.attr('caption')
  product_rv[:thumbnailUrl] = optimized_product_uri(product.attr('thumbnailUrl'))
  product_rv[:marketplaceUrl] = product.attr('marketplaceUrl')
  rv << product_rv
end
puts rv.inspect