Describes several search team experiments at Etsy, and the methodology we arrived at for doing redesigns in light of these experiences. Presented at Warmgun, 2012.
USD $83 million in transactions 4.2 million items sold http://www.etsy.com/blog/news/?s=weather+report Sunday, December 2, 12 We had about 1.5B page views in October which makes us a reasonably large website.
12 Here’s a screenshot of an internal view of the various tests and config rampups running on just one of our pages. As you can see, there are a whole lot of them.
effort into tooling to support this work. This is a screenshot of our A/B analyzer, which automatically generates a dashboard with important business metrics for every configured test.
from some gnarly statistics. This wizard does the math for you and lets you know how long an experiment will need to run in order to have a significant result.
from breaking things. Sunday, December 2, 12 I’m going to call what we do “continuous experimentation,” for the lack of a better term. We try to make small changes as much as possible, and we measure those changes so that we stay honest and don’t break the site.
2, 12 So what do I mean by “breaking the site?” Well, behind every Etsy shop is a person that depends on it, and counts on us not to push changes that hurt their business. So we would be remiss not to measure our changes.
Sunday, December 2, 12 The second reason we measure product releases is so that we stay honest. Much of Etsy’s sales are seller-driven, so our graphs currently tend to go up no matter what. Obviously that can’t continue forever. But we have to use A/B testing to tell if we’ve made things worse or better.
tabs. We should do that on Etsy.” - Typical know-it-all Etsy employee Sunday, December 2, 12 Let me give you an example. A few years ago there was controversy internally at Etsy over whether or not items should open up in new tabs. Some Etsy employees do this themselves when they’re digging through search results, and they wish that it happened by default. They thought that the average user would be happier if this were the case.
2, 12 When we tried that, 70% more people gave up and left the site after getting a new tab. Maybe some Etsy employees know how to use tabs in a browser, but my grandmother doesn’t. We’ve replicated this result more than once.
December 2, 12 is that process has to change to accommodate data and experimentation. If you follow a waterfall process and try to bolt A/B testing onto it, you will fail
at once. A/B test as a hurdle. Assumptions. Multi-stage release. Iterative. One thing at a time. A/B testing integral to process. Hypotheses. Removing the Search Dropdown Sunday, December 2, 12 These were two projects done largely by the same team. Infinite scroll was poorly managed, and a release removing a dropdown in our site header was well managed.
experience. Sunday, December 2, 12 The reason we did this was because we thought that it obvious that more items, faster was a better experience. There’s a lot of web lore out there to that effect, based mostly on some findings Google’s made in their own search.
some bugs. 3. A/B to measure obvious big improvement. 4. Rent warehouse. 5. Hold release party in warehouse. (Implied) Sunday, December 2, 12 So when we decided to do this we just went for it. We designed and built the feature, and then we figured we’d release it and it’d be great.
from search* Sunday, December 2, 12 They bought fewer items from search. Now they didn’t buy fewer items overall, they just stopped using search to find those items. Which is kind of interesting. It was clear we’d made search worse.
thing that occurred to us is that there must have been bugs in the product that we missed. So we spent a month trying to figure out if that was the case. We sliced results by browser and geographic location. We sent a guy to a public library to try using an ancient computer. We did find some bugs, but none of them changed the overall results.
December 2, 12 Eventually we came to terms with the fact that infinite scroll had made the product worse, and we had changed too many things in the process to have any clue which was the culprit.
first place.” Sunday, December 2, 12 So, we were in a situation where we weren’t sure if we should continue working on this or not. Even if we had issues in IE or something, the behavior of people using Chrome wasn’t way better, it was also worse. How do we know if it’s a good idea to finish this or not? So we went back and tried to verify that the premises that made us do this were right.
people get to an item page as the result count increases. Absolutely no change in purchases. Sunday, December 2, 12 And the answer was yes, maybe a little bit, but only barely. There was a very slight improvement in the number of people that ever got to a item page. But the effect is very slight, and purchases aren’t sensitive to this. There’s no increase in purchases when we increase the number of search results.
some bugs. 3. A/B to measure obvious big improvement. 4. Rent warehouse. 5. Hold release party in warehouse. (Implied) Lots of work Didn’t happen Sunday, December 2, 12 So if we go back to our “product plan,” we see a couple of major things wrong with it. We did a lot of work, and it was pointless.
more items is better (easy) 2. Validate premise: faster is better (easy) 3. Either: A. Abort! (easy) B. Build infinite scroll (hard). Sunday, December 2, 12 A better way to have done this would have been to validate those premises ahead of time and then make the call. But we didn’t do that.
work feels really horrible. Most of the time this is a really difficult choice to make, and without a lot of honesty and discipline, most teams aren’t going to do it. We are not very rational creatures in the face of sunk costs.
is not that infinite scroll is stupid. It may be great on your website. But we should have done a better job of understanding the people using our website.
Default to “all items.” 3. Rich autosuggest. 4. Suggest shops in item results. 5. Add favorites filter to search results. 6. Search bars on item and shop pages. 7. Kill the dropdown. Sunday, December 2, 12 So we wanted to remove this thing. Chastened by the infinite scroll release, we did our best to plan this out in smaller steps.
Rich autosuggest. 4. Suggest shops in item results. 5. Add favorites filter to search results. 6. Search bars on item and shop pages. 7. Kill the dropdown. Kill the Dropdown: Project Plan Short. Measurable. Isolated. Sunday, December 2, 12 Each of these steps is small and isolated.
Rich autosuggest. 4. Suggest shops in item results. 5. Add favorites filter to search results. 6. Search bars on item and shop pages. 7. Kill the dropdown. Kill the Dropdown: Project Plan Opportunity to change plans. Sunday, December 2, 12 Each step is an opportunity to get real feedback and change directions if we have to.
Rich autosuggest. 4. Suggest shops in item results. 5. Add favorites filter to search results. 6. Search bars on item and shop pages. 7. Kill the dropdown. Kill the Dropdown: Project Plan Ambitious design goal, never out of sight. Sunday, December 2, 12 And all of the individual releases were small, but the overall design goal was still ambitious.
we did this, sales of vintage items without the dropdown in place increased almost 4%. So we increased the ability of buyers on Etsy to find vintage goods, we didn’t decrease it. Which is a great thing to be able to tell our community.
search dropdown was that it was context-sensitive. So if you were on a shop page it defaulted to searching within the shop. And in some other situations it would search for people.
Sunday, December 2, 12 So you more or less get the idea here. We had a big goal, which we could have been unmanageable as a single release. We did it as ten or fifteen small releases.
Develop Measure Infinite Scroll Dropdown Redesign Sunday, December 2, 12 Contrasting the two release plans, infinite scroll was a big bet that didn’t work out. The dropdown redesign was a series of small bets: some worked and some didn’t, but we didn’t have to throw out everything when things didn’t work