Usability Gnome

Tuesday, May 13, 2008

Do Searchers Search More Over Time?

A paper from 2004 recently fell into my hands. It's from the journal Management Science, so you'll have to go there to get the paper (you'll need an account). As is usual in journal papers nowadays, it has five authors, Johnson, Moe, Fader, Bellman, and Lohse.

They did several studies, partly focused on asking whether people use search more often as they get more experienced on the Web. They also looked at how much people searched for sites when they wanted to buy something.

The results might be surprising to some - even in the age of search, users don't like to check out a lot of e-commerce stores. The majority prefer to settle on a few stores and go there rather than constantly checking out new ones. And even when they get more experienced with search, they don't use it much more. And they found that users don't search as much as you'd think.

None of it surprises me, although they didn't account for some factors, like age. Marketers have long known that past a certain age, the willingness to try new brands drops like a stone. The authors didn't break out sessions by individuals, but by households, so there's likely to be a lot of slop to the data.

The question that occurs to me is whether, if such brand loyalty online is real, it's due to actual loyalty, or reluctance to tangle with a new interface. E-commerce isn't so much like brick-and-mortar shopping as it like operating software, and few users enjoy mastering new software. Is it the label, or the comfort level?

There's also the fact that many users dislike searching unless they know a strong keyword. Look for "toilet seat" and you're likely to find an online hardware store. Search for "pens" and you'll end up with specialty stores, stationers, and collector sites, all of which have to sifted. Google is good, but it's not clairvoyant.

Monday, April 28, 2008

Third-Party Perils

A lot of client sites that I evaluate have tagging problems that aren't really of their own making. We have clients "tag" their sites for analytics purposes to send data back to our mothership, which is then returned to the client as reports. As you undoubtedly know, it's been getting commonplace to "farm out" a certain part of a site to third-party suppliers. Many clients, for example, now out-source their employment pages, with just enough matching page elements to make the visitor think they're still somewhere in the same site. Same thing with newsletters and emails - other people handle it for you. The problem is that those sites usually aren't tagged, so you can't track them. No tracking, no evaluation. Again, small sites aren't deeply affected, but bigger ones are. If you can, work it out with your vendor to let you tag their pages, or have them tag the pages. It's not a new request for most of them. Don't ignore such vital functions as recruitment and marketing contacts.

Saturday, April 19, 2008

Engage This!

I'm afraid that I have to take exception to yet another Web buzzword. This time it's "engagement". It's hot right now. Just ask Eric Peterson, who's making a little cottage industry out of his own "Engagement Index". Please. Make it stop. "Engagement" is no better defined than "intelligence", "happiness", or "it sucks".

I'm really a numbers kinda guy, with the heart of a researcher. That means I resist sloppy thinking. And "engagement" is just that, sloppy thoughts. Naming something and believing that you've driven to the heart of it. Peterson's various components may have merit, but he's going about this all the wrong way. Ideally, you study a big group of things and then derive patterns using standard statistical techniques. You don't just wish them into being, no matter how sure you are that they exist. Then you validate your model against a known situation and see if it holds up. If it wavers, fragments, or veers wide of the mark, then your model is faulty.

So far as I can tell, Peterson has never subjected his model to rigorous validation. His various engagement components in the index aren't weighted, so as one rises another could fall, leaving you with the same EI, but with a different situation entirely. I think anybody who relies on a single-measure EI to make expensive business decisions is playing with a loaded gun with the barrel plugged.

That's not to say that "engagement" could never be defined. It can. But it can be defined only as a series of KPIs that shouldn't be arbitrarily added together. A simple radar chart could show them all. So could time series charts. And it should be defined anew for each site. The quest for a standardized index will go on, but in the end I think it's futile. Adopt a Deming approach and keep working on your own special site. I don't think there are any shortcuts.

Tuesday, April 8, 2008

More Unintended Consequences

I love all aspects of how humans interact with technology, so I was particularly interested in seeing how well the new crimecams of San Francisco would work out. Turns out they're very effective in reducing crime - within range of the cameras. Were the designers of this system not parents? Even toddlers catch on that if you want to misbehave and not get caught, you move out of sight. Mayor Gavin Newsom voiced the paradigm of a generation when he said that the cameras at least made people feel safer. This would seem absurd if it weren't followed by the next quote. Paraphrased, it says that citizens felt safer because crime moved away a block or two, so that their neighbors would have to deal with it instead. The ultimate nimby. Newsom even says that he anticipated some kind of felon shuffle when he had them put up, but that voters generally liked them. The fact that the cameras might be able to zoom in on their bedtime activities doesn't seem to faze them.

Friday, March 28, 2008

Old Houses and Portals

I live in an older home (1920s) in a historic neighborhood. It's not a particularly wealthy neighborhood. Most historic ones aren't. Money flees its breeding ground. But the neighborhood is comfortable and reasonably vibrant. I always wondered why I loved older homes, and finally one of my gurus, Stewart Brand, might have explained it in his book How Buildings Learn: What Happens After They're Built. He says that older buildings exude what we call "charm" or "character" because they've been altered over time to suit both changing infrastructure needs (the arrival of central heating, air conditioning, indoor plumbing, electricity) and the changing life needs of its occupants (bigger kitchens, more light, more entertainment at home). They grow, morph, and gradually conform more closely to actual human life, like an old pair of jeans. New homes are raw despite their efforts to "design for life". Brand points out that buildings can't be designed up-front for our lifestyles, because no designer can get it right the first time. That's why a home needs so much time to find its proper shape.

What's the lesson for Web designers? Alas, probably not much, despite my most earnest desires to bring the analogy across. The missing element is time. Websites don't give you time. Portals were supposed to let users modify their views quickly, compressing the decades of home conformity into minutes online. Never worked. The vast majority of visitors never knew about customization or took the time to mess with it. Personalization works to an extent, but not completely. Web users are now used to their comfortable sites changing regularly, and although they may not approve, they rarely boycott on that basis.

That said, for years I've been fascinated with the idea of a personalization engine that would track Web user behavior and subtly shift the interface to suit. I've never bothered to fully flesh out the concept, but in general it would work much like Microsoft's failed personalization functionality in Office, the one that gave you chevrons instead of full menus. It was a good idea, but possibly the wrong place to use it. Office users are almost all repeat visitors. Website visitors aren't. Amazon does a good job with personalization, but I'd extend it from "you might also like this stuff" to actually shifting controls and navigational paths. A pipe dream, certainly, but given a huge pile of cash something I'd be interested in researching.

Saturday, March 15, 2008

Reading TeaLeaf

Sorry to have been away so long. Complications of various kinds. But now I'm back, and with tea.

Have you seen TeaLeaf? It's a snazzy app that sits athwart your Web traffic, sniffing and recording every user's session. A bit disconcerting, that. But its benefits are undeniable. It stores thirty days (or more, at your discretion) of user transactions, at the user level. It aggregates them too. I've long been a proponent of continual usability checking. Our profession seems to put all its emphasis on initial design and testing, while utterly neglecting Web analytics and other red-flag functionality that can signal usability leaks. Traditional Web analytics is good, but it isn't always granular, meaning that its results are en masse, not at the level of the individual user. It's great for marketing departments, but not as good for usability concerns. TeaLeaf shows the actual user transactions - where people go, what they click, what choices they make, and whether their conversions are successful.

For example, you can lose users at any turn in the road, but especially during checkout. Many visitors drop off when money becomes an issue, and understandably so, since they had no intention of paying anyway; they're just here for the experience, or the knowledge. But others experience technical problems or usability pitfalls. TeaLeaf generates a report on who converted and who didn't, and then you can track out why the failures happened, following every user's trail.

Saturday, December 15, 2007

Conversion Rates and The Types of Visitors

Wendy Moe and Peter Fader published a paper in March 2003 titled "Dynamic Conversion Behavior at e-Commerce Sites". In it, they talk (among other things) about the types of visitors to e-commerce sites. This struck me because in analytics we tend to lump all visitors together, just because we can't easily define the segments. Moe and Fader mention the obvious: conversion rates on e-commerce sites are spectacularly low compared with physical stores, below 5% in most cases. Any brick-and-mortar store would have closed up within a week at that rate. They analyze why rates are so low, mostly because so many visitors aren't really immediate buyers. Moe and Fader classify visitors into four groups, only one of which is the get-in-get-out immediate buyer type. Note that I've added notes from my own perspective, so Moe and Fader may take issue with how I'm using their categorization.

Direct buyers. They come, they choose, they slap down the plastic. They enter knowing what they're looking for. The site can be marginally usable and unattractive, and they'll still probably buy.

Indirect buyers. They know generally what they want, but they're browsing. Probably will buy, but will take a while and lots of pages. May be influenced somewhat by site characteristics, but not extensively.

Threshold buyers. These aren't ready to buy, but they're curious and skittish. They're window-shopping. For these visitors, site elements are everything. If the site isn't sticky, they'll leave. Likely influenced by usability and attractiveness. Store impression is as important as product.

Never-buy visitors. These are seeking knowledge, not product. Not likely to be influenced by site appearance or usability. May look at lots of pages or very few. No intention of buying.

Now, these aren't mutually exclusive. I switch modes. I may go to Amazon to order a book I've been wanting, or I may go there just to see what's inside of a particular book that my campus will order.

The relative size of each group has a huge bearing on site owner strategy, but all four groups are typically crammed into the data and dashboard indiscriminately. In statistics, we call this "conflating populations". We see it often in distributions with multiple "hills". And it makes analysis almost a blind operation. For example, if you're seeing a large per-session page view rate, but the conversion rate refuses to rise from 2%, is it because you haven't satisfied the threshold visitors, or because you have too many never-buys, or because the population is mostly indirect buyers? One solution won't hit all of them, so choosing to optimize something on the site may not be the answer. For example, if you streamline the checkout, that may help to capture more of the threshold buyers, but if they're actually only a small percentage of the visitor count, you won't see much of a bump in conversion.

For very large sites, picking the right optimization strategy may make the difference between a huge loss and massive improvement, six figures or more. So how do we segment these populations? Surveys always beckon to us, but I'm skeptical. Surveys online are always self-selected, and self-selection seems to me to invalidate most survey data. There may be clues in the analytics data itself, but I have yet to find a formula. Moe and Fader propose a formula (in fact that's the major purpose of the paper). It would provide a good basis in the real world if only our figures for new and returning visitors were accurate, and they're decidedly not. Several studies have confirmed that cookie-based figures for new visitors can be off by a factor of 2 or more. Unfortunately, if we don't know who's coming to the site, we can't segment them, and without registration we just can't be sure.

Still, keeping these four categories in mind will help when doing site analysis and optimization. If we can make an informed guess about which category of visitor is dominating, we can advise the client accordingly. For example, if we get a lot of visitors to particular pages that have a lot of information, and the pages are obviously being read, and the product is unusual or truly new, then we may have a lot of never-buys. This intuitive approach isn't completely satisfying to me, but it may be all we have.

Tuesday, November 27, 2007

Scott Adams and the Demise of Common Sense

Scott Adams, the creator of Dilbert, has announced on his blog that he'll be blogging less often. It seems that his original common sensical expectations about how the blog would turn out aren't coming out well at all.

They original expectations included:

1. Advertising dollars
2. Compiling the best posts into a book.
3. Growing the audience for Dilbert
4. Artistic satisfaction.

Of these, only number 4 has worked out. RSS has made visitors go around the ads, the book hasn’t done all that well, and the audience for Dilbert hasn’t been correlated at all with the growth of the blog. As the blog has exploded, the benefits to him haven’t. So he’s talking about blogging less often. It’s a great illustration of how common sense is a lousy predictor of future events. Viva testing and statistics.

Wednesday, November 21, 2007

Numbers Aren't Always Useful

A while back I worked with a client who used a popular service that provides percentage figures for visitation of others' websites. This service contracts with ISPs to get sanitized web visitation figures from its subscribers, some ten million of them last I heard. Then it reports mass figures on who went where. The problem is that it's impossible to find out just what those numbers are - all you get from the service is percentages, which are presumably percentages of the subset of ten million that went to that particular site on any given day. How many is that total? Nobody knows, and the service isn't telling. In my client's case, they were getting figures like .0016%, which is so low that it isn't hardly worth knowing, but they were very keen on it, watching the numbers shift daily from .0016% to .0021%, and cheering lustily at the uptick. Of course, .000016 X 10^6 is only 160 (visits, users? who knows?), and the surge amounted to only 50, a number that didn't make them so cheerful when I pointed it out. And that assumes that the denominator is indeed ten million, which is probably isn't - it's doubtful that all ten million subscribers are online on any given day. The actual improvement might have been as little as 10, or less.

The problem is taking percentages as real numbers. They need to be scrutinized, and you need to know what the denominator is. We called the service to find out just what those percentage figures meant, but they either wouldn't, or couldn't, tell us. The people we talked to were frankly ignorant of simple statistics, calling the ten million subscribers a "sample". I had to tell them that it wasn't a sample, it was a sampling frame, and that without knowing the denominator for the percentage, the tiny percentages the client tracked were all but meaningless. The service had no answer, and the call ended unsatisfactorily. And the client continues using the service to this day, and happily reporting the results to higher-ups.

The service may be worthwhile for larger sites that get a sizable percentage of whatever data points the service is tracking, but any sites smaller than that are probably not getting their money's worth. I used to think that people with marketing degrees would be more conscious of the fallibility of numbers and the need to understand analytics, but I've been wrong more often than right.

Sunday, November 18, 2007

Common Sense Sucks

As HCI practitioners who believe in making technology slicker to use, our biggest opponents may be budgets, but coming up strong on the outside is stupidity based on common sense. Human psychology is a weird and wonderful thing, and I love studying it because there's always something unexpected waiting to mug you around the corner. Where others love bar fights, I love getting pasted by new knowledge. And research has shown for a long time that common sense is a lousy predictor of future events. You'd think that humans would be good at prediction by now, but we're not. For one thing, we think in linear terms - one cause, one effect. But the universe is gloriously nonlinear, and every event has many causes. Further, humans tend to predict based on what's going on now. When we ask "how would you like this?" or "how would you feel about this?", the respondent has to extrapolate based solely on how he feels at the moment. When we put this to the test, we find that the respondents don't really feel that way later on. This is why I put no faith in predictive surveys.

When those in power use common sense to make sweeping laws or spend huge sums of money, they too usually screw the pooch. The Freakonomics blog has a short piece on how abstinence-only sex education has actually resulted in more teen pregnancies, something any psychologist could have predicted. The urge to merge is far too strong to educate away, especially in adolescents who haven't yet developed much control over their impulses. The victims of abstinence-only education aren't given the tools to prevent disease or pregnancy, but they're driven to give in to the mating call anyway, resulting in more pregnancies. It's obvious from studies that abstinence-only ed doesn't work, and New York has appropriately dropped it, despite losing millions in federal funds. But the US Congress stubbornly sticks to the plan. Common sense is dooming teenagers and costing millions, all with no substantial foundation, but that doesn't matter.

The lesson for us is to mistrust common sense, both our own and our clients'. I've seen clients cling to old, unusable designs simply out of faith. Analytics, psychological principles, and user testing will eliminate most of the problems if they're used, but they won't be applied if common sense has anything to say about it.