On why doing usability tests remotely is a sham.

Satya Srivastava
Jun 25, 2023
5 min read

Updated: Jun 26, 2023

Here is my take on why conducting usability tests online is not the most effective way to garner product feedback. Although I have been working from home since 2020 March and I love it, I have been able to make some observations about the pros and cons (mostly cons) of conducting usability tests online.

While I agree that each research method, whether executed remotely or in-person has its own advantages and disadvantages, here I just want to present a case for in-person testing due to its high return on investment over remote testing.

Hazards of prioritizing ease of operations over insight robustness

I understand that product teams are typically huge fans of fast, cheap, and effortless research setups. Online usability tests are one of those arrangements. You don't need a physical research lab, don't need a 3rd party local recruiter, and session timings are flexible and easy to reschedule. Unlike in-person research, there are no travel costs, no need to spend two weeks to a month gathering feedback from 8 people, and no need to keep the researcher, designer, and product manager completely engaged during a research trip. When testing remotely, everyone on the team can take care of multiple work items as they are not traveling across the country and spending their time figuring out logistics.

I will try my best to explain why it is important to spend that time, effort, and resources in the product development process. The simplest analogy is to think about what fast food (that is generally cheaper, made quickly, and does not require extravagant culinary skills to prep) does to a growing child versus what whole food prepared with a lot of skill and care can do for a child.

Early development needs nutrition. Products or features that are in their early defining phases, need good quality qualitative feedback to respond quickly to user perceptions and pain points.

1. Biased recruitment and low-fidelity data

Most of the time, there is a recruitment bias on online recruitment and testing platforms like UserTesting, UserZoom, etc. People that have access to laptops, people that generally exhibit high-tech savviness, people that need to make an extra buck, and people that have an online banking system set up for themselves sign up to participate in these tests.

Housewives, subject matter experts, and people that mostly operate on a mobile phone are some examples of profiles that are typically hard to find on these platforms.

The feedback gathered from a remote setup is as low fidelity as talking to a friend over video chat. You don't understand if they are bluffing, network connection issues create challenges in building rapport. One doesn't get any body language cues, one doesn't understand if one should ease the other person in or use the awkward pauses technique. (Yes, I use this technique when talking to my friends and family too.)

Often these interruptions in a conversation can overthrow the flow a participant needs to be in, in order to get immersed in the product experience. Ensuring that participants get immersed in the product experience and have enough time and space to identify how they are feeling and articulate those feelings is the core purpose of these tests. The intention of creating a mock-up for the usability test is to expose the participants to simulations that are as close to the real-life scenario as possible.

Without this immersion and space to express, the feedback we gather is basically a superficial review of things. This is when the participants begin to say things like "I think this will be useful for people that…" They talk in third person, the experience is no longer theirs. THAT'S A PROBLEM. The feedback is superficial.

2. Poor immersion for product teams.

In the few cases that I have done in-person tests, I have never, not had watercooler conversations with participants. This is typically before or after the interview when we are walking them in or out of the lab or waiting with them to book their cab, they talk about the topic at hand, they talk about why this is important to them, or talk about how they feel about the app/company/tech they use. These are important observations about the subjects. They add life to the insights.

Another side effect of these conversations is that it helps designers and program managers build empathy with that user group. These brilliantly educated minds that usually deny user insights, start to develop an empathy muscle by engaging in such conversations. After sitting in the interview room, they develop a skill to absorb insights, synthesizing becomes easier, and reporting results becomes easier. There is no resistance to accepting the insights that are presented, because sitting in that room, the designers and program managers already sense the insight, the report just puts words to what they already know. The last mile connectivity of insight to product actionable is of higher quality because of the in-person immersion of the team.

(This paragraph is for the philosophy nerds)

In his Critique of Pure Reason Kant famously explains the distinction between a priori and a posteriori judgment. Steven Moctezuma, in his Kant's Synthetic A Priori Knowledge, writes

A priori judgments are judgments that arise from reason alone. Such judgments are independent from any sort of experience or knowledge from the senses. These judgments apply with strict universality and necessity.

On the other hand, a posteriori judgments are judgments that arise from experience. Such judgments cannot arise from reason — they must be derived from sensory knowledge. Judgments like the sun is warm is a posteriori. A posteriori judgments have no application of universality or necessity because judgments of experience give particular instances of how things are, not that they must be a certain way in every possible case.

Qualitative research is broadly based on inductive reasoning, inductive reasoning is a posteriori judgment. Hence, it is textbook to ensure that all stakeholders (Designers, program managers, researchers, and the participants) immerse in the testing experience.

3. Online testing platforms are built for the customers, not for the researchers or participants

When interviewing in developing countries, tools like Validately fail to perform. When trying to speak with people that access the internet only through their phones, web-first platforms fail to provide an immersive experience. The testing platforms tie up with standard web-based video conferencing services like Zoom and Gmeet.

Especially when trying to speak to participants with low storage space on their phones. They dislike downloading new apps, onboarding them onto a complicated login method, or tapping on a link is a hassle, their internet speeds cause a lot of buffering and their phones heat up or begin to hang. All this while the design mock-ups are barely visible (if the researcher screen shares the mockup)

How the participant views the mock-up on their phone:

I have blurred all brand-identifiable parts from this mock-up to ensure I don't get sued.

This is a screenshot from an actual remote usability test my team had done at one of my orgs.

What the researcher sees on their desktops:

When mobile-first users are testing out mockups, while the moderator is presenting their screens, participants' mobile phones that have low contrast screens, small screen sizes, and are in an environment that is familiar to them, is distracting what must we do to ensure the mockups are designed such that experience immersion is achieved?

Clearly, the interface of the testing platform doesn't work to avoid these UI mishaps, nor do the designers take an effort to ensure the participant has an easy-to-onboard and immersive testing experience.

4. Conclusion

All in all, prioritizing the participants' testing experience is of qualitative value. It's hard for me to put a number on it and say "In-person or immersive testing is 80% more efficient as compared to remote testing". Since organizations prioritize quantitative value, it'll always be hard for qualitative researchers to demand better funding.

Given these circumstances, it's typically the burden of a researcher to ensure that 'Best practices of making a mock-up' or 'best practices of conducting a remote usability test' is propagated in their team's culture. Finding allies in the product team usually helps the cause. Until then, this article stays on the internet as a rant :)