Pawel Ciucias: August 2021

Thursday, 26 August 2021

What to look for in a user test

When conducting a user test it is important to know what to look for:

Where users able to complete the tasks asked of them? and WHY or WHY not
Where users able to compete their tasks promptly and simply? and WHY or WHY NOT
Did users make mistakes? did they realize they made a mistake? could they recover from their mistake? Why did they make a mistake? Why did they realize it?
How did users feel about the experience? and WHY did they feel that way?

the key question is why? when we are running a user test we want to know the root cause of a problem or a success we want to know why something works or doesn't work so that we avoid design pitfalls and move towards effective designs.

When conducting a user test we also want to capture information during the test

Qualitative Data

Critical incidents: things that happened during the test that may explain the results

Verbal account: Statements made by the user that indicate the thought process, attitude, and explanations of the user

Quantitative Data

Time to complete a task

Success rate of task

Critical incidents: are the bread and butter of user testing, they include any action taken by the user that explains why they where or where not successful at a task, they can include things like:

Clicking the wrong button
Ignoring the instruction shown on the screen
Providing the wrong information
Following the wrong path
misinterpreting a label
expressing confusion or frustration
asking for help
staring at the screen for a long time
giving up

Verbal accounts: are just the things that users say while performing tests, they can provide insights into what's going on inside the users head while performing the test, they can say things like:

I"m looking for
I was expecting to see
i wonder what this does
Well that doesn't seem right\
I think that was right
ask a question

these types of qualitative data are essential for us to establish actionable intelligence that is information that we can leverage to fix a problem, for example quantitative data could be something like 40% of users failed at "Task A", but qualitative data would be "2 out of 5 users could't figure out how to fill in their shipping info resulting in a failure of task A", an even better result would be to link your qualitative data back to a heuristic guidelines.

Data about users
Not only should you collect data about the test, how it went, what the users did or thought but also information about the users themselves; things that will provide better insight into the users being tested. things like:

Technical competence:familiarity with computing or with the particular platform, IOS vs Android vs Windows or Mobile vs Desktop
Domain Expertise: if they are familiar with this particular domain, if it's a social media app; do they use social medai, if it's a shopping site do they do online shopping
how frequently do they partake in this behavior
general demographics

Age
gender
Education
Ethnicity

the goal of collecting data about our users is to better understand the whys? why a certain group of users failed a specific task; most often the deciding factors will be technical competence and domain expertise.

Saturday, 21 August 2021

Formative Tests

Formative tests are performed to identify problems that need to be fixed, they are far more common than Summative tests. where summative tests focus on quantitative data such as 90% of users where able to complete "Task A" in under 30 seconds. Formative focus on qualitative analysis of problems such as:

Users struggled to accomplish "Task A" because the button required failed to convey the information that it was a button.
Users failed to took too long to complete "Task B" because the the information required wasn't readily available and they had to go searching for it.

Formative tests are the most common type of test leveraged in UX research and design. Formative tests are performed during the design phase with the goal of identifying "Bugs" to fix.

General procedure

Have representative users perform tasks
Observe them, what they do, what they say
capture where they struggle
Identify parts of the design that cause problems

Summative tests	Formative tests
Users perform tasks
Representitive users
Prove a point	Find a problem
Quantitative	Qualitative
Many users	Few users
Rare	Common

Monday, 16 August 2021

Summative tests

Comparative summative tests: to determine if a new design is better than a legacy one, or compare two new designs, ones we designed or ours to a competitors to see which one is better suited.

it is important to demonstrate using metrics that one design is superior to the other:

30% more users completed tasks A,B and C using interface 1 vs's interface 2.
users completed tasks 45 seconds faster using interface 1 vs's interface 2.
errors where reduced by 25% using interface 1 vs's interface 2

tests are conducted with the same types of users, but not the same users. once compete a comparison is conducted with regards to the differences in performance based on interface; it goes without saying that the more times you run a test with the greater number of users, the more reliable your results are.

High level Procedure:

Hypothesis: make and educated guess as to the results of your experiment
Control group: compare one group to another
Only interfaces should vary, user types should be the same (not the same users, but the same profile of users); Tasks to complete, data in the system should also be consistent across the interfaces being tested.
Use statistical comparisons to demonstrate results, t-test, chi-squared, ANOVA, put some numbers to your claims.

Benchmark summative tests: are used to answer the question of whether or not our interface meets a performance requirement, for example that users can create an online profile in 30s and start using the system or 95% of users tested succeeded at accomplishing "task d" or users make errors less than 5% of the time.

The benchmark summative test is most appropriate when
"Hard task constraints" (eg task must be completed in x amount of time); a perfect example would be a automated kiosk at an airport.
there are defined targets: operators must process a product in less then 30 seconds. with a 1% margin of error.

benchmark summative tests are often appropriate for performance critical domains, such as healthcare or military

High level Procedure:

Test users performing tasks using design
capture the performance of the tasks being completed, accuracy, speed, success rate
Demonstrates that metrics captured meet defined criteria
Use statistical methods to to calculate confidence interval
again since you'r using statistical analysis you really need to test a large number of users to get a good level of confidence in your results

Take aways

Summarative tests are used to determine that a design is better or good enough; they are used to summarize a characteristic about a design and want to make a claim about; your claim is supported by statistical data, summarative tests are rare in User research because the background in statistics that's required. as well as the increased number of users required.

Summative tests	Formative tests
Users perform tasks
Representitive users
Prove a point	Find a problem
Quantitative	Qualitative
Many users	Few users
Rare	Common

Wednesday, 11 August 2021

Heuristic Evaluation

Heuristic Evaluation is poor mans UX inspection method, it's cheaper & faster then usability testing, it also doesn't require users which why it's cheaper & faster. It's an inspection method, you go through the interface and apply Nelsen's 10 Heuristics

Visibility of system status
Match between system & real world
User Control & Freedom
Consistency & Standards
Error Prevention
Recognition instead of Recall
Flexibility & efficiency of use
Aesthetic & minimalist design
Help user recognize, Diagnose, and recover from errors
Help & documentation

Heuristic Evaluation Technique

Select system or set of screens to evaluate
Step through the user journey and apply heuristics to potential problems

Be sure to test error case
Be sure to test help system

Write down all violations big or small

Which heuristic they violate
Asses the severity of each violation

Cosmetic problem: no real user experience impact
Minor usability issue: Fix if there's time
Major usability issue: Important to fix
Usability Catastrophe: must be fixed

Create prioritized list of violations

Highlight top 5 to 10 violations
Rank in descending order of severity
Use heuristics to explain importance

Describing a heuristic Violation

Description: Drop down list is not identified as such be inspection

Severity 2/4: Minor issue

Heuristic Violated: Recognition instead of Recall

Summary: the user should be able to identify that the "All Tasks" filter is a drop down with out having to remember to click on it.

Screenshot

Recommendation: Add a downward arrow to the right of the selected value indicating that it's a drop down menu.

having 3 to 5 experts evaluate a system individually then pool their findings is one of the most effective ways to identify a large share of violations

Heuristic Evaluation	User Testing
Cheap Fast Doesn't require users	More Realistic Finds more problems Asses other UX qualities beyond Usability

It's normal to use multiple techniques in an iterative process to flush out all usability issues from your system

Summary

Heuristic evaluation is a quick and cheap way to identify significant flaws in a user interface
leverages Nielsen's heuristics
Inspect each screen, potential errors, and help options
capture and asses violations
prioritize

Tuesday, 10 August 2021

Jakob Nielsen's 10 Heuristics

Visibility of system status: the system should always keep the user informed of what is going on within a reasonable amount of time using progress indicators or busy signals.

if an action will take less then 100 milliseconds it's instantaneous
if an action takes a second it's noticeable but acceptable
if an action takes less then 10 seconds it requires a notification that something is happening
if an action takes more then 10 seconds it should be done in the background, with indicators and estimates

Match between system and the real world: the system should leverage real world language that is familiar to the user rather the system jargon. follow real world paradigms making information appear in a natural and logical order.

Takes advantage or users schema
Align real world actions with digital ones
processes should just make sense.

User control and freedom: Users often choose system functions by mistake and will need a clearly marked "Emergency Exit" to leave the unwanted state without having to got through an extended dialogue. Support undo and redo.

Support 7 stages of action, perhaps user wants to try approach again but with deviation.
Users employ trial and error approach to figure out how to use new system.

Consistency & Standards: users should not have to wonder whether different words, situations, or actions mean the same thing. Follow platform conventions

helps users transfer schema knowledge from one part of the system to another.
coherent conceptual modal helps user learn system effectively.
use the same term in the same what through system, don't use search and then find
by staying consistent you help users more rapidly learn new systems by following paradigms they're already familiar with.

Error prevention: Even better the good error messages is a careful design which prevents a problem from occurring in the first place. Either eliminate error-prone conditions or check for them and present the user with a confirmation option before they commit to the action.

in-process feedback, give the user feedback before they submit, ie invalid email
provide constraints, keep user from making mistakes by asking for input very specifically.
confirm if the user is trying to accomplish something dangerous like delete all files
prevent users from actions that are likely to fail

Recognition over recall: Minimize the user's memory load by making objects, actions, and options visible. The user should not have to remember information from one part of the dialogue to another. Instructions for use of the system should be visible or easily retrievable whenever appropriate.

try to use recognition over recall
is it reasonable for users to have to recall something
if recall fails are there cues to help

Flexibility and efficiency of use: Accelerators, unseen by the novice user may often speed up the interaction for the expert user such that the system can cater to both inexperienced and experienced users. Allow to tailor frequent actions.

recall for new or inexperienced users is very difficult whereas recall for veteran users is not a problem and both types of user need to be catered to.
allow users to customize their experience but don't force them to
accelerators are things like keyboard short cuts
allow users to create bookmarks and shortcuts
personalization tailors experiences based on past usage

Aesthetic and minimalist design: dialogs should not contain information that is not relevant or is rarely needed. every extra unit of information in a dialog competes with the relevant units of information and diminishes their relative visibility.

Visual clutter makes it difficult to focus on important actions
good use of color, share, animation and gestalt principles guide the eye
the more there is to look at the less the user will see.

Help users recognize, diagnose and resolve errors: error messages should be expressed in plain language (no codes), precisly indicate the problem, and constructivly suggest a solution.

good error messages will explain what the user did wrong and how to fix it.

Help and documentation: Event though it's better if the system can be used without documentation, it may be necessary to provide help and documentation. Any such information should be easy to search, focused on the user's task, list concrete steps to be carried out and not be too large.

sometimes the UI just isn't intuitive enough
best if help is not needed
if required make sure that

help is searchable
task-focused
concrete

these 10 rules of thumb are in place to help design a system that is a pleasure to use

Thursday, 5 August 2021

Micro Usability testing

A usability test is a formalized test in which participants are drawn from the target audience, to use pragmatic approach early on we can leverage a "Micro Usability test" this is a scaled back version of a user test:

Users can be "Close Enough" to the target audience (whoever is willing and able)
Fewer tasks (2 to 3 tasks) closer to 15 to 20 minutes instead of (5 to 10 tasks) 1hr to 1.5hrs
No screen recording of user actions
No video recording of user
No logging of user actions
No questionnaires regarding users persona

The goal is to just capture the 2 to 3 biggest takeaways from the micro usability test, there's no need for correlating the data and doing a in-depth analyses of the what why's and how's

When conducting a micro usability test you want to come up with 2 to 3 tasks:

each task should be presented to the participant separate from the other tasks
order the tasks from easiest to hardest
the tasks should be clear concise
tasks should have a clear and defined solution
When completing the tasks the user should use the "Speak out loud" technique
and when they think they're complete the user should notify the tester

once all the tasks are complete, or the user gives up this is the time for the debrief, this is the testers opportunity to engage with the tester and ask them:

during this point you seemed "surprised", "frustrated", "confused"; could you tell me why, what was wrong with the system. etc it's your opportunity to really investigate and find out what the user was thinking or feeling.

after your ad-hoc investigation you should ask some predefined questions for general feedback

have you ever used a product like this? why or why not?
do you see yourself using something like this why or why not?
some questions that are particular to what your testing.

Once the test is run and the test and post test data is collected, well its time to compile it all into a micro usability test report. The report should consist of 3 sections:

1) Key observations
A few paragraphs about key observations throughout the test

Describe the participants, write a persona (Who they are, what kind of experience do they have, with similar systems and with technology over all)
How the test went overall
Success rate of tasks
Partial or complete failures of tasks

2) Problems

Focus on the top 3 to 5 biggest problems observed and diagnose the cause of those problems, things to focus on:

What worked well? What didn’t?
What were the most confusing or frustrating aspects of the interface?
What errors or misunderstandings occurred?
What did users think about the interface?
What would they like to see improved?

3) Recommendations

list the main issues that where brought to light and back them up with evidence from the test, and propose recommendations as to how to rectify the problems.

Sunday, 1 August 2021

User Testing

User testing also known as "Usability testing" is one of the main methods used for user testing, at it's core it's basically giving a user a task to accomplish within the system and observing the user try to accomplish that goal. By observing users work with a system you learn

What works and what doesn't
Why things work and why some don't
User needs you missed or misunderstood

the basic flow for running a "user test" is

Find potential users

When picking users, make sure to select ones that are the target audience
Pick users that are not current users

Give them tasks to complete within the system

Selecting taks is for users to try is much more difficult then it seems
Start with the most common tasks
Move on to less frequent tasks, focus on the most common tasks and move in decending order
Closed ended tasks,

ones that have a clear and defined point of completion
have a verifiable outcome
follow a predictive path

Open ended tasks

Are more natural
difficult to asses success because of ambiguity
explore paths that may not have been identified

Use both open and closed ended tasks.

Observe them compete their tasks
Debrief them after they've successfully or unsuccessfully completed their tasks
Document what you've learned

we do this as part of our assessment iteration so that we can redesign our system to work better.

When selecting our task sets some things to keep in mind are

order form easiest to hardest
focus on critical tasks the things the system must accomplish
should include both "open" and "closed" ended tasks
avoid ordering affect: giving a user the answer to a subsequent task in the current one
don't lead the user: avoid language that will diverge how to accomplish the tasks
avoid ambiguous instructions: when defining you task be specific enough that the user will understand clearly what you want them to do and how, but without leading them.
tell the user to indicate that they feel they've completed the task
pilot the tasks

check the task yourself and have some colleagues try them out to ensure that they meet the above criteria

Think out loud

participants verbalize out loud what they are thinking as they're accomplishing their tasks:
looking for something
reading text
hypothesizing how the system might work
interpreting system options
interpreting system feedback
explaining their decisions
feelings: frustrated, happy, etc

it's not normal for users to do this "Think out loud" process, so don't hesitate to remind users that you're interested in how the feel or what they're thinking, use positive reinforcement to coax their thoughts and opinions out.

advantages of this approach are:

hear how the user thinks about they task
learn what the user actually sees and notices
hear how the user interprets options and feedback

disadvantages of this approach are:

timing: since users are vocalizing what they're doing they wont zip through the system as quickly as the might otherwise
Attention to detail: since users are vocalizing and paying more attention to what they're doing they may notice things that otherwise would be overlooked.
users will naturally ask questions, but as the observer you are not suppose to answer them

Post user test (debriefing)

once the user test is complete you can:

Review the users problems to try and get more information out of the user
ask the user if they find this product useful, if it's something they see themselves using
ask if it was usable, credible, aesthetically good looking
compare it to existing alternatives

What have you learned?

after you've run your tests and completed you debriefing it's time to summarize your finding, what you should focus on are the critical incidents

errors: where users didn't follow the correct path expected didn't do what was expected
expressions of frustration: users got stuck, seemed confused as how to proceed
breakdowns: where simple tasks took a long time, or users detoured from the defined journey but still got to where they where suppose to
pleasant surprises: things that the user enjoyed, things that where easier then expected

Assess if the user failed or succeeded and to what degree? Capture the users demeanor, where they happy with they system or do they think it's a load of bullocks. most importantly capture as much objective and subjective data as soon as possible, ideally during the test, debrief and directly after. Write down all the critical incidents you can:

Mental modal mismatches
Misinterpretations
Invalid assumptions made by the system
Missing user needs
Too little flexibility
Too little guidance

while summarizing your results you really want to capture overall reactions to specific aspects of the system and link those with the users successes and failures.