Hotjar · Behavioural Analytics · SaaS

Hotjar · Behavioural Analytics · SaaS

Helping customers find & act on relevant moments.

Helping customers find & act on relevant moments.

A case-study demonstrating the approach taken to get a deeper understanding of what makes a screen recording worth watching in order to power the recommendation engine (AI and machine learning – before ChatGPT).

Key Results (KPIs)

  • Sharing, favoriting or commenting (proxy success metric) increased by +5.81%

  • Rating as worth watching increased by +9.88%, significant (p=0.039).

  • UMUX Score of 79.3.

Summary

Summary

Summary

An in-depth analysis of Screen Recordings (product), using surveys, analyzing the usage of manual tags and comments, running customer interviews, to clarify and validate what makes recordings worth watching, as well as determine how customers act once they find an insight.


In our research, we found that product teams watch recordings to identify issues and prioritize them. In particular, they watch recordings after launching a new release to determine whether it's working or not. It takes a lot of manual effort for insights to be acted upon. That is, if they actually come across recordings that are relevant to them.


Sticker labels were introduced to make it easy for users to flag relevant moments in recordings, as well as to power the recommendations engine to recommend relevant recordings.

My role & responsabilities

Research, surveys, analysis, run workshop, sketch, design, prototype & test.

Launched · Duration

June 2021 · ∼3 months (research/design – implementation)

Stakeholders

Product Lead, BI & Analytics team, Group Product Lead, Tribe Design Lead, squad engineers & other affected squads.

Tools

Figma, Miro, Google Meet, EnjoyHQ, Google Sheets, Confluence, Calendly

Context

Context

Context

Hotjar equips product teams with Product Experience Insights, showing them how users behave and what they feel strongly about, so those teams can deliver real value. Discover product opportunities, consolidate qual and quant, and communicate user needs.


Recordings allows you to watch real user sessions to see exactly what your users see and to find and fix hidden friction and conversion blockers.

The Business Problem

The Business Problem

The Business Problem

While HJ captures a lot of recordings, only a few of them are viewed by our customers. So there’s likely a lot of interesting recordings they’re not watching. Those they do watch may not be the most insightful either. Indeed, the current % of accounts watching recordings that favorite, comment or share them – our proxy success metric for getting insights out of recordings – is still low. In other words, almost 9/10 of the watched recordings are potentially irrelevant.


Hence, we needed to find a way to recommend the right recordings. This is so that the few recordings they do watch, are more likely to help them find insights.

The Process

The Process

The Process

Understand

Define

Design

Prototype & Test

Understand

Empathise with customers to (in)validate our assumptions, understand their needs and uncover any pain points.

Research Goals

Research Goals

Research Goals

  • Understand the goals and context of product managers and UI/UX designers when watching recordings

  • Validate and crystalize what exactly makes recordings worth watching, and NOT worth watching

  • Validate our proxy success metric of sharing, favoriting or commenting

  • Learn how customers extract & action the insights they find in recordings

  • Find pain points & areas of opportunity to better inform and validate our roadmap.

Partner with BI & Analytics - Key Stakeholders

Partner with BI & Analytics - Key Stakeholders

Partner with BI & Analytics - Key Stakeholders

Together with the product lead, we shared our plans for a potential recommendation engine with the BI and Analytics team. With support through async Slack channels and multiple calls, the analytics team analysed the correlation between session meta data and favouriting, sharing, commenting or re-watching (our proxy for insightful moments).



Direct feedback from our customers

Direct feedback from our customers

Direct feedback from our customers

Thumbs up/down rating – This was the first step to getting direct feedback from our customers (quick feedback to get more engagement). The idea was that they would mark recordings as worth watching or not by a quick interaction found in the top right of the player.


We also used this as the first step towards training our data for future recommendations, since machine learning requires a boolean.

Dig deeper into the feedback

Dig deeper into the feedback

Dig deeper into the feedback

Follow-up survey – After users marked their first recording as worth watching or not, we triggered a popup survey asking them to elaborate on what they saw that made it relevant to them. Upon completing the survey, we invited customers to participate in an optional customer interview via Calendly (software for scheduling).

Additionally, we set up an external survey aimed at power customers asking when they last saw something really interesting in a recording that they took note of or shared with others. After the survey, we added an option to leave an email address if they want to be contacted to conduct a customer interview.

Customer interviews

Customer interviews

Customer interviews

I conducted 15, one-hour-long customer interviews for this project. Almost all were power ICPs (ideal customer profiles) who answered our survey about the last time they saw something interesting in a recording that they actioned on.


During these interviews, we explored their motivations for using recordings, specific moments they thought were truly insightful, and how they acted on them.

Define

Analyse and synthesise the research findings to determine the core problem.

Analysis

Analysis

Analysis

All the survey responses, manual tags and comments from the previous 2-3 months, that customers left in recordings were exported. The product lead and I agreed on a set of tags that would group insights across all sources and split tagging between us, using key words to match to the right tag group, or creating new tags if necessary. As soon as all the raw data was tagged, I analyzed all of it to find patterns.

Category breakdown

Based on customer feedback, what each tag means.

Tag ranking

Tags that have been mentioned the most.

Keyword analysis

Analyzing the terminology customers use.

Problem framing

The insights were categorised into 4 main categories.

Why & when do Product teams watch recordings?

Product Teams watch recordings to find issues – bugs, UI/UX issues – to better learn and prioritize what to fix. They watch recordings especially right after a release to validate whether it’s working or not. Recordings allow them to “go beyond the numbers” and “really understand what’s going on”.

What makes recordings (not) worth watching?

Customers prefer watching recordings of engaged users, particularly when they go through a product flow from start to finish and they can see specific, relevant “A-HA” moments:

  • Bugs and usability issues, which often lead to user confusion or frustration

  • Moments of successful use of the product, particularly conversions


Conversely, they do not like wasting time watching recordings that…

  • …are too short or have little activity

  • …have content playback fidelity issues

  • are of themselves or someone in their team (and not actually of a user).

Extracting & actioning on moments they find insightful

When customers watch one of the “A-HA” moments listed above, currently they do not have an easy way to mark those moments for later. Favoriting and manual tags do not allow pinpointing exactly when the moment happened, and comments require time and are not easily searchable or categorizable. To make matters worse, performance and content fidelity is unreliable.


As a result, Product Teams go through a lot of manual work when using tags & comments, and they often manually screen grab relevant moments as clips. They add the recording URL to a spreadsheet, manually take note of the timestamp and / or use a screen capture tool to manually generate a clip.

Is our proxy success metric an indicator of finding a recording worth watching?

The short answer is Yes, but there’s nuances to it. We can say there’s a crossover, but not correlation.


Customers are unlikely to share, favorite or comment recordings they rate as NOT worth watching, but they don’t necessarily share, favorite or comment recordings they rate as worth watching.


The 2 behaviors serve different purposes, and so the commonality of properties users find useful overall is a good indicator (in both cases, recordings tend to be more activity-packed).


*proxy success metric = sharing, favouriting & commenting

HMW:

"How might we streamline the process for Product Teams to identify and prioritize bugs and UI/UX issues in recordings, reducing the amount of manual work done outside of Hotjar."


"How might we recommend relevant recordings that will increase the likelihood that our customers will find an insight from the recordings they watch."

Design

Challenging the status quo. Exploring ideas and solutions to the problem.

Reference HJ Product Principles

Reference HJ Product Principles

Reference HJ Product Principles

At Hotjar we have product principles that are intended to serve as a shared compass for every colleague whose decisions shape the products we bring to life. These principles should drive consistency, alignment, and focus when making decisions and setting priorities.


I highlighted these two as they were the main principles referenced during the ideation session. Since Hotjar is a product-led growth company, the product is fully self-serve and hence the need for an intuitive product for users to navigate without hand holding from sales or account managers. Hotjar itself has always been known for its ease of use. Therefore, any solution we explore must be intuitive.


Thinking big and starting small is another core principle that follows our core values of being bold and moving fast. I.e. we look towards the big picture, however we then prioritise brilliantly to bring value to customers quicker through smaller iterations building towards the future.

Generate ideas & get buy-in

Generate ideas & get buy-in

Generate ideas & get buy-in

Ideation workshop – We conducted a workshop with the product lead, group product lead, tribe design lead, another designer in the tribe, and engineers. It was mainly intended to involve stakeholders and make them feel involved in the project. Getting stakeholder buy-in early results in more engagement in the long run. Further, this provides a common problem to solve and generates many ideas about how to solve it.


In line with our product principles, we looked into the future, "think big", in our Reach for the Stars section, encouraging us not to limit ourselves to present limitations. Secondly, "think small", think more feasible, what is the smallest thing we can do that will bring value to our customers as quickly as possible.

Workshop Outcome

Two main opportunities emerged from the workshop.

Relevance Score

Build a new score based on what customers find insightful, such as engaged sessions with issues, that ranks recordings based on their relevance, so users can sort recordings by relevance, increasing the probability that they will watch and find an insight.

Allow users to find, sort, and/or action Hotjar insights directly in the context of their workflow

Introduce a new tagging system that makes it easier for product teams to mark relevant moments in recordings and collaborate more easily.


Our long-term vision is to use these tags to feed our Relevance ML engine to learn when those moments occurred - a bug, confusion, a conversion, etc. - so we can provide much more granular recommendations.

Risk / Challenges

  • Before I joined, tagging had been iterated a few times, so there is a mashup of systems, so users have a number of workarounds they have created over time.

  • Even if we provide a better solution, removing the current tagging system could annoy our users if they already have a tagging system and taxonomy in place.

  • Tags are also used across other parts of the product, so whatever is done here needs to be consistent. If squads don't think it works for their needs, they can block it and we will end up not launching or having inconsistent experiences across.

Contingency plan

  • We need to find a way to make the transition to the new tagging system as easy as possible for our customers.

  • To do so, we need to engage with our customers and create a sunsetting plan where we can duplicate their system into the new one, or allow them a period of time where they can switch over.

  • Cross-squad alignment is imperative, ensuring that the solution scales across all products, not just recordings.

Competitors & Inspiration

Competitors & Inspiration

Competitors & Inspiration

Our inspiration came from competitors and market leaders. There was a trend of using emojis to react to messages and videos in a more interactive and personal way. Additionally, it allows users to react to specific moments in a video, rather than adding a comment to the entire video.

Where are users clicking the most?

Where are users clicking the most?

Where are users clicking the most?

For a better understanding of where users are engaging with the player, we examined engagement metrics on Mixpanel and heatmaps. As we would need to revisit the hierarchy of the player, to incorporate the new tagging system,


In the player, the footer was by far the most engaging feature, with users looking for specific moments, such as visiting a specific page. As well as the play and pause buttons, which they used to stop recordings and take notes when they found something insightful.

Exploration – Sketching

Exploration – Sketching

Exploration – Sketching

Next, I sketched out some options for enabling users to mark specific relevant moments in recordings using emojis.

Design explorations

  • The first thing we looked at was the visual hierarchy of the player footer, i.e. the placement and grouping of the actions. Examining options between making the play and pause at the center of the footer vs the reactions. Play and pause buttons were frequently used to pause and take notes. Therefore, we decided to experiment with making the reaction labels central to make it easier for users to mark specific moments in recordings. Play/pause actions are also left aligned in other popular players such as YouTube, so it makes sense to follow convention. They also rely on keyboard shortcuts, which we have too, however need educate our users more on.

  • We also took inspiration from YouTube's success when they introduced video chapters. We adapted the concept to highlight the different pages visited throughout a session, since that is the most requested feature. On hover, we reveal the page the user was on at any point in the timeline so customers know where to jump to.

  • Using icons instead of emoijs was also considered.

  • Several other considerations were made, including the timeline's color, revisiting markers, grouping or hiding actions, etc.

Markers on the timeline

Markers on the timeline

Markers on the timeline

In addition, we explored how the markers would look like on the timeline and how users can interact with markers after reacting to a specific moment.

Design iterations

  • Following a few iterations based on internal feedback, we settled on a design to test with customers.

  • We decided to use emojis, but changed from the standard emojis to Twemojis that we edited a bit to fit our needs. Our main reason for doing this was to differentiate the reactions from the emojis used in the commenting modal.

Prototype & Test

Validation Testing

Usability testing – By creating a prototype on Figma and testing it with our customers, we were able to identify minor pain points and issues with the flow that we iterated on before releasing it to customers.

Beta testing & Dogfooding

Our usability testing was followed by refinement sessions with the engineers and implementation started weeks later (working ahead of the engineers).


After implementation began, we built the functionality and added it behind a feature flag so that the team could test it internally for bugs.


As we were planning to remove and replace some functionality that some customers may have been using, we wanted to invite a small group of power users to participate in Beta testing, so they could get early access to the feature and provide direct feedback.

UMUX lite survey

As part of our follow-up with the beta testers, we created an UMUX lite survey. UMUX-Lite is a short qualitative assessment designed to measure the general usability of a system (developed to offer a shorter alternative to the 10-item SUS questionnaire).


UMUX-Lite contains two statements, namely if the new feature meets their requirements and if it is easy to use. Then we asked them a question to allow them to elaborate. We also added an optional follow-up question inviting customers to participate in customer interviews.


A usability score out of 100 can be calculated at the end of the survey to see how useful the new feature is. Using this as a benchmark, you could then run the survey again next quarter to see if any progress has been made.


Our usability score was 79.3, which is considered good. Above 68 is considered above average.

Learnings

Rapid iterations – Move fast. Learn quicker.

Don't understate the importance of continuous iterative improvements. It might be easy to get caught up in the trap of trying to make the perfect user interface. Our experience has shown that moving fast and keeping our core values in mind helped us deliver faster and learn quicker, keeping our customers at the heart of our decisions through various rounds of user feedback.

Data-informed decisions

Often, you assume you know your own product better than your customers. However, if you talk to your customers, you will be surprised by what workarounds they use, which you might not have considered. Getting our users involved helped us build the right solution for them because we gained a deeper understanding of their workflows. Even if it is just a matter of label names, why guess when you can take the time to learn your customer's lingo and make things easier for them?

Collaboration

Collaboration was key with multiple stakeholders such as working with the BI team on the new relevance score. We aligned based on user needs from research to ensure the relevance score added value and not noise to an already busy UI.


The product also used tags in other areas, so team collaboration was crucial to align on the approach and ensure the experience is consistent. This was done by involving them early in the project, sharing insights found, facilitating workshops to get their ideas in, and sharing learnings.

Shout-out

Shout out to the Product Manager, who was an outstanding sparring partner in this project.

Luke Causon 2025

Austria · Remote