Sumo Logic

Explorer

Transforming dashboard to ease discovery
01 THE SITUATION

Brief

If you notice it's taking too long to stream your favorite movie or you're seeing a 404 error, you might be in the middle of an outage. On the other side, there are Site Reliability Engineers who get alerted right away and need to restore the service for you. During these critical times, companies like Netflix rely on Sumo Logic to get real time analytics.

For the SREs, finding the root cause of an outage requires navigating through many dashboards often times without much context. In 2019, we started to see a trend where customers started migrating some of their data to other tools - that provided more flexibility and a concise workflow. Our team set out to support SREs better through these stressful times by revamping the dashboard experience from browsing to filtering content.

My role

UX
IxD

Collaborators

Soraya Nunkunkit・Lead Product Designer
Aona Yang・UX Researcher
Abhi Khanna・Product Manager
Jeremy Asuncion・UI Engineer

Sumo Logic Dashboards 2018

Goals

Meet Andre, the on call engineer. His mission is to find the needle in the haystack within minutes when sifting through multiple dashboards. And sometimes his job is on the line if our product doesn't deliver on its promise.
Why can't we just automate all this if we have fancy algorithms that can predict our movie preferences? The complexity of infrastructure and systems running the internet still requires human analysis. And behind the scenes, there's a Melinda. She's the system administrator who creates all of the content in dashboards.
When defining success for this dashboard redesign we focused on the following:
1. Mean time to resolution - Are we making an impact on Melinda's organization?
2. Team engagement - Are we making Andre's job easier/harder?
02 THE APPROACH

Defining direction

The first critical step for Andre is to narrow down the scope of his search. I helped the team understand the pros and cons of each alternative. From quickest to execute to rethinking the workflow. Option B was the winner since it helped Andre navigate complex hierarchy quickly without losing the dashboard view.

Option B
Option A
Option C

Navigation explorations

03 THE OUTCOME

Introducing Explorer

We launched into a beta by separating the new experience under Explorer and maintaining classic dashboards for existing customers. This gave the option to customers to transition to the new experience at their own pace.

Isolating data

Navigation alone won't get Andre to the right data. Once he has found a relevant dashboard he needs to slice and dice individual charts. We discovered from observing customers that the most common path taken is to open another tab (Logs/Metircs) but this doesn't scale well when they have to scan through 10+ visualizations. It also leaves room for errors when copying and pasting names in a query.
Melinda, the content creator, can easily define the filters relating to the dashboard so Andre, the SRE on call doesn't have to leave dashboard for further analysis.

Charts 2.0

Throughout this project I advocated for bringing more consistency across all of our charts so SREs can surface patterns more quickly and get immediate insights. This effort required building guidelines around layout, colors and interactions so all visualizations felt like they were part of one product, not just legos from external libraries. And making our charts responsive lay down the groundwork for our team to tackle mobile next.
I simplified the time range editor component to lighten up a stressful moment. This is the most common action customers use but sadly we had multiple variations of the same component.

This is just the beginning

So...has our solution made an impact on mean time to resolution and team engagement for our customers? The feedback from customers who are adopting Explorer in their troubleshooting process shows we're making a difference. Yet, we're still working to make this even better as we learn from how people are engaging with the new dashboards.

This view right here is worth the money. People cannot f****** find the data they need when they need it.

Alex Morreale
Senior Site Reliability Engineer at ezCater

UX is very neat...It's more intuitive than Grafana, definitely the ability to see namespaces is very nice.

Lior Mechlovich
Platform Site Reliability Engineer at Informatica

I couldn't have the made the transition to production on Kubernetes as quickly as I did without the visibility [Explorer] provided.

Jeremy Proffitt
Lead Site Reliability Engineer at Lending Tree