Moz: Consolidating Monitoring for a Complex SaaS Application at Scale

About Moz

Moz is a Software-as-a-Service company serving over 28,000 customers headquartered in Seattle, USA. The company provides online applications that help inbound marketers with their marketing efforts and SEO.

Moz has a large and complex environment to host their SaaS applications for customers. Running on over 1,500 virtual machines, hosted on over 400 physical servers across 2 private data centers in the U.S. their environment consists of custom applications written in over a dozen languages, huge database clusters running MySQL, Reddis, Riak, HBase and MongoDB, storing 65TB of data in total, and are a large OpenStack user with a 1.5PT Swift object store.

Moz’s Monitoring Challenges

As Moz’s infrastructure grew, with multiple teams responsible for different areas of the service, the number of Open-Source monitoring solutions run at Moz quickly exploded. There was no consistent setup of each tool, resulting in different levels of coverage across their infrastructure, and making it painful for Moz to setup monitoring for new services added to their Production infrastructure.

In addition, there was no clear visability of how different areas of the service was performing, as the tools didn’t provide easy to view and share Dashboards that could be shared not just with Operations, but Developers and Business Stakeholders.

Moz was also moving more into the world of DevOps too, which required them to provide monitoring as a self-service solution to their different product Development teams.

How Dataloop.IO Helped

Moz is currently rolling out Dataloop.IO across all of it’s infrastructure to consolidate and replace all their existing monitoring solutions.

Dataloop.IO has provided a flexible but consistent framework that all theirs teams can collaborate around to easily setup monitoring for new services in Production, and has allowed them to quickly create highly visual, and easily sharable, dashboards for the services they care about, improving their visibility and coverage across their entire infrastructure.

As you can imagine, monitoring 1,500 VMs in real-time also created a scaling challenge for Moz’s legacy monitoring solutions, and Dataloop.IO has removed this issue entirely. Moz sends as many metrics as they need to Dataloop.IO and no longer has to manage and scale large amounts of monitoring infrastructure, freeing up their team to focus on higher value tasks.

In future they are planning to roll out Dataloop.IO as a self-service monitoring solution to all their development teams, so that they can write their own Nagios checks, and collect the metrics they care about to display on Dashboards and alert off.

Key Results

  • Consolidated all their infrastructure & service monitoring into Dataloop.IO increasing coverage and visibility across their service.
  • Acheived greater visibility across the organisation and other teams on how Moz’s services were performing, with easily sharable dashboards.
  • Saved considerable time across the team, who no longer have to manage and scale on-premise monitoring solutions, and can get other teams to setup their own monitoring without their involvement as a self-service solution.

“One of the benefits that we recognise from Dataloop.IO, is that we’ve been able to pull old monitoring tools out of place, put Dataloop.IO in place and we’re starting to converge everything into Dataloop.IO so that we have a single place to go to for our Dashboards to look at everything.”

 

Mark Schliemann, VP Technical Operations, Moz

Mark Schliemann, VP Technical Operations, Moz

Go Beyond Cloud Monitoring

Scale your infrastructure monitoring for Cloud, SaaS, Microservices and IoT-ready deployments with Dataloop.IO. We’ve designed a solution for the agile organization. Start now and see your operational metrics in minutes.