Observability Engineer
About ProbablyMonsters Inc.

ProbablyMonsters™ is a builder of sustainable game studios that creates and launches original AAA games through a people-first culture. Our mission is to unite, guide, and empower talented teams to create exceptional interactive experiences. In an industry where new AAA studios and developer-focused cultures are equally rare, ProbablyMonsters stands out by fostering both with excellence. We empower our family of studios to concentrate on game creation while our platform team focuses on providing early stage infrastructure and long-term support. This unique development model provides each of our teams the confidence, security, and stability to create and ship their games free of distractions. Our family of studios includes: narrative-driven Cauldron Studios™, multiplayer-focused Firewalk Studios™, and our third studio, not named yet, already working on a next-gen co-op RPG game. Learn more about our people, culture, and commitment to exceptional creativity at probablymonsters.com

We are looking for an
 ObservabilityEngineer to join our Data Services team.  Data Services is a central engineering team within ProbablyMonsters whose goal is to unlock the potential of data in game development.   In this role, you will empower our studios to measure and understand how their systems behave. This position requires the ability to use data from various sources to improve engineering efficiency and help all teams within ProbablyMonsters to be data-driven.

Who You Are:
  • You take pride in being able to turn data into meaningful insights. 
  • You are always thinking in terms of logs, metrics, and traces.  
  • You question how systems work and come up with solutions to visualize and track their behavior over time. 
  • You take pride in being able to dramatically improve engineering efficiency by making key data easily accessible and meaningful to all.  
  • You can easily boil down a large set of metrics into those that matter most.  
  • You’re an advocate for instrumenting systems to eliminate all guesswork about what are doing and how they are being used.  
  • You know how and when to leverage existing services to build solutions faster and cheaper. 
  • You have a strong desire to automate and test everything.  
  • You enjoy engaging with teams to identify their needs and provide them with insight. 
  • You are self-sufficient, self-motivating, and unafraid to ask for help. 
What You Will Do: 
  • Define observability for all of the ProbablyMonsters studios and engineering teams. 
  • Evaluate existing tools, and implement what's missing, to create a self-service observability platform that anyone can use to ingest and visualize their data. 
  • Empower engineering teams to collect logs, metrics, and traces from any source. 
  • Evangelize best practices with respect to code instrumentation and system performance analysis.  
  • Create an automation to make your tools and systems deployable, highly available, and self-healing. 
  • Collaborate closely with other engineers across the entire ProbablyMonsters family of studios. 
  • Promote a culture of quality, reliability, and customer-focus. 
Minimum Qualifications: 
  • Five years of experience working with and developing, observability solutions. 
  • Two years of experience working with public cloud solutions.  
  • Experience in a live-ops role through at least one major game launch or equivalent services release. 
  • Demonstrated experience debugging and resolving problems in complex distributed systems. 
  • Experience working with managed observability platforms such as DataDog, Grafana Cloud, NewRelic, Honeycomb.io, etc.  
  • Proficient at developing and using deployment pipelines. 
  • Extensive experience in developing dashboards using visualization tools such as Grafana, Tableau, Looker, etc.   
  • Ability to write code in Python, Java, Go, or equivalent language(s).