Sr. Engineer - Monitoring & Observability
We are a rapidly growing company thats revolutionizing the way the restaurant industry does business by pairing technology with an unrivaled commitment to customer success. We help restaurants streamline operations, increase revenue, and deliver amazing guest experiences through our platform that combines restaurant point of sale, guest-facing technology, and award-winning customer support. As a Toaster, you will be challenged to take on meaningful projects that will help shape the future of the company. Join us as we empower the restaurant community to delight guests, do what they love, and thrive.
The Performance team is looking for a self-motivated individual who loves monitoring distributed systems. Toast engineering teams are pushing the boundaries of Android performance and building a highly reliable and scalable AWS-hosted platform that supports our fast-growing customer base. The teams mission is to transform the architecture through observation, measurement, and validation. One day you may be putting yourself in the customers shoes to understand the performance of their restaurant and the next digging through the inner workings of our infrastructure to find where bottlenecks are at the lowest levels of our stack. We build performance testing and observability frameworks that empower our engineering teams to quickly get performance and scalability feedback about their proposed code and infrastructure changes. Join the Performance Engineering team to champion performance, deliver fast applications, and drive our platform to architectural excellence.
Recent projects include:
- Building out an observability framework that monitors the health and performance of our fleet of tens of thousands of devices in production.
- Using Espresso, JMeter, and the ELK stack, we built a simulation of a high volume customer that we use to run various experiments with.
- Deploying a synthetic monitoring solution in production that tracks, trends, and alerts on the performance of our critical transactions.
As a senior engineer you will:
- Take ownership over the existing monitoring tools and observability practices
- Design and build next-gen platform to provide real-time operational insight to Toast engineering teams
- Utilize and build on top of best-in-class SaaS tools (DataDog, NewRelic, Splunk) when it makes sense, build it yourself when it doesn't
- Help define the direction of monitoring systems across our Android devices and cloud-hosted infrastructure
- Partner with product and engineering teams to promote best practices and provide advice on how to implement features that are instrumented and observable
- Configure and instrument devices/applications/servers such as Android devices, java based services, & AWS resources to report metrics into monitoring tools
- Generate, manage, empower and report the application performance data captured by the monitoring tools and proactively work with engineering teams in resolving performance issues
- Create operational dashboards for both real-time and historical trended views
- Mentor and evangalize teams on how to use our monitoring tools
Do you have the right ingredients?
- Experience with designing and implementing monitoring infrastructure at a high-scale SaaS company
- Solid understanding of systems monitoring, alerting, and analytics (Splunk, Sumologic, New Relic, Dynatrace, DataDog, Librato, Graphite, ELK stack)
- Proficient in production monitoring concepts including synthetic, real user, application performance, system, log, distributed tracing, and dashboards
- Some programming experience (we use Python and Java but the language doesnt matter)