At Wasabi, we’re a proven collection of pioneers, visionaries and disruptive doers. We see things differently than our competitors, and we make our mark in the industry by challenging the norm and delivering the unexpected and improbable. We’re a fast-growing company taking the Cloud Storage industry by storm and recognized as one of the best places to work in Boston.
Wasabi hot cloud storage is a new class and category of cloud storage, breaking all traditional barriers and boundaries of storage with a disruptive value proposition of being 1/5th the cost of AWS S3, faster than the competition, with no fees for egress or API request and delivered as a single-tier solution. Cloud storage has never been so simple, so fast and so inexpensive. It’s all part of our vision to make cloud storage the next great global utility, just like electricity.
Role Description: Senior Cloud Operations Engineer
Wasabi, the hot cloud storage company, is looking to hire a Senior Network Operations Engineer (DevOps) to be a member of the team supporting the 24x7x365 service operations. To be successful in this role, you will need a strong networking background combined with excellent communication and interpersonal skills. This job requires you to design networks in a data center environment to support a SaaS application. You will resolve production issues and restore service quickly to ensure that uptime for the service is within acceptable business targets.
You have excellent communication skills and can work with a cross functional team to coordinate deployments, monitor systems and networks, and analyze logs and alerts to identify, diagnose and resolve production issues. You need to have experience deploying released software and monitoring and operating services. You’ll write and review Methods of Procedures (MOPs), have a passion for automation. If you thrive on working in a fast-paced startup environment to service a global customer base, we would like to hear from you.
The role reports to the VP of Operations.
- Monitor, troubleshoot and rectify issues in networks and services in a 24x7 production environment
- This role will require release, deploy, operate and monitor part of the DevOps cycle
- Working with Engineering teams to ensure that the right CICD model is in place to support the deployment network architecture for customer facing services and internal operations efficiency
- Configure, verify, and create network designs to deploy services in a datacenter environment
- Create tools that helps rapid diagnosis, fault localization, and service restoration, to ensure that the service availability is high
- Evaluate tools to automate or enhance the infrastructure to support scale
- Find and resolve issues on site as well as remotely in data center production environments
- Handle customer escalation calls
- Analyze and resolve deployment problems, find resolutions for field issues working collaboratively with development and test teams
- Create Jira tickets to ensure that software and configuration issues are resolved
- Write up Root Cause for issues found in production environment and present the analysis to a broad audience
- May require on call, off hours duty
- Some travel may be required
- 7+ years’ experience in large scale data center production environments for SaaS solutions
- Expertise in one or more of system, network, or security operations
- BS in Electrical Engineering or equivalent degree
- Strong understanding of DC architecture, Linux, network configuration, TCP/IP, DNS, DHCP, data path and other IP control plane protocols
- Strong understanding of security, incident management, and supporting security audits
- Debugging ability using networking tools, logs, and alerts to identify, localize, analyze and resolve failures
- Deep understanding and experience with fault localization and service restoration in a 24x7x365 production environment
- Excellent communication and presentation skills and ability describe technical issues to a diverse audience
- Ability and experience working in a fast-paced startup environment
- Experience in developing MOPs, SOPs and working with other teams to qualify the MOPs to reduce errors/issues in production deployments
- Knowledge and experience with Linux and python are required
- Attention to detail, creative problem solving, and outcome focused
- Positive attitude and solution focused to help customers, both internal and external
- Remaining calm under pressure, and excellent problem-solving skills are required