About the Role
We are growing our Core Engineering (DevOps) team to meet the challenges of a rapidly growing and changing SaaS company. We are looking to expand our capability to maintain, optimize and evolve our AWS-based cloud operations for our SaaS products.
Our mission is to maintain rock-solid stability and reliability for our customers while making development and management of the environment as easy as possible for our colleagues at Validity. We aspire to follow the SRE/DevOps methodologies and best practices of distinguished tech industry leaders while making the appropriate choices for our organizations scale.
We are looking for an experienced engineer who is confident with cloud infrastructure and code pipelines. In addition to becoming deeply familiar with core tools and technology, they will need to build broader familiarity with a wider set of technologies in the environment. We expect them to own projects and help guide the evolution of our platforms.
We support a complex environment that has grown rapidly through mergers and acquisitions. We host applications written in-house, purchased commercially, and from the open-source community. No two days are the same here. This gives our team the opportunity to develop great hands-on experience with many exciting technologies such as Linux, AWS, Docker, and Kubernetes.
Position Duties and Responsibilities
- Maintain and optimize our AWS cloud footprint, Kubernetes clusters, SaaS accounts, CI/CD systems, and other platforms. Focus on resiliency, scalability, maintainability, security, and cost.
- Empower our colleagues on software development teams to use the above services effectively by providing support and documentation.
- Participate in 24/7 on-call rotation. Provide critical support towards the unplanned work needs of the organization. Triage incidents effectively.
- Research/develop/deploy creative (but practical) technical solutions to our problems. Use the tools available to us to save the company time and money.
- Take ownership of longer-term projects. Share your experience and perspective with the Core Engineering team to guide major architectural decisions.
- Provide subject matter expertise in the DevOps niche we fill in the organization. Collaborate with business application development, DBA, QA, SecOps, corporate IT, and customer-facing services teams as needed. Steer organization towards best practices.
Required Experience, Skills, and Education
- A passionate engineer with 4+ years of experience in open-source system administration/operations and software development lifecycle in a public cloud-hosted environment.
- Post-secondary degree in a technical field such as Computer Science, Management of Information Systems, etc., OR demonstrated equivalent practical experience in the industry
- First-hand experience designing, building, and operating platforms for containerized microservices applications. Understands key challenges in this space. (Most of our production software runs in Docker on Kubernetes.)
- Expertise with infrastructure as code and configuration management systems (Terraform/Cloudformation/Ansible etc.).
- Strong understanding of CI/CD pipelines, modern SDLC (GitHub, Jenkins, CircleCI, etc.)
- Expert Linux system administration skills and understands TCP/IP networking and security
- Proficient in scripting languages (Bash/Python) and data structures (YAML, JSON, XML, etc.) in the context of templating and automation.
- Experience implementing and using monitoring/tracing/alerting/observability tools. (Prometheus, Grafana, Datadog, New Relic, PagerDuty, etc.)
- Excellent troubleshooting skills with a detective mindset. Does not give up on a problem, exhausts all effort and resources to resolve a difficult technical problem. Comfortable operating in a very complex technical environment.
- Excellent communication and collaboration skills. Validity and the Core Engineering team are distributed across the world. Remote work skills will be critical for this position for the foreseeable future.
Preferred Experience, Skills, and Education
- Relevant certifications in DevOps/SRE, AWS, Kubernetes, Docker, Security, Networking, etc.
- Advanced programming skill in languages like Go, Python, Ruby, Java
- Experience working in Agile/SCRUM/Kanban project management environment
- Database administration/analyst skills (transactional and/or no-sql) skills (Postgres, Mysql, Kafka, Etcd, Zookeeper, Clickhouse, Snowflake)
- Information security/SecOps skills or experience. (PKI, SSL/TLS, secrets management experience)
- Experience with auditing/compliance/regulated environments (SOC2, PCI, HIPAA, etc)
- Understanding of email deliverability concepts (SPF, DKIM, DMARC, public IP management)
- Paid Holidays
- Unlimited PTO
- Parental Leave
Pay Range: $120,000 - $150,000 base, plus up to 10% bonus opportunity, and stock options.
Final salary may vary depending on skills, location, and/or experience.
This position can be in office/remote, hiring in the following states only:
AL, AR, AZ, CA, CO, CT, FL, GA, HI, ID, IL, IN, KS, KY, MA, MD, ME, MI, MO, NC, NE, NH, NJ, NV, NY, OH, OK, PA, RI, SC, TN, TX, UT, VA, VT, WA