Intralinks Careers Website
Manager, Site Reliabilty Engineering (Intralinks, Inc., an SS&C company; Waltham, Massachusetts):
The Manager, Site Reliability Engineering will be managing a highly agile and talented team involved with ensuring the reliability and the efficient operations of the Intralinks platform and services. Intralinks platform and services include complex software components deployed across multiple data centers that provide a highly available and secure SaaS environment for the Intralinks portfolio of applications.
Specifically, the Manager, Site Reliability Engineering is responsible for the following:
- In-depth analysis of incident root cause; proposing and implementing efficient monitoring and preventing incidents; managing and finding ways to improve teams incident handling velocity;
- Managing incident priorities and expectations with the support teams;
- Working with R&D and architecture teams on defects and runtime inefficiencies identified in the production environment;
- Constantly working with the team to improve the mean time to detect root cause for incidents and resolve incidents;
- Guiding and working with the team to improve and automate the issue diagnostics process; reviewing system requirements from business systems analysts and product management indicating the business needs and functionalities of the proposed application;
- Engaging product management and other stakeholders when an inefficiency or an issue in the production environment is identified;
- Reviewing architecture and design documents that describe the implementation approach and providing feedback to ensure the deliverables are production ready;
- Partnering with the rest of the team to develop a detailed design towards runtime efficiency, efficient monitoring, and incident management;
- Periodically evaluating the production incidents and making recommendations to R&D and other stakeholders; interacting with performance and capacity planning teams to ensure that the product has met all the necessary performance and scalability requirements prior to production deployment;
- Mentoring site reliability engineering team members on the product, process, and its operational readiness towards team success;
- Embracing iterative development and agile process principles in site reliability engineering tasks;
- Managing the schedule and operations of a distributed team that provides round the clock support to production. Will manage four site reliability engineers and travel required less than 1% of the time to Intralinks offices in India.
Minimum requirements: Bachelors degree or equivalent Computer Engineering, Computer Science, Information Systems, or related field plus 5 years of software development experience.
- 5 years of coding and debugging experience using programming languages such as C, C#, SQL, .NET, and Java;
- 5 years of experience integrating, managing, and developing experience in CRM (SharePoint) solutions;
- 5 years of experience managing and accountable for project deliverables;
- 4 years of experience packaging and deploying applications in production environments;
- 4 years of experience troubleshooting system level issues on Windows and Linux systems;
- 3 years of experience troubleshooting network connectivity issues between processes and applications.
- All experience may be gained concurrently.