Data Science and Engineering is responsible for full data processing flows, processing raw data streaming in live from myriad public APIs and batch ingestion, applying statistical methodologies and making our data available in a self-service fashion to drive and enable decision across the entire platform.
We are seeking a Principal Engineer, Data Sciences Engineering to help build the next generation data platform for Numerated. As a key member of the data platform engineering team, you will help build a platform in AWS that supports both streaming and batch workloads, and can bring vital information to our banking clients. Our platform will evolve to adapt to constantly changing business requirements while enforcing strict data quality standards and preserving data integrity.
Essential Responsibilities /
- Contribute as a leader in the development of a modern data platform in a complex and fast-moving business and technical environment.
- Collaborate with data platform product owner and engineering leadership to drive advanced analytics capability into the platform, both into Numerateds core product as well as the visualization / consumption layer.
- Progressively incorporate data science, machine learning, and other analytical techniques into the data platform, to offer insights and capability both internally and externally.
- As a thought leader and evangelist, stay on top of technical advancements and new approaches in the ever-changing data technology landscape.
- Champion and help lead the migration of our data platform to the cloud.
- Develop workflow and model management framework in support of engineering and data science applications.
- Scrum team member delivering shippable quality code in fast-paced sprint cycles, exemplify principles behind scrum and the Agile Manifesto.
- Understand the importance of CI/CD, participate in code reviews and contribute to automated tests.
- As a senior member of the team, contribute to its continuous improvement, providing development leadership and mentoring across the data science and engineering team, introducing new technologies, engineering best practices and techniques.
- Consistently challenge best practices for design, coding standards, performance, security, delivery, and maintainability.
- Contribute as a full-stack engineer in developing data platform capabilities, including user interfaces, database engineering, APIs, and metadata layer.
- Oversee the quality of projects by ensuring that key technical procedures, standards, quality control mechanisms, and tools are properly utilized including performing root cause analyses for technical problems and engaging in work product quality review.
- Set and ensure best-practices are followed for all aspects of the data platform including security and data standards compliance.
Education Requirements /
- Bachelors degree in Data Science, Computer Science or an equivalent technical field required
- Masters degree in Data Science, Computer Science or an equivalent technical field preferred
Work Experience Requirements /
- 5+ years experience working in data science engineering, involving building a data platform that serves a range of data science, data warehousing, visualization, and ad hoc analytic needs
- 7+ years experience in professional software development
- 5+ years experience in progressive data architecture, incorporating cloud technologies, data lakes, elastic data warehouses, object stores and serverless architectures
- 5+ years of experience with building efficient data processing pipelines for analytical systems, progressively incorporating streaming architectures, serverless event-driven ingestion pipelines, and good knowledge of ELT best practices
- Demonstrated front-end development in frameworks such as React, Vue, or Angular.
- Deep understanding of relational databases and No SQL data storage technologies, including elastic cloud data warehousing solutions such as Snowflake (ideal), Redshift, EMR and Big Query. Prior experience with Elasticsearch a plus.
- Strong knowledge of distributed frameworks such as Spark, HBase, Presto and Flink
- Significant experience with AWS, particularly with data-related services such as Glue, EMR, Lambda, Kinesis desired.
- Comfortable working with data from a variety of sources (databases, JSON, text-based, semi/unstructured data).
- Solid understanding and work experience with the Python data science ecosystem (e.g. NumPy, Pandas, Jupyter, Scikit-learn), R, Spark in cloud environments.
- Experience with data science deployment environments such as Tensorflow and AWS SageMaker.
- Significant past experience with SQL, dimensional data modeling, relational and columnar databases.
- General understanding or experience with microservices and containerization.
- Proficiency with Python, facility with other languages desirable: Java, NodeJS, Scala, OS scripting.
- A professional attitude with strong interpersonal and communication skills - frequent video communication with remote teams and team members.
- Demonstrated ability to work effectively and with self-direction in a fast-paced, team-oriented work environment. Ability to maintain a positive attitude in high pressure situations and manage distributed teams with competing priorities with tight deadlines.
- A passion for keeping up with the rapidly changing data technical landscape.
- Consistent track record of coaching/mentoring individual contributors into more senior roles.
- Outstanding written and verbal communication skills, strong communication, influencing, and presentation skills.