Job Post

Data Architect

30 Corporate Drive
Burlington, MA 01803

At HealthEdge, we are passionate about providing healthcare payors with technology that enables them to innovate, reduce costs, and address the business imperatives of the evolving healthcare economy. As we continue to grow, we are continuously looking for ways to improve the performance, scalability, and reliability of our products.


  • Analysis and design on the enterprise data model, including domain data models, MDM, data lakes, modeling for various operational requirements.
  • Architect data flows into and out of enterprise boundaries and model intermediate data flows within the enterprise applications.
  • Prepare data for ML and AI application use.
  • Design for event-based applications, telemetry, monitoring, and other operational use cases.
  • Performance and scalability needs for data storage and analysis/persistence layer optimization for all applications.
  • Modernization of our current architecture and formulating a state of art persistence layer for a company-wide data lake to solve analytics and operational needs.
  • Collaborate across multiple technical functions and build consensus on enterprise data architecture.


  • 3+ years of prior experience with Oracle RDBMS with focus on Data Modeling (adequately skilled in data normalization), Data warehousing (prior experience building data marts and data lakes), Data Science (adequate expertise building inference engines on big data platforms for predictive and machine learning systems) and Data Analysis (prior experience with compliance reporting systems, audit and data governance).
  • 5+ years of experience leading technical teams with a focus on query optimization, identifying and fixing performance issues with Oracle RDBMS, including performing DBA tasks like data partitioning and storage layer optimization, solid knowledge and experiences with NAS, RAID, and SAN storage optimization.
  • 5+ years of experience with Hadoop/DFS.  Should have prior experience with building high performance NoSQL datastores for high traffic and high velocity data attributes, practical experience with storage formats like Parquet or ORC, expertise with support for Schema stamping and schema evolution, past experience with Data Governance and access control issues related to NoSQL stores Ranger / Atlas,  prior experience with query engines like Hive, Presto or Impala, Bigdata related ETL frameworks like Sqoop / Gobblin / Marmaray
  • 3+ years of solid experience with Hive, exposure with writing map reduce, query optimization with Pig Native, Presto (preferred) or Impala
  • 3+ years of experience with event-driven frameworks like Kafka, data access, ingress/egress, security and encryption on Cloud platforms like AWS, Azure, GCP, and hybrid cloud build on VMWare.
  • Overall 10+ years of solid experience on data/persistence related issues with an exclusive focus on architecture and performance on high volume high traffic systems with ACID and time series characteristics.

Nice to have

  • 3+ years of Healthcare domain experience with exposure to HL7, FHIR data models.
  • 3+ years of experience in data security/governance for audit & compliance, CMS, HEDIS, HIPAA, PHI, PIA, etc.
Category: Data Science / Machine Learning

See What It's Like To Work Here

HealthEdge is an innovative software company that provides the only integrated financial, administrative and clinical software platform for healthcare payors.

Get unlimited job listings with a BIZZpage