Data Platform Engineer: LIMS & Cross‑Team Data Lead
About the Role
We generate data from multiple sources: our internal wet lab, partner hospitals, and sequencing providers. As we scale, our challenge is no longer data volume; it’s data clarity. Our datasets are rich but fragmented. Teams work across different levels of granularity (patient, sample, organoid, etc.), and we are looking for a Senior Data Engineer to own that layer.
You will design and maintain the data model that sits between our raw scientific data and the teams that need to act on it: clinical operations, wet lab scientists, and computational researchers. This is a highly operational, high-ownership role. You will be the central point of contact for data infrastructure, and your work will have immediate, visible impact on how we run trials, validate therapies, and ship new insights.
What You’ll Do
Own the cross-functional data model Define the table structure, unique identifiers, and join logic that makes our data coherent across teams. Establish and enforce data conventions that work for both lab and operations teams.
Lead our LIMS integration Drive the technical integration of our laboratory information management system (LIMS) into the production environment — from schema design to data flow and migration from current systems.
Build the final layer of data curation Create user-facing views that merge clinical, experimental, and molecular data into clean, analysis-ready datasets accessible to non-technical users.
Bridge teams and priorities Act as the data liaison between clinical operations, wet lab scientists, omics specialists, and computational researchers. Translate diverse requirements into coherent data solutions and ensure roadmap alignment across teams.
Who You Are
Solution-oriented: When faced with a non-technical problem, you can gather cross-functional needs and find the right engineering solution.
Deeply curious: You proactively understand people’s bottlenecks and connect the dots between teams.
A strong communicator: You translate technical concepts for non-technical audiences and are comfortable gathering requirements from scientists, clinicians, and engineers alike.
Autonomous: You ship without waiting for perfect specifications. You own your scope end-to-end.
Cross-functional by nature: You thrive in environments that require coordination across multiple teams with different priorities and vocabularies.
Minimum Qualifications
MSc or Engineering degree in Computer Science, Data Science, or Computational Biology. Engineering background is a must.
4–5+ years of data engineering experience in an environment where data is originally messy and unstructured.
Strong SQL skills and hands‑on experience designing relational data models.
Proficiency in Python for data processing and pipeline development.
Experience with workflow orchestration tools (Airflow, Dagster, or similar).
Comfortable with Git/GitHub in a collaborative engineering environment.
Preferred Qualifications
Experience in environments where scientists are primary data consumers.
Familiarity with biological, bioinformatics, or laboratory data systems (LIMS).
Understanding of experimental workflows and research or clinical operations.