Location: Remote
Department: Technology
Reports To: Director of Engineering
Position Overview
We’re seeking a Staff AI Engineer to define and own the reference architecture for our agent platform. You will own building reliable, safe, and scalable systems that power clinical workflows. You’ll design and implement the core agent runtime, evaluation frameworks, and observability layers that enable speed, trust, and continuous improvement. This role combines deep AI/ML expertise with systems engineering to create production grade agents that are explainable and cost-efficient.
Key Responsibilities
Model Development & Lifecycle Management
• Design, train, fine-tune, and maintain machine learning and AI models across supervised, unsupervised, semi-supervised, and reinforcement learning paradigms.
• Own the AI Agent runtime and full model lifecycle, including:
• Orchestration primitives, safety policy enforcement, prompt governance.
• Data preparation and feature engineering
• Model training and hyperparameter optimization
• Validation, testing, observability, and benchmarking
• Deployment, monitoring, retraining, and drift detection
• Establish reproducible training pipelines and experiment tracking (e.g., model versioning, metadata capture, lineage).
Agent Orchestration & Reliability
• Design robust orchestration for multi-agent workflows that ensures consistency and resilience.
• Implement state machine or DAG-based execution with retries, idempotency, and parent/child task decomposition.
• Define tool-call contracts, enforce timeouts, and establish predictable failure modes.
Prompt & Tool Shipping Discipline
• Treat prompts and tool schemas as versioned code artifacts.
• Build processes for staged rollouts, canary deployments, rollback strategies, and tenant-specific overrides.
Evaluation Frameworks
• Develop golden datasets, regression suites, and CI gates to validate changes before release.
• Combine offline evaluations with online monitoring tied to real workflows and clinical acceptance criteria.
Observability & Debugging
• Implement full trace and replay capabilities for rapid incident resolution.
• Capture every decision point: inputs, prompts, model responses, tool calls, outputs, and confidence signals.
Runtime Safety
• Enforce deterministic policy checks and prohibited actions at runtime.
• Integrate required confirmations and audit trails; treat safety as code, not documentation.
Cost Instrumentation
• Build unit economics tracking for token and tool costs per run, workflow, and tenant.
• Create feedback loops to maintain predictable costs as usage scales.
Required Qualifications
• Experience: You’ve shipped and owned production systems.
• Technical Skills: Strong proficiency in Python and/or C#, with the ability to integrate both seamlessly.
• Evaluation Expertise: Proven track record of building evaluation pipelines, covering offline and online validation.
• Systems Thinking: Deep understanding of orchestration patterns, distributed systems, and runtime safety.
• Observability & Debugging: Hands-on experience implementing logging, replay, and monitoring for complex workflows.
• Cost Awareness: Familiarity with instrumentation for performance and cost optimization at scale.
• Eligible to work in the U.S. without relying solely on a student visa or a visa sponsored by a third-party employer.
Preferred Qualifications
• Experience with knowledge graphs, ontologies, or symbolic AI.
• Practical experience with vector databases, embedding pipelines, and RAG architecture.
• Background in regulated or high-stakes domains (e.g., healthcare, finance, security).
• Published research or patents in AI/ML or related fields.
• Familiarity with MLOps platforms, experiment tracking, and monitoring tools.
Success Metrics
• Models deployed demonstrate measurable business or operational impact.
• Robust validation and monitoring frameworks adopted across AI initiatives.
• Clear, defensible AI technique selection aligned with problem requirements.
• Scalable data and knowledge representations that enable future AI capabilities.
• Strong cross-functional trust in the rigor, safety, and effectiveness of AI systems.
Physical Demands:
The physical demands described here are representative of those that must be met by an employee to successfully perform the essential functions of this job. While performing the duties of this job, the employee is regularly required to speak, hear, read, and type. This is largely a sedentary role; however, some shipping may be required. This position requires the ability to occasionally lift office products and supplies up to 40 pounds.
Work Environment:
To perform this job successfully, an individual must be able to perform each essential duty satisfactorily. The requirements listed above are representative of the knowledge, skill and/or ability required.
Please note this job description is not designed to cover or contain a comprehensive listing of activities, duties, or responsibilities that are required of the employee for this job. Duties, responsibilities, and activities may change at any time, with or without notice.
Apply Now
Apply Now