About the Role:
Set up and manage the ML and LLM deployment infrastructure for a globally distributed healthcare AI platform.
Responsibilities:
- Build pipelines for training, validation, deployment, and monitoring of ML/LLM models.
- Automate fine-tuning and retraining with secure patient data (in compliance with privacy standards).
- Implement A/B testing, feedback loops, and live performance tracking.
- Monitor and auto-scale models based on usage and region.
Requirements:
- 4+ years in DevOps/MLOps or infrastructure engineering.
- Experience with MLFlow, Weights & Biases, Ray, Kubernetes, and scalable GPU deployment.
- Familiarity with secure healthcare deployments (encryption, data locality, compliance).