- Company Name
- McKesson
- Job Title
- Sr ML Ops Specialist
- Job Description
-
**Job Title:** Sr ML Ops Specialist
**Role Summary:**
Architect, deploy, and sustain end‑to‑end machine learning pipelines that transition models from research to production. Drive continuous integration/continuous deployment (CI/CD), monitoring, and infrastructure scalability across cloud platforms.
**Expectations:**
Deliver fully functional, monitored, and optimized ML systems that meet business objectives while ensuring reliability, security, and compliance.
**Key Responsibilities**
- Design, implement, and maintain MLOps pipelines for model training, deployment, and monitoring.
- Collaborate with data science teams to translate model requirements into production‑ready artifacts.
- Build automated CI/CD workflows (Jenkins, GitLab CI, Azure DevOps) for rapid model iteration.
- Containerize ML workloads (Docker) and orchestrate deployments with Kubernetes.
- Set up monitoring, logging, and alerting (Prometheus, Grafana, ELK) to detect performance degradation, data drift, and model decay.
- Provision and manage scalable cloud infrastructure (Azure, AWS, GCP) for training and inference workloads.
- Ensure security, reliability, and regulatory compliance of ML systems.
- Evaluate and integrate emerging MLOps tools and best practices.
- Mentor junior engineers and champion MLOps standards across teams.
**Required Skills**
- 5+ years in software engineering, DevOps, or MLOps with strong ML focus.
- Proficient in Python; experience with Java/Scala is a plus.
- Hands‑on with MLOps platforms (MLflow, Kubeflow, SageMaker, Azure ML, Google AI Platform).
- Expertise in containerization (Docker) and orchestration (Kubernetes).
- Solid CI/CD knowledge (Jenkins, GitLab CI, Azure DevOps).
- Experience with data pipelines (Spark, Kafka, Airflow).
- Familiarity with monitoring/logging stacks (Prometheus, Grafana, ELK).
- Cloud platform proficiency (Azure, AWS, or GCP).
- Strong problem‑solving, communication, and teamwork abilities.
**Required Education & Certifications**
- Bachelor’s or Master’s degree in Computer Science, Engineering, Data Science, or a quantitative field.
- Equivalent experience (7+ years) may substitute for formal degree.
- Certifications such as AWS Certified Machine Learning – Specialty, Google Professional Machine Learning Engineer, or Azure AI Engineer Associate are highly desirable.