- Company Name
- Elsevier
- Job Title
- Machine Learning Ops Engineer
- Job Description
-
**Job title:** Machine Learning Ops Engineer
**Role Summary:**
Design, build, and operate end‑to‑end machine‑learning and retrieval pipelines that bring experimental NLP/IR/GenAI models to production for health‑care content platforms. Work across cloud services, MLOps tools, and search/graph technologies to deliver scalable, secure, and high‑performance AI features such as RAG, semantic search, and knowledge‑graph retrieval.
**Expectations:**
- Ship production‑ready GenAI and search solutions within a multidisciplinary environment.
- Maintain reproducibility, governance, and ethical standards for medical data and content.
- Continuously monitor, test, and optimize ML infrastructure and workloads for cost efficiency.
**Key Responsibilities:**
- Automate ML workflows using AWS, Azure, Databricks, and foundation‑model APIs (OpenAI, Bedrock).
- Maintain model registries, artifact stores, and version control.
- Develop CI/CD pipelines for data validation, model testing, and automated deployment.
- Scale custom SageMaker pipelines and implement MLOps solutions with SageMaker, MLflow, or Azure ML.
- Engineer GAR+RAG components: query interpretation, chunking, embeddings, hybrid retrieval, prompt libraries, guardrails, and structured outputs.
- Build ML pipelines leveraging Elasticsearch/OpenSearch/Solr, vector databases, and graph databases.
- Create evaluation systems: IR metrics (NDCG, MAP, MRR), LLM quality metrics (faithfulness, grounding), A/B testing.
- Optimize infrastructure costs through monitoring, scaling, and resource efficiency.
- Stay updated on GAI, NLP, RAG research and integrate state‑of‑the‑art techniques.
- Collaborate with product managers, data scientists, responsible AI experts, and ops engineers to translate business problems into ML solutions.
**Required Skills:**
- 3+ years in ML engineering or MLOps, shipping production AI/IR systems.
- Proficiency in Python; Java/Scala experience preferred.
- Hands‑on with AWS, Azure, or Google Cloud (SageMaker, Databricks, MLflow, Azure ML).
- Experience with search/vector/graph technologies: Elasticsearch/ OpenSearch/ Solr / Neo4j.
- Knowledge of LLM evaluation, prompt engineering, and guardrail implementation.
- Familiarity with PyTorch, TensorFlow, PySpark, and large‑scale data processing (Spark).
- Strong understanding of the data‑science lifecycle, feature engineering, training, and evaluation metrics.
- Background in healthcare or medical content workflows is a plus.
**Required Education & Certifications:**
- Bachelor’s or Master’s degree in Computer Science, Data Science, Engineering, or related field.
- Relevant certifications (e.g., AWS Certified Machine Learning – Specialty, Azure AI Engineer Associate, or similar) preferred but not mandatory.
Philadelphia, United states
Hybrid
11-02-2026