- Company Name
- AURATEM
- Job Title
- Consultant Technical Lead MLOps
- Job Description
-
**Job title:** Consultant Technical Lead MLOps
**Role Summary**
Lead the industrialization and production scaling of AI and RAG (Retrieval‑Augmented Generation) solutions. Transform experimental data‑science prototypes into robust, deployable systems with end‑to‑end pipelines, continuous monitoring, and automated retraining. Drive architectural design, cloud deployment, CI/CD, and best coding practices while mentoring data‑science teams.
**Expatations**
- 6–12 month contract (renewable) as a freelance or permanent contractor.
- Full ownership of the migration from prototype to production for large‑scale RAG/LLM workloads.
- Deliver solutions that are scalable, monitored, and maintainable across cloud environments.
**Key Responsibilities**
1. **Industrialization of RAG/LLM Systems**
- Automate processing of large documents (e.g., 250+ page PDFs).
- Build chunking, indexing, and ingestion pipelines into vector databases.
- Scale knowledge‑base storage and retrieval.
- Implement monitoring metrics to detect model drift and provide alerting.
2. **Architecture & Scalability**
- Design complex, multi‑step workflows for long‑running tasks.
- Deploy services on Azure cloud (or AWS/GCP).
- Containerise with Docker, orchestrate with Kubernetes, and manage infrastructure using Terraform.
- Establish specialized CI/CD pipelines for AI workloads.
3. **Development & Best Practices**
- Refactor data‑science code to meet industrial standards (OOP, dependency injection).
- Introduce automated tests, code reviews, and modular design.
- Manage model versioning and dependency control.
4. **MLOps & Monitoring**
- Oversee production model performance, including drift detection and automated retraining.
- Produce dashboards and business‑relevant metrics via Prometheus, Grafana, MLflow.
5. **Leadership & Collaboration**
- Guide and coach a team of data scientists, disseminating engineering best practices.
- Translate technical decisions into actionable roadmap items.
**Required Skills**
- **Programming:** Python (5–10 years, expert).
- **MLOps:** Machine‑learning pipelines, model monitoring, drift detection.
- **Cloud:** Azure, AWS, or GCP – deployment at scale.
- **DevOps:** Docker, Kubernetes, CI/CD.
- **Software Architecture:** Design patterns, dependency injection, OOP.
- **Frameworks:** LangChain, Haystack, ChromaDB, Pinecone (preferred).
- **Databases:** Vector databases, relational (SQLAlchemy).
- **Streaming/Orchestration:** Kafka/Pulsar, Airflow/Prefect (desired).
- **Monitoring Tools:** Prometheus, Grafana, MLflow.
- **Methodologies:** Agile, TDD, code review.
**Soft Skills**
- Technical leadership and mentoring.
- Autonomy in complex projects.
- Industrialization mindset: turning POCs into production‑ready systems.
**Required Education & Certifications**
- Bachelor’s or Master’s degree in Computer Science, Software Engineering, Data Science, or related field.
- Certifications in cloud platforms (Azure/AWS/GCP) and MLOps/Kubernetes are advantageous.