- Company Name
- ITMC Systems, Inc
- Job Title
- AI DevOps Engineer
- Job Description
-
Job Title: AI DevOps Engineer
Role Summary:
Architect, deploy, and scale enterprise AI infrastructure for agent-based platforms and Retrieval-Augmented Generation (RAG) pipelines across multi‑cloud environments. Focus on automation, cost control, observability, and secure operations of LLM services.
Expectations:
- Deliver highly available, scalable AI solutions with a strong emphasis on reliability and security.
- Optimize resource usage and cost for large‑language‑model workloads.
- Design and maintain comprehensive monitoring, observability, and CI/CD pipelines.
Key Responsibilities:
- Develop, deploy, and maintain the NOVA agentic AI platform and LiteLLM gateway.
- Build and tune RAG pipelines (ingestion, chunking, embeddings, vector stores) on GCP and Azure.
- Deploy AI services on Kubernetes (AKS, GKE), implement Helm/K8s automation.
- Create and manage CI/CD workflows using Jenkins, GitHub Actions, and Opsera.
- Automate infrastructure with Terraform, Helm, and GitOps; enforce security and compliance.
- Build automation tooling, MCP servers, and SDKs/APIs for multi‑agent orchestration.
Required Skills:
- 5+ years platform engineering/DevOps experience.
- 2+ years AI/ML or LLM platform development.
- Expertise in Kubernetes, CI/CD, and GCP/Azure cloud architecture.
- Proficient in Python and/or TypeScript; Bash scripting.
- Experience with Prometheus, Grafana, OpenTelemetry, and Dynatrace.
- Knowledge of containerization (Docker), Helm, Terraform, and GitOps workflows.
Required Education & Certifications:
- Bachelor’s degree in Computer Science, Software Engineering, or related field (or equivalent experience).
- Relevant certifications (e.g., Google Professional Cloud Architect, Microsoft Certified: Azure Solutions Architect, Certified Kubernetes Administrator) are a plus.