cover image
Found People Inc.

DevOps Manager (MLOps)

Hybrid

Toronto, Canada

Mid level

Full Time

12-02-2026

Share this job:

Skills

Python Bash SQL GitHub CI/CD DevOps Docker Kubernetes Monitoring Scripting and Automation Networking Architecture Azure AWS GCP CI/CD Pipelines API Management Terraform Infrastructure as Code Microservices GitHub Actions

Job Specifications

***4x a week in office, downtown Toronto

This role will support a set of cloud- and ML-heavy initiatives, with a strong focus on production ML platforms, cloud networking, and API-driven architectures. We are looking for someone who is deeply hands-on, comfortable operating at the infrastructure and platform layer, and experienced in supporting ML teams running models in production.

You’ll work closely with software engineers, ML engineers, and data teams to ensure that cloud infrastructure, deployment pipelines, and ML services are secure, observable, and scalable.

Key Responsibilities:

Cloud Infrastructure & Networking

Design, build, and operate cloud infrastructure across GCP (preferred), AWS, or Azure
Own cloud networking architecture including VPCs, load balancers, firewall, security policies, and IAM strategies
Ensure reliability, performance, and cost efficiency of cloud environments

DevOps & Platform Engineering

Build and operate microservices deployed into serverless environments such as Cloud Run or equivalent platforms
Implement and maintain CI/CD pipelines and automation using Terraform, GitHub Actions, and related tooling
Partner closely with application teams to enable safe, fast, and repeatable deployments

MLOps & ML Platform Support

Support ML and AI services in production, including deploying, operating, and monitoring models and pipelines
Hands-on experience with Google Vertex AI and ML platform operations
Deploy and operate MCP servers and other AI/ML-driven services in live environments
Work closely with ML teams to productionize models and ensure operational excellence

API Management & Observability

Design and manage API gateways and API management platforms (Apigee or native cloud API gateways) at scale
Implement strong observability practices including logging, monitoring, alerting, and notification systems
Troubleshoot performance, reliability, and data flow issues across distributed systems

Qualifications & Experience

Required

5+ years of professional experience in DevOps, Cloud Engineering, Platform Engineering, or MLOps roles
Strong hands-on experience with cloud networking (VPCs, load balancers, firewall rules, IAM)
Advanced API management experience using Apigee or native cloud API gateways
Direct MLOps experience, including deploying and operating ML/AI services in production
Deep, recent experience with GCP, particularly Vertex AI
Hands-on experience with Docker, Kubernetes, and serverless platforms (e.g., Cloud Run)
Strong scripting and automation skills using Python and Bash (SQL familiarity is a plus)
Proven experience with Infrastructure as Code (Terraform) and CI/CD automation
Strong understanding of cloud observability and operational best practices

About the Company

In today's dynamic and fiercely competitive business environment, and with a war on talent, identifying and finding the right talent is crucial for an organization's success and growth. It's essential to ensure the right fit from the start. That's where Found People Inc steps in. Found People Inc is Canada's premier boutique contingency recruitment firm. Their expertise is in recruiting top talent and building high-performance teams for startups, SaaS, and Fortune 500 companies focusing on mid to senior-level roles in techno... Know more