- Company Name
- Colossus Technologies Group
- Job Title
- Sr. Platform SRE
- Job Description
-
**Job Title:** Sr. Platform SRE
**Role Summary:**
Senior Platform/SRE/DevOps Engineer responsible for designing, building, and maintaining secure, highly‑available, cloud‑native infrastructure on AWS. Drives automation, observability, and continuous improvement of deployment pipelines for AI automation platforms.
**Expectations:**
- 5+ years of hands‑on Platform Engineering, SRE, or DevOps experience.
- Deep expertise in AWS, Kubernetes, and containerization.
- Proven ability to deliver resilient, scalable infrastructure as code.
- Strong security and compliance (SOC2) awareness.
- Ability to collaborate with development teams and lead incident response.
**Key Responsibilities:**
- Architect and implement cloud‑native infrastructure using Kubernetes and AWS.
- Develop and maintain IaC using Terraform or CloudFormation.
- Design, operate, and optimize CI/CD pipelines (Jenkins, GitLab CI, GitHub Actions).
- Ensure high availability, performance, and reliability of production systems.
- Manage cloud security, compliance, monitoring, and observability (Prometheus, Grafana, CloudWatch, ELK).
- Define and execute incident response and on‑call processes.
- Continuously improve infrastructure, automation, and deployment practices.
**Required Skills:**
- AWS cloud services (EC2, S3, RDS, etc.)
- Kubernetes and Docker container orchestration
- Infrastructure as Code: Terraform, CloudFormation
- Scripting/Programming: Python, Bash
- CI/CD tools: Jenkins, GitLab CI, GitHub Actions
- Monitoring & logging: Prometheus, Grafana, CloudWatch, ELK Stack
- Strong understanding of distributed systems, microservices, and event‑driven architectures
- Security & compliance fundamentals (SOC2, IAM, network security)
**Required Education & Certifications:**
- Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent practical experience).
- Preferred: AWS Certified Solutions Architect or DevOps Engineer certification.