- Company Name
- Themesoft Inc.
- Job Title
- IT Site Reliability Engineer / Architect
- Job Description
-
**Job Title**
IT Site Reliability Engineer / Architect
**Role Summary**
Design, build, and maintain secure, highly‑available cloud infrastructure for medical device applications (Ignition, PostgreSQL, HiveMQ, Qlik, Confluent Kafka, Tanzu). Lead Kubernetes orchestration, DevOps practices, CI/CD pipelines, infrastructure‑as‑code, monitoring, and disaster‑recovery across multi‑cloud environments.
**Expectations**
*Own and evolve the infrastructure architecture, ensuring availability, scalability, and security.*
*Collaborate closely with cross‑functional teams (dev, ops, security) to define and enforce best practices.*
*Deliver reliable, automated deployments and rapid issue resolution.*
**Key Responsibilities**
1. Architect and implement cloud‑native infrastructure for the specified applications.
2. Build and maintain Kubernetes clusters, Helm charts, and Kustomize manifests; enforce GitOps workflows.
3. Develop CI/CD pipelines for application and infra deployments using tools such as Tanzu, Confluent Kafka, and Git.
4. Maintain infrastructure‑as‑code repositories with version control and documentation.
5. Monitor system performance, detect drift, and automate remediation; use Prometheus/Grafana or equivalent.
6. Design and test disaster‑recovery and business‑continuity plans.
7. Administer Linux production systems (Ubuntu, CentOS, RHEL, Debian), manage networking, filesystems, and performance tuning.
**Required Skills**
- Cloud: AWS/Azure/GCP (provisioning, networking, IAM).
- Container: Kubernetes, Docker, Helm, Kustomize, GitOps.
- Databases & messaging: PostgreSQL, HiveMQ, Confluent Kafka, Qlik, Ignition, Tanzu.
- CI/CD & IaC: Git, GitHub/GitLab, Jenkins/ArgoCD, Terraform, Ansible.
- Linux system administration, kernel internals, process & network tuning.
- Monitoring & observability: Prometheus, Grafana, ELK stack.
- Disaster‑recovery design, high‑availability planning.
**Required Education & Certifications**
- Bachelor’s degree in Computer Science, Information Technology, or related field (or equivalent experience).
- Relevant certifications preferred:
* Certified Kubernetes Administrator (CKA) / Certified Kubernetes Security Specialist (CKS)
* Red Hat Certified Engineer (RHCE) or similar Linux certification
* Cloud platform certifications (AWS SAA, GCP‑CSP, Azure AZ-303/304)
---