- Company Name
- Nscale
- Job Title
- Senior Software Engineer
- Job Description
-
**Job Title:** Senior Software Engineer
**Role Summary:**
Design, implement, and operate scalable control and data‑plane services that power a GenAI cloud platform. Own APIs, SDKs, distributed state management, workload scheduling, and observability for high‑throughput AI/ML workloads across multiple regions. Drive technical strategy, mentor peers, and ensure production reliability and performance.
**Expectations:**
- Deliver production‑grade distributed systems on major cloud providers (AWS, GCP).
- Master AI development tools (Claude, Cursor, etc.) as core workflow.
- Exhibit strong command of typed languages (Go, Rust, Python).
- Manage day‑2 operations: monitoring, alerting, incident response, performance tuning.
- Thrive in ambiguous, fast‑paced environments with high ownership.
- Advocate engineering best practices and continuous improvement.
**Key Responsibilities:**
- Build and own control‑plane and data‑plane services for cloud platform.
- Develop APIs and SDKs consumed by platform and client services.
- Implement reliable distributed state, storage, and scheduling engines across regions.
- Engineer infrastructure for high‑throughput AI/ML training and inference workloads.
- Drive technical decisions and champion best practices across engineering teams.
- Maintain operational health: observability, incident response, performance optimization, reliability improvement.
- Innovate new platform services using cloud‑native, AI‑driven approaches.
- Mentor and guide junior engineers on architecture, coding, and ops.
**Required Skills:**
- Extensive experience designing, building, and operating scalable production systems on AWS or GCP.
- Proficiency in Go; experience with Rust, Python, and multiple language ecosystems.
- Strong background in distributed workflows, backend services, and RESTful API design.
- Hands‑on day‑2 operations: monitoring, alerting, incident response, performance tuning.
- Familiarity with AI‑assisted development tooling.
- Pragmatic problem‑solver with high agency and ownership mindset.
**Nice to Have (Non‑mandatory):**
- Kubernetes, Terraform or Pulumi, event‑driven architecture, NATS, Kafka, RabbitMQ.
- GPU orchestration and AI workload scheduling.
- Open‑source contributions, front‑end stack exposure.
**Required Education & Certifications:**
- Bachelor’s (or higher) degree in Computer Science, Software Engineering, or related field.
- Relevant cloud certifications (e.g., AWS Solutions Architect, GCP Professional Cloud Architect) are a plus but not mandatory.