cover image
Baseten

Baseten

www.baseten.co

4 Jobs

114 Employees

About the Company

Inference is everything. Baseten is an AI infrastructure platform giving you the tooling, expertise, and hardware needed to bring great AI products to market - fast. Our proprietary Inference Stack utilizes the cutting-edge of performance research combined with highly performant and reliable infrastructure to give you out-of-the-box global availability with 99.99% of uptime.

Listed Jobs

Company background Company brand
Company Name
Baseten
Job Title
Software Engineer - Model API's
Job Description
Job title: Software Engineer – Model APIs Role Summary: Architect, build, and maintain high‑performance model serving APIs that expose large‑language‑model endpoints. Focus on inference performance, observability, and developer experience while ensuring low latency, reliability, and scalability across distributed GPU resources. Expectations: - 3+ years designing and operating large‑scale, low‑latency distributed systems or APIs. - Proven ownership of backend services with rate‑limiting, authentication, quotas, and metering. - Strong infra instincts: profiling, tracing, capacity planning, and SLO management. - Comfortable debugging complex runtimes, GPU execution traces, and custom CUDA operators. - Excellent written communication; able to produce design docs and collaborate cross‑functionally. Key Responsibilities: - Design, develop, and operate the Model API surface for structured outputs, function calling, and multi‑modal serving. - Profile and optimize TensorRT‑LLM kernels, CUDA performance, and multi‑GPU communication patterns. - Implement performance improvements (speculative decoding, quantization, batching, KV‑cache reuse) across runtimes. - Build comprehensive benchmarking frameworks for real‑world workloads (model types, batch sizes, sequence lengths, hardware). - Instrument observability with metrics, traces, and logs; create repeatable benchmarks for speed, reliability, and quality. - Implement platform fundamentals: API versioning, validation, usage metering, quotas, and authentication. - Collaborate with product, infra, and dev‑experience teams to deliver robust, developer‑friendly serving experiences. Required Skills: - Distributed systems, large‑scale API design, and low‑latency backend engineering. - Experience with rate‑limiting, auth, quotas, and metering. - Profound knowledge of profiling, tracing, and performance tuning, including GPU and CUDA. - Familiarity with TensorRT‑LLM, vLLM, or similar inference engines. - Ability to debug runtime internals, GPU trace logs, and custom CUDA operators. - Strong documentation skills and cross‑team collaboration. - Optional: Kubernetes, service meshes, API gateways, and open‑source API experience strengthens candidacy. Required Education & Certifications: - Bachelor’s degree in Computer Science, Electrical Engineering, or a related technical field (or equivalent experience).
New york, United states
Hybrid
Junior
24-11-2025
Company background Company brand
Company Name
Baseten
Job Title
Senior Frontend Engineer
Job Description
Job title: Senior Frontend Engineer Role Summary: Lead design, implementation, and maintenance of high‑performance, accessible web interfaces for an AI platform, collaborating with product, design, and backend teams to deliver scalable, user‑centric features. Expactations: • Own critical frontend components and user experience across the platform. • Mentor peers, establish best practices, and drive major frontend initiatives. • Communicate effectively with cross‑functional teams to translate product requirements into elegant UI solutions. • Deliver reusable component libraries and infrastructure that accelerate product development. Key Responsibilities: • Design, implement, and maintain responsive, accessible UIs using React and TypeScript. • Collaborate closely with product designers to transform complex ideas into intuitive interfaces. • Optimize application performance, focusing on rendering speed and responsiveness. • Partner with backend teams to define APIs, test, and refine end‑to‑end flows. • Establish best practices and mentor other engineers. • Build reusable component libraries and frontend infrastructure. Required Skills: • 5+ years building production‑grade web applications. • Deep expertise in React, TypeScript, and modern web tooling. • Proven track record of building performant, scalable UIs. • Strong product sense and user‑experience empathy. • Experience within cross‑functional product teams in a fast‑paced environment. • Excellent communication and collaboration skills. Nice to have: Tailwind CSS, Next.js, WebSockets, real‑time data, custom visualization tooling, open‑source contributions. Required Education & Certifications: • Bachelor’s degree in Computer Science, Software Engineering, or related field is preferred. • Relevant technical certifications (e.g., React Developer, TypeScript) are a plus.
New york, United states
Hybrid
Senior
24-11-2025
Company background Company brand
Company Name
Baseten
Job Title
AI Solutions Engineer
Job Description
Job title: AI Solutions Engineer Role Summary: Deliver end‑to‑end AI solutions on the Baseten platform by partnering with customers from discovery through production. Merge software engineering, product design, and technical customer success to create scalable, observable AI services that meet performance, latency, and cost targets. Expectations: - Own the full customer lifecycle: problem framing, prototype creation, evaluation, deployment, and monitoring. - Translate ambiguous business goals into clear technical specifications and PoCs, and ship high‑quality services on time. - Drive continuous improvement of the technical stack, collaborating with product and engineering teams. Key Responsibilities: - Design, implement, and maintain production‑grade software systems primarily in Python. - Build and deploy ML model pipelines; manage Docker containerization and integration with Baseten’s infrastructure. - Deliver rapid Proof‑of‑Concepts, detailed specs, and fully tested services. - Optimize AI/ML workflows for latency, throughput, and cost efficiency. - Act as a project manager and product owner: define scope, milestones, and stakeholder communication. - Work closely with customers’ engineering teams throughout sales, implementation, and expansion phases. - Provide technical pre‑sales support and post‑deployment customer success. - Work cross‑functionally with product, performance engineering, and software teams to enhance platform capabilities. Required Skills: - Proficient in Python; familiarity with other general‑purpose languages is a plus. - Hands‑on experience building and deploying ML pipelines, model serving, and inference infrastructure. - Solid understanding of ML lifecycle: data ingestion, training, validation, deployment, and monitoring. - Comfortable with container orchestration (Docker, Kubernetes) and cloud infrastructure concepts. - Strong written and verbal communication, especially on complex technical topics. - Ability to navigate ambiguity, make judgment on trade‑offs, and avoid unnecessary complexity. - Basic project management competencies; experience owning end‑to‑end initiatives. Required Education & Certifications: - Bachelor’s, Master’s, or Ph.D. in Computer Science, Engineering, Mathematics, or a related technical field. - Minimum 1 year of professional software engineering experience in a fast‑paced, high‑growth environment.
New york, United states
Hybrid
Fresher
24-11-2025
Company background Company brand
Company Name
Baseten
Job Title
Software Engineer - Infrastructure
Job Description
Job title Software Engineer - Infrastructure Role Summary Develop and maintain core infrastructure components for a machine‑learning inference platform, enabling efficient model deployment, scaling, and monitoring across multi‑cloud environments. Expactations * Build Python and Go based services that support model serving * Design and manage Kubernetes deployments and orchestration layers * Implement monitoring, logging, and resource‑management solutions for ML workloads * Automate deployment workflows and improve reliability and performance * Collaborate closely with peer teams to evaluate and adopt infrastructure best practices Key Responsibilities * Develop and extend platform components for inference on varied GPU resources (e.g., B200, fractional H100) * Create and maintain Kubernetes manifests, Helm charts, and deployment pipelines for model serving workloads * Design service‑level monitoring dashboards and alerting for inference performance metrics * Engineer efficient resource provisioning and autoscaling strategies for distributed inference workloads * Enhance automation scripts and CI/CD pipelines to streamline model deployment processes * Participate in architectural reviews and technical discussions to shape long‑term infrastructure strategy * Mentor junior engineers on coding standards, DevOps practices, and ML infrastructure concepts Required Skills * 1–3 years software engineering or infrastructure experience * Proficiency in Python; Go knowledge preferred * Hands‑on experience with Kubernetes, containerization (Docker, containerd) * Familiarity with distributed system concepts and resource scheduling * Experience with monitoring/logging tools (Prometheus, Grafana, ELK) * Basic understanding of ML model serving and inference workflows * Strong written and verbal communication; collaborative mindset Required Education & Certifications * Bachelor’s degree or higher in Computer Science, Engineering, or related field (or equivalent experience) ---
New york, United states
Hybrid
Fresher
11-02-2026