cover image
Speechmatics

Speechmatics

www.speechmatics.com

1 Job

107 Employees

About the Company

Speechmatics is the world's leading expert in speech technology, combining the latest breakthroughs in AI and ML to unlock the business value in human speech. Businesses use Speechmatics worldwide to accurately understand and transcribe human-level speech into text regardless of demographic, age, gender, accent, dialect or location in real-time and on recorded media. Combining these transcripts with the latest AI-driven speech capabilities, businesses build products that utilize summaries, topics, sentiment, chapters, translation and more. Speechmatics processes over 300 years of transcription worldwide every month in 50 languages. Having pioneered machine learning in speech recognition, its neural networks consider acoustics, languages, dialects, multiple speakers, punctuation, capitalization, context and implicit meanings. Speechmatics is headquartered in Cambridge, UK with a New York office too. Speechmatics is a registered trademark.

Listed Jobs

Company background Company brand
Company Name
Speechmatics
Job Title
Software Engineer - Data & AI
Job Description
**Job Title:** Software Engineer – Data & AI **Role Summary:** Design, develop, and maintain scalable data pipelines and web‑scraping solutions that generate high‑quality datasets for large‑scale speech AI models. Oversee data ingestion, transformation, storage, and quality assurance while optimizing performance and ensuring compliance with privacy regulations. **Expactations:** - Build and operate robust web‑scraping systems to collect data from diverse sources. - Develop end‑to‑end ETL pipelines for ingesting, cleansing, normalizing, and storing large volumes of data. - Optimize pipeline efficiency for speed, reliability, and scalability. - Monitor and troubleshoot data workflows, ensuring high reliability and throughput. - Identify opportunities to source or generate new relevant datasets. - Collaborate with ML engineers and leadership to define data strategy and roadmap. - Maintain compliance with data privacy and security standards. - Work with cloud and on‑prem infrastructure to support data operations. **Key Responsibilities:** - Architect and implement scalable web‑scraping tools and data pipelines using Python. - Manage data storage infrastructure (databases, cloud storage, on‑prem systems). - Enforce data quality through validation, cleaning, and normalisation processes. - Monitor pipeline performance and troubleshoot bottlenecks or failures. - Coordinate with ML teams to align data capabilities with model requirements. - Stay current on industry best practices in web scraping, data compliance, and cloud services. - Contribute to the continuous improvement of data infrastructure and tooling. **Required Skills:** - Proven experience in software/data engineering. - Strong proficiency in Python and SQL. - Hands‑on experience with web‑scraping, crawlers, and related frameworks. - Expertise in data pipeline design and ETL. - Familiarity with cloud platforms (e.g., AWS, Azure, GCP). - Knowledge of data compliance, privacy‑preserving techniques, and security best practices. - Experience with on‑prem infrastructure and database management. - Ability to optimize for performance, scalability, and cost efficiency. **Required Education & Certifications:** - Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent professional experience). - Relevant certifications in cloud technologies or data engineering are a plus.
Cambridge, United kingdom
Hybrid
04-12-2025