- Company Name
- We+
- Job Title
- Data Engineer / Scientist ( LLM & OCR )- Lyon
- Job Description
-
Job Title: Data Engineer / Data Scientist (LLM & OCR)
Role Summary:
Senior data engineer/scientist responsible for evaluating, integrating, and industrializing large language model (LLM) solutions within OCR, RAD, and LAD document processing pipelines. Works independently to expand document recognition scope, modernize existing technologies, and benchmark AI performance across cloud providers (AWS, GCP, Azure).
Expectations:
- 4‑7 years of professional experience in data engineering or data science.
- Proven expertise in applying LLMs to document analytics.
- Significant hands‑on experience with OCR, RAD, and LAD technologies.
- Strong analytical, synthesis, and communication skills with a collaborative mindset.
Key Responsibilities:
- Test and compare LLMs from multiple providers (AWS, GCP, Azure, or equivalents).
- Prepare, structure, and manage datasets for testing and evaluation phases.
- Define success/failure criteria and implement AI benchmark tools.
- Analyze, interpret, and report on results to drive decision‑making.
- Develop and enhance existing products using Java, Python, and Spring AI.
- Promote best practices and upskill teams on LLM, OCR, RAD, and LAD topics.
- Operate autonomously while maintaining close alignment with cross‑functional engineering teams.
Required Skills:
- Advanced proficiency in Python, Java, and Spring AI framework.
- Deep understanding of LLM architectures, fine‑tuning, and deployment.
- Expertise in OCR, RAD, and LAD technologies and workflow integration.
- Experience with AI benchmarking tools and performance metrics.
- Familiarity with cloud AI services (AWS Bedrock, GCP Vertex AI, Azure AI).
- Strong analytical, problem‑solving, and documentation skills.
- Excellent communication and collaboration abilities.
Required Education & Certifications:
- Bachelor’s or Master’s degree in Computer Science, Data Engineering, Artificial Intelligence, or a related field.
- Relevant certifications in AI/ML or cloud platforms are advantageous but not mandatory.