- Company Name
- Environment Agency
- Job Title
- Junior Data Engineer - 32482
- Job Description
-
**Job Title**
Junior Data Engineer
**Role Summary**
Support the Water Quality Digital Services team in building and deploying data products and analytics tools on Microsoft Azure. Manage and transform large, multi‑source datasets, design star-schema data models, and maintain reliable ETL pipelines to enable efficient reporting and analytics for the organization’s water quality stakeholders.
**Expectations**
- Manage and analyze large datasets from diverse sources.
- Design and implement star‑schema models for relational data warehouses.
- Build, test, and maintain production‑ready ETL pipelines.
- Write clean, reusable Python code, apply object‑oriented practices.
- Write SQL for data extraction, transformation, and loading.
- Collaborate cross‑functionally, promote inclusion and teamwork.
- Continuously improve data quality, resilience, and documentation.
**Key Responsibilities**
- Extract data from various structured and semi‑structured sources (databases, APIs, files).
- Transform data using Python and SQL, ensuring accuracy and performance.
- Load transformed data into Azure Synapse Analytics or equivalent data warehouse.
- Design and maintain star‑schema data models that support efficient query and reporting.
- Develop scheduled ETL workflows, monitor execution, troubleshoot failures.
- Create data documentation and metadata for use by analysts and Power BI developers.
- Work with business stakeholders to understand reporting needs and deliver actionable data solutions.
- Participate in code reviews, adhere to version control (Git) and CI/CD practices.
- Ensure data security, privacy compliance, and data lineage.
- Provide technical support for Power BI dashboards and analytics tools.
**Required Skills**
- Proficient in Python (pandas, pyodbc, SQLAlchemy, etc.).
- Strong SQL Server experience (T‑SQL, OLAP, indexing).
- Experience with Azure services: Synapse Analytics, Data Factory, Data Lake, Power BI, Fabric.
- Knowledge of ETL design, data pipeline orchestration, data quality, and testing.
- Ability to design and implement star‑schema data models.
- Familiarity with version control (Git), CI/CD, and automated testing.
- Excellent problem‑solving, analytical, and communication skills.
- Commitment to inclusive teamwork and collaboration.
**Required Education & Certifications**
- Bachelor’s degree or equivalent in Computer Science, Information Systems, Data Engineering, or related field.
- Relevant certifications (e.g., Microsoft Certified: Azure Data Engineer Associate, SQL Server certifications) are advantageous but not mandatory.