cover image
Hays

Monitoring Manager

Hybrid

Waterside, United kingdom

Senior

Freelance

18-03-2026

Share this job:

Skills

Leadership ServiceNow Monitoring Ansible Quality Assurance Training Architecture Machine Learning Azure AWS Process Improvement

Job Specifications

We have excellent contract job opportunity for Observability/Monitoring Service Owner – Cloud for our leading client.

Role overview

Own the technical execution of the Observability solutions, integration of monitoring tools, leveraging the ai capabilities in the NOW platform to manage events of client’s Transform products and technical platforms.

Contract – 6 months (high potential to extend further)

Location – Waterside (UB7 0GB) (2-3 days per week onsite)

Pay – attractive daily rate (inside IR35)

In this role, you will…

Leadership and Governance:

Lead and own the IT observability, Automation and Autohealing services for IAG Transform.
Foster a culture of innovation, collaboration, and continuous improvement in the organisation.
Develop and implement policies, process and procedures for observability service.
Define standards for logs, event alerts and quality assurance.
Establish governance frameworks to ensure consistent and compliant usage of observability tools.
Set up technical review gov board for any monitoring solutions to define/validate/endorse monitoring strategies, solutions, demands, etc.
Conduct regular audits to ensure compliance with established policies and standards.
Responsible for providing an observability centre of excellence, own and provide observability solutions to product and platform teams.

Innovation and Strategy:

Develop strategies to leverage new observability tools and technologies to enhance IT service operations and overall business operations.
Lead proof-of-concept initiatives to automation resolution of events and incidents.
Introduce and implement new machine learning models and aiops features.

Process Improvement:

Responsible to identify service optimisation initiatives to mature the overall service.
Continuously improve IT and business service availability through effective use of observability and automation tooling.
Identify opportunities to automate processes and reduce manual efforts.
Optimise metric intelligence.

Vendor Management:

Manage vendors and partners to provide best in class service to meet IAG requirements.
Manage vendor relationships, service-level agreements (SLAs), escalations and CSI plans.
Evaluate and select new vendors and tools as needed.

Observability Tooling Architecture:

Design and oversee the implementation of a comprehensive enterprise observability tooling architecture and strategy that supports ITSM, monitoring, observability, automation, and delivery management.
Engage in AiOps project to ensure that the key monitoring tools like Datadog, AWS, Azure monitor, Dynatrace, etc is feeding the right logs and metrics into event management module in service now.
Optimize observability tooling infrastructure to improve efficiency, reliability, and performance.
Ensure that all tools integrate seamlessly with each other and with other enterprise systems.
Develop and maintain a roadmap for enterprise tool enhancements and upgrades.
Set up business service monitoring dashboards for the critical business services

Automation & Autohealing:

Own the automation and autohealing service, platforms and tools.
Define the automation and autohealing policy, process and procedure.
Identify potential use cases for automation and autohealing and take it through the right governance to implement automation playbooks using ansible or any AWS/AZURE native services that seem fit for the use case.
Responsible for reduction in manual efforts in service ops and increase in automation.

Tool Integration and Optimization:

Work collaboratively with cross-functional teams to ensure integration of tools across the Enterprise to reduce manual effort and maximise quality and productivity.
Define the technical specifications, standards, and policy for technical integration of monitoring tools into ServiceNow/Ansible.
Validate the technical architecture of the integration to ensure its fit for use, fit for purpose, its scalable and flexible to meet the demands of measuring business services.
Implement best practices, industry standards and frameworks for configuration and usage of observability and automation technology tools.

ITSM Tooling:

Responsible to identify opportunities to increase the proactive prediction, detection and restoration of events and incidents using machine learning models.
Responsible to leverage the aiops, service now to increase the automation of resolution.
Design and oversee the implementation of ITSM tooling solutions that support ITIL-aligned processes.
Work collaboratively with cross-functional teams to ensure integration of ITSM tools with other essential enterprise tools (e.g., monitoring, CMDB, service desk, automation tools).

Training and Support:

Provide training and support to technology staff on the effective use of observability and automation services.
Serve as a subject matter expert for enterprise tools and related technologies.

Skills

Minimum Requirements:

Extensive experience (typically 15+ years) in observabilit

About the Company

We are leaders in specialist recruitment and workforce solutions, offering advisory services such as learning and skill development, career transitions and employer brand positioning. As the Leadership Partner to our customers, we invest in lifelong partnerships that empower people and businesses to succeed. We help you achieve your career goals and deliver your business needs by combining meaningful innovation with our global scale and insights. Last year we helped over 280,000 people find their next career. Join the mill... Know more