Infrastructure & Cloud

Medior Site Reliability Engineer

Would you like to go into the back backbone we rely on ?

We are looking for an experienced Site Reliability Engineer (SRE) to join the Engineering Chapter team and help ensure the reliability, scalability, and performance of critical on-premises services within the ERA product organization.

In this role, you'll focus on building and maintaining a modern observability platform, implementing monitoring best practices, and automating operational processes. Working closely with cross-functional engineering teams, you'll help improve system resilience, reduce incident response times, and ensure the availability of business-critical services.

If you're passionate about observability, automation, and operational excellence, this opportunity is for you.

Role

Observability & Monitoring

Design, implement, and maintain enterprise monitoring solutions.
Build intuitive Grafana dashboards and visualizations.
Configure meaningful alerts to proactively detect issues.
Implement distributed tracing and centralized log aggregation.
Define and maintain Service Level Indicators (SLIs) and Service Level Objectives (SLOs).
Continuously improve monitoring coverage and platform visibility.

Infrastructure & Reliability

Manage and optimize on-premises monitoring infrastructure.
Ensure platform reliability, scalability, and high availability.
Support Linux-based environments and troubleshoot infrastructure issues.
Participate in 24/7 on-duty rotations for incident response.
Contribute to reducing Mean Time to Detect (MTTD) and Mean Time to Recover (MTTR).

Automation & DevOps

Automate deployment, configuration, and operational tasks.
Develop automation scripts using Python, Bash, or Go.
Improve infrastructure management through automation and standardization.
Support Infrastructure as Code and operational best practices.

Collaboration

Work closely with development teams to improve application instrumentation.
Promote observability best practices across engineering teams.
Balance technical improvements with business priorities.
Contribute to continuous improvement initiatives within the Engineering Chapter.

Security & Compliance

Ensure monitoring solutions comply with enterprise security standards.
Maintain secure on-premises monitoring environments.
Support compliance and governance requirements.

Profile

Core Technical Skills

Advanced experience with Grafana
Strong expertise in Prometheus and PromQL
Hands-on experience with OpenTelemetry
Experience with Elasticsearch
Strong Linux system administration skills
Good understanding of networking fundamentals
Experience securing on-premises infrastructure

Programming & Automation

Experience with one or more of:

Python
Bash
Go

Experience

3+ years of experience in monitoring, observability, or Site Reliability Engineering.
At least 2 years of hands-on experience with Grafana and Prometheus in production environments.
Strong experience supporting Linux-based production systems.
Proven experience managing enterprise on-premises infrastructure.
Experience participating in 24/7 operational support or on-call rotations.

Security

Understanding of enterprise security practices.
Experience working within compliance-driven environments.

Who You Are

Passionate about reliability, automation, and operational excellence.
Analytical with strong troubleshooting skills.
Comfortable working in production-critical environments.
Able to prioritize effectively and balance technical improvements with business needs.
Collaborative and proactive in working with cross-functional teams.
Committed to continuous improvement and knowledge sharing.

Offer

Freelance Long term Contract

What You'll Help Deliver

As a Site Reliability Engineer, you'll contribute directly to:

Improved platform reliability and system availability.
Reduced MTTD (Mean Time to Detect) and MTTR (Mean Time to Recover).
Comprehensive observability across critical services.
Automated deployment, monitoring, and operational processes.
Secure and compliant monitoring infrastructure supporting business-critical applications.

Voordelen

3 dagen telewerken

Bij Sander, behandelen we elke aanvraag strikt vertrouwelijk!

Apply now

Submit your CV today and let us connect you with top employers in your field.