Infrastructure & Cloud
Monitoring Devops engineer
Join our team as a DevOps Specialist and take the lead in building a smarter, more predictive IT monitoring and incident detection ecosystem
We’re looking for an experienced and proactive DevOps Specialist with a strong background in IT monitoring, incident detection, and automation. This role plays a vital part in ensuring systems and applications remain stable, efficient, and reliable by identifying issues before they escalate.
Our goal is to move toward a predictive operating model—where data, trends, and automation help us anticipate incidents and maintain smooth operations.
Role
Monitoring & Observability
- Set up, manage, and optimize monitoring solutions using tools like ELK Stack, Splunk, Dynatrace, Datadog, New Relic, Grafana, or similar.
- Continuously fine-tune dashboards, alerts, and metrics to improve visibility and reduce noise.
- Build reports that highlight system health, performance, and anomalies.
- Support trend analysis to identify early signals that could indicate future incidents.
Cloud Integration
- Ensure robust monitoring across cloud platforms such as GCP, AWS, or Azure.
- Maintain observability in cloud-native environments, and ensure that potential issues are detected early.
Incident Detection & Response
- Analyze system data to uncover patterns, define new metrics, and improve detection capabilities.
- Collaborate on root cause analysis, post-incident reviews, and continuous improvement efforts.
- Develop alerting rules and integrate them with our incident management platform (FreshServices).
- Help bridge the gap between monitoring, alerting, and response workflows.
Collaboration & Communication
- Work closely with developers, application teams, and IT support to align monitoring with business and technical needs.
- Ensure clear and timely communication during incidents and improvement projects.
- Guide teams in adopting and optimizing monitoring tools.
Automation & Scripting
- Use Terraform, Python, Bash, and CI/CD pipelines to automate monitoring, alerting, and incident workflows.
- Build automation scripts to reduce manual tasks and improve system responsiveness.
Documentation & Standards
- Maintain clear documentation of monitoring setups, alert thresholds, and escalation procedures.
- Define and document alert priorities and response protocols.
Autonomous Working
- Take ownership of your work and propose improvements to tools, workflows, or processes.
- Bring curiosity, initiative, and a solution-oriented mindset to your day-to-day work.
Profile
- Degree in Computer Science, IT, or equivalent practical experience.
- Strong hands-on experience with modern monitoring tools (e.g. ELK, Datadog, Splunk).
- Experience working with at least one major cloud provider (GCP, AWS, Azure).
- Proficiency in automation/scripting (Terraform, Python, Bash).
- Familiarity with CI/CD pipelines and modern DevOps practices.
- Excellent communication skills and ability to work with multiple teams.
Offer
- Continuous learning and professional development support
- Easily reachable headquarters with hassle-free parking
- Adaptable work schedule and remote work options to support a healthy work-life balance
Voordelen
2 dagen telewerken
Bij Sander, behandelen we elke aanvraag strikt vertrouwelijk!
Apply now
Submit your CV today and let us connect you with top employers in your field.