We are seeking a Monitoring Engineer to collaborate with our SRE and Development teams.
The position entails optimizing our monitoring and tooling to enhance our productivity and increase system availability by proactively identifying trends.
- Take charge of monitoring our applications and systems
- Work with other teams to ensure that the monitoring tooling and processes are appropriate
- Train team members on how to effectively monitor services, systems, and even business metrics
- Foster communication and effective collaboration
- Assist with standardizing and automating monitoring across multiple environments
- Regularly review costs to ensure that we are not overspending
- Help establish a productive environment where team members take ownership of their product end-to-end
- Comprehensive knowledge of monitoring tools with DataDog being our primary tool
- Proficiency in working with both cloud and physical datacentres
- Experience in setting up and managing on-call rotations using tools like OpsGenie
- Knowledge of integrating different agents and monitoring libraries for improved visibility
- Familiarity with metrics, logs, and their cross-relation
- Excellent communication and leadership skills
- Strong problem-solving and conflict-resolution ability
- Outstanding organizational skills
- Experience in an agile environment
The ideal candidate
- Experience with Datadog is essential for this position.
- Should have a strong background in monitoring tools and be proficient in working with cloud tech.
- Experience with tools to set up and manage on-call rotations such as OpsGenie is desirable.
- Additionally, knowledge of how to integrate various agents and monitoring libraries for better visibility, as well as an understanding of metrics and logs, is necessary.
- Strong communication, leadership, and problem-solving skills, as well as excellent organizational skills, are critical.
- Familiarity with working in an agile environment is preferred.
Our Partner's Approach:
A medium-sized company with a start-up culture and a can-do ethos that prioritizes getting things done. Less emphasis on management hierarchy or process and prioritize working remotely while maintaining regular communication. Their open-door policy extends across all levels and departments, and our staff come from diverse backgrounds from all over the world.
A fully remote and flexible work schedule, on-the-job training, and opportunities for advancement within the company. A hardware allowance is included to ensure that you have all the necessary tools to complete your work. Additionally, 25 days of holidays, including bank holidays in your respective country.