DevOps/ Incident Manager

Location Montenegro
Job Type Permanent
Salary Competitive
Reference 29147

The DevOps has overall accountability to enable faster release and deployment cycles by taking advantage of agile development methodologies; improved collaboration between business stakeholders, application development and operations teams; and automation tools.

Incident Manager is responsible for Major Incidents Root Cause Analysis. Ensures that normal service operation for specific microservice is restored as quickly as possible to minimize adverse impact on the business. Manages the lifecycle of a problem, helping to eliminate recurring incidents as well as minimizing the impact of incidents that cannot be prevented.


  • 4-5 years of experience as System Administrator/DevOps

  • Experience as an Incident Manager / Site Reliability Engineer is a significant plus

  • Linux system, Web servers (e.g Nginx)

  • Networks understanding (TCP/IP)

  • Proficiency writing scripts (at least one: python, sh, bash, and/or others)

  • Docker orchestration systems: ECS, Kubernetes

  • Messaging queues: ActiveMQ, RabbitMQ or any other Apache Kafka message broker

  • Familiarity with Logstash, Kibana, Elasticsearch technologies

  • Familiarity with Amazon AWS: EC2, EMR, Kinesis, Redshift

  • English at a speaking level



  1. Design, implement and maintain a dynamic infrastructure to support software development. Java/python/scala services (Batch and Stream Processing), Elasticsearch + Logstash communication

  2. Building new systems, upgrading and patching existing ones

  3. Leverage scripting to build required automation and tools

  4. Configuration management

  5. Continuous Integration

  6. Continuous Delivery

  7. Perform technical research as a member of a team

  8. Investigating AWS components combining.

  9. Checking 3rd party components and services.

  10. Learn on the job and explore new technologies with little supervision

  11. Optimize services costs

  12. Improve services stability/availability

  13. Manage day to day functions to ensure the efficient handling smaller to medium level problems and Major Incidents so that normal service operation is recovered as quickly as possible to minimize the adverse impact on the business

  14. Oversee communication and technical issues and coordinate work during Major Incidents (Very High & Critical)

  15. Create and maintain incident reports to ensure accuracy

  16. Monitor outstanding problems and their status to ensure corrective measures are being taken to permanently fix the problems

  17. Perform proactive trend analysis and root cause analysis to identify potential areas of concern with a focus on prevention and elimination of recurrences

  18. Provide guidance and recommendations to business and technical partners to help prevent future incidents and improve system stability and client satisfaction

  19. Creation of agreed action plans with named actions & deadlines for Incident Reports. Accountable for the Delivery of that plan

  20. Testing and production environment

  21. Efficient Delivery of Services (Quality, Low cost)

  22. Efficient and Effective Solution Delivery Process

  23. Promote Continuous Improvement

  24. Ability to choose automation tools to deploy based on service requirements

  25. Provide effective cost solutions to deploy/maintain infrastructure

  26. Be strong in CI/CD area to help team with processes automatization



  • Work in a dynamic and fast-paced international company

  • Beautiful emerging European destination (Montenegro, Podgorica)

  • Low cost of living

  • Generous relocation package

  • Work/residence permit for employees and their families

  • Competitive salary

  • Comprehensive medical insurance for all family members

  • Annual reimbursement of flight home for employee and family

  • Public holidays + 21 working days of annual leave

  • Fun and friendly professional environment

  • Using leading technologies and modern practices

  • Excellent training and development opportunities

  • Rapidly expanding global footprint

Apply Now