Cloud Operations Engineer

Posted 3 weeks ago


You will:

Run and monitor our client’s Cloud infrastructure by:

– Ensuring that the infrastructure is highly available and can scale without any downtime for the customer;
– Ensuring that our client’s Cloud Platform and application doesn’t decrease performance due to infrastructure problems;
– Ensuring that the infrastructure costs are optimised by reducing them when possible without affecting the overall performance;
– Ensuring that our client’s Cloud Platform and application is protected and secure, and there is no breach in the infrastructure and in the application that can affect the integrity or security of the customer data.
Handle incidents, changes and support requests by:
– Handling production and non-production incidents and alerts;
– Handling change requests like creating new customer environments, platforms and application releases;
– Handling support requests from internal and external customers;
– Ensuring that all Customer SLAs are met;
– Ensuring that all Internal OLAs are met.
Improve and optimise our client’s Cloud Platform by:
– Helping and supporting the engineering teams by giving them operational measurements, feedback and reports, so that they can increase performance of the application and increase engineering productivity;
– Helping and supporting the sales and delivery teams by giving them operational measurements, feedback and reports, so that they can stay ahead of customer needs.
Develop and improve operational tools and processes by:
– Improving the Operational processes so that we provide a high class IT service to the customers and meet all SLAs;
– Integrating and managing Monitoring and Alerting Systems;
– Integrating and managing Data Collecting, Processing, Analyzing and Reporting tools.
Need to have strong DevOps mindset and culture by:
– Sharing knowledge and best practices like Delivery and Deployment to the team and organisation so that we can enable the DevOps approach;
-Helping the engineering teams to increase the deployment frequency and reduce the lead time for changes by building highly performing CI/CD pipelines.

You need to have:

    • Incident/Change Management knowledge and experience;
    • Basic knowledge of web servers (e.g Apache, NGINX, Tomcat);
    • Solid knowledge of Linux systems (e.g. RedHat Enterprise Linux/CentOS);
    • Solid scripting and programming knowledge and experience (e.g. Shell Scripting, Python);
    • Good knowledge of monitoring and alerting solutions (e.g. AWS CloudWatch, Site24x7, ELK/EFK, Prometheus);
    • Good knowledge of database administration;
    • Good knowledge of network (load balancing, DNS) and security concepts;
    • Good knowledge of Cloud Computing (IaaS, PaaS) and/or virtualization;
    • Good communication, strong organizational and problem-solving skills;
    • Experience with planning, collaboration and reporting tools (e.g Jira, Trello, HipChat, Confluence, Excel).

We hope you have:

    • Basic understanding of scalable web application and microservice architecture;
    • Application security knowledge (secure software development practices);
    • Breaking down customer requirements into infrastructure changes and Pre-sales consulting experience;
    • Basic experience with configuration management tools (e.g. Ansible, Chef, Puppet);
    • Exposure to the Software Development Life Cycle, Continuous Integration and Delivery processes;
    • Experience with BI tools (e.g. Kibana, Grafana, etc.);
    • Experience with container orchestrators technologies (e.g. Kubernetes, Docker Swarm);
    • Experience with infrastructure-as-code technologies (e.g. Terraform, Heat).