Job description
As a Site Reliability Engineer, you will be responsible for operations and support of IT solutions designed for the telecommunication sector (eSIM cards, IoT, M2M) integrated with the network of worldwide telco operators. You will be part of the project - public cloud (DevOps model), in terms of technologies focused on Cyber Security (Telecom side).
#LI-BH1
Your responsibilities
- To perform day to day activities in GCP Cloud, following the SRE approach in a cross-functional team
- To develop & maintain the IAC code & automation tools
- You will be responsible for support & operations in GCP Cloud to shape the product roadmap and establish strong operational readiness across teams
- Responsible to extend and acknowledge the handover milestones to Tier I/II to comply with contractual SLA’s
- Responsible to maintain and improve the proactive solution monitoring
- To ensure the integrity and reliability of the solution functional baseline & architecture
- To provide technical guidance for the new or evolution of services and provide consolidated technical analysis
- To deploy existing and/or new products in GCP Cloud
- To perform onboarding tests, communicate technical risk concerns and help prepare mitigation plans, Infrastructure and application sanity checks
- To handle incidents raised internally or by customers to restore services or to provide solutions to customers within the SLA
- To ensure SRE milestones accomplishment like SLO and Error budge policies
- To respond to requests & queries coming from different stakeholders and customers with regards to platforms & products under the scope
- To solve complex problems arising from issues on the platform leading to bug fixes or system updates, escalation to Level 3 and close follow up until resolution
- To design, plan & implement all changes as a result of problem management and/or release management. Provide technical guidance to CAB for assessment of change requests
- To participate in the preparation and review of technical product & customer-specific documentation and to maintain it throughout the lifecycle of the project
- To participate in the on-call standby rotation in the team during out of office hours including weekends
Our requirements
- Configuration management of the SW configuration and components
- Change management of the SW solution
- Knowledge of Cloud service provider i.e. GCP, monitoring tools, networking, infrastructure and Linux
- Hands-on experience in Continuous Integration and Delivery tools like Gitlab, Terraform & Helm
- Strong knowledge of system integration, operation, maintenance and proven experience with automation tools including Gitlab
- Hands-on in deployment with Kubernetes and GCP administration and support in the production-grade environment
- Proficient in Linux, TCP/IP, HTTP (S) protocol
- Experience of programming in at least one of the following languages: C, C++, Java, Python, or Go
- Experience with algorithms and data structures
- Knowledge of Agile methodology and Service Delivery best practices
- English on advanced level