Overview
Job Description
Responsible for managing day to day network operations and maintenance, in Unified Collaborations domain. This includes- customer change requests, platform uptime, incident management, problem management, CFT / OEM Interlock, and escalation support for the operations. This is an operational role, responsible for delivering results that have a direct impact on day-to-day operations and capable of instructing professional or technical staff and reviewing the quality of the work undertaken by these roles
.
Location : Bangalore & Pu
ne
Key Responsibiliti
es:Manage cloud applications using common DevOps and Agile practices to successfully keep uptime and delive
ry.About 50% of your time should be spent automating the site systems to self-manage and self-heal. Also make sure that each traffic impacting event is captured by the alarm monitoring syst
em.Other 50% of your time dealing with incident resolution, on-call support and support. onboarding and Implementation of new servic
es.Should have good understanding of call flows, architecture and analysis on (SMS, WhatsApp, Plugins, RCS, Chatbots, Email and
AI)
Knowledge/Experie
nce:Develop the processes needed to maintain services availability. Tools onboarding/implementation and optimal usage to maintain services, typically involve data collection and extensive monitor
ing.Capture and analyse major metrics, such as availability, mean time between failures and mean time to repair, and develop new metrics and KPIs as necessary. Add these metrics to monitoring dashboards and reporting syst
ems.Use detailed monitoring to improve the availability and performance of applications, services, systems and infrastructure. Create new alerts to find anomalies and understand the root cause of system failu
res.Create and deploy automation, alerting, self-healing architectures and other technologies to make the environment more maintaina
ble.Monitor, manage and troubleshoot regular processes to improve processes and workfl
ows.Create and maintain documentation for processes, automation, infrastructure, resources and servi
ces.Act as a subject matter expert and coach team members in troubleshooting and debugging iss
ues.
Sk
ills:7+ years of experience in Linux operating systems and their administration, along with networking prot
ocol.Expertise in docker/Kubernetes orchestra
tion.Deep understanding and knowledge of monitoring and alerting tools like Nagios, New Relic, CloudWatch, PagerDuty
etc.Working knowledge of scripting languages such as Python, PHP, Bash
etc.Experience in public clouds (Preferably
AWS).Should have knowledge on log analytics techniques and t
ools.Passion to learn and implement new techno
logy.Should be flexible to support during the off hours it the nee
d be.Expert in at least 2 channels (SMS, WhatsApp, Plugins, RCS, Chatbots, Email and AI). Call flows, Architecture, troubleshoo
ting.
Qualifica
tions:(What academic/professional qualifications/registrations does the individual need - if rel
evant)Engineering background-BE,
Btech
Compe
tencies(What competency skills sets does the individual need e.g. organizational skills, planning, resilience, motivation, teamwork, verbal communication skill
s etc.)
Strong verbal and written communication skills. Ability to clearly communicate project status/risks/issues as well as technical
details.Highly organized, detail-oriented, focused Strong analytica
l skills
Certif
ications:1. AWS Dev Ops/ System Administrator Certif
ications.2. Certified Kubernetes Administra
tor (CKA)