Overview
· Site Reliability Engineering:
· Serve as the primary contact responsible for ensuring application scalability, performance, and resilience.
· Practice sustainable incident response and blameless post-mortems while taking a holistic approach to problem solving and optimizing time to recover.
· Automate data-driven alerts to proactively escalate issues. Work with development teams to establish SLOs and improve reliability.
· DevOps/Automation:
· Tackle complex development, automation, and business process problems. Engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation, and refinement.
· Support the application CI/CD pipeline for promoting software into higher environments through validation and operational gating, and lead Mastercard in DevOps automation and best practices.
· Increase automation and tooling to reduce toil and manual intervention.
· ITSM Practices:
- Analyses ITSM activities of the platform and provide feedback loop to development teams on operational gaps or resiliency concerns.
Role Qualifications
The ideal candidate will have experience in many of these areas:
· BS degree in Computer Science or related technical field involving coding (e.g., physics or mathematics), or equivalent practical experience.
· Coding or scripting exposure.
· Appetite for change and pushing the boundaries of what can be done with automation. Be curious about new technology, infrastructure, and practices to scale our architecture and prepare for future growth.
· Experience with algorithms, data structures, scripting, pipeline management, and software design
· Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
· Interest in designing, analyzing, and troubleshooting large-scale distributed systems.
· Willingness and ability to learn and take on challenging opportunities and to work as a member of matrix based diverse and geographically distributed project team.
· Ability to balance doing things right with fixing things quickly. Flexible and pragmatic, while working towards improving the long-term health of the system.
· Comfortable collaborating with cross-functional teams to ensure that expected system behavior is understood, and monitoring exists to detect anomalies.
SRE , Hadoop, Cloud, UNIX scripting
3-5 years min
Job Type: Full-time
Pay: Up to ₹3,000,000.00 per year
Schedule:
- Day shift
Work Location: In person
Speak with the employer
+91 9594829089