Information Technology
Full-Time
DATAECONOMY
Overview
About Us
About DATAECONOMY: We are a fast-growing data & analytics company headquartered in Dublin with offices inDublin, OH, Providence, RI, and an advanced technology center in Hyderabad,India. We are clearly differentiated in the data & analytics space via our suite of solutions, accelerators, frameworks, and thought leadership.
Job Description
Job Summary:
We are seeking an experienced Observability Engineer with a strong DevOps background to design, implement, and manage observability solutions across cloud and on-prem environments. The ideal candidate will have expertise in monitoring, logging, tracing, and alerting to ensure high system availability, performance, and reliability.
Key Responsibilities
Required Skills & Qualifications:
As per company standards.
About DATAECONOMY: We are a fast-growing data & analytics company headquartered in Dublin with offices inDublin, OH, Providence, RI, and an advanced technology center in Hyderabad,India. We are clearly differentiated in the data & analytics space via our suite of solutions, accelerators, frameworks, and thought leadership.
Job Description
Job Summary:
We are seeking an experienced Observability Engineer with a strong DevOps background to design, implement, and manage observability solutions across cloud and on-prem environments. The ideal candidate will have expertise in monitoring, logging, tracing, and alerting to ensure high system availability, performance, and reliability.
Key Responsibilities
- Design & Implement Observability Solutions: Develop and maintain monitoring, logging, and tracing solutions using industry-leading tools (Prometheus, Grafana, Datadog, New Relic, Splunk, etc.).
- Performance Monitoring & Optimization: Ensure proactive identification and resolution of performance bottlenecks in distributed systems.
- Logging & Tracing: Set up and manage centralized logging solutions (ELK/EFK stack, Fluentd, OpenTelemetry).
- Alerting & Incident Management: Configure alerting mechanisms using tools like PagerDuty, Ops genie, or VictorOps for proactive issue detection.
- SRE Practices: Implement Site Reliability Engineering (SRE) principles to enhance system reliability and reduce MTTR (Mean Time to Resolution).
- Automation & Infrastructure as Code (IaC): Automate observability setup and configurations using Terraform, Ansible, or similar tools.
- Cloud & Kubernetes Monitoring: Implement observability best practices for cloud platforms (AWS, Azure, GCP) and containerized environments (Kubernetes, Docker).
- Collaboration: Work closely with development, SRE, and operations teams to ensure end-to-end observability of applications and services.
- Compliance & Security: Ensure logging and monitoring solutions adhere to security and compliance requirements.
Required Skills & Qualifications:
- 6-10 years of experience in DevOps, SRE, or Observability engineering.
- Strong hands-on experience with observability tools like Prometheus, Grafana, New Relic, Datadog, Splunk, ELK/EFK, OpenTelemetry, AppDynamics, etc.
- Experience in setting up distributed tracing solutions (Jaeger, Zipkin, OpenTelemetry).
- Expertise in Kubernetes monitoring using Prometheus, Thanos, Loki, or similar tools.
- Strong proficiency in scripting (Python, Bash, Shell) for automation.
- Hands-on experience with Terraform, Ansible, Helm, or CloudFormation for infrastructure automation.
- Proficiency in CI/CD pipelines and GitOps methodologies using Jenkins, GitLab CI, ArgoCD, or Flux.
- Experience in public cloud environments (AWS, Azure, GCP) and monitoring cloud-native services.
- Strong troubleshooting and root cause analysis (RCA) skills.
- Understanding of SLIs, SLOs, and error budgets as part of SRE best practices.
- Familiarity with log management, anomaly detection, and AI-based observability solutions is a plus.
As per company standards.
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in