All jobs
WEXDevOps
Site Reliability Engineer-3
BrazilPosted today
Senior Site Reliability Engineer (SRE) responsible for leading initiatives to improve the reliability, scalability, and efficiency of systems at WEX. The role involves designing automation, enhancing observability, and guiding engineering teams on best practices in reliability engineering.
Location: Brazil
Responsibilities
- Lead efforts in designing and implementing scalable and reliable systems.
- Develop advanced automation strategies to reduce manual work.
- Conduct detailed postmortems and implement permanent fixes.
- Mentor junior engineers and promote best practices across teams.
- Improve incident response processes and drive MTTR (Mean Time to Recovery) reductions.
- Optimize cloud infrastructure costs and resource utilization.
- Influence SRE culture and process improvements.
Requirements
- 5+ years of experience in SRE, DevOps, or software engineering roles.
- Strong programming skills in Python, Go, or Java.
- Experience with scalable and distributed systems.
- Experience with monitoring and logging (Grafana, ELK stack, Splunk, etc.).
- Knowledge of containerization and orchestration (Docker, Kubernetes).
- Advanced cloud automation experience (AWS, Azure, GCP).
- Understanding of CI/CD pipelines and version control systems.
- Knowledge of networking, databases, and storage architectures.
- Knowledge of incident management frameworks (e.g., xMatters, PagerDuty, Opsgenie).
- Experience in managing production reliability for real-time systems, score computation services, or policy engines.
Similar remote jobs
yesterday