All jobs
LTGEngineering
Cloud Infrastructure Engineer (Open LMS) Colombia, Remote
Remote (Colombia)Posted 7 days ago
Senior Cloud Infrastructure Engineer needed to build, scale, and evolve a multi-tenant SaaS hosting platform on AWS, supporting hundreds of Moodle LMS instances with custom orchestration, distributed service discovery, and infrastructure as code.
Location: Remote (Colombia)
Responsibilities
- Designing, building, and maintaining AWS infrastructure using Terraform (EC2, RDS, S3, SQS, Lambda, ALB, ElastiCache, Route 53, VPC networking)
- Writing and maintaining Puppet modules to configure and manage fleets of EC2 instances across multiple auto-scaling groups
- Maintaining and extending Python-based automation and tooling that supports platform operations
- Operating and improving distributed service discovery and configuration management (etcd)
- Managing and tuning a multi-tier caching strategy (Varnish, Redis/Valkey, PHP OPcache)
- Running and scaling our observability stack (Prometheus, Grafana, Loki, Fluentd, PagerDuty) and participating in on-call rotations
- Evaluating and implementing distributed storage solutions as the platform evolves
- Improving deployment workflows and release processes
- Collaborating with internal teams on API contracts, integration patterns, and operational tooling
- Participating in incident response, root cause analysis, and platform reliability improvements
Requirements
- Strong experience with AWS services in production — particularly EC2, RDS, S3, SQS, Lambda, ALB, ElastiCache, Route 53, IAM, and VPC networking
- Proficiency in authoring and maintaining Terraform modules for production infrastructure
- Proficiency in authoring and maintaining Puppet modules (or equivalent agent-based configuration management) for fleet management
- Solid Python skills — you'll be writing and maintaining production daemons, not just scripts
- Deep Linux systems knowledge (Ubuntu) — comfortable with Apache/Nginx, PHP-FPM, Varnish, systemd, filesystem mounts, and networking fundamentals
- Understanding of distributed systems concepts: consensus, leader election, distributed locking, eventual consistency, and the tradeoffs involved
- Proficiency in building and maintaining observability pipelines (Prometheus, Grafana, Loki, or equivalent) in production
- Comfortable working in a GitLab-based CI/CD workflow
- Clear communicator who can document architectural decisions and explain technical tradeoffs to both technical and non-technical stakeholders
Similar remote jobs
Senior Manager, Software Engineering
United States$204,000–$327,500 USD/yr (US, depending on location)
today