AWS Cloud Engineer

Global Remote

Company Overview

BotCity is building the future of automation with the Governance Platform for Python automations and AI Agents. We empower enterprises to innovate at scale, bringing governance, control, and observability to every automation project. Our philosophy is simple: automation is software, and software deserves the same high-code standards that drive innovation in AI and machine learning.

We recently raised a $12M Series A, led by Four Rivers with participation from Y Combinator, SoftBank, and top industry leaders such as Lew Cirne (New Relic), Rod Johnson (Spring Source), and Walter Kortschak (Summit Partners | Firestreak Ventures). With 1,000+ customers in 70+ countries, including Bayer and LG, and recognition by G2 (2024) as one of the world’s top 25 emerging platforms, BotCity is scaling fast.

We’re a global remote company with teams across the US and LATAM, united by a shared vision to redefine how enterprises build and manage automation. If you’re looking for an environment that values impact, autonomy, and excellence, we’d love for you to join us on this journey.

 
Role Overview

The AWS Cloud Engineer will own BotCity’s cloud infrastructure in AWS, ensuring performance, security, cost-efficiency, and scalability for all our services. This is a foundational hire who will design, automate, and evolve the infrastructure, build guardrails and best practices, and act as the internal AWS authority. The ideal candidate is both hands-on and strategic, with strong operational judgment and a passion for infrastructure excellence. This role will report to the VP of Engineering.

 
Responsibilities

  • Architect, provision, and maintain AWS infrastructure (compute, storage, networking, databases, VPCs).
  • Design, build, and manage Infrastructure as Code (IaC) modules (e.g. with Terraform, CloudFormation).
  • Automate deployment pipelines, blue/green or canary strategies, infrastructure provisioning, and configuration management.
  • Monitor, log, and alert on infrastructure health (CloudWatch, Prometheus, etc.), and maintain dashboards and SLIs/SLOs.
  • Optimize resource usage and cost (right-sizing, reserved instances, scaling strategies, spot instances).
  • Lead fault-tolerance, disaster recovery, backup, and high-availability design and testing.
  • Participate in root-cause analysis of incidents; own postmortems, drive improvements.
  • Define and enforce infrastructure standards, security controls, tagging, policies, guardrails.
  • Collaborate with developers, security, operations teams to ensure feature architecture aligns with infrastructure best practices.
  • Assist in migrations/refactoring of existing systems or services to better infrastructure patterns.
  • Keep up-to-date with AWS service updates, evaluate new offerings, and propose adoption when beneficial.
  • Document architecture, runbooks, operational playbooks, and best practices.
Requirements

Required Qualifications

  • Bachelor’s degree in Computer Science, Engineering or equivalent.
  • Experience (5+ years) with AWS infrastructure in production systems.
    AWS certifications (Solutions Architect - Professional or Specialty).
  • Proven track record in operating highly available, scalable systems in AWS (elastic computing, storage, networking).
  • Experience with Infrastructure as Code (Terraform, CloudFormation, CDK or equivalent).
  • Experience with AWS security suite and tools (Inspector, Security Hub, GuardDuty).
  • Good knowledge of networking (VPCs, routing, security groups, load balancers), and storage (EBS, S3, caching).
  • Experience in cost optimization strategies, capacity planning, and scaling.
  • Experience with scripting (Bash, Python).
  • Familiarity with monitoring, logging, metrics, observability tools.
  • Solid troubleshooting skills and experience with incident management and postmortem processes.
  • Strong communication skills — able to explain infrastructure decisions to developers, leadership, and cross-functional teams.
  • Experience working with MS Office/Excel, Google Suite, Notion, Slack.
  • Ability to travel as needed to support events and meet the team.
  • Portuguese - Fluent.
  • English - Fluent.

 
Preferred Qualifications

  • Experience with hybrid or multi-cloud environments.
  • Experience with advanced observability (distributed tracing, OpenTelemetry).
  • Experience with disaster recovery and chaos engineering practices.
  • Prior experience in an early-stage, high-growth, and fast-paced startup environment or technology companies.

#jobs #hiring #aws #cloud