Senior Site Reliability Engineer

📍 São Paulo - SP Endereço: Nações Unidas 12901 CEP 04578-900 Publicado 21/04/2026 Tecnologia da Informação
Remoto / Home office
Carregando opções de candidatura Estamos preparando as ações disponíveis para esta vaga.

Sobre a vaga

Leia os requisitos e vantagens antes de aplicar. Atualize seu currículo para aumentar as chances.

First Help Financial (FHF) is a fast-growing and culturally diverse company in the U.S.

We provide auto loans to the underserved and care for our customers and partners with exceptional service.

Through flexible financing options and tri-lingual support, we offer consumers an easier way to finance their first car.

We lend to and support our portfolio which has consistently grown 30%+ each year over the last nine years.

Here you will find hard-working colleagues who come from over 20 countries.

We hold ourselves to the highest standards of professionalism but also enjoy our work.

Our culture and benefits are geared towards making you successful in life and comfortable at work.

Local

São Paulo - SP

Híbrido/Brazil HQ address - Nações Unidas 12901 - São Paulo - SP (this role requires one day per week in the office on a rotating schedule, plus an additional day in the office each month).

Responsabilidades

  • Ensure the availability, performance, and reliability of the Loan Origination System (LOS) through proactive monitoring and incident response.
  • Partner with product and engineering teams to define and maintain SLOs/SLAs, introduce error budgets, and drive accountability.
  • Collaborate on architectural improvements aimed at increasing resilience, scalability, and observability.
  • Lead incident analysis and postmortems, and implement preventive actions.
  • Design, build, and operate infrastructure as code (IaC) using Terraform.
  • Improve observability tooling and practices using Datadog, enhancing alerting, tracing, and system dashboards.
  • Participate in on-call rotations and respond to production incidents.
  • Automate operational processes and promote a DevOps culture across squads.

Requisitos

  • 5+ years of experience in Site Reliability Engineering or DevOps roles.
  • Proven experience managing and improving production systems in a cloud-native environment (preferably AWS).
  • Strong experience with observability tools and practices.
  • Experience defining and driving adoption of SLIs, SLOs, and SLAs.
  • Experience in operating event-driven systems and distributed architectures.
  • Solid understanding of Terraform and infrastructure as code best practices.
  • Strong debugging and troubleshooting skills across the stack.
  • Comfortable writing and reviewing production-grade code (preferably in Java).
  • Excellent written and verbal communication in English.
  • A pragmatic and collaborative mindset, with a passion for system reliability and operational excellence.
  • Bachelor's degree in computer science or similar fields preferred.

Diferenciais

  • Generous salaries
  • Monthly lunches
  • Employee recognition and talent development program
  • Healthy work-life balance
  • Career growth opportunities
  • Commitment to diversity and inclusion

Carga horária

Híbrido

Tipo de contrato

CLT

Sobre a empresa

First Help Financial (FHF) is a fast-growing and culturally diverse company in the U.S.

We provide auto loans to the underserved and care for our customers and partners with exceptional service.

Through flexible financing options and tri-lingual support, we offer consumers an easier way to finance their first car.

We lend to and support our portfolio which has consistently grown 30%+ each year over the last nine years.

Here you will find hard-working colleagues who come from over 20 countries.

We hold ourselves to the highest standards of professionalism but also enjoy our work.

Our culture and benefits are geared towards making you successful in life and comfortable at work.

Tech Stack

  • Languages: Java (Spring Boot).
  • Cloud: AWS (Lambda, Kinesis, S3, EC2).
  • IaC & CI/CD: Terraform, AWS CodePipeline.
  • Databases: MongoDB Atlas.
  • Observability: Datadog.
  • Event-driven architecture: Kinesis Streams, Lambdas.
  • Version control & Collaboration: GitHub, Slack, Confluence, Jira.

Estimativa salarial para Site Reliability Engineer em São Paulo/SP

Listamos abaixo as 3 profissões mais próximas com salário médio, conforme aproximação com ocupações do Ministério do Trabalho e Emprego.

Novo: relatório de compatibilidade CV x vaga

Veja o percentual de aderência do seu currículo com esta vaga.

Após o login você poderá solicitar o relatório. Nós iremos analisar a compatibilidade para você, comparando seu currículo com os requisitos da vaga e destacando o percentual de aderência, os pontos fortes e as lacunas que podem ser melhoradas. Você recebe por e-mail um resumo claro com recomendações práticas para aumentar suas chances de entrevista.