We are seeking a talented and proactive Site Reliability Engineer to join our Cloud Security and Infrastructure (CSI) team.
This role involves ensuring the reliability, scalability, and performance of applications and infrastructure, with opportunities to work with cutting-edge containerization and cloud technologies.
Join our team and play a critical role in building and maintaining secure, observable, and scalable systems!
Responsibilities
- Create and manage applications, containerize them using tools like Docker or Podman, and troubleshoot logs to trace events
- Develop and deploy Kubernetes resource manifests into clusters such as Kind, GKE, or AKS
- Set up and configure Prometheus agents for monitoring infrastructure and application behavior, while defining alerts based on metrics
- Collaborate with teams to maintain robust CI / CD pipelines using Azure DevOps or GitOps frameworks like Helm and ArgoCD
- Ensure the reliability and scalability of distributed systems by monitoring, debugging, and optimizing system performance
- Utilize infrastructure-as-code tools, such as Terraform, to manage cloud environments
Requirements
2+ years of hands-on programming experience paired with proficiency in at least one scripting languageProficiency in Microsoft AzureCompetency in Kubernetes and LinuxKnowledge of observability principles and familiarity with tools such as Prometheus for monitoring applicationsExperience with Azure DevOps CI / CD pipelines and / or GitOps workflows using Helm or ArgoCDBackground in using Terraform to design and manage cloud infrastructureEnglish level B1+ for effective communicationNice to have
Showcase of Azure DevOps experienceExperience with Google Cloud PlatformFamiliarity with Prometheus and related observability toolsBackground in service mesh tools, particularly IstioWe offer
International projects with top brandsWork with global teams of highly skilled, diverse peersEmployee financial programsPaid time off and sick leaveUpskilling, reskilling and certification coursesUnlimited access to the LinkedIn Learning library and 22,000+ coursesGlobal career opportunitiesVolunteer and community involvement opportunitiesEPAM Employee GroupsAward-winning culture recognized by Glassdoor, Newsweek and LinkedInSeniority level
AssociateEmployment type
Full-timeJob function
Engineering, Information Technology, and Business DevelopmentIndustries
Software Development, IT Services and IT Consulting, and Nanotechnology Research#J-18808-Ljbffr