Site Reliability Engineer Job at Brooksource, San Antonio, TX

ZGE2anJmVC9VbGsxVG1TZi94dzNwQ1BRRkE9PQ==
  • Brooksource
  • San Antonio, TX

Job Description

Site Reliability Engineer (SRE)

Job Summary

Seeking a highly skilled Site Reliability Engineer to work closely with engineering teams to ensure applications are highly available, meet performance standards, and meet the reliability expectations of business stakeholders. As a Site Reliability Engineer, you will work to identify and deliver automation solutions designed to ensure high availability and resiliency using your expertise in software development, complexity analysis, and scalable system design.

Duties and Responsibilities

  • Monitor system performance, identify areas for improvement, and implement solutions to enhance reliability and availability.
  • Guide architecture and development teams on how to make applications highly available, reliable, and performant at a global scale
  • Collaborate with product owners to Implement and monitor key metrics to meet SLOs and SLA
  • Collaborate with development team members to troubleshoot and resolve problems
  • Drive the Root Cause Analysis of production issues and other failures within the supported application software stack
  • Design, build, and champion automated solutions and tasks to optimize application/service/platform uptime with minimal human intervention
  • Develop tools and processes to monitor the Cloud resources and applications
  • Use Kubernetes to deploy platform services
  • Create and implement standards and best practices, driving adoption across development teams and external vendors as applicable

Requirements and Qualifications

Expertise and/or relevant experience in the following areas is mandatory:

  • Bachelor or above degree in Computer Science or a related technical discipline
  • 4+ year’s experience in the deployment, administration, and troubleshooting of large-scale distributed systems
  • 4+ years of experience in Automation Programming in one or more of the following scripting programming languages: Python, Go, Rust, and JavaScript (with priority being given to Python and Go but not required). Bash is not a programing language.
  • 4+ years of experience working with Linux terminal tools and writing shell scripts within a Linux environment
  • Strong understanding of SLA’s, SLO’s and SLI’s
  • Strong understanding of public cloud service concepts
  • Strong understanding of Unix/Linux operating systems internals and administration (Debian is preferred but not required)
  • Strong understanding of networking (e.g. TCP/IP, routing, network topologies, and hardware)
  • Strong experience in debugging and optimizing code and automating routine tasks
  • Strong skills in problem-solving and communication
  • SRE experience including:
  • Monitoring
  • Alert creation and tuning
  • Willing to work and support West Coast hours (9 AM – 6 PM PST)
  • Willing to work in on-call rotation to participate in troubleshooting and communication efforts outside of normal business hours

Expertise and/or relevant experience in the following areas is preferred:

  • Experience with the following or equivalent technologies: Kubernetes, Docker, OpenStack, Relational Databases, NoSQL Databases
  • Strong communication skills and presentation skills
  • Exhibits a determination or willingness to take action and achieve results
  • Excellent command of the English language (written and spoken)
  • Excellent organizational skills in planning and prioritizing own workload and initiatives

Job Tags

Similar Jobs

Legendary Foods

Quality Assurance and Food Safety Manager Job at Legendary Foods

 ...Summary: Work Location: Bell, CA Lead the Quality Assurance and Food Safety Department to successfully implement and maintain current Good...  ...to keep up with the pace. Responsibilities: Manage and oversee the company's Quality Assurance and Food Safety practices... 

LanceSoft

Local Contract Pharmacist - $95-100 per hour Job at LanceSoft

 ...LanceSoft Established in 2000, LanceSoft is a Certified MBE and Woman-Owned organization. Lancesoft Inc. is one of the highest rated companies in the industry. We have been recognized as one of the Largest Staffing firms and ranked in the top 50 fastest Growing Healthcare... 

Capital One

Senior Platform Engineer Job at Capital One

Senior Platform Engineer Do you love building and pioneering in the technology space? Do you enjoy solving complex technical problems in a fast-paced, collaborative, inclusive, and iterative delivery environment? At Capital One, you'll be part of a big group of makers... 

Confidential

Chief Compliance Officer Job at Confidential

 ...Chief Compliance Officer About the Company Popular e-wallet for bitcoin exchange & storage Industry Internet Type Public Company Founded 2012 Employees 1001-5000 Categories Bitcoin E-Commerce Personal Finance Cryptocurrency... 

Weatherby Healthcare

Locum | Nurse Practitioner Psychiatry Job at Weatherby Healthcare

 ...Paid malpractice insurance ~24-hour access to your Weatherby Healthcare consultant and support team ~ Covered transportation and housing expenses Ranges shown should be used as an estimate and are affected by many factors including the critical need of the...