Site Reliability Engineer Job at Brooksource, San Antonio, TX

ZGE2anJmVC9VbGsxVG1TZi94dzNwQ1BRRkE9PQ==
  • Brooksource
  • San Antonio, TX

Job Description

Site Reliability Engineer (SRE)

Job Summary

Seeking a highly skilled Site Reliability Engineer to work closely with engineering teams to ensure applications are highly available, meet performance standards, and meet the reliability expectations of business stakeholders. As a Site Reliability Engineer, you will work to identify and deliver automation solutions designed to ensure high availability and resiliency using your expertise in software development, complexity analysis, and scalable system design.

Duties and Responsibilities

  • Monitor system performance, identify areas for improvement, and implement solutions to enhance reliability and availability.
  • Guide architecture and development teams on how to make applications highly available, reliable, and performant at a global scale
  • Collaborate with product owners to Implement and monitor key metrics to meet SLOs and SLA
  • Collaborate with development team members to troubleshoot and resolve problems
  • Drive the Root Cause Analysis of production issues and other failures within the supported application software stack
  • Design, build, and champion automated solutions and tasks to optimize application/service/platform uptime with minimal human intervention
  • Develop tools and processes to monitor the Cloud resources and applications
  • Use Kubernetes to deploy platform services
  • Create and implement standards and best practices, driving adoption across development teams and external vendors as applicable

Requirements and Qualifications

Expertise and/or relevant experience in the following areas is mandatory:

  • Bachelor or above degree in Computer Science or a related technical discipline
  • 4+ year’s experience in the deployment, administration, and troubleshooting of large-scale distributed systems
  • 4+ years of experience in Automation Programming in one or more of the following scripting programming languages: Python, Go, Rust, and JavaScript (with priority being given to Python and Go but not required). Bash is not a programing language.
  • 4+ years of experience working with Linux terminal tools and writing shell scripts within a Linux environment
  • Strong understanding of SLA’s, SLO’s and SLI’s
  • Strong understanding of public cloud service concepts
  • Strong understanding of Unix/Linux operating systems internals and administration (Debian is preferred but not required)
  • Strong understanding of networking (e.g. TCP/IP, routing, network topologies, and hardware)
  • Strong experience in debugging and optimizing code and automating routine tasks
  • Strong skills in problem-solving and communication
  • SRE experience including:
  • Monitoring
  • Alert creation and tuning
  • Willing to work and support West Coast hours (9 AM – 6 PM PST)
  • Willing to work in on-call rotation to participate in troubleshooting and communication efforts outside of normal business hours

Expertise and/or relevant experience in the following areas is preferred:

  • Experience with the following or equivalent technologies: Kubernetes, Docker, OpenStack, Relational Databases, NoSQL Databases
  • Strong communication skills and presentation skills
  • Exhibits a determination or willingness to take action and achieve results
  • Excellent command of the English language (written and spoken)
  • Excellent organizational skills in planning and prioritizing own workload and initiatives

Job Tags

Similar Jobs

More Perfect Union

Enterprise Writer (REMOTE) Job at More Perfect Union

 ...across various platforms. ~ Preference for candidates with experience in developing, launching, and overseeing original enterprise journalism series in video format, particularly those that involve in-depth reporting and storytelling. ~ Demonstrable experience as a... 

Plants and Planters

Garden Center worker (Experience with Interior Plants) Job at Plants and Planters

 ...the peak seasons and 5 days per week at other times of the year. This person should know and have several years of working at a garden center, have a positive attitude and be able to get to work on their own. Attend to customer needs and questions, Conduct cash register... 

Sysco

Produce Category Manager Job at Sysco

 ...Role Description: Responsible for driving profitable sales growth of the produce category within a region Collaborates cross functionally (Sales, Supply Chain, Revenue Management) and across entities (Market, Corporate, Specialty) Category expert that provides... 

Dexian DISYS

Data analyst Job at Dexian DISYS

 ...As a Data Privacy Analyst, you will be responsible for reviewing and understanding current and emerging privacy practices to provide guidance and strategies to manager and optimize the organization's information and data assets. This role requires a deep understanding... 

Optum

Patient Care Coordinator - Seattle, WA Job at Optum

 ...Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The...  ...us to start Caring. Connecting. Growing together. The Patient Care Coordinator (Dermatology)is responsible for checking in, scheduling appointments...