VetJobs - The Leading Military Job Board

Job Information

Meta SiteOps Global Systems Engineer in Dublin, Ireland

Summary:

Meta is seeking a forward thinking, experienced Data Center Systems Engineer to join the Data Center Site Operations team. Our data centers, and the tens of thousands of servers installed in them, are the foundation upon which our rapidly scaling infrastructure efficiently operates and upon which our innovative services are delivered. Meta is at the leading edge of the global data center industry both in terms of how data centers are designed and operated. This person should enjoy working in a fast-paced environment where adaptability and flexibility will be key to their success. The candidate we seek is a forward-thinking IT professional with deep experience utilizing multiple diverse software tools to identify automation solutions intended to address complex operational issues. This role is deeply cross-functional and considers the technical needs of frontline users to identify and automate diagnostic tooling which enables quality and efficient delivery of production servers. They should be able to perform deep data analysis to drive decisions on the top priorities to automate repairs on servers in a hyper scale environment. The successful candidate will be an engineer that can drive solutions with code, through deep collaboration and crisp, timely communication with globally diverse teams. Extensive knowledge of managing servers, programming/scripting languages, and Linux systems is required.

Required Skills:

SiteOps Global Systems Engineer Responsibilities:

  1. Identify and root cause systemic issues in the fleet and drive resolution. Help deliver maximum server fleet up-time and utilization rates, by leveraging data to understand hardware failure conditions and root cause.

  2. Write and review code, develop documentation, and debug the hardest problems, live, on some of the largest and most complex systems in the world.

  3. Own and develop diagnostic tooling requirements to run the fleet

  4. Own and Drive the escalation process for Site Operations to identify, root cause, and solve complex tooling and hardware issues affecting the fleet.

  5. Execute operational validation and verification activities for the new product integration.

  6. Through consistent collaboration with cross-functional tooling teams, help determine root cause and provide input into their development process, with an operations centric view of how open issues are affecting the fleet.

  7. Build cross functional relationships and have the ability to influence policies and procedures to improve global data center operations.

  8. Mentor team members to evaluate and identify better ways to resolve issues and define updates to tools and processes.

  9. Ability to travel up to 25% required.

Minimum Qualifications:

Minimum Qualifications:

  1. Engineering degree or commensurate experience

  2. 7+ years of experience in systems infrastructure operations or related field

  3. Experience coding in higher-level languages (e.g., Python, PHP, C++, or Java)

  4. Experience in configuration and maintenance of applications such as web servers, load balancers, relational databases, storage systems and messaging systems

  5. Experience learning software, frameworks and APIs

Industry: Internet

DirectEmployers