DE Jobs

Search from over 2 Million Available Jobs, No Extra Steps, No Extra Forms, Just DirectEmployers

Job Information

Google Product Hardware Reliability Engineer, Global Hardware Reliability Engineering in Sunnyvale, California

Minimum qualifications:

  • Bachelor's degree in Electrical/Industrial/Mechanical Engineering or equivalent practical experience.

  • 10 years of experience in reliability engineering of cloud infrastructure hardware and technology, dealing with failure analysis and fault isolation techniques and applying them to isolate root causes.

  • 7 years of experience with system level reliability tools (RBDs, MCFs, HPPs, NHPPs).

Preferred qualifications:

  • Master's degree or PhD in Electrical/Industrial/Mechanical Engineering or equivalent practical experience.

  • Experience leading cross-functional problem-solving teams using practical approaches.

  • Knowledge of Industry Test Standards (e.g., JEDEC, ASTM, IEEE).

  • Understanding of Physics of Failure and Reliability Physics.

  • Ability to effectively lead teams to meet corporate and customer reliability expectations, and effectively communicate to the project team with excellent people management skills.

Google has one of the largest and most powerful computing infrastructures in the world. Your team is responsible for providing the manufacturing capability to deliver this state-of-the-art physical infrastructure. As a Manufacturing Engineer, you evaluate the product designs and create the processes, tools and procedures behind Google's powerful search technology. When vendors build parts for our infrastructure, you're right there alongside ensuring manufacturing processes are repeatable and controlled. You collaborate with Commodity Managers and Design Engineers to determine Google's infrastructure needs and product specifications. Your work ensures the various pieces of Google's infrastructure fit together perfectly and keep our systems humming along smoothly for a seamless user experience.

Google Cloud is responsible for providing the hardware design and the manufacturing capability to deliver state-of-the-art physical infrastructure that powers Machine Learning applications, among others. Quality and Reliability are the foundational cornerstones for the success of this complex offering.

As a Reliability Engineer, you will lead new product introduction (NPI) reliability related activities between our Engineering teams, Contract Manufacturers (CM), and suppliers for Machine Learning hardware, identifying and managing risks, and clearly communicating project deliverables and status to stakeholders. You will evaluate complex product designs and provide insights to the design teams on potential improvements and tradeoffs. You will create procedures and tools to drive product development and manufacturing in a fast-paced environment while focusing on the root causes of failure. You will also evaluate the reliability status of the fleet and support product improvement initiatives. Finally, you will work with external partners to ensure their products meet our customer expectations.

Behind everything our users see online is the architecture built by the Technical Infrastructure team to keep it running. From developing and maintaining our data centers to building the next generation of Google platforms, we make Google's product portfolio possible. We're proud to be our engineers' engineers and love voiding warranties by taking things apart so we can rebuild them. We keep our networks up and running, ensuring our users have the best and fastest experience possible.

The US base salary range for this full-time position is $134,000-$198,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. The range displayed on each job posting reflects the minimum and maximum target salaries for the position across all US locations. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process.

Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google (https://careers.google.com/benefits/) .

  • Lead system design analysis to enable evaluations and product de-risk at an early stage of development.

  • Enable the implementation of the reliability plan and lead efforts to assess and mitigate risk of failure early during NPI.

  • Lead system reliability monitoring efforts and flag unwanted system behavior. Extract field reliability data and drive failure analysis efforts, identification of root causes of failure, and creation of actionable insights.

  • Maintain relationships with outside partners, testing labs, cross-functional internal groups, and Contract Manufacturer (CM) partners, while developing in-house test and qualification capabilities where needed.

  • Lead system reliability efforts by working with other organizations to define reliability goals and plans, securing the resources needed to execute the plan. Drive reliability test plans and collect, analyze, and synthesize the data to enable verification of the design reliability goals.

Google is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. See also https://careers.google.com/eeo/ and https://careers.google.com/jobs/dist/legal/OFCCPEEOPost.pdf If you have a need that requires accommodation, please let us know by completing our Accommodations for Applicants form: https://goo.gl/forms/aBt6Pu71i1kzpLHe2.

DirectEmployers