August 10
🏢 In-office - Bay Area
• Responsible for managing a team (7-10 employees) that oversees all aspects of reliability and maintaining the highest levels of uptime to support millions of users, including: monitoring site health and reliability; operating automated observability and alerting to maintain uptime; Driving service restoration during critical incidents and reviewing post incident action items for value and completion • If needed, participate in on-call rotation to get a better understanding of the current SiteOps daily activities and run the business work to look for opportunities to improve both internally and for your customers • Partnering across the organization to drive Operational Intelligence and best practices in operational development and reliability • Create tooling that increases velocity for the business and developers that expose site health and reliability that automates and streamlines processes in the organization • Report actionable improvements with supporting data to scale Credit Karma systems and applications both web and native • Lead with empathy and grow the careers of a diverse team of talented engineers across multiple locations
• Bachelor’s Degree in Computer Science, related field or equivalent experience • 7+ years of engineering management experience • Lead the operations team (devops or NOC) of a customer facing mission critical application • Maintains focus on the internal customer and data-driven partnership in SDLC and QE to improve production operations • Enjoys working on a cross-functional team • Strong technical leadership skills with the ability to remain focused and calm under pressure • Experience in running critical incidents in a global or company-wide context, engaging with executives and senior leadership, and leading root cause analysis sessions • Experience running and monitoring applications at scale, using metrics and tracing tools like, New Relic, Data Dog, Stackdriver, Zipkin, Prometheus, etc. • Ability to create and drive SLO/SLA at an enterprise level • Experience developing production quality tooling • Familiarity with SRE methodologies; passionate about solving operational challenges by using automation and software • Ability to communicate effectively vertically and horizontally within the organization through demonstrating written and verbal communication skills
• Medical and Dental Coverage • Retirement Plan • Commuter Benefits • Wellness perks • Paid Time Off (Vacation, Sick, Baby Bonding, Cultural Observance, & More) • Education Perks • Paid Gift Week in December
Apply Now