Working to improve the lives of the 350+ million people suffering from rare and complex conditions
2 days ago
🔄 Hybrid – Bay Area
Working to improve the lives of the 350+ million people suffering from rare and complex conditions
• Design and implement scalable, efficient, and reliable data infrastructure to support our growing clinical, business, and operational data needs. • Lead the development of real-time and batch data pipelines, ETL processes, and data warehousing solutions optimized for clinical data analytics and business intelligence. • Optimize data systems for high-performance querying and analysis of clinical, operational and business datasets. • Develop data pipelines and infrastructure to support AI solutions, validation, and deployment in clinical and operational settings. • Stay current with emerging data technologies and methodologies, applying them strategically to enhance our data systems.
• 7+ years of experience as a data engineer with a proven track record of designing, implementing, and managing large-scale data systems. • Deep expertise in data modeling, data architecture and modern data warehousing solutions (e.g. Clickhouse, Snowflake, Redshift, or BigQuery) • Extensive experience with data pipeline tools and frameworks such as Apache Airflow, Spark, Kafka, and ETL technologies • Strong skills in data visualization tools and BI platforms (e.g., Superset, Tableau, Power BI) • Strong proficiency in SQL and expertise in programming languages such as Python, R • Proven ability to optimize data systems for performance and cost-efficiency, particularly for large-scale clinical and business datasets • Hands-on experience with cloud platforms (e.g., AWS, Google Cloud, Azure) and their associated data services. • Experience with CI/CD practices for data pipelines and infrastructure-as-code
Apply Now