Data Engineer

South San Francisco, CA 94080

Posted: 08/14/2018 Employment Type: Contract Industry: Information Technology Job Number: 4637

Our client is a leading biotechnology company that discovers, develops, manufactures and commercializes medicines to treat patients with serious or life-threatening medical conditions. They are among the world' s leading biotech companies, with multiple products on the market and a promising development pipeline.


  • The  Early Clinical Development Operations (ECD Ops) department is seeking an experienced Data Engineer who is motivated and experienced in data architecture to help further the development of ECD Operations data services. This individual will work in the ECD Ops Information Management Office (IMO) and will be accountable for providing engineering expertise in the delivery and optimization of the organization' s data lake and data warehouse called gCORE.
  • The role will require cross-functional interactions with Data Management Leads, Clinical Study Teams, Predictive Analytics, Artificial Intelligence and Information Technology teams to drive data acquisitions and data operations projects as well as data platform technology needs. The hallmark of a great candidate is one who is eager to solve complex problems with data, is skilled in managing databases and developing data pipelines and has a passion for learning new skillsets to deliver on organizational-wide data needs.


  • Architect solutions that will transform data into an analyzable format for data scientists, data operations processes and analytical tools / dashboards Work with external suppliers including clinical sites and CROs to define and design data integrations Develop and optimize big data pipelines for data scientists Develop ETL workflows using data warehouse ETL tools for production processes, such as data quality monitoring and cleansing in coordination with IT · Perform hands-on infrastructure design of ECD' s data lake and data warehouse environment (gCORE) including continuous exploration and recommendation of new technologies and best practices · Communicate synthesized data quality findings to business and technical team members, senior leaders and external stakeholders · Research and recommend new innovative methods and systems to manage data for business improvement · Contribute to internal governance teams to drive the data quality business cycle and roadmap




  • Bachelor's or Master's degree in computer science or software engineering
  • 5+ years of programming experience in one or more of these: Java, Python, C++, Scala, etc.
  • Experience with relational SQL and NoSQL databases, including Postgres and Cassandra Experience building and optimizing big data pipelines using Spark or other similar technologies Experience with AWS cloud services: EC2, EMR, RDS, Redshift Solid understanding of how to design robust data workflows including optimization and user experience Strong analytical and problem solving skills Excellent oral and written communication skills Able to work in teams and collaborate with others to clarify requirements Strong co-ordination and project management skills to handle complex projects Experience developing and working with XML, JSON, and external web services

Preferred Qualifications

  • Clinical drug development domain knowledge Experience with Clinical data and systems such as Medidata RAVE, Siebel CTMS, IxRS Experience with Scientific data such as Genomics and Imaging data Experience with data quality software such as Informatica, Paxata, Alteryx, Data Monarch or similar class of tools Competencies in applied statistics to solve business needs Knowledge of industry data standards used in drug development, particularly in Clinical development



  • Bachelor's or Master's degree in computer science or software engineering
Apply Online

Send an email reminder to:

Share This Job:

Related Jobs:

Login to save this search and get notified of similar positions.