Company Background

Our client is the company that pioneers making bioinformatics techniques available to scientific researchers who do not feel comfortable with doing actual programming.

Since 2015 it has organized the group of scientists and engineers with extensive experience in using statistical and machine learning methods to mine heterogeneous data sources.

They have helped to more than 20 organizations with building trustworthy data analytics pipelines on top of omics data.

Project Description

The goal of the project is to build a solution that will address the challenges of the existing one:

 

  • Standardize data architecture approach and data (pre)processing pipelines;
  • Improve data platform scalability;
  • Streamline ingestion of heterogeneous data sources and make integration of the new ones faster;
  • Define a common approach to metadata management and data quality;
  • Enable smooth integration with both ML notebooks and analytical reports;
Technologies
  • Azure —  Blob Storage, IAM and Security, DataBricks, Data Lake Storage, Data Factory, SQL Database and Data Warehouse;
  • Python/SQL;
  • Neo4j/Cypher;
Job Requirements
  • 3+ years of experience in developing, implementing, and maintaining data integration solutions;
  • 2+ years of experience in Azure stack;
  • Understanding of data engineering best practices and methodologies;
  • Good communication skills in verbal and written English;
  • Experience in building solution architecture;
  • Experience in creating technical documentation;
  • Desire and ability to understand the business domain into details required to create fully-functional code;
  • Experience in cloud services is preferred;
  • Experience in NoSQL databases is preferred;
Напишите нам.
Мы обязательно ответим!
Отклинуться через: linkedin.com hh.ru

*Обязательное поле

Проверьте, правильность заполнения формы.
Ваша заявка принята, спасибо. Мы свяжемся с вами, используя указанные вами контакты.