Big Data Engineer - InsurTech

Oliver James Associates Limited
United Kingdom
$80k - $110k pa
06 Oct 2017
07 Nov 2017
Contract Type
Full Time
A rare opportunity to join a rapidly growing InsurTech firm financially backed by an industry leader. This position offers the opportunity to be part of a global success story that continues to evolve. You will have deep understanding of relational database technologies, 3+ years experience building large scale data pipelines with Hadoop/Spark. Strong skills in computer science fundamentals is advantageous, you will also have a driven and ambitious mindset. Job Responsibilities - You will work with architects, business partners and business analysts to understand requirements, design and build effective solutions. - Utilize the data engineering skills within and outside of the developing information ecosystem for discovery, analytics and data management - You will be using Data wrangling techniques converting one "raw" form into another including data visualization, data aggregation, training a statistical model etc. - Create different levels of abstractions of data depending on analytics needs. - Hands on data preparation activities using the Hadoop technology stack - Implement discovery solutions for high speed data ingestion. - Work closely with the Data leadership team to perform complex analytics and data preparation tasks. - Work with various relational and non-relational data sources with the target being Hadoop based repositories. - Sourcing data from multiple applications, profiling, cleansing and conforming to create master data sets for analytics use. - Design solutions for managing highly complex business rules within the Hadoop ecosystem. - Performance tune data loads. - Leverage visualanalytics tools to communicate results of data analysis. Skills Required - 3-5 years of solid experience in Big Data technologies a must. - A computer science or related educational background - Knowledge of Hadooop 2.0 ecosystems, HDFS, MapReduce, Hive, Pig, sqoop, Mahout, Spark etc. a must. - You will have significant programming experience (with above technologies as well as Java, R and Python on Linux) a must. - Knowledge of any commercial distribution like HortonWorks, Cloudera, MapR etc. a must. - Excellent working knowledge of relational databases, HBase etc. - Data visualization tool experience a plus. - Natural Language Processing (NLP) skills with experience in Apache Solr, Python a plus - Knowledge of High-Speed Data Ingestion, Real-Time Data Collection and Streaming is a plus.
This job was originally posted as