CloudFactory is scaling up and growing fast, with $78 million in venture capital investment and offices on four continents. We're looking for talented people to join us on our mission as we earn, learn, and serve our way to becoming leaders worth following.
As we continue to scale we are seeking a Principal Data Engineer to be responsible for ensuring that all data captured, generated, stored, retained and distributed by the CloudFactory platform is modeled and managed appropriately. It is also their responsibility to create a culture of data awareness in CloudFactory's engineering team.Key Responsibilities
- Overall ownership of the platform data model and associated storage, access and backup technologies.
- Design and ownership of internal data export processes and technologies
- Consultant to engineers and architects when designing new techniques for ingesting and storing data.
- Liaison with security and data roles throughout the organization to promote standard processes and solutions.
- Supporting Client Technology Solutions and Sales
- Promoting data-related skills in Platform Engineering and establishing data-related career paths and best practices.
- Interpreting regulatory and compliance obligations to establish how they affect the platform design and how the flow of data out of the platform should be managed.
- Leading projects and teams of engineers to create data solutions for various departments within the organization.
Assess and report on the state of CloudFactory's existing software platform and development processes. During the first six weeks, work with all members of the Platform Engineering team to understand and report on how data is handled in the current CloudFactory platform. This report will include a 'first impressions' analysis of the areas of concern (if any) with respect to data protection, security and safety. Understand CloudFactory's regulatory and compliance obligations. CloudFactory is a complex entity with employees, workers and clients from many different countries. CloudFactory also voluntarily submits to various compliance and certification programmes. Within two months of starting, the Principal Data Engineer will need to understand the implications of these obligations on data handling and protection within the CloudFactory platform. Develop/validate standard models and architectures. The CloudFactory platform generates a significant amount of data, but these generation and storage use-cases all tend to fall into a small number of patterns. Some work has been done in defining standard technology, modelling and backup/recovery for these patterns. Within three months of starting, validate and document these patterns, or propose alternatives where they do not exist or are insufficient. Establish a working integration with CloudFactory's Enterprise Data Warehouse . There is a desire to have data stored in the CloudFactory platform to be made available in the existing Enterprise Data Warehouse (Snowflake). Within the first six months, design and implement a pattern where data is made available to the Enterprise Data Warehouse by default in a safe, sustainable way. Establish and implement an appropriate schema/data dictionary approach for the entire platform . CloudFactory runs an increasingly distributed functional and data architecture. There are few central resources for understanding what data is stored where, which has implications when CloudFactory needs to adopt a new compliance regime. Determine an appropriate and sustainable model for cataloging this data and make significant progress on implementing it within the first four months. Define data roles and career path . Within the first six months, work with the Director of Engineering to define the data related roles required by CloudFactory engineering and how CloudFactory should recruit into, or develop existing staff, into those roles.EssentialRequirements
- Experience with the design and implementation of web-scale relational data models
- Experience with the design and implementation of web-scale document/NoSQL data models
- Experience with the design and implementation of data warehouse/aggregation solutions
- Experience with the design and implementation of modern or traditional ETL processes
- Hands-on experience with AWS
- Translating regulatory and compliance requirements into technical guidance
- Bacherlor / Undergraduate Degree in Computer Science, IT, or similar field; a Master's is a plus
- Data engineering certification is a plus
- Experience with the design and implementation of search engines (especially ElasticSearch/Lucene)
- Understanding of data-streaming technologies (e.g. Kinesis/Kafka)
- Experience with the core AWS data stack and associated products: S3, Glue, Redshift, Athena, QuickSight
- Experience with a BI tool (report and visualization design)
- Background in data development
- Understanding of the role and needs of data science
CloudFactory is a global leader in combining people and technology to provide workforce solutions for machine learning and business process optimization. Our professionally managed and trained teams work with high accuracy using virtually any tool. We process millions of tasks a day for innovators including Microsoft, GoSpotCheck, Hummingbird Technologies, Ibotta and Luminar. We exist to create meaningful work for one million talented people in developing nations, so we can earn, learn, and serve our way to become leaders worth following.
Join us, and change the world for the better. If you are skilled and humble, with a commitment to lifelong learning, and you're curious about the world and its people, you could be a good fit at CloudFactory. We welcome the unique contributions you can bring to help us build a diverse, inclusive workplace because we connect, learn, and grow stronger from our differences. We want you to bring your whole, authentic self to work. We look forward to hearing from you!
Still unsure? Read '5 Reasons You Should Work at CloudFactory'.