Our Ideal Candidate

Tookitaki is looking for a Data Engineer who is familiar with the Hadoop platform and is able to design, implement and maintain optimal data/machine learning (ML) pipelines in the platform.

Responsibilities

  • Designing and implementing fine-tuned production-ready data/ML pipelines in Hadoop platform.
  • Driving optimization, testing and tooling to improve quality.
  • Reviewing and approving high level & detailed design to ensure that the solution delivers to the business needs and align to the data & analytics architecture principles and roadmap.
  • Understanding business requirement and solution design to develop and implement solutions that adhere to big data architectural guidelines and address business requirements.
  • Following proper SDLC (Code review, sprint process).
  • Identifying, designing, and implementing internal process improvements: automating
    manual processes, optimizing data delivery, etc.
  • Building robust and scalable data infrastructure (both batch processing and real-time) to support needs from internal and external users
  • Understanding various data security standards and using secure data security tools to apply and adhere to the required data controls for user access in Hadoop platform.
  • Supporting and contributing to development guidelines and standards for data ingestion
  • Working with data scientist and business analytics team to assist in data ingestion and data related technical issues.
  • Designing and documenting the development & deployment flow.

Requirements

  • Experience in developing rest API services using one of the Scala frameworks
  • Ability to troubleshoot and optimize complex queries on the Spark platform
  • Expert in building and optimizing ‘big data’ data/ML pipelines, architectures and data sets
  • Knowledge in modelling unstructured to structured data design.
  • Experience in Big Data access and storage techniques.
  • Experience in doing cost estimation based on design and development.
  • Excellent debugging skills for the technical stack mentioned above which even includes analyzing server logs and application logs.
  • Highly organized, self-motivated, proactive, and ability to propose best design solutions.
  • Good time management and multitasking skills to work to deadlines by working independently and as a part of a team.
  • Ability to analyse and understand complex problems.
  • Ability to explain technical information in business terms.
  • Ability to communicate clearly and effectively, both verbally and in writing.
  • Strong in user requirements gathering, maintenance and support
  • Excellent understanding of Agile Methodology.
  • Good experience in Data Architecture, Data Modelling, and Data Security.

Experience (Must have)

  • Scala: Minimum 2 years of experience
  • Spark: Minimum 2 years of experience
  • Hadoop: Minimum 2 years of experience (Security, Spark on yarn, Architectural
    knowledge)
  • Hbase: Minimum 2 years of experience
  • Hive - Minimum 2 years of experience
  • RDBMS (MySql / Postgres / Maria) - Minimum 2 years of experience
  • CI/CD Minimum 1 year of experience

Experience (Good to have)

  • Kafka
  • Spark Streaming
  • Apache Phoenix
  • Caching layer (Memcache / Redis)
  • Spark ML
  • FP (Scala cats / scalaz)

Qualifications

Bachelor's degree in IT, Computer Science, Software Engineering, Business Analytics or equivalent with at-least 2 years of experience in big data systems such as Hadoop as well as cloud-based solutions.

Job Perks

  • Attractive variable compensation package
  • Flexible working hours - everything is results-oriented
  • Opportunity to work with an award-winning organization in the hottest space in tech: artificial intelligence and advanced machine learning

Get in touch with us

Fill out the form to let us know about you and your enquiry and one of our team will be in contact with you shortly.