In Brief:

Tookitaki is looking for a Data Engineer who is familiar with the Spark platform and is able to design, optimize and maintain optimal data/machine learning (ML) pipelines in the platform. The following are the main responsibilities of the role:


  • Designing and implementing fine-tuned production-ready data/ML pipelines in Hadoop platform.
  • Driving optimization, testing, and tools to improve quality.
  • Reviewing and approving high-level & detailed design to ensure that the solution delivers to the business needs and aligns with the data & analytics architecture principles and roadmap.
  • Understanding business requirements and solution design to develop and implement solutions that adhere to big data architectural guidelines and address business requirements.
  • Following proper SDLC (Code review, sprint process).
  • Identifying, designing, and implementing internal process improvements: automating manual processes, optimizing data delivery, etc.
  • Building robust and scalable data infrastructure (both batch processing and real-time) to support the needs of internal and external users
  • Understanding various data security standards and using secure data security tools to apply and adhere to the required data controls for user access in the Hadoop platform.
  • Supporting and contributing to developing guidelines and standards for data ingestion
  • Working with data scientist and business analytics team to assist in data ingestion and data-related technical issues.
  • Designing and documenting the development & deployment flow.


  • Experience in developing rest API services using one of the Scala frameworks
  • Ability to troubleshoot and optimize complex queries on the Spark platform
  • Expert in optimizing ‘big data’ data/ML pipelines, architectures, and data sets
  • Experience in Big Data access and storage techniques.
  • Experience in doing cost estimation based on design and development.
  • Excellent debugging skills for the technical stack mentioned above which even includes analyzing server logs and application logs.
  • Highly organized, self-motivated, proactive, and able to propose the best design solutions.
  • Good time management and multitasking skills to work to deadlines by working independently and as a part of a team.
  • Ability to analyze and understand complex problems.
  • Ability to explain technical information in business terms.
  • Ability to communicate clearly and effectively, both verbally and in writing.
  • Strong in user requirements gathering, maintenance, and support
  • Excellent understanding of Agile Methodology.
  • Good experience in Data Architecture, Data Modelling, and Data Security.

Must Have Skills :

  1. Spark:3 to 4 years
  2. Scala: Minimum 2 years of experience
  3. Hadoop, Hive, Hbase: Minimum 1 Years of Experience
  4. Expertise in either Kafka or ElasticSearch
  5. Professional level English Communication skills

Get in touch with us

Fill out the form to let us know about you and your enquiry and one of our team will be in contact with you shortly.