DATA ENGINEER

Durbhasi gurukulam private limited
  • Full Time
  • 08-Feb-2026
  • Pan India,
  • Start date
    Immediately
  • Duration
    6 Months
  • Stipend
    ₹0 /month
  • No of Credits
    35
  • Apply by
    08-Apr-2026
  • Full Time

About the program

We are looking for a passionate Data Engineer Intern to work with structured and unstructured data and assist in preparing datasets for Large Language Model LLM finetuning. The intern will gain handson experience in data preprocessing, transformation, and pipeline development used in modern data and AI systems. This internship focuses on realworld exposure to data engineering workflows and AIready data preparation. Roles Responsibilities: Work with structured data SQL databases, CSV, Excel Handle unstructured data such as text, logs, JSON, documents, and web data Clean, preprocess, and normalize datasets for analytics and ML use cases Assist in creating datasets for LLM finetuning and evaluation Perform data labeling, formatting, and validation Build and maintain basic ETL data ingestion pipelines Integrate data from APIs, files, and databases Ensure data quality and consistency Document data pipelines and preprocessing steps Skills Required: Basic knowledge of Python Understanding of SQL and databases Familiarity with structured and unstructured data formats CSV, JSON, text Basic data preprocessing concepts Interest in AI, Machine Learning, and Large Language Models Preferred Skills Optional: Experience with Pandas NumPy Exposure to NoSQL databases MongoDB, Elasticsearch optional Understanding of LLM concepts and dataset preparation Familiarity with text preprocessing and tokenization Basic knowledge of ML workflows Eligibility Criteria: Students pursuing B.Tech BCA MCA B.Sc M.Sc Diploma Branches: Computer Science, IT, Data Science, AIML, or related fields Freshers and beginners are welcome Learning Outcomes: Handson experience with realworld structured and unstructured datasets Understanding of data pipelines used in AI systems Exposure to LLM dataset preparation and finetuning workflows Practical knowledge of data engineering best practices

Perks

Opportunities for career growth and advancement Employee wellness programs Flexible work hours Remote work options Access to cutting-edge technology and tools.

Who can apply?

Only those candidates can apply who:

  1. are from Any
  2. and specialisation from Any
  3. are available for duration of 6 Months
  4. have relevant skills and interests

Terms of Engagement

The terms of engagement for employees at our company include expectations regarding work hours, performance metrics, benefits eligibility, and adherence to company policies and procedures. These terms are outlined in the employment contract and are agreed upon by both parties upon hiring.

Number of openings

5