
Smart Data Solutions Hiring Drive – Intern AI Engineer
Smart Data Solutions Hiring Drive: An exciting opportunity awaits aspiring AI professionals who want to be at the intersection of machine learning, natural language processing, and health technology. It is an Intern AI Engineer role in the India office, which is located within Perungudi, Chennai, and focuses on research-driven creation of intelligent systems that are designed to change the way documents from the pharmaceutical industry are evaluated, as well as processed and used. This job offers an in-depth experience of building robust AI pipelines and testing modern technologies within the world of the language model (LLM) ecosystem.

About the Role
This AI Engineer Intern is accountable for analysing and designing deep learning and machine model of learning models that are specialised in health and pharmaceutical data. The work involves the transformation of complicated documents related to clinical and regulatory into useful, actionable insights from data. The intern is expected to assist with OCR (Optical Character Recognition), OMR (Optical Mark Recognition), as well as NLP (Natural Language Processing) tasks, which are essential elements of the latest healthcare AI solutions.
A thorough understanding of text analytics language modelling, as well as image-to-text transformation methods, plays a vital role in creating a reliable and scalable model performance. The job requires expertise with LLM frameworks. The intern will work on software that is built with LangChain, Qwen, Nuextract and other open-source models for foundations.
Key Responsibilities
- Develop, refine and test deep and machine learning models for pharmaceutical and medical document processing.
- Develop as well as maintain AI pipelines to support OCR and OCR functions which convert text that is not structured like prescriptions, clinical forms and regulatory filings into digital formats that are well-organised.
- Explore instruction-based LLMs to improve understanding of documents and help in implementing RAG or retrieval augmented generation (RAG) modules.
- Write flexible, efficient and reusable code written in Python and Java, as well as contributing to integration tasks in the back-end and testing frameworks.
- Create prompt templates and optimise the model of language performance across a variety of health text data.
- Take part in the data cleanup, annotating and labelling tasks to improve the quality of biomedical NLP data sets.
- Collaboration with cross-functional teams comprising biomedical researchers, data scientists, along software engineers and biomedical researchers to enhance AI models and pipelines.
- Control experiments that are related to the retrieval of information and search enhancement aided by AI systems that are contextual.
The focus of this internship is the learning process through implementation, and gives the chance to work with real-world data that is not structured and helps develop the development of advanced solutions for processing documents in the health care industry.
Required Skills and Qualifications
- Expertise in Python programming and working-level understanding of Java to develop integration.
- Solid base on machine learning and deep learning, and the principles of natural language processing.
- Experientially demonstrated interest or exposure to LLMs like Qwen, Nuextract, or other open-source instruction-tuned models.
- Experience in frameworks and libraries like Hugging Face Transformers, OpenCV, Tesseract, SpaCy, as well as PyTorch and TensorFlow.
- Experience with OCR/OMR tools as well as document extraction tools for images and text-based data.
- The ability to manage the components independently in AI research projects, and also contribute in collaborative settings.
- Knowledge gained from prior academic or project-based studies of Biomedical Artificial Intelligence, Pharmaceutical Informatics, or Healthcare NLP applications.
- Understanding of tools for annotation on documents, as well as the organisation of datasets and metrics for model evaluation.
- Inquiring about the retrieval of information, prompt engineering, prompting, and the application of LLM process workflows for deployment.
Smart Data Solutions Hiring Drive – Apply Link
Join our Telegram group: Click Here
Follow us on Instagram: Click Here
Join our WhatsApp group: Click Here
More Latest Off-Campus Hiring 2025 Jobs:
- Tech Mahindra New Big Hiring Drive – Apply Link
- Microsoft New Hiring Drive – Apply Link
- Amazon’s New Biggest Hiring – Apply Link
- Cognizant New Hiring Drive – Apply Link
- Izeon Off-Campus Hiring – Apply Link
- Safran Off-Campus Drive – Apply Link
- Motorola Off-Campus Drive – Apply Link
- RRB NTPC Off-Campus Recruitment – Apply Link
- Adobe New Off-Campus Hiring – Apply Link
- HCLTech New Mass Hiring – Apply Link
- HCLTech Big Hiring Drive – Apply Link
- Coinbase Off-Campus Hiring – Apply Link
- Deloitte Big Mass Hiring – Apply Link
- EY Big Mass Hiring Drive – Apply Link
- Siemens New Hiring – Apply Link
- Aptean Off-Campus – Apply Link
- Zensar Off-Campus Hiring – Apply Link
- Lumel Off-Campus Hiring – Apply Link
- Infosys’ Biggest Hiring – Apply Link
- EY Hiring Drive 2025 – Apply Link
- Nokia Off-Campus Drive – Apply Link