Available for AI/ML Roles & Research Internships

Abishek Mishra AI / ML Engineer

Building intelligent systems that extract signal from noise โ€” from RAG pipelines over 1,200+ document chunks to ML models with measurable performance gains. Research-backed, production-oriented.

3+Research Roles
87%RAG Retrieval Accuracy
52K+Records Processed
9%Model Improvement

Research-Backed.
Production-Minded.

I'm an AI/ML Engineer with hands-on research and industry experience spanning data preprocessing, model evaluation, and end-to-end pipeline development. My work sits at the intersection of machine learning engineering and applied research.

From reducing preprocessing time by 38% at DeepQore to building a RAG system achieving 87% top-5 retrieval accuracy, I consistently translate research insights into measurable engineering outcomes.

I've worked across AI + neuroscience research at GBSCIDP & NeuroAI Research Foundation, giving me a broad perspective on how intelligent systems can push disciplinary boundaries.

End-to-end ML pipeline development & optimization
Research paper synthesis & experimental design support
Data engineering: cleaning, EDA, validation at scale
LLM integration, prompt engineering, RAG architecture
38%
Preprocessing time reduction @ DeepQore
11% โ†’ 3%
Missing data reduction @ Elevate Labs
0.71 โ†’ 0.80
Model accuracy improvement via pipeline tuning
50+ req/min
FastAPI backend throughput in RAG system

Technical Toolkit

A full-stack AI engineering toolkit spanning languages, frameworks, data infrastructure, and research methods.

๐Ÿ
Languages & Core
Python SQL JavaScript Bash Markdown
๐Ÿค–
ML / AI Frameworks
scikit-learn TensorFlow PyTorch Hugging Face LangChain OpenAI API
๐Ÿ“Š
Data & Analytics
Pandas NumPy Matplotlib Seaborn Plotly EDA Feature Engineering
โš™๏ธ
Infrastructure & APIs
FastAPI Flask Streamlit PostgreSQL Redis Docker REST APIs
๐Ÿ”
NLP & Retrieval
TF-IDF Vector DBs Embeddings Semantic Search RAG Prompt Engineering
๐Ÿ› ๏ธ
Dev & Research Tools
Git / GitHub Jupyter VS Code Weights & Biases Linux LaTeX

Where I've Built

Three research and industry roles spanning data engineering, model optimization, and AI/neuroscience research.

AI/ML Research Intern Remote
DeepQore
  • Processed and engineered features across 52,000+ records, building robust preprocessing pipelines
  • Reduced preprocessing time by 38% through pipeline optimization and vectorized operations
  • Conducted systematic ML model comparisons using Accuracy, F1-score, and RMSE as evaluation metrics
  • Improved model performance by up to 9% through hyperparameter tuning and feature selection
  • Synthesized and summarized AI research papers to inform experimental design and methodology
AI/ML Intern Remote
Elevate Labs
  • Performed comprehensive data cleaning and preprocessing across multiple real-world datasets
  • Reduced missing data from 11% to 3% using targeted imputation and validation strategies
  • Conducted exploratory data analysis on 15,000+ records, uncovering actionable patterns
  • Built data validation pipelines ensuring downstream model integrity and reproducibility
Junior AI Researcher Research
GBSCIDP & NeuroAI Research Foundation
  • Contributed to interdisciplinary research at the intersection of AI and neuroscience
  • Supported data-driven experimentation design and quantitative analysis workflows
  • Assisted with model evaluation, documentation, and research reporting
  • Engaged with cutting-edge neuroscience literature to inform AI modeling approaches

What I've Built

Four end-to-end AI/ML systems demonstrating research depth and production-level engineering capability.

๐Ÿ”
Flagship
Retrieval-Augmented Generation (RAG) System

End-to-end RAG pipeline built over 1,200+ document chunks. Implements PDF ingestion, semantic chunking, embedding generation, and hybrid search combining vector similarity with metadata filtering. Redis caching layer reduces response latency significantly.

87% Top-5 Accuracy 50+ req/min 25% Latency Reduction 1,200+ Doc Chunks
Python FastAPI PostgreSQL Vector DB NLP Redis LangChain
๐ŸŽฌ
Content-Based Recommendation System

Movie recommendation engine built on a 5,327-title dataset. Uses TF-IDF vectorization on metadata and cosine similarity for real-time matching. Sub-100ms response times make it suitable for production integration.

<0.08s Response 5,327 Movies
Python TF-IDF Cosine Similarity scikit-learn Pandas
โœ๏ธ
AI Story Generator (LLM System)

Controlled generative AI application leveraging large language models with structured prompt engineering. Optimized prompt templates guide output coherence, style, and length. Deployed as an interactive Streamlit app.

Prompt Engineering Streamlit Deployed
Python LLM APIs Prompt Engineering Streamlit
๐Ÿ“ˆ
Student Performance Prediction System

Full ML pipeline for academic outcome prediction on a 1,001-record dataset. Systematic feature engineering, preprocessing, and model selection lifted accuracy from 0.71 to 0.80 โ€” a 12.7% relative improvement.

0.71 โ†’ 0.80 Accuracy 1,001 Records
Python scikit-learn Feature Engineering EDA Pandas

Credentials

Industry-recognized certifications validating core ML, data science, and AI competencies.

๐Ÿง 
Machine Learning Specialization
Andrew Ng / Coursera
๐Ÿ
Python for Data Science & AI
IBM / Coursera
๐Ÿ“Š
Data Analysis with Python
IBM / Coursera
๐Ÿค—
NLP with Hugging Face
Hugging Face / Community
โšก
Deep Learning Fundamentals
fast.ai
๐Ÿ”ฌ
AI Research Internship
NeuroAI Research Foundation

Download My Resume

Full PDF resume with detailed experience, education, and project documentation โ€” ready for recruiter review.

Request Resume PDF

Let's Build Something

Open to AI/ML engineering roles, research internships, and collaborative projects. Let's connect.