Available for AI/ML Roles & Research Internships

Abishek Mishra AI / ML Engineer

Building intelligent systems that extract signal from noise — from RAG pipelines over 1,200+ document chunks to ML models with measurable performance gains. Research-backed, production-oriented.

View Projects Get in Touch Resume

3+Research Roles

87%RAG Retrieval Accuracy

52K+Records Processed

9%Model Improvement

// about

Research-Backed.
Production-Minded.

I'm an AI/ML Engineer with hands-on research and industry experience spanning data preprocessing, model evaluation, and end-to-end pipeline development. My work sits at the intersection of machine learning engineering and applied research.

From reducing preprocessing time by 38% at DeepQore to building a RAG system achieving 87% top-5 retrieval accuracy, I consistently translate research insights into measurable engineering outcomes.

I've worked across AI + neuroscience research at GBSCIDP & NeuroAI Research Foundation, giving me a broad perspective on how intelligent systems can push disciplinary boundaries.

End-to-end ML pipeline development & optimization

Research paper synthesis & experimental design support

Data engineering: cleaning, EDA, validation at scale

LLM integration, prompt engineering, RAG architecture

38%

Preprocessing time reduction @ DeepQore

11% → 3%

Missing data reduction @ Elevate Labs

0.71 → 0.80

Model accuracy improvement via pipeline tuning

50+ req/min

FastAPI backend throughput in RAG system

// skills

Technical Toolkit

A full-stack AI engineering toolkit spanning languages, frameworks, data infrastructure, and research methods.

🐍

Languages & Core

Python SQL JavaScript Bash Markdown

🤖

ML / AI Frameworks

scikit-learn TensorFlow PyTorch Hugging Face LangChain OpenAI API

📊

Data & Analytics

Pandas NumPy Matplotlib Seaborn Plotly EDA Feature Engineering

⚙️

Infrastructure & APIs

FastAPI Flask Streamlit PostgreSQL Redis Docker REST APIs

🔍

NLP & Retrieval

TF-IDF Vector DBs Embeddings Semantic Search RAG Prompt Engineering

🛠️

Dev & Research Tools

Git / GitHub Jupyter VS Code Weights & Biases Linux LaTeX

// experience

Where I've Built

Three research and industry roles spanning data engineering, model optimization, and AI/neuroscience research.

AI/ML Research Intern Remote

DeepQore

Processed and engineered features across 52,000+ records, building robust preprocessing pipelines
Reduced preprocessing time by 38% through pipeline optimization and vectorized operations
Conducted systematic ML model comparisons using Accuracy, F1-score, and RMSE as evaluation metrics
Improved model performance by up to 9% through hyperparameter tuning and feature selection
Synthesized and summarized AI research papers to inform experimental design and methodology

AI/ML Intern Remote

Elevate Labs

Performed comprehensive data cleaning and preprocessing across multiple real-world datasets
Reduced missing data from 11% to 3% using targeted imputation and validation strategies
Conducted exploratory data analysis on 15,000+ records, uncovering actionable patterns
Built data validation pipelines ensuring downstream model integrity and reproducibility

Junior AI Researcher Research

GBSCIDP & NeuroAI Research Foundation

Contributed to interdisciplinary research at the intersection of AI and neuroscience
Supported data-driven experimentation design and quantitative analysis workflows
Assisted with model evaluation, documentation, and research reporting
Engaged with cutting-edge neuroscience literature to inform AI modeling approaches

// projects

What I've Built

Four end-to-end AI/ML systems demonstrating research depth and production-level engineering capability.

🔍

Flagship

Retrieval-Augmented Generation (RAG) System

End-to-end RAG pipeline built over 1,200+ document chunks. Implements PDF ingestion, semantic chunking, embedding generation, and hybrid search combining vector similarity with metadata filtering. Redis caching layer reduces response latency significantly.

87% Top-5 Accuracy 50+ req/min 25% Latency Reduction 1,200+ Doc Chunks

Python FastAPI PostgreSQL Vector DB NLP Redis LangChain

🎬

Content-Based Recommendation System

Movie recommendation engine built on a 5,327-title dataset. Uses TF-IDF vectorization on metadata and cosine similarity for real-time matching. Sub-100ms response times make it suitable for production integration.

<0.08s Response 5,327 Movies

Python TF-IDF Cosine Similarity scikit-learn Pandas

✍️

AI Story Generator (LLM System)

Controlled generative AI application leveraging large language models with structured prompt engineering. Optimized prompt templates guide output coherence, style, and length. Deployed as an interactive Streamlit app.

Prompt Engineering Streamlit Deployed

Python LLM APIs Prompt Engineering Streamlit

📈

Student Performance Prediction System

Full ML pipeline for academic outcome prediction on a 1,001-record dataset. Systematic feature engineering, preprocessing, and model selection lifted accuracy from 0.71 to 0.80 — a 12.7% relative improvement.

0.71 → 0.80 Accuracy 1,001 Records

Python scikit-learn Feature Engineering EDA Pandas

// certifications

Credentials

Industry-recognized certifications validating core ML, data science, and AI competencies.

🧠

Machine Learning Specialization

Andrew Ng / Coursera

🐍

Python for Data Science & AI

IBM / Coursera

📊

Data Analysis with Python

IBM / Coursera

🤗

NLP with Hugging Face

Hugging Face / Community

⚡

Deep Learning Fundamentals

fast.ai

🔬

AI Research Internship

NeuroAI Research Foundation

Abishek Mishra AI / ML Engineer

Research-Backed.
Production-Minded.

Technical Toolkit

Where I've Built

What I've Built

Credentials

Download My Resume

Let's Build Something

Abishek Mishra AI / ML Engineer

Research-Backed.Production-Minded.

Technical Toolkit

Where I've Built

What I've Built

Credentials

Download My Resume

Let's Build Something

Research-Backed.
Production-Minded.