Projects

A mix of shipped work, ongoing experiments, and slots reserved for future LLM and agentic AI projects.

LLMs • Embeddings • Research

Embeddings Researcher — Comparing Representation Learning Methods

A research-oriented Streamlit app to compare TF-IDF, Skip-gram, CBOW, and GloVe embeddings — exploring when each representation method works best and how they affect downstream tasks.

Implemented core algorithms from scratch in PyTorch / NumPy.
Interactive UI for neighborhoods, similarities, and analogies.
Designed to help users build intuition for “how embeddings think”.

Live Demo Code (repo link)

Agentic AI • RAG • Healthcare

Clinical QA Agent with Retrieval-Augmented Generation

Multi-agent workflow where one agent retrieves & ranks medical evidence, another synthesizes answers, and a critic agent checks for hallucinations before responses are surfaced.

Built with LangGraph, LangChain, and a vector database.
Distinct agent roles (retriever, generator, verifier).
Focus on traceability and “show your work” chains of thought.

Repo (placeholder)

Coming soon

These are intentional placeholders — as I push more projects to GitHub / Hugging Face, I’ll link them here.

Agentic AI

AgentBench: Evaluating Multi-Agent LLM Workflows

A small benchmark suite for comparing agentic LLM setups: single-agent vs multi-agent, with different planning and tool-calling strategies.

Task templates for retrieval, planning, tool use.
Evaluation metrics beyond accuracy (latency, cost, stability).

Coming soon

RAG • Evaluation

RAGEval: Debugger for Retrieval-Augmented Generation

A toolkit to inspect retrieval quality, context construction, and answer grounding — with side-by-side comparisons of different retrievers and rerankers.

Visualize query → top-k documents → final answer.
Hooks for LLM-as-a-judge and human evaluation.

Coming soon

Vision • Deep Learning

Vision Transformer from Scratch

A PyTorch implementation of a compact Vision Transformer, trained on a small image dataset, focusing on clarity of code and explanatory visuals.

Step-by-step, well-documented implementation.
Side-by-side with a ResNet baseline.

Coming soon