Data Scientist with 7+ years of experience applying Python, SQL, statistical modelling, machine learning, to solve large-scale business problems in data-intensive environments. Experienced in building end-to-end explainable models, translating business problems into analytics solutions, and deploying scalable data products. A strong communicator with a proven track record of driving data-driven decision-making in Agile teams.
📄 View My Resume
A lightning-fast Streamlit application powered by Groq (Llama-3.3-70B), LangChain, and local FAISS vector storage. Allows users to query complex PDFs with zero API costs.
Moving beyond basic LangChain tutorials. A deep dive into semantic chunking, hybrid search (BM25 + Vector), and cross-encoder re-ranking.
Built and evaluated custom transformer model for time series prediction. Comparison with baselines like ARIMA, XGboost and advanced models like LSTMs, Bi-LSTMs, GRU, TSmixer to validate the efficiency.
A deep dive into model evaluation. Moving beyond basic accuracy to explore Precision, Recall, Cross-Entropy Loss, and modern LLM/RAG metrics.
Transformers look scary, but they are mostly matrix multiplication! A breakdown of the linear algebra and non-linear functions powering LLMs.
Beyond the Jupyter notebook. A practical guide to wrapping models in FastAPI, containerizing with Docker, and establishing MLOps monitoring.
Bridging the gap between engineering and the boardroom. A guide to converting Precision, Recall, and Model Thresholds into actual dollar amounts.