Research

I Asked AI to Measure a Building. It looked up the Answer Instead

AI Can Look At a Video & Describe a Car Crash. It Can't Tell You How Fast the Car Was Going. Over the last few months, just like you - I've been using VLMs (Vision-Language Models) quite extensively, more aggressively than ever. Uploading screenshots, analysing dashcam clips, feeding product images, asking models to "look at this video and tell me what's happening." And I'll be honest - the results felt magical. "There's a yellow sedan moving left to right across the frame

Mar 29 min read

What's Actually Behind the AI Evaluation Hype?

What's behind the AI evaluation obsession? Over the last few months, I noticed something strange. Every AI paper I opened, every conference talk I watched, every product launch I followed - they all kept circling back to the same word: AI evaluations. Not training. Not fine-tuning. Not even prompting. Evaluations!! At first, I dismissed it as another benchmark fatigue - another leaderboard war. But this wasn't about 'measuring' models better. It was about 'teaching' them dif

Feb 57 min read

Stop Chunking Your CSVs: How I Built an AI That Reads Spreadsheets Like a Human

Last week, I walked into a meeting that started with what seemed like a simple question. "We've got PDF ingestion working great. How do we solve for CSVs and spreadsheets?" The room went quiet. You know that silence - the one where everyone's mentally calculating how much work just landed on the table? That was it. See, here's the thing. Chunking PDFs and text files into a RAG pipeline? That's pretty neat at this point. It's almost a solved problem. You split the document, ge

Dec 12, 20257 min read

Why Your RAG System Is Lying to You?

The company's AI assistant, powered by their "state-of-the-art RAG system," had confidently told a potential client: "According to our...

Jul 9, 20257 min read

What Do the World's Smartest AI Models Read Before They Graduate? (Spoiler: It's Not What You Think)

Picture this: The entire internet contains roughly 100 trillion words of text. Now guess how much of that makes it into training the...

Jul 3, 20256 min read

Artificial Empathy: How AI Learns to Pretend It Cares

"ChatGPT understands me better than my therapist ever did." When I first heard this from an old friend last month, I nearly choked on my...

Jun 30, 20255 min read

The Hidden Science Behind LLM Token Limits (And How Million-Token Models Actually Work)

Introduction "Why can't I just paste my entire company's documentation into ChatGPT?" If I had a dollar for every time I've heard this...

Jun 27, 20257 min read