

I Asked AI to Measure a Building. It looked up the Answer Instead
AI Can Look At a Video & Describe a Car Crash. It Can't Tell You How Fast the Car Was Going. Over the last few months, just like you - I've been using VLMs (Vision-Language Models) quite extensively, more aggressively than ever. Uploading screenshots, analysing dashcam clips, feeding product images, asking models to "look at this video and tell me what's happening." And I'll be honest - the results felt magical. "There's a yellow sedan moving left to right across the frame
Ashish Arora
9 min read


What's Actually Behind the AI Evaluation Hype?
What's behind the AI evaluation obsession? Over the last few months, I noticed something strange. Every AI paper I opened, every conference talk I watched, every product launch I followed - they all kept circling back to the same word: AI evaluations. Not training. Not fine-tuning. Not even prompting. Evaluations!! At first, I dismissed it as another benchmark fatigue - another leaderboard war. But this wasn't about 'measuring' models better. It was about 'teaching' them dif
Ashish Arora
7 min read
