A Guide to Vision-Language-Action Models

In partnership with

Dear Sentinels

What a week it has been! As you may recall, I started at the University of Southampton earlier this month, and my first deliverable is looming, just four days away, not that I’m counting. Of course, as I mentioned when I broke the news about the new job, I wrote this last week, so if anything goes wrong, blame my past self, arg! I’ve also thrown my hat in the ring for five years of funding, fingers crossed, and I’ll keep you posted if the universe is feeling generous. This week, we’re dipping our toes into the world of Large Language Models for robotics. Exciting stuff, and no robots were harmed in the making of this edition.

Vision–language–action (VLA) models are causing quite the stir in the world of robotics. Gone are the days when robots needed separate systems for seeing, thinking, and moving, VLA models bundle it all together in one neat package. The secret sauce? They train on camera feeds, natural language instructions, and the resulting actions, all at the same time. This means our robot friends can finally connect what they see, what you say, and what they’re supposed to do, no more endless lines of task-specific code (and no more late-night debugging sessions, thank goodness). The real magic happens when you feed these models mountains of data from the internet, giving them a surprisingly broad grasp of the world. Add a dash of imitation or reinforcement learning, and suddenly they’re not just moving, but actually figuring things out in the real world. The upshot: robots that aren’t doomed to a life of repetitive tasks, but can adapt on the fly to new places, objects, and jobs, even if they’ve never seen them before.

In the investigative article up next, we’ll take a closer look at how language and motion are coming together in robotics, no interpretive dance required. After that, the academic article will dig into what really matters when building vision–language–action models for generalist robots. But before we get too serious, let’s see what oddities the web has thrown up for us this week.

88% resolved. 22% stayed loyal. What went wrong?

That's the AI paradox hiding in your CX stack. Tickets close. Customers leave. And most teams don't see it coming because they're measuring the wrong things.

Efficiency metrics look great on paper. Handle time down. Containment rate up. But customer loyalty? That's a different story — and it's one your current dashboards probably aren't telling you.

Gladly's 2026 Customer Expectations Report surveyed thousands of real consumers to find out exactly where AI-powered service breaks trust, and what separates the platforms that drive retention from the ones that quietly erode it.

If you're architecting the CX stack, this is the data you need to build it right. Not just fast. Not just cheap. Built to last.

See the data

News from around the web!

MiroFish

Anthropic Academy

OpenAI is throwing everything into building a fully automated researcher

Cybersecurity for Beginners | Google Cybersecurity Certificate

Cursor admits its new coding model was built on top of Moonshot AI’s Kimi

The Software Factory: Why Your Team Will Never Work the Same Again

OpenAI reportedly plans to double its workforce to 8,000 employees

The Disappearing DOGE Depositions