Tips and tricks to reduce MCP token bloat

Bill Doerrfeld | February 5, 2026

My latest for The New Stack shares techniques on optimizing MCP usage.

MCP servers can quickly drain context windows without the right guardrails. Thankfully, there are ways around this...

Today,
my feature with The New Stack breaks down a number of practical techniques for reducing MCP token bloat as teams begin using multiple MCPs in real, scaled workflows.

Techniques include more intentional tool design, minimizing upfront context, progressive disclosure, better tool discovery, subagents, code mode, semantic caching, stronger prompting practices, and more.

The big takeaway: as MCP gains real enterprise traction, it'll take smart approaches to optimize its use in software development.

Huge thank you to the experts who shared their knowledge with me for this piece! This one features, in order of appearance:

-
Gil Feig, CTO and co-founder, Merge
-
Christian Posta, VP and global field CTO, solo.io
-
Alex Salazar, co-founder and CEO, Arcade.dev
-
Marcin Klimek, senior technical product manager, SmartBear
-
Kevin Swiber, API strategist, Layered System
-
Neeraj Abhyankar, VP of data and AI, R Systems
-
Ori Yitzhaki, chief product officer, Sonar
-
Tom Moor, head of engineering, Linear
-
Matt Martin, co-founder and CEO, Clockwise
-
Ankit Jain, CEO, Aviator
-
Melissa R., Director of AI, AppOmni


This is a space I expect will continue to evolve, and I hope to continue covering the emerging techniques to get the most of MCP in practice.

Read: 10 strategies to reduce MCP token bloat
By Bill Doerrfeld February 4, 2026
It may seem like AI agents are suddenly doing everything across industries. But in reality, the pace of agentic AI is moving carefully, and very deliberately, in highly regulated environments like finance and banking.
By Bill Doerrfeld February 3, 2026
My latest feature for InfoWorld explores when it makes sense to scrape public web sources, and when official API integrations are the better choice for external data.
By Bill Doerrfeld January 30, 2026
What does it mean to go nano with your software updates — to "carve with a scalpel" instead of swinging a hammer? For my latest DirectorPlus piece, I caught up with Chainguard VP Dustin Kirkland to dig into that idea.
By Bill Doerrfeld January 27, 2026
I recently moderated a webinar that brought together three luminaries in the API community to discuss the importance of API standards in agentic AI development.
By Bill Doerrfeld January 26, 2026
The more folks use MCP servers in development, the more they’re realizing it can lead to runaway token usage, unpredictable response sizes, and flooded context windows.
By Bill Doerrfeld January 20, 2026
Who really benefits from AI coding tools? New research suggests AI amplifies existing top performers more than average developers. Read my post on LeadDev.
By Bill Doerrfeld January 19, 2026
Many say edge computing will enable the future of AI inference. For InfoWorld, I looked at the tech required, and the roadblocks to overcome to get us there.
By Bill Doerrfeld January 15, 2026
Survey data from Zuplo finds rising MCP adoption, security concerns, and shows how developers are using MCP servers to connect AI with APIs in 2026.
By Bill Doerrfeld January 5, 2026
Blockchain for everything, metaverse, big data, NFTs... In hindsight, what were we thinking? Today, I call out some of tech's biggest overhyped trends on InfoWorld.
By Bill Doerrfeld January 5, 2026
Like any production software application, AI agents are producing a spectrum of metadata behind the scenes. Some are calling agentic metadata a “gold mine” to direct continual improvements.