Model Context Protocol (MCP) and MCP Servers in LLM Agent Systems

Published: June 20, 2025

Model Context Protocol (MCP) is an open standard (introduced by Anthropic in late 2024)1 that defines a universal method for connecting large language models (LLMs) to external tools, data sources, and services. MCP is often analogized as a "USB-C port" for AI applications – it provides a standardized interface through which any AI agent can plug into various context providers (APIs, databases, file systems, etc.) without bespoke integration. This section provides a comprehensive overview of MCP and MCP servers, covering its core concepts, real-world implementations, best practices, current debates, production readiness, and comparisons to other tool integration methods.

Conceptual Overview of MCP and MCP Servers

MCP addresses a fundamental limitation: advanced LLMs are powerful but isolated from live data and tools unless explicitly wired to them. Traditionally, integrating an AI agent with N different systems required N custom API connectors; MCP instead offers a single protocol so that models and tools can interoperate in a plug-and-play fashion. In an MCP-based architecture, an AI host application (e.g. a chat assistant or IDE with AI features) can connect to one or more external MCP servers that provide access to specific resources or actions.

Conceptual Overview of MCP and MCP Servers

MCP follows a client–server model. A host running an AI assistant (with an MCP client library) maintains one-to-one JSON-RPC connections to multiple MCP servers, each server exposing a particular set of tools or data access. The host/LLM can invoke tools on these servers via standardized requests, without needing tool-specific integration code. This diagram illustrates a host (e.g. Claude or an IDE) connected to several MCP servers: some serve local data sources (A, B), and others wrap remote services (C).

Key elements of MCP architecture: The host application instantiates an MCP client for each server and coordinates interactions, while each MCP server is a lightweight service exposing certain capabilities via the protocol. Critically, this decouples the model provider (which runs the LLM) from the tool provider (which offers external functionalities) – they communicate through a well-defined JSON-RPC interface rather than custom code. The MCP specification standardizes message formats for tool discovery, invocation, and result handling between clients and servers.2 During initialization, an MCP server advertises what it can do (its available tools, resources, and/or prompts) and the client and server negotiate supported features (capabilities) for the session. Tools are described with machine-readable schemas (using JSON Schema for inputs/outputs) so that the LLM can understand how to call them.

By standardizing how an agent learns about available tools and how it calls them, MCP enables dynamic tool integration. An AI agent can query any MCP server for its tool list (e.g. functions like search_files or send_email with input schemas) and then invoke those tools via a JSON-RPC request over the active connection. The host's MCP client handles the low-level protocol (stdin/stdout streams for local servers, or HTTP + Server-Sent Events for remote servers) so that from the LLM's perspective, using a new tool is as simple as calling a function in a standardized format. This allows agents to gain new capabilities at runtime without retraining or hard-coding, which is especially useful when you want to add tools to an AI agent that you cannot directly modify the code of. For example, a closed-source chat assistant might only support a few built-in tools – MCP provides a pathway to extend it with third-party tools by running MCP servers that the assistant's host can connect to.

In summary, MCP establishes a common protocol for discovery and invocation of tools by AI systems. The model (or agent) remains focused on language reasoning, while external MCP servers (maintained by tool providers or the user) handle the actual operations (searching a database, retrieving documents, executing code, etc.) through a secure, uniform interface. This clear separation of concerns brings flexibility: one can swap out LLM backends or tool implementations independently, as long as both sides speak MCP. It also promotes interoperability across the AI ecosystem – akin to how the Language Server Protocol standardized language tooling for all IDEs, MCP aspires to let any AI assistant work with any tool that implements the protocol.

Real-World Implementations and Use Cases

Since its introduction, MCP has rapidly gained traction with developers and enterprises as a promising interoperability layer for AI agents. A growing open-source ecosystem of MCP servers has emerged, providing integrations for everything from local file systems and databases to SaaS applications. For example, official or community-supported MCP servers exist for Google Drive, Slack, GitHub, Notion, databases like Postgres, web browsers, and many more. At the time of writing, directories report thousands of MCP servers published in open source repositories (though with varying levels of maturity). This includes both "resource" servers (which expose data, for retrieval or search) and "tool" servers (which perform actions like creating tickets, sending emails, running code, etc.), and some hybrid servers that do both. Organizations are also creating private MCP connectors to internal systems, using the protocol to let AI agents safely query proprietary data sources.

Notable implementations of MCP have been integrated into various AI frameworks and platforms:

  • Anthropic Claude and Claude Desktop: As the originator, Anthropic built native support for MCP into the Claude family of assistants. Claude Desktop (the local PC app) allows end-users to install or connect to MCP servers to extend Claude's abilities. Anthropic provided a suite of pre-built servers (for popular tools like Slack, GitHub, etc.) and demonstrated how Claude 3.5 could even help generate MCP server code for new integrations. In June 2025, Anthropic's Claude Code IDE product added support for remote MCP servers with OAuth, meaning developers can point Claude to a cloud-hosted MCP server (e.g. a Sentry or Linear integration) and authorize it, without running anything locally. This update made it easier to use MCP in enterprise settings: vendors can host MCP endpoints for their APIs, and Claude users simply add the URL in order to enable new tools with the assurance of built-in authentication and consent flows.

  • LangChain and LangGraph: The popular LangChain framework (and the related LangGraph project) historically offered its own tool integration system, but it has embraced MCP for interoperability. LangChain's team released an MCP Adapters library that makes Anthropic MCP servers usable as LangChain tools. This means a developer using LangChain can connect an agent to any MCP-compatible server (including the growing open-source list) with minimal effort. Conversely, it allows tool builders to write an MCP server once and have it be usable by LangChain agents, LangGraph workflows, or any other MCP-enabled orchestrator. Harrison Chase (LangChain's CEO) has highlighted MCP's value for scenarios where the agent logic is not under your control – in such cases, MCP can inject external capabilities without needing to alter the agent's code. LangChain's blog even staged a debate on whether MCP will be a lasting standard or a passing fad (more on that in the "Debates" section).3

  • Enterprise Agent Platforms: Major cloud and enterprise AI platforms have begun integrating MCP as a tool plugin mechanism. Notably, AWS's Bedrock Agents (launched in 2025) support MCP for extending agent capabilities. Amazon's official blog shows how a Bedrock agent can be configured with one or more MCP clients (called Action Groups) to grant it access to corporate data or third-party APIs.4 In an example, an agent was connected via MCP to AWS Cost Explorer and CloudWatch data, as well as to an internet answer service (Perplexity AI), enabling the agent to dynamically retrieve cloud spend information and answer questions about it. Microsoft has also experimented with MCP in developer tools – Visual Studio 2022's "Copilot Chat" agent mode (in preview) allows developers to attach MCP servers so the AI assistant can perform tasks like repository querying or issue tracking within the IDE.5 The VS integration supports both local and remote servers, and uses a simple JSON config file (.mcp.json) to specify which servers to launch or connect to (for example, running the official GitHub MCP server via a Docker command). The agent (powered by GitHub Copilot / OpenAI) will list the server's tools and require the user's permission before invoking them – illustrating the emphasis on user consent.

  • Early Enterprise Adoption: Several companies have publicly embraced MCP as part of their AI stack. Anthropic reported that Block (Square) and Apollo were early adopters piloting MCP for internal agentic systems. Developer-focused products like Zed (code editor), Replit, Codeium, and Sourcegraph have collaborated with Anthropic to embed MCP capabilities, aiming to let their AI coding assistants retrieve context (like relevant code from repositories or documentation) via standardized connectors. Enterprise software vendors have also started providing official MCP servers for their APIs – for instance, MongoDB created an MCP server to let agents query MongoDB databases with fine-grained control, and Cloudflare, PayPal, Wix, Glean (Cisco) and others have announced MCP integrations as well. Many see laying this interoperability groundwork as preparing for a future where AI assistants are commonplace; by offering an MCP server now, a company ensures that any compliant agent (present or future) can interface with its system in a safe, structured way.

  • OpenAI and Multi-Model Support: Initially, MCP was driven by Anthropic's ecosystem, but it is not limited to Anthropic models. By mid-2025, OpenAI introduced MCP support in its own APIs and SDKs. OpenAI's Agents SDK added classes like MCPServerStdio and MCPServerSse so that developers can connect ChatGPT-based agents to MCP servers as easily as to native functions.6 The OpenAI "Responses API" (which underpins function calling and tool use) similarly announced support for calling remote MCP servers with only a few lines of configuration. This development is significant: it indicates that MCP is being recognized across the industry, and that a major model provider is enabling cross-platform tool use rather than only proprietary plugin frameworks. Even open-source LLM frameworks and independent projects have joined in – e.g. libraries like Langchain-MCP adapters, the Portkey/lastmile AI MCP agent, and others on GitHub demonstrate how to integrate local LLMs or other agent systems with MCP servers.7 In sum, an MCP server written today can potentially be used by Anthropic's Claude, OpenAI's GPT-4, local LLaMA-based agents, or any future AI that adopts the standard, giving it a unique position as a tool interchange language for AI.

Best Practices for Building and Securing MCP Servers

Building an MCP server involves more than just writing some API wrappers – to be effective (and safe) in an AI agent context, developers should follow certain best practices. Key recommendations include:

  • Clear Schema Definition and Validation: Design your tools with well-defined JSON schemas and enforce them. Each Tool (action function) or Resource (data query) exposed by your server should have a precise input/output schema (using JSON Schema) describing the expected parameters and format. This not only helps the LLM understand how to call the tool, but also lets you validate incoming requests and reject malformed or unexpected inputs at runtime. Avoid overly complex or deeply nested schemas – keep inputs concise and oriented to how a model would use them (e.g. use simple types and short field names to minimize token overhead). Mark optional vs required fields clearly, and consider providing example usage in your documentation. On startup, an MCP server typically advertises its tools and schemas to the client; by locking down this advertised schema (and perhaps caching it on the client side for review), you reduce the risk of a malicious or buggy server sneaking in an unexpected tool call. In practice, developers often rely on MCP server frameworks (like FastMCP in Python or the official SDKs) that use function signatures and docstrings to auto-generate these schemas, which can reduce errors. Schema validation is a must on both sides – MCP clients should verify that a server's declared schema is valid per the MCP spec and doesn't contain disallowed structures, and servers should validate each request against the schema before execution.

  • Authentication and Authorization: Secure any external access your server provides. Many MCP servers act as proxies to APIs or databases, so they need to handle credentials and permissions carefully. A best practice is to manage API auth internally within the MCP server – for example, perform the OAuth flows or API key injection inside the server rather than expecting the LLM or user to provide raw tokens each time. Store secrets in environment variables or secure vaults (never hardcode them in code or config), and use least-privilege principles (e.g. if your server only needs read access to a database, use a read-only credential). If building a remote MCP server intended for multiple users, implement proper auth for incoming MCP client connections as well. The MCP spec itself doesn't mandate an auth scheme for connecting to servers, but a common approach is to use OAuth 2.1 with client credentials or token-based auth for remote servers. For example, Anthropic's Claude Code will prompt the user to authenticate when adding a remote server, then handle token storage securely. Never trust the model with sensitive credentials in plain form – design servers such that the model triggers an action (e.g. "send_email") and the server internally inserts API keys or tokens to execute it, without exposing those secrets in the prompt or response. Also, implement authorization checks if needed: verify the agent/user identity if your server exposes protected data, and enforce any usage policies (the MCP protocol allows the server to inquire about the client's identity or organization in some cases to make decisions). Essentially, treat your MCP server like a public API endpoint – use HTTPS for transport, require auth for non-local use, rate-limit if necessary, and audit its access.

  • Tool Design and Documentation: Make tools that are intuitive for LLMs to use and document them for human developers. Choose clear, descriptive names for tools and resources – e.g. create_issue instead of addRec, or get_user_profile for a read action. The name and the description you provide are what the LLM will rely on to decide when to use the tool, so they should accurately convey the function's purpose. Provide concise but specific descriptions (in the tool's metadata) about what the tool does and any important caveats. For instance, a tool that deletes data should mention that explicitly so the model uses it cautiously. If a tool has side effects or costs, include that context ("Sends an email via SMTP."). Include examples or usage guidance in the tool's docstring or in your README – while the LLM might not see those, developers configuring the system will, and even the act of writing it can help ensure clarity. On the documentation side, maintain a README or docs for your MCP server that covers: how to install/run it, what tools it provides (with input/output schema for each), and any setup needed (auth tokens, config variables). This greatly aids developer onboarding, since others might use your server in their agents. Good documentation also helps non-developers (analysts, product managers) understand what an agent can do when your server is connected, which is important for trust and adoption.

  • Developer Onboarding and Maintenance: Make it easy to build, test, and update MCP servers. One best practice is to leverage the official MCP SDKs (available in Python, TypeScript/JavaScript, Java, C#, Swift, etc.) to implement servers. These SDKs encapsulate the JSON-RPC protocol details and provide base classes for defining tools, emitting events, handling subscriptions, etc., according to the spec. By using an SDK, you ensure your server remains compliant with the MCP schema and can benefit from upstream improvements (security patches, performance tuning). Regularly update to the latest spec version – the MCP spec is evolving (e.g., capabilities negotiation and new features have been added in 2025), so keeping your server library up-to-date will maintain compatibility. During development, take advantage of provided tooling such as the MCP Inspector and other debugging tools, which let you simulate a client and verify your server's responses. Testing is crucial: implement unit tests for each tool (including edge cases and error conditions) and do integration tests by running the server and calling it from a dummy MCP client or agent. This catches issues in how the tool results are formatted or whether errors are handled gracefully. Set up continuous integration to run these tests, especially if you accept community contributions (in open-source servers). Finally, maintain your server: monitor its logs in production (MCP provides a logging mechanism and you can also add custom logging around external API calls), and track usage metrics (which tools are invoked how often, latency, etc.). This will help in optimizing performance and ensuring reliability for enterprise use.

  • Security and Safety Measures: Incorporate safety guardrails to mitigate risks when enabling tool use by AI. The MCP specification emphasizes user consent and control – implementors should ensure that the user is always in the loop when dangerous actions are taken. If you are writing an MCP host or client (rather than a server), you must prompt the user for confirmation before the AI can, say, execute a shell command or send an email. From the server side, you should assume the AI could attempt anything the tool allows, so put safeguards: validate and sanitize inputs (e.g. if your tool runs a database query, prevent SQL injection or disallow certain destructive queries unless in safe mode). If your server executes code or accesses files, consider running it in a sandboxed environment (Docker container with read-only file system, limited network access, non-root user, etc.). This protects the host system in case the AI finds a way to exploit the tool. Implement timeouts for long-running operations and graceful error handling – never let the server hang indefinitely and stall the agent. For remote servers, use appropriate authentication (as discussed) and restrict network exposure (e.g. bind to localhost or require VPN if it's an internal service). Tool-specific safety: if a tool performs an irreversible action (deleting data, moving funds), you might build in an extra confirmation step or a "dry-run" mode that the AI must call first (giving the user a chance to review). Some enterprise MCP clients support scopes or permissions (for example, Visual Studio resets tool permissions whenever a tool list changes, to prevent a server from swapping in a new dangerous tool after user approval). As an MCP server developer, be transparent about what your tools do and adhere to any safety prompts or instructions the host might send. Ultimately, running arbitrary tools via LLM poses trust and safety challenges, so defense-in-depth is recommended: validate everything, log actions, and give users oversight at multiple stages.

By following these best practices – defining strict schemas, securing credentials, writing clear docs, testing thoroughly, and building in safety checks – you can create MCP servers that are reliable and enterprise-ready. The goal is to make it as straightforward and safe as possible for an AI agent to interface with your tool. A well-designed MCP server will minimize misunderstandings by the LLM and prevent unintended side effects, thus instilling confidence for both developers and users when integrating it into AI workflows.8910

Current Limitations and Debates around MCP

Despite the excitement around MCP, there are active debates in the AI community about its complexity, practicality, and long-term role. Critiques of MCP generally focus on a few areas: the protocol's overhead and scope, potential duplication of existing capabilities, security/trust issues, and actual effectiveness in practice. Here we outline some of the key limitations and contrasting viewpoints:

  • Protocol Complexity and Scope Creep: One critique is that MCP, in trying to be comprehensive, has become more complicated than necessary for a "tool calling" protocol. For example, the MCP spec doesn't just cover tools – it also includes provisions for sharing prompts (predefined prompt templates) and even letting servers request their own LLM completions or user queries (the sampling feature). Some argue this goes beyond the minimal requirements to call an external function, potentially making implementations heavier. Nuno Campos (lead of LangGraph) quipped: "Why does a tool protocol need to also serve prompts and LLM completions?" – implying that MCP might be over-engineered. He also questioned the decision to use a stateful, two-way JSON-RPC connection for tools, which necessitates persistent sessions and event handling, rather than a simpler stateless request/response API. The MCP designers chose JSON-RPC and streaming for flexibility (bi-directional notifications, incremental outputs, etc.), but skeptics note that this makes running MCP servers in scalable cloud environments trickier (maintaining state per session). In essence, critics would prefer a leaner protocol focused purely on tool invocation, leaving prompts and other extras out; they worry that the added complexity raises the barrier to implementation and may not be worth the payoff if many agents don't use those advanced features.

  • Tool Quality and "Footgun" Concerns: Another challenge raised is whether plugging in arbitrary third-party tools can truly yield robust agent behavior. Empirical evidence shows current LLMs often struggle to choose the correct tool and format the call correctly without significant prompt tuning. Nuno pointed out that in carefully crafted agents, models still mis-call tools about half the time, so letting users throw random new tools at an agent (via MCP) might result in a poor success rate. The argument is that effective agent-tool integration usually requires tailoring the system prompt and instructions to those specific tools – something one doesn't get when tools are added dynamically. If an agent doesn't understand the nuances of a new tool, it might call it incorrectly or at suboptimal times. Harrison Chase acknowledged that "these agents might not be 99% reliable" but hopes they can be "good enough to be useful" for many personal or internal use cases. Still, the risk is that if tool calls fail frequently or produce errors, end-users will get frustrated. This leads to a debate: Is it better to have a long tail of easily added tools that work occasionally, or a short list of hand-integrated tools that work consistently? Proponents believe LLMs are rapidly improving at tool use and that a huge ecosystem of tools (as enabled by MCP) will unlock countless niche workflows that bespoke agents would never cover. Skeptics counter that for serious production agents, one would rather custom-build the integration to ensure reliability, instead of relying on a one-size-fits-all tool caller. In short, there's a quality trade-off inherent in MCP's generality.

  • Security and Trust Considerations: By design, MCP can grant an AI broad powers – from reading files to executing code – which raises trust and safety issues. One point of debate is how to trust third-party MCP servers. Since anyone can write an MCP server, an agent or user connecting to one is effectively running unknown code (especially for local servers). If the server is malicious or compromised, it could abuse the host system or return deceptive information. The MCP spec leaves authentication and trust up to the implementer (hosts are expected to prompt users for approval of each tool and not expose sensitive data without consent), but it doesn't provide a built-in authentication handshake for server identity. This is gradually being addressed by platform-specific measures (for instance, Anthropic's directory of *"recommended" servers or vendors signing their servers, and hosts like VS and Claude Code requiring OAuth or user tokens for certain remote servers). Still, the lack of a standardized trust mechanism means adoption in enterprise settings requires careful vetting of each server. Companies may choose to run only internally developed MCP servers or those from trusted vendors, rather than arbitrary open-source ones. Another aspect is that some critics feel MCP should have offered a more direct secure HTTP API model for tool calls. Petrus Theron, an early MCP contributor, argued that "LLMs already know how to talk to any API documented with OpenAPI; the missing piece is authorization. Why not just let the AI make HTTP requests with proper auth, instead of wrapping everything in MCP?" From that perspective, MCP's JSON-RPC layer and custom schema could be seen as unnecessary indirection – a simpler approach could be an authorization gateway that lets an AI call whitelisted REST APIs safely. Theron also noted the current MCP spec lacks streaming responses for tools (each tool call is a single request/response, meaning long-running operations can't send intermediate data unless you break them into chunks). He suggested gRPC or another streaming-friendly RPC would have been better for those cases. These critiques highlight that MCP is not without design trade-offs: it chose a uniform RPC for compatibility and session context, at the cost of some statelessness and simplicity.

  • Adoption and Fragmentation: A major question is whether MCP will become a widely adopted standard (a fixture in AI infrastructure) or just a transient experiment (a fad). Early on, one limitation was that not all major AI providers supported MCP – it originated with Anthropic's Claude, while OpenAI and others had their own plugin/function paradigms. By mid-2025, this gap started closing (OpenAI's support as noted), but we still see parallel efforts like Google's A2A (Agent-to-Agent protocol) and Cisco's AGNTCY, each addressing integration in different ways. The existence of multiple "standards" has led to talk of an impending "AI agent protocol war". Google explicitly positioned A2A as complementary to MCP, not a competitor, focusing on inter-agent communication rather than tool use. However, it remains to be seen if the industry will converge on a single method. Some skeptics, like Nuno, have pointed out that early MCP enthusiasm might be inflated by hype on social media rather than real usage – he referenced an online directory counting many servers but implied real adoption lagged the buzz. The LangChain debate ended by listing what MCP would need to avoid being a historical footnote: simpler implementation, a stateless/scale-friendly mode for server use, better auth solutions, and improved reliability when mixing in new tools. These are non-trivial challenges, but they are being worked on (for example, Anthropic's June 2025 update introducing OAuth for remote servers directly tackles the auth issue, and discussions are ongoing about an HTTP-only stateless variant or best practices for scaling MCP servers).

On the other hand, supporters of MCP argue that many of these limitations are surmountable or already improving, and that MCP's benefits outweigh its downsides for certain classes of problems. They note that MCP's ecosystem is growing fast – in its first 7–8 months it became, at least by count of integrations, the de facto choice for connecting AI to tools. Enthusiasts believe that even if MCP isn't perfect, it has spurred a community to rally around open tool integrations rather than closed proprietary ones, which in the long run is better for the AI industry. Jeff Wang of Exa stated that the reason tool interoperability protocols are "having a moment" now is that models have finally become capable enough to use tools effectively, making it urgent to establish common standards for doing so. From the enterprise perspective, champions like MongoDB's product director tout MCP's fine-grained control – the ability for organizations to expose exactly what an agent can do or see, in contrast to giving an AI unfettered API access. This resonates with companies that need to safeguard data: with MCP servers, they decide which functions to expose and can even require identity/context from the agent side before fulfilling a request, adding a layer of governance. So while there is healthy skepticism, there is also a cohort that sees MCP (or something like it) as a necessary piece of the AI stack going forward, albeit one that will evolve. As one VentureBeat article put it, many enterprises are starting to invest in MCP now on the belief that it could become "the universal language" for AI-tool interoperability – and that laying the groundwork early is preferable to being left behind.11

In summary, MCP's current limitations include its relative complexity, the need for careful security handling, questions about reliability with arbitrary tools, and the fact that it's an emerging standard with some competing approaches. The community is actively debating whether these are simply growing pains of a future fixture or signs that a different approach might win out. The coming year will likely see refinement of the protocol (e.g. addressing streaming and stateless use cases) and clearer evidence of how well MCP-powered agents perform in real-world use. These debates are valuable, as they are prompting improvements and informing best practices – whether MCP in its exact current form endures or not, it has undoubtedly accelerated progress toward interoperable, extensible AI agents.

Production Readiness and Future Outlook

Is MCP ready for production use in enterprise systems? The answer is nuanced. On one hand, MCP has reached a level of maturity and support in mid-2025 that allows for serious deployments; on the other hand, it is still rapidly evolving and not yet a "plug and play" panacea for all AI integration needs.

Evidence of readiness: Several enterprises have already implemented MCP in production-like environments. As noted, companies like Block have used it to build internal agent systems, and Amazon's Bedrock platform (targeted at enterprise AI workloads) has built-in MCP integration, which signals confidence in its stability. The availability of multi-language SDKs and reference servers makes it easier for enterprise dev teams to build custom MCP connectors (there are official templates and quickstart guides for languages like Python and Java, which speed up development while following the spec). Importantly, the move by OpenAI to support MCP means that an enterprise not on Claude can still leverage MCP servers with models like GPT-4, potentially using OpenAI's managed tooling infrastructure. This cross-compatibility is a big step toward production viability – enterprises can invest in writing an MCP server (for, say, their internal knowledge base) and use it with multiple AI vendors' models, reducing vendor lock-in. Moreover, critical features for enterprise use are being addressed: the Claude Code remote server support update introduced native OAuth flows, which is crucial for integrating with business SaaS APIs securely. We also see efforts around deployment: containerized MCP servers, serverless templates, and helm charts are starting to appear in the community, meaning teams are packaging MCP servers for cloud deployment with observability and scaling in mind (early adopters have shared tools to run MCP servers on Kubernetes, etc., though this is outside the core spec).

What's missing or in progress: One gap is a lack of official endorsement by a standards body or industry consortium. MCP today is an open source project led by Anthropic and contributors; it isn't (yet) an IEEE or W3C standard, for example. This could give some enterprises pause, worried that changes could occur or that the spec might fork. The MCP team has published a versioning policy to manage changes and maintain backward compatibility as much as possible, and the spec has seen minor revisions (e.g. 2025-03 and 2025-06 versions) to incorporate feedback. Another missing piece is major model-provider baked-in support beyond what's already mentioned – e.g., OpenAI's support is via their SDK, but OpenAI's own ChatGPT user interface does not let end-users connect MCP servers (OpenAI's approach to tools for end-users was their Plugins system). Similarly, while Anthropic supports it through Claude's app, other big players (Google, Microsoft in their public AI products) have not exposed MCP to end-users yet. An important milestone to watch will be if Microsoft Azure OpenAI or other managed services allow MCP connectivity – for instance, Microsoft could integrate MCP into its Azure AI Studio or the Power Platform for AI, which would instantly expose MCP to many enterprise developers. There are hints of movement: Microsoft's Visual Studio experiment shows internal interest, and Google's stance of complementarity suggests they might allow an A2A agent to use an MCP tool as a sub-task.

From an operational perspective, tooling around MCP deployment needs to mature for full production readiness. Right now, if a company wants to host a dozen MCP servers for various internal tools, they must handle running each service (perhaps as separate processes or containers), monitoring them, updating them, etc. This could become complex – akin to running many microservices. We might expect "MCP server hubs" or management platforms to emerge that simplify this (for example, a unified gateway that hosts multiple tool backends with a single control plane). Already, marketplaces like Glama.ai list thousands of servers, but curating and managing versions and trust for enterprise use is an ongoing challenge.7 Ecosystem consolidation is likely to happen: popular services will have officially maintained MCP servers (many companies have already published or are working on theirs), and redundant community implementations will either converge or be pruned over time. Enterprises will likely prefer a smaller set of well-supported servers (with SLAs, security reviews, etc.) rather than the wild west of GitHub. The announcement that Anthropic will provide "developer toolkits for deploying remote production MCP servers that can serve your entire Claude for Work organization" is a sign that first-party solutions for enterprise deployment are on the roadmap. This might include easy ways to host MCP servers on-prem or in a VPC, integration with enterprise identity systems for auth, and auditing capabilities.

In terms of performance and scalability, MCP introduces some latency overhead (especially for remote servers) and statefulness. For production, careful design is needed to avoid slowing down the agent: e.g., caching the list_tools result as OpenAI's SDK allows, running servers close to the agents to reduce network hops, and possibly redesigning long workflows so that multiple tool calls can be batched if needed. Nuno's critique about statelessness highlights that current MCP servers maintain a session per client, which doesn't auto-scale behind a load balancer easily. In practice, many tool calls are short-lived and the number of concurrent agent sessions might be limited in an enterprise use case (for example, 100 analysts each running their own agent), so this may not be a blocker. But for high-scale scenarios (think thousands of agents calling a common web search MCP server), solutions like sharding or stateless modes may be required. We anticipate improvements or patterns to emerge (perhaps an MCP proxy that distributes calls to worker processes).

Key milestones to watch: One is formal adoption by additional AI providers – if, say, Anthropic, OpenAI, and Google all reliably support MCP in their agent offerings, it will likely become a fixture. OpenAI's move was big; if Google's Cloud Vertex AI (which already hosts Claude and other models) adds a tool plugin interface, will they choose MCP or something else? Google might lean on A2A for multi-agent, but they could still implement MCP for tool use given its traction. Another milestone would be standardization efforts: it's conceivable that the tech industry might form a working group to merge ideas from MCP, A2A, and other protocols into a unified standard (much like how HTML had competing specs before consolidation). For now, Anthropic's MCP and Google's A2A seem to occupy different niches, but Cisco's AGNTCY and others indicate many parties want a say. Watch for conferences or forums where these are discussed collectively.

Lastly, production readiness will be affirmed by success stories: if in the next 6–12 months we see case studies of Fortune 500 companies deploying AI copilots that use MCP to interface with internal systems (and delivering real ROI), that will solidify MCP's credibility. Early signs are positive – for example, Block's CTO's comment on MCP highlighted the value of open protocols in building agentic systems that "remove the burden of the mechanical" for employees. If those pilots demonstrate value and safety at scale, MCP (or its evolved descendant) will likely move from the "early adopter" phase to a more mainstream phase in enterprise AI architecture.

In conclusion, MCP is on the cusp of production readiness. It's not a turnkey solution yet – organizations must currently put in engineering effort to use it responsibly – but the momentum and support around it suggest it's more than a transient fad. Companies adopting it now are gaining experience in what might become a foundational layer for AI integration. Caution is still warranted (as with any new technology), but the direction is clear: enabling AI agents to securely and flexibly use tools is a crucial capability, and MCP offers a leading approach to achieve that. Enterprises should keep an eye on MCP developments, invest in small-scale trials, and contribute feedback to ensure the protocol meets their needs. With continued iteration on security, scalability, and standardization, MCP is poised to move from experimental to enterprise-grade in the near future.

Comparisons to Other Tool Provisioning Methods

MCP is one approach to giving LLMs extended capabilities – it's useful to compare it with alternative methods and frameworks for provisioning tools. The main alternatives include vendor-specific function calling interfaces (like OpenAI's Functions), plugin systems with manifests (e.g. OpenAI Plugins), orchestration frameworks like LangChain's toolkits, and other emerging protocols like Google's A2A. Each has its trade-offs in terms of flexibility, security, ease of use, and ecosystem.

  • MCP vs. Native Function Calling (OpenAI Functions and Anthropic's Tools API): Both OpenAI and Anthropic allow developers to supply function definitions alongside a prompt, so that the model can choose to call those functions. In OpenAI's case, the developer provides a name, description, and JSON schema for each function via the API, and the model can return a structured function call which the client code then executes. Anthropic's Claude has a similar "tools" parameter for function definitions. The goal of this is the same as MCP – enabling tool use – but the approach is different. Function calling is compiled into the model's API call itself, whereas MCP externalizes the tool interface to an independent server. With native function calling, the set of available tools is determined at runtime by the client's API call; if you want to add a new tool, you typically must modify your code/prompt to include it. In contrast, with MCP, an agent platform could dynamically discover new tools by connecting to new servers without changing the prompt (the agent just refreshes the tool list via MCP). One advantage of native functions is simplicity and performance – there's no additional network round-trip or process management; the model outputs a function call directly and your code executes it. It's "hard-coded and accessed locally" while MCP is "externally served and accessed". This means function calls can be faster and less failure-prone (fewer moving parts). However, the coupling is tighter: the agent developer must anticipate and include all needed functions, and the logic for those functions lives in the same application environment. MCP's loose coupling allows the tool implementations to live elsewhere (even maintained by other teams or third parties), and encourages reusability. Another distinction is multi-model and multi-platform support – OpenAI's function calling only works with OpenAI models, whereas MCP is model-agnostic. For example, if you built an internal tool via OpenAI functions and later want to use it with a different LLM, you'd have to port that logic, whereas an MCP server could serve both with no change. Security is another differentiator: with function calling, the burden is on the developer to ensure the function is safe (since it runs in-process). MCP, by being out-of-process, provides an opportunity to sandbox or isolate the tool execution. That said, function calling and MCP can coexist – indeed, many MCP servers internally use the function calling mechanism to let the LLM know about the tool. (Anthropic's Claude, for instance, sees MCP tools as if they were functions with schemas – the Claude client essentially injects the MCP server's tool list into the model's context or uses the Tools API under the hood). In summary, function calling is great for developers who control the whole stack and need direct, efficient tool use for a known set of functions, while MCP shines in more open-ended, extensible scenarios or where tool providers are separate from the model provider. One can view MCP as a higher-level protocol that could use function calling as a mechanism: e.g., OpenAI's Agent SDK uses function calls to implement the MCP server interface, bridging the two worlds.

  • MCP vs. OpenAI Plugins (Manifest-Based Plugins): OpenAI's initial solution for extending ChatGPT was a plugin system where each plugin is essentially a self-hosted web service with a standardized manifest file and OpenAPI spec. The manifest (ai-plugin.json) advertises the plugin's endpoints and authentication method, and ChatGPT could fetch this and then call the plugin's REST API (with the model generating HTTP requests). This approach also aimed at standardizing tool integration, but it differs from MCP in key ways. First, the communication is natural language + HTTP calls rather than JSON-RPC sessions. ChatGPT interprets the OpenAPI spec and includes API call examples in the prompt, and the model decides when to formulate an HTTP request. This means the model is formulating the tool invocation at the HTTP level, which proved challenging (the model might format a request incorrectly or misunderstand a parameter). MCP, by contrast, formalizes the call structure – the model just triggers a function with JSON args, and the MCP client/server handle the rest, which is arguably more robust. Second, the plugin system is proprietary to OpenAI's ChatGPT (and now possibly being phased into the function calling/"Agents" approach), whereas MCP is open and multi-platform. OpenAI plugins required OpenAI's specific orchestration (and only worked in ChatGPT or via their API if you implemented the plugin protocol yourself). MCP is more flexible in that any application can implement it and any model with an agent loop can use it. Security & consent: OpenAI plugins had an approval UI and verified domains, but once approved, the model could call the plugin freely. MCP typically expects the host to ask user permission for each tool use as well, and because MCP servers can be local, there's potentially more fine-grained control (e.g., the host can decide not to pass certain data to the server). In terms of discoverability, plugin manifests offered a way to automatically discover capabilities (via the .well-known URL), similar to how MCP servers provide a way to list tools. Interestingly, Google's A2A "agent cards" are conceptually similar to plugin manifests (a JSON describing an agent's API). The plugin model suffered from limited adoption – by late 2024 it was clear that ChatGPT plugins didn't see massive use, perhaps due to UX friction or reliability issues. Harrison Chase noted that the MCP ecosystem became larger "already" than the ChatGPT plugin ecosystem ever was. He attributes this to openness and model improvements making tool use easier now. In practice, one can see MCP as addressing some shortcomings of the plugin approach: by keeping the tool invocation in a controlled protocol (JSON-RPC) rather than raw natural language->HTTP, and by not tying it to one provider. However, plugins had the advantage of using existing REST APIs directly. With MCP, many servers are essentially thin wrappers around REST APIs (e.g., a GitHub MCP server that calls GitHub's API). Critics might ask: why not let the model call the REST API directly with an OpenAPI-spec guide (which is what plugins did)? The answer comes back to control and ease: wrapping it in an MCP server means you can enforce auth, simplify the interface, and not rely on the model to generate low-level HTTP – the server can do that and just return the result. So MCP and plugin manifests both attempt to standardize tool usage, but MCP offers a more structured and tool-provider-centric model, whereas the OpenAI plugin approach was more model-centric and ad-hoc. It's telling that OpenAI themselves are moving toward a paradigm closer to MCP (with their "Tools" in the API and the new Agents SDK using constructs like MCPServer classes).

  • MCP vs. LangChain-Style Toolkits: LangChain (and similar frameworks like Microsoft's Semantic Kernel) provided a way to integrate tools into agent chains by writing Python classes or functions (often called tools or toolkits) and then manually informing the LLM about them (usually via prompt templates that list the tool names and descriptions). For example, LangChain has a library of tools (search, math, file I/O, etc.) and developers could add custom ones following a simple interface (a run() method and a description). This is a very code-centric approach: the tools run in-process, and the agent's logic often explicitly calls a tool based on parsed LLM output. Compared to MCP, this is tightly integrated – essentially the opposite of a standardized external protocol. The advantage of the LangChain approach is straightforward integration when you are building a bespoke agent: you have full control, and you can tightly couple the tool with the agent's prompt (including adding examples of usage, etc.). Indeed, many high-reliability agents today are built this way, with carefully chosen tools and custom prompts. The downside is that those tools are not portable outside the codebase – a LangChain tool can't be "discovered" or used by a different agent framework without writing adapter code (hence why LangChain created an MCP adapter, to bridge this gap). LangChain's own library of 500+ tools, while extensive, saw limited production use according to its team. Part of the reason might be that dropping a large set of generic tools into an agent often made the agent unreliable or unpredictable, as Nuno observed. MCP doesn't inherently solve that reliability issue, but it does enable a more modular ecosystem: tool developers can package logic as a server that any agent builder could try, without copying code. Discoverability is a key difference – an MCP server can be registered in a directory and anyone can run it to grant their agent new powers, whereas LangChain's tools were mostly something a developer had to know about and include at development time. Security is another difference: LangChain tools execute in the same process as the agent, so a malicious tool could do damage (similar to function calling issues). With MCP, you could run an untrusted server in a sandbox process, as mentioned. On the other hand, running multiple MCP servers introduces operational overhead that a self-contained LangChain agent does not have. Dynamic vs. Static: MCP allows truly dynamic addition of capabilities (even at runtime, a user could spin up a new server and connect it), whereas with LangChain tools, whatever was coded in is it – changing or adding requires code changes and redeployment. Many see MCP as the evolution of the concept that LangChain first popularized (tool-using agents), generalizing it beyond Python and making it more end-user accessible. Harrison Chase himself said he wouldn't use MCP if he, as a developer, was writing a purpose-built agent (he'd integrate tools directly), but he acknowledges that MCP opens the door for non-developers to extend agents they couldn't otherwise modify. In enterprise, both patterns might coexist: core AI applications might use internal tool integration for critical tasks (for maximum reliability), while MCP is used to allow easier extension and integration with a wider array of services.

  • MCP vs. Anthropic/Google's A2A (Agent-to-Agent) Protocol: MCP and A2A are often mentioned together as part of the emerging "AI agent protocol" landscape, but they address different layers of the problem. MCP is primarily about an agent interacting with tools/services, whereas A2A is about agents interacting with each other. Google's Agent-to-Agent (A2A) protocol, announced in 2025 with broad industry backing, defines how autonomous agents can communicate, coordinate tasks, and delegate among themselves in a standardized way. Core to A2A is an Agent Card (a bit like a service description) that an agent exposes at a well-known endpoint, listing its capabilities and how to invoke them (often via high-level tasks). The interactions in A2A are usually task-oriented messages passed between agents (often over HTTP), and it supports multi-step workflows and asynchronous operations by design. In contrast, MCP is a tighter loop between one agent and one tool provider, usually synchronous per tool call. Google explicitly stated that A2A complements MCP: an agent might use MCP to execute a low-level tool, and use A2A to collaborate with another agent on a larger task. For example, imagine a complex business process where a Finance Agent and an HR Agent need to work together: A2A could facilitate their interaction (sharing subtasks and results), and each of those agents might use MCP to interface with internal company systems (databases, APIs) as they do their part. So it's not an either/or choice; they operate at different scopes. However, there is some overlap: one could accomplish certain things either by spawning a specialized agent or by calling a tool. For instance, to do language translation, one could call a "translate API" tool (via MCP) or call a "Translator Agent" via A2A. One potential trade-off is granularity vs. autonomy: MCP tools are typically single-function units invoked by the main agent, whereas an A2A agent might be more autonomous (it could perform internal reasoning or multiple steps once instructed). Security-wise, A2A raises similar trust issues (you're letting another agent handle part of the job) and will need robust authentication between agents, possibly more complex than MCP's client-server auth because each agent is both a client and server in a sense. The industry view is forming that MCP, A2A, and similar protocols (like Cisco's AGNTCY) will form a stack – MCP for tool use, A2A for agent orchestration, and even higher-level ones (some mention an ACP – Agent Communication Protocol – that might standardize interactions at a narrative level). As of now, MCP is further along in implementation than A2A (which is still early), but A2A has heavyweight backing (50+ companies with Google). If A2A gains traction, we might see agents that speak both: using A2A to find other agents and MCP to execute concrete operations. For a developer deciding between them: if your goal is simply "allow my single agent to use my APIs," MCP is the straightforward choice. If your goal is "build a system of multiple specialized AI workers that talk to each other," then A2A is aimed at that – though you'd likely still use MCP internally within each worker for its tools. In short, MCP vs A2A is not a direct competition – they are complementary standards at different layers of the AI stack. Effective AI solutions at scale may use both in tandem, and indeed the emergence of both underscores a trend toward standardized interfaces (for tools and for agent collaboration) replacing siloed, monolithic AI applications.

  • Other Methods (Manual APIs, Retrieval-Augmentation, etc.): Before MCP, many agents were built using ad hoc integrations – e.g., the agent's code calls an API when needed based on some regex or prompting. Some solutions focus purely on retrieval (RAG – Retrieval Augmented Generation) where an index or database is queried for context, often via a standard like LangChain's retriever interface or API endpoints. RAG can be seen as a subset of tool use (a resource in MCP terms). In those cases, simpler methods like direct API calls or using libraries might suffice. The trade-off comes when you want those capabilities to be discoverable and interchangeable. If a company has already invested in, say, a GraphQL API for all data, do they need MCP? Possibly not for internal devs, but if they want any AI agent (including third-party ones) to easily access that data, an MCP server could act as a bridge that speaks GraphQL behind the scenes but offers a standard interface to agents. Manifest-based registries (in a general sense, not just OpenAI plugins) might emerge where companies publish JSON specs of their AI-facing APIs. Those could even be indexed by an agent. There's research on automated agent metadata that would let agents dynamically find what tools or APIs are available in an environment. MCP's approach to discovery is more controlled (the host needs to know an address of a server or have it configured), whereas one can imagine a future where an agent queries a directory service to find "a tool that can do X." That directory could be populated by MCP server metadata or by other manifests.

In weighing these approaches, trade-offs can be summarized:

  • Discoverability and Extensibility: MCP and plugin manifests focus on making tools discoverable to agents; LangChain tools and native functions require pre-planning by developers. MCP and manifests win on dynamic extensibility (you can add new capabilities without redeploying the agent).

  • Security and Trust: Native function calls and in-process tools have a larger blast radius if something goes wrong (since they run with full app permissions). MCP introduces a trust boundary (especially if servers run with restricted permissions). However, MCP also introduces new trust questions (is the server code safe? who wrote it?). Manifest-based plugins had to deal with external HTTP calls (with all associated web security issues). Here, MCP's design of explicit user approvals and limited server visibility into the agent's state is meant to mitigate some risks.

  • Standardization and Ecosystem: MCP is emerging as a community-driven standard with multi-vendor support, whereas OpenAI's function tools were proprietary until they opened the Agents SDK, and LangChain's ecosystem, while large, was tied to its framework. If one's goal is maximum interoperability, MCP currently has the momentum in that direction. A2A targets a different aspect (multi-agent networks), and we may see convergence or layering of these standards rather than one eliminating the other.

  • Complexity: Simpler use cases might be overkill to run a separate MCP server. For instance, if an application only needs to let the model call one internal function (say, a database query), it might be easier to just use function calling with a well-crafted prompt. MCP is most advantageous when you have a diverse or growing set of tools, possibly maintained by different teams, or when the agent environment is closed (like a third-party app) so you can't just inject new code. It shines in multi-team or multi-tenant scenarios (tool provider and tool consumer are separate entities), much like how web APIs shine when integrating different systems, whereas internal function calls are fine within a single codebase.

In conclusion, MCP distinguishes itself by providing a unifying, model-agnostic layer for tool integration, whereas alternatives either lock you into a particular AI provider or lack a formal protocol. Many organizations will use a hybrid: e.g., using native function calls for simple built-in tools, MCP for more complex or third-party integrations, and possibly A2A for scaling out to multiple agents. The landscape is evolving quickly, but MCP has positioned itself at the interoperability sweet spot: low-level enough to handle concrete actions and data (like function calls do), but standardized and decoupled enough to promote a vibrant ecosystem of tool providers and AI consumers. Whether MCP in its current form remains the dominant method or influences newer standards, the end goal across all these approaches is similar – making LLMs more useful by safely connecting them to the wider world of software. Each method has its strengths, and understanding these trade-offs helps in designing an agent architecture that best fits one's use case.

Footnotes

  1. Anthropic (2024). Introducing the Model Context Protocol (MCP)

  2. Anthropic (2025). Model Context Protocol – Introduction & Specification

  3. LangChain Blog (2025). MCP: Flash in the Pan or Future Standard? (H. Chase & N. Campos debate)

  4. AWS Machine Learning Blog (2025). Harness the power of MCP servers with Amazon Bedrock Agents

  5. Visual Studio Docs (2025). *Using MCP servers with Copilot (Preview)

  6. OpenAI (2025). OpenAI Agents SDK – MCP Integration Documentation

  7. LangChain MCP Adapters Changelog (2025) and Glama.ai MCP Server Directory. 2

  8. LinkedIn (2025). Best Practices for MCP Servers (G. Desai)

  9. SSOJet (2025). What are the best practices for MCP security?

  10. Medium (2025). A Quick Introduction to MCP in Python (A. Evensen)

  11. VentureBeat (2025). How MCP is becoming enterprise AI's universal language