Internet-Draft | Applicability of MCP for the Network Man | June 2025 |
Yang, et al. | Expires 25 December 2025 | [Page] |
The application of MCP in the network management field is meant to develop various rich AI driven network applications, realize intent based networks management automation in the multi-vendor heterogeneous network environment. This document discusses the applicability of MCP to the network management in the IP network that utilizes IETF technologies. It explores operational aspect, key components, generic workflow and deployment scenarios. The impact of integrating MCP into the network management system is also discussed.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 25 December 2025.¶
Copyright (c) 2025 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
The Model Context Protocol (MCP) provides a standardized way for LLMs to access and utilize information from different sources, interact with tools, making it easier to build AI applications that can interact with external LLM and network management systems.¶
MCP has been seen as rapid adoption technology in the consumer field. The application of MCP in the network management field is meant to develop various rich AI driven network applications, realize intent based networks management automation in the multi-vendor heterogeneous network environment. By establishing standard interfaces for tool encapsulation, intent translation, and closed-loop execution within the network management system, MCP enables AI Agents to have:¶
Unified operation abstraction through normalized MCP tool definitions¶
Seamless LLM integration via the structured protocol¶
Automation Execution Ability¶
This document discusses the applicability of MCP to the network management plane in the IP network that utilizes IETF technologies. It explores operational aspect, key components, generic workflow and deployment scenarios. The impact of integrating MCP into the network management system will also be discussed.¶
The following terms are used throughout this document:¶
MCP Protocol: MCP is an open standard designed to facilitate communication between LLMs and external data sources or tools.¶
MCP Host: The entity initiating the LLM request.¶
MCP Client: A built-in module within a host, specifically designed for interaction with the MCP server.¶
MCP Server: A dedicated server that interacts with MCP clients and provides tools.¶
CLI: Command Line Interface¶
In large scale network management environment, a large number of devices from different vendors need to be uniformly managed, which can lead to the following issues or challenges:¶
Different vendors implement different YANG models (standard or proprietary), leading to:¶
Lack of uniform data structures for configuration/state retrieval.¶
Requirement for vendor-specific adaptations in automation scripts.¶
Also IETF standard device models has slow adoption. Similar device models are defined in Openconfig or other SDOs, therefore the current YANG device models ecosystem is fragmented.¶
Some vendors only partially support standard Network management protocols, and proprietary extensions may break interoperability. Other vendors might choose non-stanard network management protocol or telemetry protocol such as gnmi [I-D.openconfig-rtgwg-gnmi-spec], grpc [I-D.kumar-rtgwg-grpc-protocol]. A significant number of network operators continue to rely on legacy network management mechanisms such as SNMP.¶
Today, network API has been widely adopted by the northbound interface of OSS/BSS or Network orchestrators while YANG data models have been widely adopted by the northbound interface of the network controller or the interface between the network controller and the network devices. However Network API ecosystem and YANG model ecosystem are both built as silo and lack integration or mapping between them.¶
The LLM model with MCP support and its ability to comprehend diverse complex requirements and deliver corresponding functionalities, is well-suited for large scale multi-vendor network management environments, effectively addressing the aforementioned challenges in Section 3. Therefore, we have introduced the MCP protocol in the network management environments for building an intelligent network management and control platform.¶
Objective: Allow AI models (such as Claude) to understand natural language commands and trigger operations.¶
Workflow:¶
Intent Recognition: The LLM first analyzes the user's natural language query to identify:¶
Tool Discovery and Toolchain Generation: The LLM access tool descriptions provided by MCP servers, and matches the identified intent with available tools.¶
Parameter Extraction and Mapping: The LLM maps natural language references to structured parameter names and extracts relevant information from the user query.¶
Structured Invocation Generation: The LLM generates properly formatted tool calls following MCP's protocol.¶
Benefits:¶
Bridge natural language to tool invocation requests in a fixed format, then return this request to the client, enabling the client to properly parse the request. ## Close Loop Management¶
Objective: Realize the closed loop of "voice/text commands → automatic execution".¶
A general workflow is as follows:¶
User Input Submission: An operator submits a natural language request to the MCP client. And The MCP client forwards this request to the LLM.¶
LLM Intent Processing: The LLM parses the input, identifies the operational intent, and forwards a structured request to the MCP client, which queries the MCP Server to retrieve the available tools. The information would include the functional description, required parameters of tools.¶
LLM Toolchain Decision:¶
Tool Execution: The MCP Server executes the translated commands on target devices and returns results to the client.¶
Result Aggregation & Feedback: The MCP Client collates tool outputs (success/failure logs) and forwards them to the LLM for summarization.¶
Take multi-vendor network management as an example, the MCP server is deployed locally on the network controller, and the tools are integrated into the MCP server. The server provide the following registered tool descriptor information:¶
Tools description: it describes the name, use, and parameters of tools.¶
Tools implementation: MCP implementation describe how the tools are invoked.¶
See Tool descriptor information example as follows:¶
# Tool Descriptor [ { "name": "batch_configure_devices", "description": "Batch Configure Network Devices", "parameters": { "type": "object", "properties": { "device_ips": {"type": "array", "items": {"type": "string"}, "description": "Device IP List"}, "commands": {"type": "array", "items": {"type": "string"}, "description": "CLI Sequence"}, "credential_id": {"type": "string", "description": "Credential ID"} }, "required": ["device_ips", "commands"] } }, { "name": "check_device_status", "description": "Check the Status of Network Devices", "parameters": { "type": "object", "properties": { "device_ip": {"type": "string"}, "metrics": {"type": "array", "items": {"enum": ["cpu", "memory", "interface"]}} } } } ] # Tool Implementation from netmiko import ConnectHandler from mcp_server import McpServer app = FastAPI() server = McpServer(app) #Connection Pool Management devices = { "192.168.1.1": {"device_type": "VendorA-XYZ","credential": "admin:XYZ@password"}, "192.168.1.2": {"device_type": "VendorB-ABC","credential":"admin:ABC@passowrd"} .... } @server.tool("batch_configure_devices") async def batch_config(device_ips: list,commands: list,credential_id: str): results = {} for ip in device_ips: conn = ConnectHandler( ip = ip, username = devices[ip]["credential"].split(':')[0], password = devices[ip]["credential"].split(':')[1], device_type = devices[ip]["device_type"] ) output = conn.send_config_set(commands) results[ip] = output return {"success": True, "details": results) @server.tool("check_device_status") async def check_status(device_ip: str, metrics: list): status = {} if "cpu" in metrics: status["cpu"] = get_cpu_usage (device_ip) if "memory" in metrics: status["memory"] = get_memory_usage(device_ip) return status¶
Suppose a user submits a request (via the client) such as "Configure OSPF Area 0 with process ID 100 for all core switches in the Beijing data center," the MCP client retrieves the necessary tooling descriptor information from the MCP server and forwards it along with the request to the LLM. The LLM determines the appropriate tools and responds in JSON format as follows:¶
{ "method": "batch_configure_devices", "params": { "device_ips":["192.168.10.1",....,"192.168.10.10"], "command": [ "router ospf 100", "network 192.168.0.0 0.0.255.255 area 0" ] } } }¶
The MCP server executes the network management operation in JSON format and returns the results to the MCP client, which forwards them to the LLM. The LLM parses the response, generates a natural-language summary, and sends it back to the client for final presentation to the user. See natural lanauge summary example as follows:¶
{ "192.168.10.1": "Configure Successfully, take 2.3 seconds", "192.168.10.2": "Error: no response from the device", }¶
This section describes MCP deployment requirements for network management environments, followed by implementation scenarios. Key architectural requirements include:¶
Function-Specific MCP Servers: To maintain proper architecture and performance with growing tool volumes, servers should be categorized by network management function. Typical categories include network log analysis, device configuration management, energy consumption management, and security operations.¶
Secure and Scalable Architecture: The architecture must:¶
Automated Workflows: MCP implementations should support LLM-coordinated automation of:¶
While these core requirements apply universally, operational characteristics vary based on deployment location (on-premises vs. remote). The following subsections detail these deployment scenarios.¶
+--------------+ | User | +-------+------+ | Natural Language Command | .................|............................ . | . . +-------+------+ +-----------+. . | MCP Client +-------+ LLM |. . +-------+------+ +-----------+. . | . . Tools Request . . | . . +-------+------+ . . | MCP Server | Network . . +-------+------+ Controller . . | . .................|............................ | Netconf/Telemetry +-------------------+------------------+ | | | | | | +-----+--------+ +-------+------+ +------+-------+ | Network | | Network | | Network | | Device | | Device | | Device | +--------------+ +--------------+ +--------------+¶
+--------------+ | User | +-------+------+ | Natural Language Command ......................................................... . | . . +-------+------+ +-----------+ . . Network | MCP Client +-------+ LLM | . . Controller +-------+------+ +-----------+ . . | . . Tools Request . ......................................................... . +-------+------+ . . | MCP Server | . . +-------+------+ . .Network CLI . .Device | . . +--------------------+--------------------+ . . | | | . .+----+-------+ +-------+------+ +-------+------+. .| Network | | Network | | Network |. .| Device | | Device | | Device |. .+------------+ +--------------+ +--------------+. . . .........................................................¶
+------------+-----------------------------+-----------------------+ | | MCP Hosted Within | MCP Server Hosted | | | the Network Controller | Within Network Device | +------------+-----------------------------+-----------------------+ | | |1.Protocol for Context | |Management | No impact,reuse | Management | |Protocol | existing NM Protocols |2 Including approval | | | | mechanisms where human| | | | input is required. | | | |3.Coexist with NM proto| | | |in case not all devices| | | |support MCP | +------------+-----------------------------+-----------------------+ |Management | Use internal tools and | Need to ensure right | | Tools | LLMs within the controller | tools and background | | | for managing context and | info in the network | | | decision making | device | +------------+-----------------------------+-----------------------+ | Task | Works with pre-structured | | |Management | goal driven tasks. | Same Rule Apply | | | Tasks are usually designed | | | | and pre-defined by client | | +------------+-----------------------------+-----------------------+ | | Yes, | Yes | | Stateful | Agents can retain context | | |Management | from previous interaction, | Same Rule Apply | | | enabling continuity in | | | | long term task or | | | | conversation | | -------------+-----------------------------+-----------------------+¶
Pro¶
Con¶
Single Point of Failure: Network controller failure creates a catastrophic impact where the entire AI-driven network management capability is lost across all devices, leaving operators without intelligent automation during critical situations. While backup and failover mechanisms can be implemented, they introduce additional architectural complexity and may not guarantee seamless transitions, often resulting in management gaps during switchover periods.¶
Potential Bottleneck: High request volumes could overwhelm the centralized server during peak operations or network events, where concurrent multi-device operations may queue up and cause delays in critical network changes. Resource contention between different network management tasks affects overall system responsiveness, while limited ability to parallelize device-specific operations when controller CPU/memory becomes constrained further exacerbates performance degradation during high-demand scenarios.¶
Pro¶
The protocol architecture simplification: If you deploy the MCP Server directly on the network device, you can skip the NETCONF protocol layer and manage the device directly through MCP. This reduces the complexity of protocol conversion and simplifies the overall architecture.¶
High Availability: Device failures are isolated; other devices remain manageable.¶
Reduced Controller Load: Distributes processing load across the network.¶
Con¶
Management Complexity: Operating hundreds or thousands of distributed MCP servers introduces significant operational overhead requiring sophisticated orchestration systems for maintenance, updates, and monitoring.¶
Resource Overhead: Each network device must allocate additional compute and memory resources to host the MCP server, potentially impacting primary networking functions and increasing per-device infrastructure costs.¶
This document has no IANA actions.¶
The MCP protocol needs to consider scenarios where either the client or server encounters issues, such as crashes. If one or both parties go offline during communication, the entire process may remain stuck waiting for messages, potentially leading to an infinite loop. Furthermore, certain tool operations may be interrupted, and some irreversible network management operations could be affected.¶
Due to network latency, some operations might not return in time, yet from the user's perspective, these operations may appear either unexecuted or failed. If the user then initiates another tool request to the server, problems may occur.¶
For complex network management workflows, while LLM's tool invocation process may generally function correctly, issues can arise in the details. Users must verify each LLM operation to prevent unintended hazardous actions.¶