GitHub Stars
0
User Rating
Not Rated
Forks
0
Issues
0
Views
0
Favorites
0
Model Context Protocol (MCP)
In today’s fast-moving AI landscape, one of the biggest hurdles developers face is seamlessly connecting large language models (LLMs) to the data sources and tools they need to work effectively. That’s where the Model Context Protocol (MCP) comes in—a powerful open standard that’s redefining how applications deliver rich, structured context to LLMs.
📑 Table of Contents
- What is Model Context Protocol?
- The Journey to Model Context Protocol
- Why MCP Matters
- The Architecture of MCP
- Key Concepts in MCP
- MCP in Action
- Code Flow Explanation
- Results
- Conclusion
- References
What is Model Context Protocol?
The Model Context Protocol (MCP) is an open protocol designed to standardize how applications provide context to large language models. Think of MCP as the "USB-C port for AI applications" - just as USB-C provides a standardized way to connect devices to various peripherals, MCP offers a standardized method for connecting AI models to different data sources and tools.
Developed by Anthropic, MCP addresses a critical need in the AI ecosystem: enabling LLMs to securely access the specific information and capabilities they need while maintaining data privacy and security.
The Journey to Model Context Protocol
1. Foundational LLMs: The Early Struggles
When large language models first arrived, they were groundbreaking in their ability to generate fluent, human-like text. But despite their impressive language skills, they often struggled with instruction following. These early models operated purely on statistical pattern matching, not true comprehension or task execution.
For instance, if you asked:
It might return a paragraph instead—missing the structure entirely.
Or if prompted:
The model would echo the input without actually performing the task.
These examples reveal a deeper issue: early LLMs lacked context awareness and goal alignment. They were reactive text generators, not proactive assistants. This gap laid the groundwork for more structured, integrated approaches—leading us to innovations like the Model Context Protocol (MCP).
2. The Rise of Instruction-Tuned Models: SFT and RLHF
To bridge the gap between raw language generation and task-oriented behavior, researchers introduced Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF).
With SFT, models were trained on curated datasets where human-written responses were paired with specific prompts, teaching them how to follow instructions more precisely. RLHF took it a step further by incorporating human preferences—models were fine-tuned based on which outputs people found more helpful or appropriate.
This combination led to a major leap in usability. Now, when asked:
However, as models improved, expectations rose. Users began asking factual or dynamic queries like:
- “What’s the square root of 14.52321?”
- “What is India’s current GDP?”
These types of questions exposed the limitations of LLMs—they weren’t built to perform precise calculations or access real-time information. Despite better alignment with human intent, the models still lacked live data access, persistent memory, and grounded reasoning—paving the way for more robust systems like the Model Context Protocol.
3. The Introduction of Function Calling: Bridging Knowledge and Action
To overcome the static nature of language models, developers introduced function calling—a powerful mechanism that lets models invoke external tools to perform real-time tasks or fetch live data.
With function calling, models no longer had to rely solely on pre-trained knowledge. Instead, they could trigger external functions or APIs. For example:
Or:
This marked a huge step forward—models could now deliver accurate, timely, and computationally precise responses. But as user expectations grew, so did the complexity of their requests. People didn’t just want answers; they wanted actions.
Commands like:
- “Send this email to my manager.”
- “Post this message in our Telegram group.”
- “Book me a flight to New York tomorrow.”
…required models not only to understand the task, but also to securely interact with real-world services. This evolution set the stage for a new paradigm—agentic AI, where language models can coordinate tools, maintain context, and act autonomously. Enter: the Model Context Protocol (MCP).
4. The Rise of Agentic AI: From Answers to Real-World Actions
As users began expecting AI not just to inform but to act, a new paradigm emerged—Agentic AI. These systems could interface with external applications and APIs to perform tasks on a user’s behalf, effectively transforming language models into intelligent agents.
Now, instead of simply drafting a message, AI could:
- Schedule meetings by accessing calendars
- Send real-time updates via platforms like Telegram
- Execute transactions through payment gateways
Initially, these capabilities were built one integration at a time—custom connections for Gmail, Slack, Stripe, and other popular platforms. But this piecemeal approach soon revealed its limits:
❌ Fragmented Technology Stacks: Every application is built differently—some in Python, others in Go or JavaScript—requiring bespoke integration logic for each.
❌ Non-Standardized APIs: There’s no universal API language. Each service comes with its own structure, request format, and error handling, making interoperability a challenge.
❌ Authentication Overhead: Managing multiple security protocols like OAuth, API keys, or session tokens added significant complexity to integration efforts.
❌ Poor Scalability: Manually wiring thousands of services wasn’t sustainable. The lack of a unified framework led to maintenance headaches and slowed down innovation.
These challenges made it clear: Agentic AI needed a more standardized, extensible foundation to operate at scale—one that could handle context, memory, and secure tool usage seamlessly. This is exactly where the Model Context Protocol (MCP) steps in.
Here’s a refined, fluid rewrite for the fifth section, focusing on the birth of the Model Context Protocol (MCP) and its potential impact:
5. The Birth of Model Context Protocol (MCP)
To address the integration challenges of Agentic AI, the Model Context Protocol (MCP) was created. MCP offers a universal, standardized framework that enables AI models to seamlessly interact with any application, regardless of its underlying technology or architecture. It eliminates the need for time-consuming, custom integrations by providing a single protocol that connects AI to the entire digital ecosystem.
With MCP, developers no longer have to manually code interactions for each new service. Instead, they can define a standard method for models to communicate with various external systems, making it possible for AI to:
- Connect to databases, web services, and software applications using one protocol
- Dynamically invoke tools based on real-time context without relying on hardcoded connections
- Extend AI capabilities effortlessly across a wide range of platforms and ecosystems
Example Use Case with MCP:
Imagine a user asks:
User: "Schedule a Zoom meeting with my team tomorrow at 8 AM."
With MCP, the AI model would:
✅ Recognize the request and understand the context.
✅ Call a standardized scheduling function through an MCP-compliant API.
✅ Send the request to Zoom’s API for scheduling.
✅ Retrieve the meeting link and respond: "Your Zoom meeting is scheduled for tomorrow at 8 AM. Here is the link: [meeting_link]."
This workflow exemplifies how MCP transforms AI into not just a conversational agent, but an intelligent, autonomous entity capable of executing tasks across multiple domains, freeing businesses from the need to create bespoke integrations for every tool.
Why MCP Matters
MCP solves several key challenges in AI application development:
1️⃣ Integration Standardization: It provides a growing list of pre-built integrations that LLMs can directly plug into
2️⃣ Vendor Flexibility: It offers the freedom to switch between LLM providers without significant code changes
3️⃣ Security Focus: It implements best practices for securing data within your infrastructure
4️⃣ Reduced Development Time: It eliminates the need to build custom integrations for each new data source or tool
The Architecture of MCP
At its core, MCP follows a client-server architecture consisting of:
- MCP Hosts: Programs like Claude Desktop, IDEs, or AI tools that want to access data through MCP
- MCP Clients: Protocol clients that maintain 1:1 connections with servers
- MCP Servers: Lightweight programs that expose specific capabilities through the standardized protocol
- Local Data Sources: Computer files, databases, and services that MCP servers can securely access
- Remote Services: External systems available over the internet that MCP servers can connect to
Key Concepts in MCP
Resources
MCP servers expose data and content through resources, allowing LLMs to access specific information when needed. These resources can include files, databases, APIs, and more.
Tools
One of MCP's most powerful features is the ability to define tools that LLMs can use to perform actions through your server. This enables AI assistants to not just access information but also manipulate it or create new content.
For example, in the code samples examined, we can see tools defined for working with PowerPoint:
@mcp.tool()
async def launch_powerpoint() -> str:
"""Launches PowerPoint and creates a new blank presentation"""
global prs
# Implementation details...
@mcp.tool()
async def set_slide_background_color(color: str) -> str:
"""Sets the background color of the current slide"""
global prs
# Implementation details...
Prompts
MCP allows the creation of reusable prompt templates and workflows, making it easier to build consistent AI experiences across different applications.
Sampling
MCP servers can request completions from LLMs, enabling bidirectional communication between the model and the server.
MCP in Action
Automating PowerPoint Presentation with MCP
Let’s look at a practical example of how MCP can be used to automate tasks—specifically, creating a PowerPoint presentation by connecting a language model to PowerPoint’s functionality through MCP.
Using Python and an MCP-compliant workflow, the AI agent can perform the following steps:
- Launch the PowerPoint application
- Create a new presentation with a slide
- Add title to a slide
- Customize slide appearance, such as setting background color
- Save the final presentation file
This demonstrates how, with MCP, a language model is no longer limited to generating text—it can orchestrate tools, manipulate files, and automate end-to-end workflows in real applications.
Step 1: Building an MCP Server
First, we need to create an MCP server that exposes PowerPoint capabilities as tools. Here's how it's implemented:
# Create a FastMCP server instance
mcp = FastMCP("PowerPoint Controller")
# Global presentation object
prs = None
@mcp.tool()
async def launch_powerpoint() -> str:
"""Launches PowerPoint and creates a new blank presentation"""
global prs
# Create a new presentation
prs = Presentation()
return "PowerPoint presentation created successfully"
@mcp.tool()
async def create_new_slide() -> str:
"""Creates a new blank slide in PowerPoint"""
global prs
if prs is None:
prs = Presentation()
slide_layout = prs.slide_layouts[1] # Title and Content layout
prs.slides.add_slide(slide_layout)
return "New slide created successfully"
@mcp.tool()
async def set_slide_background_color(color: str) -> str:
"""Sets the background color of the current slide"""
global prs
if prs is None or len(prs.slides) == 0:
return "Error: No slides to modify"
# Map color names to RGB values
color_map = {
"blue": (0, 114, 198),
"light blue": (173, 216, 230),
# Other colors...
}
rgb = color_map.get(color.lower(), (0, 114, 198))
slide = prs.slides[-1]
background = slide.background
fill = background.fill
fill.solid()
fill.fore_color.rgb = RGBColor(*rgb)
return f"Slide background color set to {color}"
@mcp.tool()
async def set_slide_title(title: str, subtitle: str = "") -> str:
"""Sets the title and optional subtitle on the current slide"""
global prs
if prs is None or len(prs.slides) == 0:
return "Error: No slides to modify"
# Get the current slide (the last one)
slide = prs.slides[-1]
# Set the title
if slide.shapes.title:
title_shape = slide.shapes.title
title_shape.text = title
# Format title
for paragraph in title_shape.text_frame.paragraphs:
paragraph.alignment = PP_ALIGN.CENTER
for run in paragraph.runs:
run.font.size = Pt(44)
run.font.bold = True
# Set the subtitle if available
if hasattr(slide, 'placeholders'):
for shape in slide.placeholders:
if shape.placeholder_format.type == 2:
shape.text = subtitle
# Format subtitle
for paragraph in shape.text_frame.paragraphs:
paragraph.alignment = PP_ALIGN.CENTER
for run in paragraph.runs:
run.font.size = Pt(28)
@mcp.tool()
async def save_presentation(filename: str) -> str:
"""Saves the presentation with the specified filename"""
global prs
if prs is None:
return "Error: No presentation to save"
# Determine the Documents folder path
home_dir = os.path.expanduser("~")
documents_dir = os.path.join(home_dir, "Documents")
save_path = os.path.join(documents_dir, f"{filename}.pptx")
# Save the presentation
prs.save(save_path)
The server exposes several PowerPoint operations as tools that can be called by an LLM:
launch_powerpoint()
: Creates a new blank presentationcreate_new_slide()
: Adds a new slide to the presentationset_slide_background_color(color)
: Changes the background color of the current slideset_slide_title(title, subtitle)
: Sets the title and subtitle of the current slidesave_presentation(filename)
: Saves the presentation with the specified filename
Step 2: Creating an MCP Client
Next, we need an MCP client that connects to our server and manages the interaction with an LLM:
# Initialize Gemini client
client = genai.Client(api_key=GEMINI_API_KEY)
async def main():
# Connect to the MCP server
server_params = StdioServerParameters(
command="python",
args=["server.py"]
)
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
# Initialize session
await session.initialize()
# Get available tools
tools_result = await session.list_tools()
tools = tools_result.tools
# Create tool descriptions for the LLM
tools_description = []
for i, tool in enumerate(tools):
params = tool.inputSchema
desc = getattr(tool, 'description', 'No description available')
name = getattr(tool, 'name', f'tool_{i}')
# Format the input schema in a readable way
if 'properties' in params:
param_details = []
for param_name, param_info in params['properties'].items():
param_type = param_info.get('type', 'unknown')
param_details.append(f"{param_name}: {param_type}")
params_str = ', '.join(param_details)
else:
params_str = 'no parameters'
tool_desc = f"{i+1}. {name}({params_str}) - {desc}"
tools_description.append(tool_desc)
# Join all tool descriptions
tools_description = "\n".join(tools_description)
Step 3: Defining the Workflow and LLM Interaction
Once we have our server and client set up, we define the workflow and create a system prompt for the LLM:
# Create a system prompt that guides the LLM to use PowerPoint tools
system_prompt = f"""You are a PowerPoint automation agent. You have access to various PowerPoint tools.
Available tools:
{tools_description}
You must respond with EXACTLY ONE line in one of these formats (no additional text):
1. For function calls:
FUNCTION_CALL: function_name|param1|param2|...
2. For final answers:
FINAL_ANSWER: [message]
Important:
- Follow the exact sequence of steps to create a PowerPoint presentation
- Only give FINAL_ANSWER when you have completed all necessary steps
- Do not repeat function calls with the same parameters
- Use appropriate parameters for each function
"""
# Define the task for the LLM
query = """Create a PowerPoint presentation with the following steps:
1) Launch PowerPoint and open a new blank presentation.
2) Create a new slide.
3) Set the title of the slide to "MCP (Model Context Protocol)".
4) Change the background color of the slide to light blue.
5) Save the presentation as "AI_Presentation"."""
Step 4: Implementing the Iterative Process
We implement an iterative loop where the LLM provides function calls and we execute them:
# Track which functions have been called to prevent loops
called_functions = set()
while iteration < max_iterations:
# Get model's response with timeout
prompt = f"{system_prompt}\n\nQuery: {current_query}"
response = await generate_with_timeout(client, prompt)
response_text = response.text.strip()
if response_text.startswith("FUNCTION_CALL:"):
_, function_info = response_text.split(":", 1)
parts = [p.strip() for p in function_info.split("|")]
func_name, params = parts[0], parts[1:]
# Find the matching tool to get its input schema
tool = next((t for t in tools if t.name == func_name), None)
# Prepare arguments according to the tool's input schema
arguments = {}
schema_properties = tool.inputSchema.get('properties', {})
for param_name, param_info in schema_properties.items():
if not params: # Check if we have enough parameters
raise ValueError(f"Not enough parameters provided for {func_name}")
value = params.pop(0) # Get and remove the first parameter
param_type = param_info.get('type', 'string')
# Convert the value to the correct type based on the schema
if param_type == 'integer':
arguments[param_name] = int(value)
elif param_type == 'number':
arguments[param_name] = float(value)
elif param_type == 'array':
# Handle array input
if isinstance(value, str):
value = value.strip('[]').split(',')
arguments[param_name] = [int(x.strip()) for x in value]
else:
arguments[param_name] = str(value)
# Call the tool with the prepared arguments
result = await session.call_tool(func_name, arguments=arguments)
# Add the function to the set of called functions
called_functions.add(func_name)
# Record the result for context in the next iteration
iteration_response.append(
f"In the {iteration + 1} iteration you called {func_name} with {arguments} parameters, "
f"and the function returned {result_str}."
)
elif response_text.startswith("FINAL_ANSWER:"):
print("\n=== PowerPoint Execution Completed ===")
break
iteration += 1
Code Flow Explanation
Let's break down the complete MCP workflow:
- Server Initialization: The MCP server is created using FastMCP and exposes PowerPoint operations as tools
- Client Connection: The MCP client establishes a connection to the server using stdio transport
- Tool Discovery: The client retrieves the list of available tools from the server
- LLM Interaction Setup: The system prompt is constructed with tool descriptions to guide the LLM
- Iterative Execution:
- The LLM is prompted with the task and available tools
- The LLM returns a function call in the format
FUNCTION_CALL: function_name|param1|param2|...
- The client parses the function call and parameters
- The client converts parameters to the appropriate types based on the tool's input schema
- The client calls the tool on the server with the prepared arguments
- The server executes the requested operation and returns the result
- The result is stored for context in subsequent iterations
- Completion: The process continues until all required steps are completed or the LLM returns a final answer
Results
When executed, this code creates a PowerPoint presentation with:
- A new slide with the title "MCP (Model Context Protocol)"
- A light blue background
- The presentation saved as "AI_Presentation.pptx"
The output shows the successful execution of each step:
--- Iteration 1 ---
Preparing to generate LLM response...
Starting LLM generation...
LLM generation completed
LLM Response: FUNCTION_CALL: launch_powerpoint
Final iteration result: ['PowerPoint presentation created successfully']
--- Iteration 2 ---
Preparing to generate LLM response...
Starting LLM generation...
LLM generation completed
LLM Response: FUNCTION_CALL: create_new_slide
Final iteration result: ['New slide created successfully']
--- Iteration 3 ---
Preparing to generate LLM response...
Starting LLM generation...
LLM generation completed
LLM Response: FUNCTION_CALL: set_slide_title|MCP (Model Context Protocol)|
Final iteration result: ['Slide title set to MCP (Model Context Protocol)']
--- Iteration 4 ---
Preparing to generate LLM response...
Starting LLM generation...
LLM generation completed
LLM Response: FUNCTION_CALL: set_slide_background_color|light blue
Final iteration result: ['Slide background color set to light blue']
--- Iteration 5 ---
Preparing to generate LLM response...
Starting LLM generation...
LLM generation completed
LLM Response: FUNCTION_CALL: save_presentation|AI_Presentation
Final iteration result: ['Presentation saved as AI_Presentation.pptx']
All steps completed successfully!
Conclusion
This example demonstrates the power of MCP in enabling seamless interaction between LLMs and external applications. By standardizing how LLMs communicate with tools and data sources, MCP makes it easier to build complex AI applications that can perform real-world tasks.
The key advantages demonstrated in this implementation include:
- Clear Separation of Concerns: The server handles PowerPoint operations while the client manages LLM interaction
- Standardized Communication: The protocol ensures consistent interaction between components
- Type Safety: Parameters are automatically converted to the correct types based on the tool's schema
- Flexibility: The same server could be used with different LLMs or clients
As AI becomes increasingly integrated into our digital infrastructure, protocols like MCP will be essential in creating secure, flexible, and powerful AI applications that can work with a wide range of tools and data sources.
References
8
Followers
39
Repositories
0
Gists
6
Total Contributions