TalkO11yToMe

Name: TalkO11yToMe
Availability: InStock
Author: Sebastian Vietz

Learning what it means to use MCP by trying to interact with Dynatrace MCP server.

GitHub

GitHub Stars

User Rating

Not Rated

Favorites

Views

Forks

Issues

README

TalkO11yToMe: Learning MCP Servers with Dynatrace Integration

Welcome to your MCP (Model Context Protocol) learning journey! This project will help you understand MCP servers, their use cases, and specifically how to integrate with Dynatrace tenants for powerful observability-driven AI workflows.

🎉 PROJECT STATUS: FULLY OPERATIONAL

All 6 Dynatrace tools are now production-ready with:

✅ Standardized Configuration: Shared dotenv-based config eliminates 200+ lines of duplicate code
✅ DQL Query Polling: Proper handling of async query execution (202 → poll → results)
✅ OAuth Authentication: Working across all tools with 5-minute bearer tokens
✅ Real Data Retrieval: Successfully finding production problems, logs, and metrics
✅ Clean Architecture: lib/ vs tools/ separation for maintainability

📊 Code Quality Improvements:

Before: 40+ lines of environment parsing per tool × 6 tools = 240+ lines
After: 6 lines using shared config × 6 tools = 36 lines
Eliminated: 204+ lines of duplicate code with enhanced validation

🚨 IMPORTANT: Grail vs Classic Environment Support

Your Dynatrace environment type determines which tools and configuration you need:

🆕 Grail Environment (Recommended - Modern Platform)

URL Pattern: https://xxx.apps.dynatrace.com
Authentication: OAuth Bearer tokens required
API Endpoints: /platform/classic/environment-api/v2/
Primary Tool: grail-log-query.js ✅ Production Ready
MCP Server: ✅ Ready to Build - Complete Design Available

🏛️ Classic Environment (Legacy Platform)

URL Pattern: https://xxx.live.dynatrace.com
Authentication: API tokens OR OAuth
API Endpoints: /api/v2/
Primary Tool: classic-log-query.js ✅ Production Ready
MCP Server: ✅ Ready to Build - Complete Design Available

📖 See Environment Detection Guide below for setup details.

📁 Project Structure

TalkO11yToMe/
├── lib/                    # 🔧 Shared Infrastructure
│   ├── config.js          #    → Standardized dotenv configuration
│   └── demo-dotenv.js     #    → Configuration demonstration
├── tools/                  # 🚀 Production Tools (6 tools)
│   ├── grail-log-query.js        #    → Primary tool for Grail environments
│   ├── grail-business-analytics.js #  → DQL and business events (Grail)
│   ├── classic-log-query.js      #    → Primary tool for Classic environments
│   ├── classic-api-client.js     #    → Comprehensive API client (both)
│   ├── dynatrace-oauth-tool.js   #    → Authentication testing
│   └── dynatrace-monitor.js      #    → Visual monitoring dashboard
├── tests/                  # 🧪 Comprehensive Test Suite
│   ├── test-suite.js      #    → Main test runner with 8 validation tests
│   ├── test-config.js     #    → Test configuration and scenarios
│   ├── test-history.js    #    → Test history tracking and analysis
│   ├── results/           #    → Organized timestamped test results
│   └── README.md          #    → Complete testing documentation
├── docs/                   # 📚 Documentation
│   ├── FIXES_SUMMARY.md          #    → Complete technical solution summary
│   ├── TOOLS_GUIDE.md            #    → Detailed tool usage and examples
│   ├── DYNATRACE_LOGS_SOLUTION.md #  → Environment setup guide
│   └── [additional guides...]
└── env/                    # 🔐 Environment Configuration
    └── .env.dev                  #    → Your Dynatrace credentials

📚 Documentation

Quick Reference

🚀 Tools Guide - Complete tool documentation and usage examples
🔧 Technical Solution - Detailed environment setup and troubleshooting
📊 Implementation Summary - Complete project transformation details
🎯 AI Integration Demo - Step-by-step AI analysis examples
⚡ Cursor IDE Integration - Practical workflows and AI-powered development patterns
🤖 MCP Server Design - Complete architecture and implementation plan with official TypeScript SDK

🤖 What is MCP?

The Model Context Protocol (MCP) is an open standard that enables AI systems (like Claude, Cursor, VS Code Copilot) to securely connect to the tools and data your business already uses. Think of it as a "universal adapter" for AI.

Key Benefits:

Standardized Integration: No more custom integrations for each AI tool
Secure Data Access: Controlled access to your systems
Real-time Context: AI gets up-to-date information from your production systems
Extensible: Easy to add new capabilities and data sources

🏗️ Understanding MCP Architecture

MCP consists of three main components:

┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│    Host     │◄───┤   Client    │◄───┤   Server    │
│ (AI Agent)  │    │(Translator) │    │ (Data/Tool) │
└─────────────┘    └─────────────┘    └─────────────┘

Host: Your AI application (Claude Desktop, VS Code, etc.)
Client: Handles communication and translates requests
Server: Provides specific functionality (Dynatrace data, file access, APIs, etc.)

MCP Features:

Tools: Functions that AI can call (query metrics, create alerts, etc.)
Resources: Data sources AI can read (logs, dashboards, configurations)
Prompts: Templates for common AI interactions

🔍 Dynatrace MCP Server

The official Dynatrace MCP server enables AI assistants to interact with your Dynatrace environment for real-time observability insights.

Capabilities:

Problem Management: List and analyze production problems
Security Issues: Access vulnerability and security problem details
DQL Queries: Execute Dynatrace Query Language for logs and events
Slack Integration: Send alerts via Slack connectors
Workflow Automation: Set up notification workflows
Entity Ownership: Get ownership information for services

Use Cases:

Real-time Debugging: Get production context while coding
Incident Response: AI-assisted troubleshooting with live data
Security Analysis: Correlate code vulnerabilities with production issues
Performance Optimization: Query metrics and logs in natural language

🚀 Setup Instructions

Prerequisites

Node.js 18+ (we use v20.19.2 LTS)
Access to Dynatrace tenant(s)
Dynatrace API token OR OAuth client credentials
AI client (Cursor IDE recommended)

Environment Configuration

Copy the environment template:

cp env/env.template env/.env.dev

Edit env/.env.dev with your credentials:

# Dynatrace Environment
DT_ENVIRONMENT=https://your-environment-id.live.dynatrace.com

# API Token (simpler setup)
API_TOKEN=dt0c01.XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY

# OAuth (optional - for official MCP server)
OAUTH_CLIENT_ID=dt0s02.XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
OAUTH_CLIENT_SECRET=dt0s02.XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.YYYYYYYYYYYYYYYYYYYYYYYYYYYY
OAUTH_RESOURCE_URN=urn:dynatrace:environment:your-environment-id

Test your configuration:

# Test shared configuration
node lib/demo-dotenv.js

# Verify tools work
node tools/dynatrace-monitor.js

# Run comprehensive test suite
node tests/test-suite.js

🔍 Environment Detection

Check your Dynatrace URL to determine your environment type:

Grail Environment Detection

https://xxx.apps.dynatrace.com  ← Contains ".apps."

→ Use: grail-log-query.js (OAuth required)

Classic Environment Detection

https://xxx.live.dynatrace.com  ← Contains ".live."

→ Use: classic-log-query.js (API token supported)

🎯 Quick Start Demo

For Grail Environments:

# Run primary tool for modern Dynatrace platforms
node tools/grail-log-query.js "error" now-2h

# Business analytics with DQL polling
node tools/grail-business-analytics.js query "fetch logs | limit 5"

# Visual monitoring dashboard
node tools/dynatrace-monitor.js

For Classic Environments:

# Run primary tool for legacy Dynatrace platforms
node tools/classic-log-query.js search "error" now-2h

# Comprehensive API client with auto-detection
node tools/classic-api-client.js problems 10

# Visual monitoring dashboard
node tools/dynatrace-monitor.js

🎯 Recent Testing Results:

5 log records retrieved with rich metadata (scanned 418,085 records in 16ms)
17 Lambda problems detected including high error rates and timeouts
OAuth tokens working with 5-minute bearer token lifecycle
DQL polling functional - proper 202 → poll → results workflow

📖 For detailed tool usage, see Tools Guide.

🎯 Real-World Use Cases

1. AI-Powered Incident Response

Use the AI Integration Demo to:

Analyze Lambda function errors automatically
Generate incident response priorities
Create security assessments
Design monitoring dashboards

2. Production Debugging Context

# Get context for coding session
node tools/dynatrace-monitor.js

# Search for specific errors with polling
node tools/grail-log-query.js "timeout" now-1h

3. Daily Health Monitoring

# Morning standup dashboard
node tools/dynatrace-monitor.js

# Lambda function health check
node tools/classic-api-client.js lambda-problems now-24h

📊 Architecture Improvements

Shared Configuration (`lib/config.js`)

Before: Each tool had 40+ lines of duplicate environment parsing
After: Single shared config with dotenv standardization

// All tools now use this simple pattern:
const config = require('../lib/config');
const dt = await config.getDynatraceConfig();
// 6 lines replace 40+ lines of custom parsing

Benefits Achieved:

✅ 204+ lines eliminated - Massive code reduction
✅ Standardized error handling - Consistent across all tools
✅ Environment validation - Automatic Grail vs Classic detection
✅ Enhanced security - Proper credential masking in logs
✅ Better maintainability - Single source of configuration truth

📚 Resources

Official MCP Resources:

Dynatrace MCP Integration:

Official Dynatrace MCP Server (Classic environments only)
Dynatrace API Documentation
Dynatrace DQL Documentation

Project Documentation:

Complete Tools Guide - All tools with examples and best practices
Technical Implementation - Environment setup, troubleshooting, and architecture details
AI Integration Examples - Step-by-step AI analysis workflows
Cursor IDE Integration - Development workflows and practical AI patterns

🎉 Ready to get started? Check out the Tools Guide for comprehensive usage examples and best practices!

🧪 Test Suite

We've included a comprehensive test suite to validate all tools and infrastructure:

Quick Test Run

# Run all tests (takes ~8 seconds)
node tests/test-suite.js

What Gets Tested

✅ File Structure: All 9 expected files present
✅ Shared Configuration: Dotenv integration working
✅ Authentication: OAuth token generation (5-minute lifecycle)
✅ All 6 Tools: Grail, Classic, and Universal tools
✅ Real Data Access: Production problems and logs
✅ DQL Polling: Async query execution (202 → poll → results)

Expected Results

📊 TEST RESULTS SUMMARY
Total Tests: 8
✅ Passed: 8
❌ Failed: 0
📈 Success Rate: 100.0%

📋 See Test Suite Documentation for detailed usage and troubleshooting.

🚀 Project Roadmap

🎯 Current Status: Production Ready

All tools operational with comprehensive testing and organized structure. Ready for next phase enhancements.

📋 Upcoming Priorities

🌐 HTTP Improvements: Migrate from fetch to axios for better error handling and request interceptors
🤖 MCP Server Development: Build custom MCP server exposing tools as AI-accessible capabilities
⚡ Performance Optimization: Parallel testing, performance benchmarking, load testing

📚 Planning Documents

Complete Project Roadmap - Comprehensive feature roadmap and implementation timeline
Axios Migration Plan - Detailed plan for HTTP improvements
MCP Server Design - Complete architecture and implementation with official TypeScript SDK

💡 Ideas & Contributions

The roadmap includes advanced features like:

Natural language querying ("Show me Lambda errors from the last hour")
Multi-environment support and comparison
Predictive analytics and anomaly detection
Integration with other observability platforms

📝 Contributing: Review the TODO.md for ways to contribute to future development

🎯 Focus Areas: HTTP improvements and MCP server development are immediate priorities**

Author Information

Sebastian Vietz