memory-engine

Name: memory-engine
Availability: InStock
Author: Janek Wenning

An open-source knowledge management system that converts unstructured text into a graph database with automatic relationship discovery and semantic search. Built with JanusGraph and Milvus, designed to work with both commercial and local AI models. Currently in alpha development.

GitHub

GitHubスター

ユーザー評価

未評価

お気に入り

閲覧数

フォーク

イシュー

README

Memory Engine

A semantic knowledge management system that combines graph-based knowledge representation with vector embeddings for information storage, retrieval, and synthesis.

🌟 Overview

Memory Engine is an experimental knowledge management system that transforms unstructured text into a structured, searchable knowledge graph. It combines graph databases with vector embeddings to create a foundation for applications that can understand, connect, and reason about information.

⚠️ Important Notice

This is a personal open-source project developed for learning and research purposes. No guarantees are made regarding reliability, security, or suitability for production use. Use at your own risk.

🚧 Project Status

This project is currently in active development (v0.5.0 - Orchestrator Integration) and should be considered experimental.

Vision

Our goal is to create a truly open and accessible knowledge management system that works with:

Any AI model: Commercial APIs (OpenAI, Anthropic, Google) and local models (Ollama, Hugging Face)
Any deployment: From laptop development to distributed production systems
Any data: Text, documents, structured data, and multimedia content

We aim to eliminate dependency on paid APIs by providing full support for local model execution, making advanced knowledge management accessible to everyone.

🎯 What Memory Engine Does

Input: Unstructured text, documents, or data

Output: Structured knowledge with automatic relationships and semantic search capabilities

Core Functions

Knowledge Ingestion: Feed text/documents → Engine extracts entities, facts, and relationships → Stores in graph database
Knowledge Retrieval: Query in natural language → Engine searches semantically → Returns relevant information with context
Automatic Processing: The engine handles complexity internally - relationship discovery, quality assessment, versioning, and optimization

Key Features

🧠 AI & Language Models

Multi-LLM Support: 5 different LLM providers (Gemini, OpenAI, Anthropic, Ollama, HuggingFace)
LLM Independence: Fallback chains and circuit breaker pattern for resilience
Local Operation: Complete offline capabilities with Ollama and HuggingFace Transformers
Automatic Relationship Discovery: Detects and creates relationships between knowledge entities

⚡ Performance & Production

Advanced Caching: Multi-level caching with TTL, memory limits, and intelligent invalidation
Connection Pooling: Health monitoring and configurable pool management
Query Optimization: Prepared statements and batch processing for high throughput
Memory Management: Garbage collection optimization and automatic resource cleanup

🛠️ Operations & Management

Health Monitoring: Comprehensive system health checks and service monitoring
CLI Tools: Complete command-line interface for all management operations
Migration Tools: Backend migration utilities with multiple strategies
Backup & Restore: Automated backup with compression and retention policies

🔌 Extensibility

Plugin Architecture: Custom storage backends, LLM providers, and embedding providers
Data Export/Import: Multiple formats (JSON, CSV, XML, GraphML, Cypher, Gremlin, RDF)
Metrics Collection: Prometheus-compatible metrics with counters, gauges, histograms

🔍 Knowledge Management

Semantic Search: Multi-provider vector embeddings with modular vector stores
Modular Storage: Choose from JanusGraph, SQLite, or JSON backends
Quality Enhancement: Automated quality assessment and contradiction resolution
Version Control: Complete change tracking and rollback capabilities

🔐 Security & Integration

Basic Security Features: Authentication, RBAC, encryption, and audit logging (educational purposes)
Privacy Controls: Fine-grained knowledge privacy levels and access control
Flexible Integration: MCP (Module Communication Protocol) interface for external systems
Agent Support: Google ADK integration for conversational knowledge interactions

🚀 Quick Start

Prerequisites

Python 3.8+
Docker & Docker Compose (optional, for JanusGraph/Milvus)
At least one LLM provider API key:
- Google Gemini API key (Get one here)
- OpenAI API key (Get one here)
- Anthropic API key (Get one here)
- Or use local models with Ollama or HuggingFace (no API key needed)

1. Installation

# Clone the repository
git clone https://github.com/Celebr4tion/memory-engine.git
cd memory-engine

# Run automated setup
./scripts/setup.sh

The setup script will:

Check Python version compatibility
Create virtual environment
Install dependencies
Create configuration template
Set up development tools

2. Environment Setup

# Edit the .env file created by setup
# Set your preferred LLM provider API keys (at least one required)
GOOGLE_API_KEY="your-gemini-api-key"           # For Gemini
OPENAI_API_KEY="your-openai-api-key"           # For OpenAI GPT
ANTHROPIC_API_KEY="your-anthropic-api-key"     # For Claude
HUGGINGFACE_API_KEY="your-hf-api-key"          # For HuggingFace API (optional)

# Optional: Set environment (defaults to development)
ENVIRONMENT="development"

3. Start Infrastructure (Optional)

For production storage backends:

# Start JanusGraph and Milvus (optional, for production storage)
cd docker
docker-compose up -d

# Wait for services to initialize (2-3 minutes)
docker-compose logs -f

For development, you can use lightweight storage backends (SQLite/JSON) that don't require external services.

4. Basic Usage

from memory_core.core.knowledge_engine import KnowledgeEngine
from memory_core.model.knowledge_node import KnowledgeNode

# Initialize the system
engine = KnowledgeEngine()
engine.connect()

# Create knowledge from text
node = KnowledgeNode(
    content="Machine learning is a subset of artificial intelligence",
    source="AI Textbook",
    rating_truthfulness=0.9
)

# Save to knowledge graph
node_id = engine.save_node(node)
print(f"Created knowledge node: {node_id}")

# Retrieve and explore
retrieved = engine.get_node(node_id)
print(f"Content: {retrieved.content}")

5. CLI Management (v0.4.0+) & Orchestrator Features (v0.5.0+)

Memory Engine includes a comprehensive CLI for production management:

# Initialize a new Memory Engine instance
memory-engine init --backend=sqlite --embedding=sentence_transformers

# Check system health
memory-engine health-check --detailed

# Migrate between storage backends
memory-engine migrate --from=sqlite --to=janusgraph --verify

# Export knowledge graph data
memory-engine export --format=json --output=backup.json --include-metadata

# Import data from various formats
memory-engine import --file=data.json --merge-duplicates

# Create system backups
memory-engine backup --strategy=full --compression=gzip

# Restore from backup
memory-engine restore --backup=backup_12345 --clear-existing

# Manage plugins
memory-engine plugins list --type=storage
memory-engine plugins install custom-backend

# Configuration management
memory-engine config show --section=storage
memory-engine config set storage.backend janusgraph
memory-engine config validate

# System status
memory-engine status
memory-engine version

# Orchestrator Integration (v0.5.0+)
# Start streaming MCP operations
memory-engine mcp stream-query --query="knowledge about AI" --batch-size=50

# Manage event system
memory-engine events list --status=pending
memory-engine events replay --from-timestamp=1234567890

# Module registry management
memory-engine modules list --capabilities
memory-engine modules register my-custom-module

# Advanced GraphQL-like queries
memory-engine query build --type=nodes --filter="content contains 'AI'" --limit=10
memory-engine query execute --query-file=complex_query.json

📖 Documentation

Document	Description
📋 Setup Guide	Complete installation and configuration instructions
⚙️ Configuration	Basic configuration and environment setup
🔧 Advanced Configuration	Advanced configuration system
🏗️ Architecture	System architecture and component interactions
🏗️ Project Structure	Detailed project organization and structure
📡 API Reference	Complete API documentation including MCP interface
🔐 Security Framework	Authentication, RBAC, encryption, and privacy controls
🔧 Troubleshooting	Common issues and solutions

💻 Examples

Explore practical examples in the examples/ directory:

Basic Usage: Core operations and workflows
Knowledge Extraction: Text processing and knowledge extraction
MCP Integration: Using the Module Communication Protocol
Security Framework: Authentication, RBAC, encryption, and privacy controls
Advanced Queries: Complex querying and analytics
Knowledge Synthesis: Question answering and insight discovery

Run Examples

# Ensure infrastructure is running
cd docker && docker-compose up -d

# Run basic usage example
python examples/basic_usage.py

# Run knowledge extraction demo
python examples/knowledge_extraction.py

# Test MCP interface
python examples/mcp_client_example.py

# Try configuration system
python examples/config_example.py

🧪 Testing

Memory Engine includes a comprehensive test suite organized by type:

# Run all tests
./scripts/test.sh all

# Run only unit tests (fast, no external dependencies)
./scripts/test.sh unit

# Run integration tests (requires JanusGraph and Milvus)
./scripts/test.sh integration

# Run tests with coverage report
./scripts/test.sh coverage

# Run specific test file
./scripts/test.sh --file config_manager

Test organization:

Unit Tests (tests/unit/): Fast, isolated tests
Integration Tests (tests/integration/): Tests requiring external services
Component Tests (tests/): End-to-end component testing

🏗️ Architecture

Memory Engine uses a sophisticated multi-layer architecture:

┌─────────────────────────────────────────────────────────────────┐
│                    Application Layer                            │
├─────────────────┬─────────────────┬─────────────────┬───────────┤
│   Python API    │   MCP Interface │  Knowledge Agent│ REST API  │
├─────────────────┴─────────────────┴─────────────────┴───────────┤
│                    Knowledge Engine Core                        │
├─────────────────┬─────────────────┬─────────────────┬───────────┤
│   Knowledge     │   Relationship  │    Versioning   │  Rating   │
│   Processing    │   Extraction    │    Manager      │  System   │
├─────────────────┼─────────────────┼─────────────────┼───────────┤
│   Graph Store   │   Vector Store  │   Embedding     │  LLM API  │
│  (JanusGraph)   │   (Milvus)      │   Manager       │ (Gemini)  │
└─────────────────┴─────────────────┴─────────────────┴───────────┘

Core Components

Modular Graph Storage: Multiple backend options (JanusGraph, SQLite, JSON file)
Vector Database (Milvus): Enables semantic similarity search
Embedding System: Generates and manages vector representations
Processing Pipeline: Extracts and structures knowledge from text
Versioning System: Tracks changes and enables rollbacks
MCP Interface: Standardized API for external integration

Storage Backend Options

Choose the storage backend that fits your deployment needs:

🏢 JanusGraph: Production-grade distributed graph database
💾 SQLite: Single-user deployments with SQL capabilities
📄 JSON File: Development and testing with human-readable storage

🔧 Technology Stack

Component	Technology	Purpose
Graph Storage	JanusGraph / SQLite / JSON	Knowledge relationships
Vector Database	Milvus / ChromaDB / NumPy	Similarity search
LLM Providers	Gemini / OpenAI / Anthropic / Ollama / HuggingFace	Knowledge extraction
Embedding Providers	Gemini / OpenAI / Sentence Transformers / Ollama	Vector generation
Agent Framework	Google ADK	Conversational interfaces
Web Framework	FastAPI	REST API endpoints
Language	Python 3.8+	Core implementation

🧪 Development

Running Tests

# Unit tests only
pytest tests/ -k "not integration" -v

# All tests (requires infrastructure)
pytest tests/ -v

# With coverage
pytest tests/ --cov=memory_core --cov-report=html

Development Setup

# Install development dependencies
pip install pytest pytest-cov black isort mypy

# Format code
black memory_core/ tests/
isort memory_core/ tests/

# Type checking
mypy memory_core/

# Pre-commit hooks
pip install pre-commit
pre-commit install

📊 Performance

Performance characteristics will vary depending on your hardware, data complexity, and configuration. We recommend testing with your specific use case and data to establish realistic benchmarks.

🤝 Contributing

We welcome contributions! Please see our contributing guidelines:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Make your changes with tests
Ensure all tests pass (pytest)
Format code (black . && isort .)
Commit changes (git commit -m 'Add amazing feature')
Push to branch (git push origin feature/amazing-feature)
Open a Pull Request

Development Standards

Code Quality: All code must pass linting and type checking
Testing: Maintain >90% test coverage
Documentation: Update docs for any API changes
Performance: Benchmark performance-critical changes

📝 License

This project is licensed under the Hippocratic License 3.0 - an ethical source license that promotes responsible use of software while protecting human rights and environmental sustainability.