ai-knowledge

Name: ai-knowledge
Availability: InStock
Author: lizhe-0423

ai-knowledge is a library that provides advanced AI capabilities using Java. It simplifies the implementation of machine learning algorithms and data analysis, enabling developers to quickly prototype their ideas. With extensive documentation and sample code available, it reduces the learning curve for users.

GitHub

GitHub Stars

User Rating

Not Rated

Favorites

Views

Forks

Issues

README

🚀 AI Knowledge Base Retrieval System

中文版 | English

An enhanced RAG (Retrieval-Augmented Generation) intelligent knowledge base system built on Spring AI framework with Ollama and OpenAI integration

📖 Project Overview

This project is an intelligent knowledge base system that integrates Retrieval-Augmented Generation (RAG) technology, designed to provide comprehensive AI-assisted solutions for enterprises. By combining the capabilities of multiple large language models, it achieves end-to-end intelligent processing from document parsing to intelligent Q&A.

✨ Core Features

🔍 RAG (Retrieval-Augmented Generation)

Key Capabilities:

📄 Multi-format Document Processing: Support for PDF, Word, Markdown, and other document formats via Apache Tika
🔗 Git Repository Integration: Automatic repository cloning and code analysis using JGit
🧠 Dual Embedding Models:
- Local nomic-embed-text model via Ollama for privacy and cost control
- OpenAI text-embedding-ada-002 for high-quality embeddings
🗄️ Vector Storage: PostgreSQL with pgvector extension for persistent vector storage
🔄 Flexible Model Switching: Configuration-based switching between local and cloud models

Technical Benefits:

Enhanced search accuracy through semantic understanding
Cost-effective hybrid model approach
Scalable vector storage solution
Privacy-preserving local processing option

🤖 AI-Powered Q&A System

Core Workflow:

Document Ingestion: Parse and chunk documents using Spring AI Tika integration
Vector Embedding: Convert text to vectors using selected embedding model
Semantic Search: Retrieve relevant documents from vector database
Answer Generation: Generate contextual responses using OpenAI GPT models

Application Scenarios:

Enterprise knowledge management
Technical documentation Q&A
Code repository analysis and search
Intelligent customer support

🏗️ Technical Architecture

Supported AI Models

Ollama Models: Local deployment with nomic-embed-text for embedding
OpenAI GPT Series: Cloud-based models for text generation and embedding
Extensible Framework: Easy integration of additional model providers

Core Technology Stack

Backend Framework: Spring Boot 3.2.3 with Spring AI
Vector Database: PostgreSQL with pgvector extension
Caching: Redis for performance optimization
Document Processing: Apache Tika for multi-format support
API Documentation: Swagger UI with Knife4j enhancements
Containerization: Docker support for easy deployment

Key Dependencies

Spring AI BOM for AI model integration
Redisson for Redis operations
JGit for Git repository handling
FastJSON for JSON processing
HikariCP for database connection pooling

🚀 Quick Start

Prerequisites

Java 17+
PostgreSQL with pgvector extension
Redis server
Ollama (for local models)
OpenAI API key (for cloud models)

Configuration

Database Setup: Configure PostgreSQL connection in application-dev.yml
AI Models: Set up Ollama locally or configure OpenAI API credentials
Vector Storage: Choose between SimpleVectorStore (memory) or PgVectorStore (persistent)
Embedding Model: Configure spring.ai.rag.embed to select embedding model

Running the Application

# Clone the repository
git clone <repository-url>

# Navigate to the project directory
cd ai-knowledge

# Run with Maven
mvn spring-boot:run -pl dev-tech-app

The application will start on port 8090 with Swagger UI available at /swagger-ui.html.

📊 System Architecture

📊 RAG Workflow

🔧 Configuration Options

Embedding Model Selection

Local Model: Set spring.ai.rag.embed=nomic-embed-text for privacy and cost savings
Cloud Model: Set spring.ai.rag.embed=text-embedding-ada-002 for higher quality