kreuzberg

Document intelligence framework for Python - Extract text, metadata, and structured data from PDFs, images, Office documents, and more. Built on Pandoc, PDFium, and Tesseract.

GitHub Stars

2,340

User Rating

Not Rated

Favorites

0

Views

77

Forks

95

Issues

5

Installation
Difficulty
Intermediate
Estimated Time
10-20 minutes

Installation

Installation

Prerequisites

Required software and versions:
Python: 3.7 or higher
pip: Latest version

Installation Steps

1. Clone Repository

bash
git clone https://github.com/Goldziher/kreuzberg.git
cd kreuzberg

2. Install Dependencies

bash
pip install -r requirements.txt

3. Verify Environment

Ensure that the required dependencies are installed correctly.

Troubleshooting

Common Issues

Issue: Dependencies fail to install Solution: Check the versions of Python and pip, and try reinstalling.