FastAPI-BitNet

FastAPI-BitNet provides a robust REST API for managing and interacting with `llama.cpp`-based BitNet model instances. Designed for developers and researchers, it allows programmatic control of automated testing, benchmarking, and interactive chat sessions. Key features include session management, batch operations, and model benchmarking.

GitHub Stars

34

User Rating

Not Rated

Favorites

0

Views

30

Forks

8

Issues

0

Installation
Difficulty
Intermediate
Estimated Time
10-20 minutes
Requirements
Docker Desktop: Latest version
Conda: Latest version (or another Python environment manager)
+1 more

Installation

Installation

Prerequisites

Docker Desktop: Latest version
Conda: Latest version (or another Python environment manager)
Python: 3.10+

Installation Steps

1. Set Up the Python Environment

Create and activate a Conda environment:
bash
conda create -n bitnet python=3.11
conda activate bitnet
Install the Huggingface-CLI tool to download the models:

pip install -U "huggingface_hub[cli]"
Download Microsoft's official BitNet model:

huggingface-cli download microsoft/BitNet-b1.58-2B-4T-gguf --local-dir app/models/BitNet-b1.58-2B-4T

2. Running the Application

#### Using Docker (Recommended)
1Build the Docker image:
bash
docker build -t fastapi_bitnet .
2Run the Docker container:
bash
docker run -d --name ai_container -p 8080:8080 fastapi_bitnet
#### Local Development You can run the application directly with Uvicorn:
bash
uvicorn app.main:app --host 0.0.0.0 --port 8080 --reload

Configuration

Configuration

Basic Configuration

FastAPI-BitNet Setup

No specific configuration files are required, but it is recommended to set environment variables as needed:
bash
export API_KEY="your-api-key"
export DEBUG="true"

Advanced Configuration

Security Settings

Store API keys in environment variables or secure configuration files
Set appropriate file access permissions
Adjust logging levels

Performance Tuning

Configure timeout values
Limit concurrent executions
Set up caching

Configuration Example

Basic Configuration

json
{
  "mcpServers": {
    "bitnet-mcp": {
      "command": "python",
      "args": ["-m", "app.main"],
      "env": {
        "API_KEY": "your-api-key"
      }
    }
  }
}

Examples

Examples

Basic Usage

Interactive Chat Using the API

python
import requests

def send_prompt(prompt):
    response = requests.post(
        'http://localhost:8080/chat',
        json={"prompt": prompt}
    )
    return response.json()

result = send_prompt("Hello, what is your name?")
print(result)

Executing Batch Operations

python
import requests

prompts = ["Hello", "How are you?"]
response = requests.post(
    'http://localhost:8080/batch',
    json={"prompts": prompts}
)
print(response.json())

Use Cases

Researchers testing multiple BitNet model instances simultaneously to compare performance.
Developers building interactive chatbots that generate responses based on user input.
System administrators monitoring resource usage to find optimal server configurations.
Data scientists benchmarking model performance to identify areas for improvement.