FastAPI-BitNet
FastAPI-BitNet provides a robust REST API for managing and interacting with `llama.cpp`-based BitNet model instances. Designed for developers and researchers, it allows programmatic control of automated testing, benchmarking, and interactive chat sessions. Key features include session management, batch operations, and model benchmarking.
GitHub Stars
34
User Rating
Not Rated
Favorites
0
Views
30
Forks
8
Issues
0
Installation
Difficulty
IntermediateEstimated Time
10-20 minutes
Requirements
Docker Desktop: Latest versionConda: Latest version (or another Python environment manager)+1 more
Installation
Installation
Prerequisites
Docker Desktop: Latest version
Conda: Latest version (or another Python environment manager)
Python: 3.10+
Installation Steps
1. Set Up the Python Environment
Create and activate a Conda environment:bash
conda create -n bitnet python=3.11
conda activate bitnet
pip install -U "huggingface_hub[cli]"
huggingface-cli download microsoft/BitNet-b1.58-2B-4T-gguf --local-dir app/models/BitNet-b1.58-2B-4T
2. Running the Application
#### Using Docker (Recommended)1Build the Docker image:
bash
docker build -t fastapi_bitnet .
2Run the Docker container:
bash
docker run -d --name ai_container -p 8080:8080 fastapi_bitnet
bash
uvicorn app.main:app --host 0.0.0.0 --port 8080 --reload
Configuration
Configuration
Basic Configuration
FastAPI-BitNet Setup
No specific configuration files are required, but it is recommended to set environment variables as needed:bash
export API_KEY="your-api-key"
export DEBUG="true"
Advanced Configuration
Security Settings
Store API keys in environment variables or secure configuration files
Set appropriate file access permissions
Adjust logging levels
Performance Tuning
Configure timeout values
Limit concurrent executions
Set up caching
Configuration Example
Basic Configuration
json
{
"mcpServers": {
"bitnet-mcp": {
"command": "python",
"args": ["-m", "app.main"],
"env": {
"API_KEY": "your-api-key"
}
}
}
}
Examples
Examples
Basic Usage
Interactive Chat Using the API
python
import requests
def send_prompt(prompt):
response = requests.post(
'http://localhost:8080/chat',
json={"prompt": prompt}
)
return response.json()
result = send_prompt("Hello, what is your name?")
print(result)
Executing Batch Operations
python
import requests
prompts = ["Hello", "How are you?"]
response = requests.post(
'http://localhost:8080/batch',
json={"prompts": prompts}
)
print(response.json())
Use Cases
Researchers testing multiple BitNet model instances simultaneously to compare performance.
Developers building interactive chatbots that generate responses based on user input.
System administrators monitoring resource usage to find optimal server configurations.
Data scientists benchmarking model performance to identify areas for improvement.
Additional Resources
Related MCPs
AGENTIX
5
AGENTIX is an automation tool built with Python, designed to enable users to efficiently execute specific tasks. With an intuitive interface and powerful features, it is accessible even to users with limited programming knowledge. It offers a wide range of functionalities, including task automation, data processing, and API integration.