Installation Guide

Requirements

QuantLLM requires Python 3.10 or later. The following are the core dependencies:

PyTorch >= 2.0.0
Transformers >= 4.30.0
CUDA Toolkit (optional, but recommended for GPU support)

Installation Methods

1. From PyPI (Recommended)

Basic installation:

pip install quantllm

With GGUF support (recommended for deployment):

pip install quantllm[gguf]

With development tools:

pip install quantllm[dev]

2. From Source

git clone https://github.com/codewithdark-git/DiffusionLM.git
cd DiffusionLM
pip install -e .

For development installation:

pip install -e .[dev,gguf]

Hardware Requirements

Minimum Requirements:

CPU: 4+ cores
RAM: 16GB+
Storage: 10GB+ free space
Python: 3.10+

Recommended for Large Models:

CPU: 8+ cores
RAM: 32GB+
GPU: NVIDIA GPU with 8GB+ VRAM
CUDA: 11.7 or later
Storage: 20GB+ free space

GGUF Support

GGUF (GGML Universal Format) support requires additional dependencies:

llama-cpp-python >= 0.2.0
ctransformers >= 0.2.0 (optional)

These are automatically installed with:

pip install quantllm[gguf]

Verify Installation

You can verify your installation by running:

import quantllm
from quantllm.quant import GGUFQuantizer

# Check GGUF support
print(f"GGUF Support: {GGUFQuantizer.CT_AVAILABLE}")

# Check CUDA availability
import torch
print(f"CUDA Available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA Version: {torch.version.cuda}")
    print(f"GPU Device: {torch.cuda.get_device_name(0)}")

Common Issues

1. CUDA Compatibility

If you encounter CUDA errors:

# Install PyTorch with specific CUDA version
pip install torch --index-url https://download.pytorch.org/whl/cu118

2. Memory Issues

For large models, enable memory optimization:

quantizer = GGUFQuantizer(
    model_name="large-model",
    cpu_offload=True,
    chunk_size=500,
    gradient_checkpointing=True
)

3. GGUF Conversion Issues

If GGUF conversion fails:

Ensure llama-cpp-python is installed:
```
pip install llama-cpp-python --upgrade
```

Check system compatibility:

python -c "from ctransformers import AutoModelForCausalLM; print('GGUF support available')"

Next Steps

Read the Getting Started guide
Check out tutorials/index
See advanced_usage/index for advanced features