Installation Guide
=================

Requirements
-----------

QuantLLM requires Python 3.10 or later. The following are the core dependencies:

* PyTorch >= 2.0.0
* Transformers >= 4.30.0
* CUDA Toolkit (optional, but recommended for GPU support)

Installation Methods
------------------

1. From PyPI (Recommended)
~~~~~~~~~~~~~~~~~~~~~~~

Basic installation:

.. code-block:: bash

    pip install quantllm

With GGUF support (recommended for deployment):

.. code-block:: bash

    pip install quantllm[gguf]

With development tools:

.. code-block:: bash

    pip install quantllm[dev]

2. From Source
~~~~~~~~~~~~

.. code-block:: bash

    git clone https://github.com/codewithdark-git/DiffusionLM.git
    cd DiffusionLM
    pip install -e .

For development installation:

.. code-block:: bash

    pip install -e .[dev,gguf]

Hardware Requirements
------------------

Minimum Requirements:
~~~~~~~~~~~~~~~~~~

* CPU: 4+ cores
* RAM: 16GB+
* Storage: 10GB+ free space
* Python: 3.10+

Recommended for Large Models:
~~~~~~~~~~~~~~~~~~~~~~~~~

* CPU: 8+ cores
* RAM: 32GB+
* GPU: NVIDIA GPU with 8GB+ VRAM
* CUDA: 11.7 or later
* Storage: 20GB+ free space

GGUF Support
----------

GGUF (GGML Universal Format) support requires additional dependencies:

* llama-cpp-python >= 0.2.0
* ctransformers >= 0.2.0 (optional)

These are automatically installed with:

.. code-block:: bash

    pip install quantllm[gguf]

Verify Installation
----------------

You can verify your installation by running:

.. code-block:: python

    import quantllm
    from quantllm.quant import GGUFQuantizer
    
    # Check GGUF support
    print(f"GGUF Support: {GGUFQuantizer.CT_AVAILABLE}")
    
    # Check CUDA availability
    import torch
    print(f"CUDA Available: {torch.cuda.is_available()}")
    if torch.cuda.is_available():
        print(f"CUDA Version: {torch.version.cuda}")
        print(f"GPU Device: {torch.cuda.get_device_name(0)}")

Common Issues
-----------

1. CUDA Compatibility
~~~~~~~~~~~~~~~~~~

If you encounter CUDA errors:

.. code-block:: bash

    # Install PyTorch with specific CUDA version
    pip install torch --index-url https://download.pytorch.org/whl/cu118

2. Memory Issues
~~~~~~~~~~~~~

For large models, enable memory optimization:

.. code-block:: python

    quantizer = GGUFQuantizer(
        model_name="large-model",
        cpu_offload=True,
        chunk_size=500,
        gradient_checkpointing=True
    )

3. GGUF Conversion Issues
~~~~~~~~~~~~~~~~~~~~~~

If GGUF conversion fails:

1. Ensure llama-cpp-python is installed:
   
   .. code-block:: bash

       pip install llama-cpp-python --upgrade

2. Check system compatibility:
   
   .. code-block:: bash

       python -c "from ctransformers import AutoModelForCausalLM; print('GGUF support available')"

Next Steps
---------

* Read the :doc:`getting_started` guide
* Check out :doc:`tutorials/index`
* See :doc:`advanced_usage/index` for advanced features