1
0
mirror of https://github.com/containers/ramalama.git synced 2026-02-05 06:46:39 +01:00
Files
ramalama/CLAUDE.md
Daniel J Walsh b9f671646e Apply suggestions from code review
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-01-06 10:19:20 -05:00

3.9 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

RamaLama is a CLI tool for managing and serving AI models using containers. It provides a container-centric approach to AI model management, supporting multiple model registries (Hugging Face, Ollama, OCI registries) and automatic GPU detection with appropriate container image selection.

Build and Development Commands

Setup

make install-requirements    # Install dev dependencies via pip

Testing

# Unit tests (pytest via tox)
make unit-tests              # Run unit tests
make unit-tests-verbose      # Run with full trace output
tox                          # Direct tox invocation

# E2E tests (pytest)
make e2e-tests               # Run with Podman (default)
make e2e-tests-docker        # Run with Docker
make e2e-tests-nocontainer   # Run without container engine

# System tests (BATS)
make bats                    # Run BATS system tests
make bats-nocontainer        # Run in nocontainer mode
make bats-docker             # Run with Docker

# All tests
make tests                   # Run unit tests and system-level integration tests

Running a single test

# Unit test
tox -- test/unit/test_cli.py::test_function_name -vvv

# E2E test
tox -e e2e -- test/e2e/test_basic.py::test_function_name -vvv

# Single BATS file
RAMALAMA=$(pwd)/bin/ramalama bats -T test/system/030-run.bats

Code Quality

make validate                # Run all validation (codespell, lint, format check, man-check, type check)
make lint                    # Run flake8 + shellcheck
make check-format            # Check black + isort formatting
make format                  # Auto-format with black + isort
make type-check              # Run mypy type checking
make codespell               # Check spelling

Documentation

make docs                    # Build manpages and docsite

Architecture

Source Structure (ramalama/)

  • cli.py - Main CLI entry point, argparse setup, subcommand dispatch
  • config.py - Configuration constants and loading
  • common.py - Shared utility functions, GPU detection (get_accel())
  • engine.py - Container engine abstraction (Podman/Docker)

Transport System (ramalama/transports/)

Handles pulling/pushing models from different registries:

  • base.py - Base Transport class defining the interface
  • transport_factory.py - New() and TransportFactory for creating transports
  • huggingface.py, ollama.py, oci.py, modelscope.py, rlcr.py - Registry-specific implementations
  • Transports are selected via URL scheme prefixes: huggingface://, ollama://, oci://, etc.

Model Store (ramalama/model_store/)

Manages local model storage:

  • global_store.py - GlobalModelStore for model management
  • store.py - Low-level storage operations
  • reffile.py - Reference file handling for tracking model origins

Command System (ramalama/command/)

  • factory.py - assemble_command() builds runtime commands (llama.cpp, vllm, mlx)
  • context.py - Command execution context
  • schema.py - Inference spec schema handling

Key Patterns

  • GPU Detection: get_accel() in common.py detects GPU type (CUDA, ROCm, Vulkan, etc.) and selects appropriate container image
  • Container Images: GPU-specific images at quay.io/ramalama/{ramalama,cuda,rocm,intel-gpu,...}
  • Inference Engines: llama.cpp (default), vllm, mlx (macOS only) - configured via YAML specs in inference-spec/engines/

Test Structure

  • test/unit/ - pytest unit tests (fast, no external dependencies)
  • test/e2e/ - pytest end-to-end tests (marked with @pytest.mark.e2e)
  • test/system/ - BATS shell tests for full CLI integration testing

Code Style

  • Python 3.10+ required
  • Line length: 120 characters
  • Formatting: black + isort
  • Type hints encouraged (mypy checked)
  • Commits require DCO sign-off (git commit -s)