1
0
mirror of https://github.com/containers/bootc.git synced 2026-02-05 06:45:13 +01:00

docs: Add comprehensive ostree-ext container storage documentation

Document how container images are stored as ostree commits, including:

container/mod.rs:
- On-disk storage structure (ref namespace, layer storage, merge commit)
- Import flow from manifest fetch through merge commit creation
- Tar stream format and connection to deployments
- Signature verification options
- Key types and submodules

container/store.rs:
- Reference namespace constants and their purposes
- Three-step import process (create, prepare, execute)
- Layer types (commit, component, derived) and their handling
- Merge commit metadata keys
- Layer caching and deduplication strategy
- Garbage collection behavior
- Example usage

lib.rs:
- Add key modules section highlighting container, tar, sysroot, chunking

This complements the recent installation documentation by explaining how
container images are actually stored on disk in the ostree repository.

Assisted-by: OpenCode (Claude Sonnet 4)
Signed-off-by: Colin Walters <walters@verbum.org>
This commit is contained in:
Colin Walters
2026-01-23 15:26:23 -05:00
parent b17ca33ba9
commit 5c52b25ef9
3 changed files with 283 additions and 20 deletions

View File

@@ -1,29 +1,164 @@
//! # APIs bridging OSTree and container images
//!
//! This module contains APIs to bidirectionally map between a single OSTree commit and a container image wrapping it.
//! Because container images are just layers of tarballs, this builds on the [`crate::tar`] module.
//! This module provides the core infrastructure for bidirectionally mapping between
//! OCI/Docker container images and OSTree repositories. It enables bootable container
//! images to be fetched from registries, stored efficiently, and deployed as ostree
//! commits.
//!
//! To emphasize this, the current high level model is that this is a one-to-one mapping - an ostree commit
//! can be exported (wrapped) into a container image, which will have exactly one layer. Upon import
//! back into an ostree repository, all container metadata except for its digested checksum will be discarded.
//! ## Overview
//!
//! Container images are fundamentally layers of tarballs. This module leverages the
//! [`crate::tar`] module to import container layers as ostree content, and exports
//! ostree commits back to container images. The key insight is that ostree's
//! content-addressed object storage maps naturally to OCI layer deduplication.
//!
//! When a container image is imported ("pulled"), each layer becomes an ostree commit.
//! These layer commits are then merged into a single "merge commit" that represents
//! the complete filesystem state. This merge commit is what gets deployed as a
//! bootable system.
//!
//! ## On-Disk Storage Structure
//!
//! Container images are stored in the ostree repository (typically `/sysroot/ostree/repo/`)
//! using a structured reference (ref) namespace:
//!
//! ### Reference Namespace
//!
//! - **`ostree/container/blob/<escaped-digest>`**: Each OCI layer is stored as a
//! separate ostree commit. The digest (e.g., `sha256:abc123...`) is escaped using
//! [`crate::refescape`] to be valid as an ostree ref. For example:
//! `ostree/container/blob/sha256_3A_abc123...`
//!
//! - **`ostree/container/image/<escaped-image-reference>`**: Points to the "merge
//! commit" for a pulled image. The image reference (e.g., `docker://quay.io/org/image:tag`)
//! is escaped similarly. This is the ref that deployments point to.
//!
//! - **`ostree/container/baseimage/<project>/<index>`**: Used to protect base images
//! from garbage collection. Tooling that builds derived images locally should write
//! refs under this prefix to prevent the base layers from being pruned.
//!
//! ### Layer Storage
//!
//! Each container layer is stored as an ostree commit with a special structure:
//!
//! - **OSTree "chunk" layers**: Layers that are part of the base ostree commit use
//! the "object set" format - the filenames in the commit *are* the object checksums.
//! This enables efficient reconstruction of the original ostree commit.
//!
//! - **Derived layers**: Non-ostree layers (e.g., from `RUN` commands in a Containerfile)
//! are imported as regular filesystem trees and stored as standard ostree commits.
//!
//! ### The Merge Commit
//!
//! The merge commit (`ostree/container/image/...`) combines all layers into a single
//! filesystem tree. It contains critical metadata in its commit metadata:
//!
//! - `ostree.manifest-digest`: The OCI manifest digest (e.g., `sha256:...`)
//! - `ostree.manifest`: The complete OCI manifest as JSON
//! - `ostree.container.image-config`: The OCI image configuration as JSON
//!
//! This metadata enables round-tripping: an imported image can be re-exported with
//! its original manifest structure preserved.
//!
//! ## Import Flow
//!
//! The import process (implemented in [`store::ImageImporter`]) follows these steps:
//!
//! 1. **Manifest fetch**: Contact the registry via containers-image-proxy (skopeo)
//! to retrieve the image manifest and configuration.
//!
//! 2. **Layout parsing**: Analyze the manifest to identify:
//! - The base ostree layer (identified by the `ostree.final-diffid` label)
//! - Component/chunk layers (split object sets)
//! - Derived layers (non-ostree content)
//!
//! 3. **Layer caching check**: For each layer, check if an ostree ref already exists
//! for that digest. Cached layers are skipped, enabling efficient incremental updates.
//!
//! 4. **Layer import**: For uncached layers:
//! - Fetch the compressed tarball from the registry
//! - Decompress and parse the tar stream
//! - Import content into ostree (handling xattrs via `bare-split-xattrs` format)
//! - Create an ostree commit and write the layer ref
//!
//! 5. **Merge commit creation**: Overlay all layers (processing OCI whiteout files)
//! to create a unified filesystem tree. Apply SELinux labeling if needed.
//! Store manifest/config metadata and write the image ref.
//!
//! 6. **Garbage collection**: Prune layer refs that are no longer referenced by any
//! image or deployment.
//!
//! ## Tar Stream Format
//!
//! The tar format used for ostree layers is documented in [`crate::tar`]. Key points:
//!
//! - Uses `bare-split-xattrs` repository mode to handle extended attributes
//! - XAttrs are stored in separate `.file-xattrs` objects, avoiding tar xattr complexity
//! - `/etc` in container images maps to `/usr/etc` in ostree (the "3-way merge" location)
//! - Hardlinks are used for deduplication within layers
//!
//! ## Connection to Deployments
//!
//! When bootc deploys an image, it creates an ostree deployment whose "origin" file
//! references the container image. The origin contains:
//!
//! - The [`OstreeImageReference`] specifying the image and signature verification method
//! - The merge commit checksum
//!
//! On subsequent boots, bootc can compare the deployed commit against the registry
//! manifest to detect available updates.
//!
//! ## Signatures
//!
//! OSTree supports GPG and ed25519 signatures natively, and it's expected by default that
//! when booting from a fetched container image, one verifies ostree-level signatures.
//! For ostree, a signing configuration is specified via an ostree remote. In order to
//! pair this configuration together, this library defines a "URL-like" string schema:
//! OSTree supports GPG and ed25519 signatures natively. When fetching container images,
//! signature verification can be configured via [`SignatureSource`]:
//!
//! `ostree-remote-registry:<remotename>:<containerimage>`
//! - `OstreeRemote(name)`: Verify using the named ostree remote's keyring
//! - `ContainerPolicy`: Defer to containers-policy.json (requires explicit allow)
//! - `ContainerPolicyAllowInsecure`: Use containers-policy.json defaults (not recommended)
//!
//! A concrete instantiation might be e.g.: `ostree-remote-registry:fedora:quay.io/coreos/fedora-coreos:stable`
//! This library defines a URL-like schema to combine signature verification with
//! image references:
//!
//! To parse and generate these strings, see [`OstreeImageReference`].
//! - `ostree-remote-registry:<remotename>:<containerimage>` - Verify via ostree remote
//! - `ostree-image-signed:<transport>:<image>` - Use container policy
//! - `ostree-unverified-registry:<image>` - No verification (not recommended)
//!
//! ## Layering
//! Example: `ostree-remote-registry:fedora:quay.io/fedora/fedora-bootc:latest`
//!
//! A key feature of container images is support for layering. At the moment, support
//! for this is [planned but not implemented](https://github.com/ostreedev/ostree-rs-ext/issues/12).
//! See [`OstreeImageReference`] for parsing and generating these strings.
//!
//! ## Layering and Derived Images
//!
//! Container image layering is fully supported. A typical bootable image structure:
//!
//! 1. **Base ostree layer**: Contains the core OS as an ostree commit
//! 2. **Chunk layers**: Split objects for efficient updates (optional)
//! 3. **Derived layers**: Additional content from Containerfile `RUN` commands
//!
//! The `ostree.final-diffid` label in the image configuration marks where the
//! ostree content ends and derived content begins. This enables:
//!
//! - Efficient layer sharing between images with the same base
//! - Proper SELinux labeling of derived content using the base policy
//! - Round-trip export preserving the layer structure
//!
//! ## Key Types
//!
//! - [`Transport`]: OCI/Docker transport (registry, oci-dir, containers-storage, etc.)
//! - [`ImageReference`]: Container image reference with transport
//! - [`OstreeImageReference`]: Image reference plus signature verification method
//! - [`SignatureSource`]: How to verify image signatures
//! - [`store::ImageImporter`]: Main import orchestrator
//! - [`store::PreparedImport`]: Analysis of layers to fetch
//! - [`store::LayeredImageState`]: State of a pulled image
//! - [`ManifestDiff`]: Comparison between two image manifests
//!
//! ## Submodules
//!
//! - [`store`]: Core storage and import logic
//! - [`deploy`]: Integration with ostree deployments
//! - [`skopeo`]: Skopeo subprocess management for registry operations
use anyhow::anyhow;
use cap_std_ext::cap_std;

View File

@@ -1,9 +1,129 @@
//! APIs for storing (layered) container images as OSTree commits
//! # Storing layered container images as OSTree commits
//!
//! # Extension of encapsulation support
//! This module implements the core storage and import logic for container images in
//! ostree. It handles fetching images from registries, caching layers as ostree commits,
//! and creating merge commits that represent the complete image filesystem.
//!
//! This code supports ingesting arbitrary layered container images from an ostree-exported
//! base. See [`encapsulate`][`super::encapsulate()`] for more information on encapsulation of images.
//! ## Overview
//!
//! The primary entry point is [`ImageImporter`], which orchestrates the import of a
//! container image. The import process efficiently handles incremental updates by
//! caching each layer as a separate ostree commit, only fetching layers that aren't
//! already present.
//!
//! ## Reference Namespace Constants
//!
//! Layers and images are stored using these ref prefixes (defined as constants in this module):
//!
//! - `ostree/container/blob`: Individual OCI layers stored as commits
//! - `ostree/container/image`: Merge commits for complete images
//! - [`BASE_IMAGE_PREFIX`] (`ostree/container/baseimage`): Protected base images (public)
//!
//! Layer refs use escaped digests (e.g., `sha256:abc...` becomes `sha256_3A_abc...`)
//! via [`crate::refescape`] to conform to ostree ref naming requirements.
//!
//! ## Import Process
//!
//! A typical import flow:
//!
//! 1. **Create importer**: [`ImageImporter::new`] initializes the proxy connection
//! to the container registry (via containers-image-proxy/skopeo).
//!
//! 2. **Prepare import**: [`ImageImporter::prepare`] fetches the manifest and
//! analyzes which layers need to be downloaded:
//! - Returns [`PrepareResult::AlreadyPresent`] if the image is unchanged
//! - Returns [`PrepareResult::Ready`] with a [`PreparedImport`] containing
//! the download plan
//!
//! 3. **Execute import**: [`ImageImporter::import`] downloads missing layers and
//! creates the merge commit:
//! - Each layer is fetched, decompressed, and imported as an ostree commit
//! - The merge commit overlays all layers, processing whiteouts
//! - Image metadata (manifest, config) is stored in commit metadata
//!
//! ## Layer Types
//!
//! The manifest layout is parsed to identify different layer types:
//!
//! - **Commit layer**: The base ostree commit layer (identified by `ostree.final-diffid`)
//! - **Component layers**: Additional ostree "chunk" layers containing split objects
//! - **Derived layers**: Non-ostree layers from Containerfile `RUN` commands
//!
//! Each layer type is handled differently during import:
//!
//! - Ostree layers use object-set import mode for efficient reconstruction
//! - Derived layers are imported as regular filesystem trees with SELinux labeling
//!
//! ## Merge Commit Metadata
//!
//! The merge commit stores essential metadata for image management:
//!
//! - `ostree.manifest-digest`: The canonical manifest digest (e.g., `sha256:...`)
//! - `ostree.manifest`: Complete OCI manifest as canonical JSON
//! - `ostree.container.image-config`: OCI image configuration as canonical JSON
//!
//! This metadata enables:
//! - Detecting when updates are available
//! - Re-exporting images with preserved structure
//! - Querying image state via [`query_image`] and [`query_image_commit`]
//!
//! ## Layer Caching and Deduplication
//!
//! Layers are cached by their content digest, enabling:
//!
//! - **Incremental updates**: Only changed layers are downloaded
//! - **Cross-image sharing**: Images sharing layers reuse cached commits
//! - **Efficient storage**: Ostree's content-addressed storage deduplicates files
//!
//! The `query_layer` function checks if a layer is already cached by looking up
//! its ref. During import, cached layers are skipped entirely.
//!
//! ## Garbage Collection
//!
//! Unreferenced layers are automatically pruned after imports via [`gc_image_layers`]:
//!
//! 1. Collect all layer digests referenced by stored images and deployments
//! 2. List all layer refs under `ostree/container/blob/`
//! 3. Remove refs for layers not in the referenced set
//!
//! Note: This only removes refs; actual object pruning requires a separate
//! call to `ostree::Repo::prune`.
//!
//! ## Key Types
//!
//! - [`ImageImporter`]: Main import orchestrator with progress tracking
//! - [`PrepareResult`]: Result of preparing an import (already present vs. ready)
//! - [`PreparedImport`]: Detailed import plan with layer analysis
//! - [`ManifestLayerState`]: Per-layer state (descriptor, ref, cached commit)
//! - [`LayeredImageState`]: Complete state of a pulled image
//! - [`CachedImageUpdate`]: Cached metadata for pending updates
//! - [`ImportProgress`]: Progress events for layer fetches
//! - [`LayerProgress`]: Byte-level progress for a single layer
//!
//! ## Example Usage
//!
//! ```ignore
//! use ostree_ext::container::{OstreeImageReference, store::ImageImporter};
//!
//! let imgref: OstreeImageReference = "ostree-unverified-registry:quay.io/fedora/fedora-bootc:latest".parse()?;
//! let mut importer = ImageImporter::new(&repo, &imgref, Default::default()).await?;
//!
//! match importer.prepare().await? {
//! PrepareResult::AlreadyPresent(state) => {
//! println!("Image already at {}", state.manifest_digest);
//! }
//! PrepareResult::Ready(prep) => {
//! println!("Fetching {} layers", prep.layers_to_fetch().count());
//! let state = importer.import(prep).await?;
//! println!("Imported {}", state.merge_commit);
//! }
//! }
//! ```
//!
//! ## See Also
//!
//! - [`super::encapsulate`]: Export ostree commits to container images
//! - [`crate::tar`]: Tar stream format for layer content
use super::*;
use crate::chunking::{self, Chunk};

View File

@@ -2,7 +2,15 @@
//!
//! This crate builds on top of the core ostree C library
//! and the Rust bindings to it, adding new functionality
//! written in Rust.
//! written in Rust.
//!
//! ## Key Modules
//!
//! - [`container`]: Bidirectional mapping between OCI container images and ostree commits.
//! This is the core of bootc's ability to deploy container images as bootable systems.
//! - [`tar`]: Lossless export and import of ostree commits as tar archives.
//! - [`sysroot`]: Extensions for managing ostree deployments.
//! - [`chunking`]: Splitting ostree commits into layers for efficient container updates.
// See https://doc.rust-lang.org/rustc/lints/listing/allowed-by-default.html
#![deny(missing_docs)]