diff --git a/crates/ostree-ext/src/container/mod.rs b/crates/ostree-ext/src/container/mod.rs index 77d8e171..3550b6bf 100644 --- a/crates/ostree-ext/src/container/mod.rs +++ b/crates/ostree-ext/src/container/mod.rs @@ -1,29 +1,164 @@ //! # APIs bridging OSTree and container images //! -//! This module contains APIs to bidirectionally map between a single OSTree commit and a container image wrapping it. -//! Because container images are just layers of tarballs, this builds on the [`crate::tar`] module. +//! This module provides the core infrastructure for bidirectionally mapping between +//! OCI/Docker container images and OSTree repositories. It enables bootable container +//! images to be fetched from registries, stored efficiently, and deployed as ostree +//! commits. //! -//! To emphasize this, the current high level model is that this is a one-to-one mapping - an ostree commit -//! can be exported (wrapped) into a container image, which will have exactly one layer. Upon import -//! back into an ostree repository, all container metadata except for its digested checksum will be discarded. +//! ## Overview +//! +//! Container images are fundamentally layers of tarballs. This module leverages the +//! [`crate::tar`] module to import container layers as ostree content, and exports +//! ostree commits back to container images. The key insight is that ostree's +//! content-addressed object storage maps naturally to OCI layer deduplication. +//! +//! When a container image is imported ("pulled"), each layer becomes an ostree commit. +//! These layer commits are then merged into a single "merge commit" that represents +//! the complete filesystem state. This merge commit is what gets deployed as a +//! bootable system. +//! +//! ## On-Disk Storage Structure +//! +//! Container images are stored in the ostree repository (typically `/sysroot/ostree/repo/`) +//! using a structured reference (ref) namespace: +//! +//! ### Reference Namespace +//! +//! - **`ostree/container/blob/`**: Each OCI layer is stored as a +//! separate ostree commit. The digest (e.g., `sha256:abc123...`) is escaped using +//! [`crate::refescape`] to be valid as an ostree ref. For example: +//! `ostree/container/blob/sha256_3A_abc123...` +//! +//! - **`ostree/container/image/`**: Points to the "merge +//! commit" for a pulled image. The image reference (e.g., `docker://quay.io/org/image:tag`) +//! is escaped similarly. This is the ref that deployments point to. +//! +//! - **`ostree/container/baseimage//`**: Used to protect base images +//! from garbage collection. Tooling that builds derived images locally should write +//! refs under this prefix to prevent the base layers from being pruned. +//! +//! ### Layer Storage +//! +//! Each container layer is stored as an ostree commit with a special structure: +//! +//! - **OSTree "chunk" layers**: Layers that are part of the base ostree commit use +//! the "object set" format - the filenames in the commit *are* the object checksums. +//! This enables efficient reconstruction of the original ostree commit. +//! +//! - **Derived layers**: Non-ostree layers (e.g., from `RUN` commands in a Containerfile) +//! are imported as regular filesystem trees and stored as standard ostree commits. +//! +//! ### The Merge Commit +//! +//! The merge commit (`ostree/container/image/...`) combines all layers into a single +//! filesystem tree. It contains critical metadata in its commit metadata: +//! +//! - `ostree.manifest-digest`: The OCI manifest digest (e.g., `sha256:...`) +//! - `ostree.manifest`: The complete OCI manifest as JSON +//! - `ostree.container.image-config`: The OCI image configuration as JSON +//! +//! This metadata enables round-tripping: an imported image can be re-exported with +//! its original manifest structure preserved. +//! +//! ## Import Flow +//! +//! The import process (implemented in [`store::ImageImporter`]) follows these steps: +//! +//! 1. **Manifest fetch**: Contact the registry via containers-image-proxy (skopeo) +//! to retrieve the image manifest and configuration. +//! +//! 2. **Layout parsing**: Analyze the manifest to identify: +//! - The base ostree layer (identified by the `ostree.final-diffid` label) +//! - Component/chunk layers (split object sets) +//! - Derived layers (non-ostree content) +//! +//! 3. **Layer caching check**: For each layer, check if an ostree ref already exists +//! for that digest. Cached layers are skipped, enabling efficient incremental updates. +//! +//! 4. **Layer import**: For uncached layers: +//! - Fetch the compressed tarball from the registry +//! - Decompress and parse the tar stream +//! - Import content into ostree (handling xattrs via `bare-split-xattrs` format) +//! - Create an ostree commit and write the layer ref +//! +//! 5. **Merge commit creation**: Overlay all layers (processing OCI whiteout files) +//! to create a unified filesystem tree. Apply SELinux labeling if needed. +//! Store manifest/config metadata and write the image ref. +//! +//! 6. **Garbage collection**: Prune layer refs that are no longer referenced by any +//! image or deployment. +//! +//! ## Tar Stream Format +//! +//! The tar format used for ostree layers is documented in [`crate::tar`]. Key points: +//! +//! - Uses `bare-split-xattrs` repository mode to handle extended attributes +//! - XAttrs are stored in separate `.file-xattrs` objects, avoiding tar xattr complexity +//! - `/etc` in container images maps to `/usr/etc` in ostree (the "3-way merge" location) +//! - Hardlinks are used for deduplication within layers +//! +//! ## Connection to Deployments +//! +//! When bootc deploys an image, it creates an ostree deployment whose "origin" file +//! references the container image. The origin contains: +//! +//! - The [`OstreeImageReference`] specifying the image and signature verification method +//! - The merge commit checksum +//! +//! On subsequent boots, bootc can compare the deployed commit against the registry +//! manifest to detect available updates. //! //! ## Signatures //! -//! OSTree supports GPG and ed25519 signatures natively, and it's expected by default that -//! when booting from a fetched container image, one verifies ostree-level signatures. -//! For ostree, a signing configuration is specified via an ostree remote. In order to -//! pair this configuration together, this library defines a "URL-like" string schema: +//! OSTree supports GPG and ed25519 signatures natively. When fetching container images, +//! signature verification can be configured via [`SignatureSource`]: //! -//! `ostree-remote-registry::` +//! - `OstreeRemote(name)`: Verify using the named ostree remote's keyring +//! - `ContainerPolicy`: Defer to containers-policy.json (requires explicit allow) +//! - `ContainerPolicyAllowInsecure`: Use containers-policy.json defaults (not recommended) //! -//! A concrete instantiation might be e.g.: `ostree-remote-registry:fedora:quay.io/coreos/fedora-coreos:stable` +//! This library defines a URL-like schema to combine signature verification with +//! image references: //! -//! To parse and generate these strings, see [`OstreeImageReference`]. +//! - `ostree-remote-registry::` - Verify via ostree remote +//! - `ostree-image-signed::` - Use container policy +//! - `ostree-unverified-registry:` - No verification (not recommended) //! -//! ## Layering +//! Example: `ostree-remote-registry:fedora:quay.io/fedora/fedora-bootc:latest` //! -//! A key feature of container images is support for layering. At the moment, support -//! for this is [planned but not implemented](https://github.com/ostreedev/ostree-rs-ext/issues/12). +//! See [`OstreeImageReference`] for parsing and generating these strings. +//! +//! ## Layering and Derived Images +//! +//! Container image layering is fully supported. A typical bootable image structure: +//! +//! 1. **Base ostree layer**: Contains the core OS as an ostree commit +//! 2. **Chunk layers**: Split objects for efficient updates (optional) +//! 3. **Derived layers**: Additional content from Containerfile `RUN` commands +//! +//! The `ostree.final-diffid` label in the image configuration marks where the +//! ostree content ends and derived content begins. This enables: +//! +//! - Efficient layer sharing between images with the same base +//! - Proper SELinux labeling of derived content using the base policy +//! - Round-trip export preserving the layer structure +//! +//! ## Key Types +//! +//! - [`Transport`]: OCI/Docker transport (registry, oci-dir, containers-storage, etc.) +//! - [`ImageReference`]: Container image reference with transport +//! - [`OstreeImageReference`]: Image reference plus signature verification method +//! - [`SignatureSource`]: How to verify image signatures +//! - [`store::ImageImporter`]: Main import orchestrator +//! - [`store::PreparedImport`]: Analysis of layers to fetch +//! - [`store::LayeredImageState`]: State of a pulled image +//! - [`ManifestDiff`]: Comparison between two image manifests +//! +//! ## Submodules +//! +//! - [`store`]: Core storage and import logic +//! - [`deploy`]: Integration with ostree deployments +//! - [`skopeo`]: Skopeo subprocess management for registry operations use anyhow::anyhow; use cap_std_ext::cap_std; diff --git a/crates/ostree-ext/src/container/store.rs b/crates/ostree-ext/src/container/store.rs index 0ef4d980..284b1168 100644 --- a/crates/ostree-ext/src/container/store.rs +++ b/crates/ostree-ext/src/container/store.rs @@ -1,9 +1,129 @@ -//! APIs for storing (layered) container images as OSTree commits +//! # Storing layered container images as OSTree commits //! -//! # Extension of encapsulation support +//! This module implements the core storage and import logic for container images in +//! ostree. It handles fetching images from registries, caching layers as ostree commits, +//! and creating merge commits that represent the complete image filesystem. //! -//! This code supports ingesting arbitrary layered container images from an ostree-exported -//! base. See [`encapsulate`][`super::encapsulate()`] for more information on encapsulation of images. +//! ## Overview +//! +//! The primary entry point is [`ImageImporter`], which orchestrates the import of a +//! container image. The import process efficiently handles incremental updates by +//! caching each layer as a separate ostree commit, only fetching layers that aren't +//! already present. +//! +//! ## Reference Namespace Constants +//! +//! Layers and images are stored using these ref prefixes (defined as constants in this module): +//! +//! - `ostree/container/blob`: Individual OCI layers stored as commits +//! - `ostree/container/image`: Merge commits for complete images +//! - [`BASE_IMAGE_PREFIX`] (`ostree/container/baseimage`): Protected base images (public) +//! +//! Layer refs use escaped digests (e.g., `sha256:abc...` becomes `sha256_3A_abc...`) +//! via [`crate::refescape`] to conform to ostree ref naming requirements. +//! +//! ## Import Process +//! +//! A typical import flow: +//! +//! 1. **Create importer**: [`ImageImporter::new`] initializes the proxy connection +//! to the container registry (via containers-image-proxy/skopeo). +//! +//! 2. **Prepare import**: [`ImageImporter::prepare`] fetches the manifest and +//! analyzes which layers need to be downloaded: +//! - Returns [`PrepareResult::AlreadyPresent`] if the image is unchanged +//! - Returns [`PrepareResult::Ready`] with a [`PreparedImport`] containing +//! the download plan +//! +//! 3. **Execute import**: [`ImageImporter::import`] downloads missing layers and +//! creates the merge commit: +//! - Each layer is fetched, decompressed, and imported as an ostree commit +//! - The merge commit overlays all layers, processing whiteouts +//! - Image metadata (manifest, config) is stored in commit metadata +//! +//! ## Layer Types +//! +//! The manifest layout is parsed to identify different layer types: +//! +//! - **Commit layer**: The base ostree commit layer (identified by `ostree.final-diffid`) +//! - **Component layers**: Additional ostree "chunk" layers containing split objects +//! - **Derived layers**: Non-ostree layers from Containerfile `RUN` commands +//! +//! Each layer type is handled differently during import: +//! +//! - Ostree layers use object-set import mode for efficient reconstruction +//! - Derived layers are imported as regular filesystem trees with SELinux labeling +//! +//! ## Merge Commit Metadata +//! +//! The merge commit stores essential metadata for image management: +//! +//! - `ostree.manifest-digest`: The canonical manifest digest (e.g., `sha256:...`) +//! - `ostree.manifest`: Complete OCI manifest as canonical JSON +//! - `ostree.container.image-config`: OCI image configuration as canonical JSON +//! +//! This metadata enables: +//! - Detecting when updates are available +//! - Re-exporting images with preserved structure +//! - Querying image state via [`query_image`] and [`query_image_commit`] +//! +//! ## Layer Caching and Deduplication +//! +//! Layers are cached by their content digest, enabling: +//! +//! - **Incremental updates**: Only changed layers are downloaded +//! - **Cross-image sharing**: Images sharing layers reuse cached commits +//! - **Efficient storage**: Ostree's content-addressed storage deduplicates files +//! +//! The `query_layer` function checks if a layer is already cached by looking up +//! its ref. During import, cached layers are skipped entirely. +//! +//! ## Garbage Collection +//! +//! Unreferenced layers are automatically pruned after imports via [`gc_image_layers`]: +//! +//! 1. Collect all layer digests referenced by stored images and deployments +//! 2. List all layer refs under `ostree/container/blob/` +//! 3. Remove refs for layers not in the referenced set +//! +//! Note: This only removes refs; actual object pruning requires a separate +//! call to `ostree::Repo::prune`. +//! +//! ## Key Types +//! +//! - [`ImageImporter`]: Main import orchestrator with progress tracking +//! - [`PrepareResult`]: Result of preparing an import (already present vs. ready) +//! - [`PreparedImport`]: Detailed import plan with layer analysis +//! - [`ManifestLayerState`]: Per-layer state (descriptor, ref, cached commit) +//! - [`LayeredImageState`]: Complete state of a pulled image +//! - [`CachedImageUpdate`]: Cached metadata for pending updates +//! - [`ImportProgress`]: Progress events for layer fetches +//! - [`LayerProgress`]: Byte-level progress for a single layer +//! +//! ## Example Usage +//! +//! ```ignore +//! use ostree_ext::container::{OstreeImageReference, store::ImageImporter}; +//! +//! let imgref: OstreeImageReference = "ostree-unverified-registry:quay.io/fedora/fedora-bootc:latest".parse()?; +//! let mut importer = ImageImporter::new(&repo, &imgref, Default::default()).await?; +//! +//! match importer.prepare().await? { +//! PrepareResult::AlreadyPresent(state) => { +//! println!("Image already at {}", state.manifest_digest); +//! } +//! PrepareResult::Ready(prep) => { +//! println!("Fetching {} layers", prep.layers_to_fetch().count()); +//! let state = importer.import(prep).await?; +//! println!("Imported {}", state.merge_commit); +//! } +//! } +//! ``` +//! +//! ## See Also +//! +//! - [`super::encapsulate`]: Export ostree commits to container images +//! - [`crate::tar`]: Tar stream format for layer content use super::*; use crate::chunking::{self, Chunk}; diff --git a/crates/ostree-ext/src/lib.rs b/crates/ostree-ext/src/lib.rs index 5de91010..d2f1191b 100644 --- a/crates/ostree-ext/src/lib.rs +++ b/crates/ostree-ext/src/lib.rs @@ -2,7 +2,15 @@ //! //! This crate builds on top of the core ostree C library //! and the Rust bindings to it, adding new functionality -//! written in Rust. +//! written in Rust. +//! +//! ## Key Modules +//! +//! - [`container`]: Bidirectional mapping between OCI container images and ostree commits. +//! This is the core of bootc's ability to deploy container images as bootable systems. +//! - [`tar`]: Lossless export and import of ostree commits as tar archives. +//! - [`sysroot`]: Extensions for managing ostree deployments. +//! - [`chunking`]: Splitting ostree commits into layers for efficient container updates. // See https://doc.rust-lang.org/rustc/lints/listing/allowed-by-default.html #![deny(missing_docs)]