From 49b9435aa55fd4e450fbf41de446c6dd6e0beb87 Mon Sep 17 00:00:00 2001
From: Ashleigh Brennan <abrennan@redhat.com>
Date: Tue, 16 Sep 2025 11:28:53 -0500
Subject: [PATCH] CNV-15425: Clean up NUMA conceptual docs

---
 modules/cnf-about-numa-aware-scheduling.adoc  | 20 -------------------
 .../cnf-numa-aware-scheduling.adoc            |  5 +++++
 snippets/about-numa.adoc                      | 14 +++++++++++++
 3 files changed, 19 insertions(+), 20 deletions(-)
 create mode 100644 snippets/about-numa.adoc

diff --git a/modules/cnf-about-numa-aware-scheduling.adoc b/modules/cnf-about-numa-aware-scheduling.adoc
index 07d586d02b..aa22421829 100644
--- a/modules/cnf-about-numa-aware-scheduling.adoc
+++ b/modules/cnf-about-numa-aware-scheduling.adoc
@@ -6,38 +6,18 @@
 [id="cnf-about-numa-aware-scheduling_{context}"]
 = About NUMA-aware scheduling
 
-[discrete]
-[id="introduction-to-numa_{context}"]
-== Introduction to NUMA
-
-Non-Uniform Memory Access (NUMA) is a compute platform architecture that allows different CPUs to access different regions of memory at different speeds. NUMA resource topology refers to the locations of CPUs, memory, and PCI devices relative to each other in the compute node. Colocated resources are said to be in the same _NUMA zone_. For high-performance applications, the cluster needs to process pod workloads in a single NUMA zone.
-
-[discrete]
-[id="performance-considerations_{context}"]
-== Performance considerations
-
-NUMA architecture allows a CPU with multiple memory controllers to use any available memory across CPU complexes, regardless of where the memory is located. This allows for increased flexibility at the expense of performance. A CPU processing a workload using memory that is outside its NUMA zone is slower than a workload processed in a single NUMA zone. Also, for I/O-constrained workloads, the network interface on a distant NUMA zone slows down how quickly information can reach the application. High-performance workloads, such as telecommunications workloads, cannot operate to specification under these conditions. 
-
-[discrete]
-[id="numa-aware-scheduling_{context}"]
-== NUMA-aware scheduling
-
 NUMA-aware scheduling aligns the requested cluster compute resources (CPUs, memory, devices) in the same NUMA zone to process latency-sensitive or high-performance workloads efficiently. NUMA-aware scheduling also improves pod density per compute node for greater resource efficiency.
 
-[discrete]
 [id="integration-with-node-tuning-operator_{context}"]
 == Integration with Node Tuning Operator
 
 By integrating the Node Tuning Operator's performance profile with NUMA-aware scheduling, you can further configure CPU affinity to optimize performance for latency-sensitive workloads.
 
-[discrete]
 [id="default-scheduling-logic_{context}"]
 == Default scheduling logic
 
 The default {product-title} pod scheduler scheduling logic considers the available resources of the entire compute node, not individual NUMA zones. If the most restrictive resource alignment is requested in the kubelet topology manager, error conditions can occur when admitting the pod to a node. Conversely, if the most restrictive resource alignment is not requested, the pod can be admitted to the node without proper resource alignment, leading to worse or unpredictable performance. For example, runaway pod creation with `Topology Affinity Error` statuses can occur when the pod scheduler makes suboptimal scheduling decisions for guaranteed pod workloads without knowing if the pod's requested resources are available. Scheduling mismatch decisions can cause indefinite pod startup delays. Also, depending on the cluster state and resource allocation, poor pod scheduling decisions can cause extra load on the cluster because of failed startup attempts.
 
-
-[discrete]
 [id="numa-aware-pod-scheduling-diagram_{context}"]
 == NUMA-aware pod scheduling diagram
 
diff --git a/scalability_and_performance/cnf-numa-aware-scheduling.adoc b/scalability_and_performance/cnf-numa-aware-scheduling.adoc
index 087481169d..7b793a8c4a 100644
--- a/scalability_and_performance/cnf-numa-aware-scheduling.adoc
+++ b/scalability_and_performance/cnf-numa-aware-scheduling.adoc
@@ -12,6 +12,11 @@ Learn about NUMA-aware scheduling and how you can use it to deploy high performa
 
 The NUMA Resources Operator allows you to schedule high-performance workloads in the same NUMA zone. It deploys a node resources exporting agent that reports on available cluster node NUMA resources, and a secondary scheduler that manages the workloads.
 
+[id="cnf-numa-aware-scheduling-about-numa_{context}"]
+== About NUMA
+
+include::snippets/about-numa.adoc[]
+
 include::modules/cnf-about-numa-aware-scheduling.adoc[leveloffset=+1]
 
 include::modules/cnf-numa-resource-scheduling-strategies.adoc[leveloffset=+1]
diff --git a/snippets/about-numa.adoc b/snippets/about-numa.adoc
new file mode 100644
index 0000000000..a90001af57
--- /dev/null
+++ b/snippets/about-numa.adoc
@@ -0,0 +1,14 @@
+// Snippets included in the following assemblies and modules:
+//
+// *scalability_and_performance/cnf-numa-aware-scheduling.adoc
+
+:_mod-docs-content-type: SNIPPET
+
+Non-uniform memory access (NUMA) architecture is a multiprocessor architecture model where CPUs do not access all memory in all locations at the same speed. Instead, CPUs can gain faster access to memory that is in closer proximity to them, or _local_ to them, but slower access to memory that is further away.
+
+A CPU with multiple memory controllers can use any available memory across CPU complexes, regardless of where the memory is located. However, this increased flexibility comes at the expense of performance.
+
+_NUMA resource topology_ refers to the physical locations of CPUs, memory, and PCI devices relative to each other in a _NUMA zone_. In a NUMA architecture, a NUMA zone is a group of CPUs that has its own processors and memory. Colocated resources are said to be in the same NUMA zone, and CPUs in a zone have faster access to the same local memory than CPUs outside of that zone.
+A CPU processing a workload using memory that is outside its NUMA zone is slower than a workload processed in a single NUMA zone. For I/O-constrained workloads, the network interface on a distant NUMA zone slows down how quickly information can reach the application.
+
+Applications can achieve better performance by containing data and processing within the same NUMA zone. For high-performance workloads and applications, such as telecommunications workloads, the cluster must process pod workloads in a single NUMA zone so that the workload can operate to specification.