openshift-docs/modules/microshift-rhoai-workflow.adoc

// Module included in the following assemblies:
//
// * microshift_ai/microshift-rhoai.adoc

:_mod-docs-content-type: CONCEPT
[id="microshift-rhoai-workflow_{context}"]
= Workflow for using {rhoai} with {microshift-short}

Using {rhoai} with {microshift-short} requires the following general workflow:

Getting your AI model ready::

* Choose the artificial intelligence (AI) model that best aligns with your edge application and the decisions that need to be made at {microshift-short} deployment sites.
* In the cloud or data center, develop, train, and test your model.
* Plan for the system requirements and additional resources your AI model requires to run.

Setting up the deployment environment::

* Configure your {op-system-bundle} for the specific hardware your deployment runs on, including driver and device plugins.

* To enable GPU or other hardware accelerators for {microshift-short}, follow the guidance specific for your edge device about what you need to install. For example, to use an NVIDIA GPU accelerator, begin by reading the following NVIDIA documentation: link:https://docs.nvidia.com/datacenter/cloud-native/edge/latest/nvidia-gpu-with-device-edge.html#running-a-gpu-accelerated-workload-on-red-hat-device-edge[Running a GPU-Accelerated Workload on Red Hat Device Edge] (NVIDIA documentation).

* For troubleshooting, consult the device documentation or product support.
+
[TIP]
====
Using only a driver and device plugin instead of an Operator might be more resource efficient.
====

Installing the {microshift-short} {rhoai} RPM::

* Install the `microshift-ai-model-serving` RPM package.

* Restart {microshift-short} if you are adding the RPM while {microshift-short} is running.

Getting ready to deploy::

* Package your AI model into an OCI image, otherwise known as the ModelCar format. If you already have S3-compatible storage or a persistent volume claim set up, you can skip this step, but only the ModelCar format is tested and supported for {microshift-short}.

* Select a model-serving runtime, which acts as your model server. Configure the runtime with the serving runtime and inference service.

** Copy the `ServingRuntime` custom resource (CR) from the default `redhat-ods-applications` namespace to your own namespace.

** Create the `InferenceService` CR.

* Optional: Create a `Route` object so that your model can connect outside the cluster.

Using your model::

* Make requests against the model server. For example, another pod running in your {microshift-short} deployment that is attached to a camera can stream an image back to the model-serving runtime. The model-serving runtime prepares that image as data for model inferencing. If the model was trained in the binary identification of a bee, the AI model outputs the likelihood that the image data is a bee.