Run build_rag.sh from the bind-mounted context dir, rather than copying
the script and requirements files into the image.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
Clean the docsite directory as part of "make clean". Remove generated
doc files from the repository; their presence creates a risk of
publishing incorrect documentation.
Signed-off-by: John Wiele <jwiele@redhat.com>
Standardize on multi-stage builds for all images, which avoids including
development tools and libraries in the final images, reducing image size.
Install all llama.cpp binaries and shared libraries for consistency with
upstream images. Avoid installing unnecessary (and large) .a files from
the installation directory.
Call build_llama.sh to install runtime dependencies in the final images
so package versions can be kept consistent between build and final images.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
Remove the Tekton pipelines for building the bats image, and the bats-integration
test. Run tests as part of the ramalama image build using the e2e image.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
Redirect the output from setvars.sh so commands that parse output (like
"ramalama bench") aren't confused.
If run with a tty and no args, start an interactive shell for convenience and
consistency with other images.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
Refactor the fetch_metadata method to attempt to fetch GGUF metadata
from a manifest, and then to attempt to fetch safetensors metadata
from the repo tree, including potentially paginated lists, although
there does not seem to be a huggingface model that is big enough to
require pagination yet.
To fetch a safetensors model, download the set of files from the repo
with https requests. Add an "other_files" category for safetensors
files that are neither JSON config files nor .safetensors model files,
such as the tokenizer.model file.
Remove code to download models from huggingface.co with the
huggingface cli. It didn't work correctly anyway. Stub out the
_collect_cli_files and in_existing_cache methods for the huggingface
transport.
Enable bats test of downloading safetensors models from
huggingface. Remove bats test and e2e test of downloading models with
the huggingface cli. Adjust test expectation for split safetensors
model.
Fixes: #1493
Signed-off-by: John Wiele <jwiele@redhat.com>
Just check the command output reports that it's pulling from cache.
Pull/pull-from-cache timing will vary depending on network/disk, so not safe
to assume pulling from disk will be twice as fast as network.
Signed-off-by: Oliver Walsh <owalsh@redhat.com>
Disable image pulling when using HIP_VISIBLE_DEVICES, otherwise the large
rocm image will be pulled.
Skip all of the generate tests when --nocontainer is true.
Signed-off-by: Oliver Walsh <owalsh@redhat.com>
Remove build and installation of whisper.cpp, and installation of ffmpeg.
Rename build_llama_and_whisper.sh to build_llama.sh.
Update Containerfiles to reference new script name.
Consolidate management of cmake args in build_llama.sh.
Remove references to whisper-server in various locations.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>