Eric Curtin
057c19e8d2
Remove libexec files
...
This is breaking nocontainer invocations, python package managers
don't recognize libexec files and replace the shebang.
Signed-off-by: Eric Curtin <ecurtin@redhat.com >
2025-06-11 05:20:49 +01:00
Daniel J Walsh
d98adcbc9f
Merge pull request #1499 from containers/update-shortnames
...
This is not a multi-model model
2025-06-10 23:43:49 -04:00
Eric Curtin
6959d73d30
Merge pull request #1501 from alaviss/push-tumrzqxpzvkn
...
amdkfd: add constants for heap types
2025-06-10 17:41:49 -05:00
Leorize
309766dd8c
amdkfd: add constants for heap types
...
Signed-off-by: Leorize <leorize+oss@disroot.org >
2025-06-10 17:22:30 -05:00
Eric Curtin
4808a49de0
Merge pull request #1500 from alaviss/push-pwxuznmnqptr
...
Only enumerate ROCm-capable AMD GPUs
2025-06-10 17:02:17 -05:00
Leorize
db4a7d24af
Apply formatting fixes
...
Signed-off-by: Leorize <leorize+oss@disroot.org >
2025-06-10 15:20:18 -05:00
Leorize
93e36ac24e
Extract VRAM minimum into a constant
...
Signed-off-by: Leorize <leorize+oss@disroot.org >
2025-06-10 15:17:37 -05:00
Leorize
ecb9fb086f
Extract amdkfd utilities to its own module
...
Signed-off-by: Leorize <leorize+oss@disroot.org >
2025-06-10 15:17:20 -05:00
Leorize
fab87654cb
Only enumerate ROCm-capable AMD GPUs
...
Discover AMD graphics devices using AMDKFD topology instead of
enumerating the PCIe bus. This interface exposes a lot more information
about potential devices, allowing RamaLama to filter out unsupported
devices.
Currently, devices older than GFX9 are filtered, as they are no longer
supported by ROCm.
Signed-off-by: Leorize <leorize+oss@disroot.org >
2025-06-10 14:54:48 -05:00
Eric Curtin
9bc76c2757
This is not a multi-model model
...
Although the other gemma once are. Point the user towards a single
gguf.
Signed-off-by: Eric Curtin <ecurtin@redhat.com >
2025-06-10 18:43:06 +01:00
Daniel J Walsh
83a75f16f7
Merge pull request #1492 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1749542372
2025-06-10 08:42:14 -04:00
Daniel J Walsh
8a9f6a0291
Merge pull request #1496 from containers/fix-build
...
Install uv to fix build issue
2025-06-10 08:32:17 -04:00
Eric Curtin
b21556b513
Install uv to fix build issue
...
Run the install-uv.sh script.
Signed-off-by: Eric Curtin <ecurtin@redhat.com >
2025-06-10 13:14:56 +01:00
Daniel J Walsh
4be8cbc71e
Merge pull request #1495 from containers/dont-use-llvmpipe
...
There's a change that we want that avoids using software rasterizers
2025-06-10 08:08:50 -04:00
Eric Curtin
b4a3375d94
There's a change that we want that avoids using software rasterizers
...
It avoids using llvmpipe when Vulkan is built in and fallsback to
ggml-cpu.
Signed-off-by: Eric Curtin <ecurtin@redhat.com >
2025-06-10 13:05:31 +01:00
Daniel J Walsh
7bdd073b59
Merge pull request #1491 from makllama/xd/fix_hf
...
Fix #1489
2025-06-10 05:25:40 -04:00
renovate[bot]
5b849722cb
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1749542372
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-06-10 09:22:45 +00:00
Daniel J Walsh
5925bb6908
Merge pull request #1490 from rhatdan/llama-stack
...
Make sure llama-stack URL is shown to user
2025-06-10 05:22:05 -04:00
Xiaodong Ye
ae0775afd1
Address review comments
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
2025-06-10 16:45:47 +08:00
Xiaodong Ye
6f020d361c
Fix #1489
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
2025-06-10 16:39:26 +08:00
Daniel J Walsh
764fc2d829
Make sure llama-stack URL is shown to user
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com >
2025-06-10 09:50:04 +02:00
Daniel J Walsh
b64d82276c
Merge pull request #1471 from rhatdan/oci
...
Throw exception when using OCI without engine
2025-06-10 03:36:20 -04:00
Daniel J Walsh
041c05d2b8
Throw exception when using OCI without engine
...
Fixes: https://github.com/containers/ramalama/issues/1463
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com >
2025-06-10 08:46:01 +02:00
Daniel J Walsh
97a14e9c2d
Merge pull request #1486 from containers/remove-duplicate-line-on-restapi
...
Only print this in the llama-stack case
2025-06-10 00:09:54 -04:00
Eric Curtin
2368da00ac
Only print this in the llama-stack case
...
In the llama.cpp case it doesn't make as much sense, llama-server
prints this string when it's ready to be served like so:
main: server is listening on http://0.0.0.0:8080 - starting the main loop
This can be printed seconds or minutes too early potentially in
the llama.cpp case.
Signed-off-by: Eric Curtin <ecurtin@redhat.com >
2025-06-09 15:25:08 +01:00
Daniel J Walsh
c62acfbba6
Merge pull request #1484 from rhatdan/VERSION
...
Bump to v0.9.1
v0.9.1
2025-06-09 08:37:35 -04:00
Daniel J Walsh
9c639fc651
Bump to v0.9.1
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com >
2025-06-09 14:37:05 +02:00
Daniel J Walsh
bbcfb7c0f1
Fix llama-stack
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com >
2025-06-09 14:37:05 +02:00
Daniel J Walsh
3317372625
Merge pull request #1474 from rhatdan/demos
...
Update demos to show serving models.
2025-06-09 03:35:06 -04:00
Daniel J Walsh
cd2a8c3539
Update demo scripts to show serve
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com >
2025-06-09 09:34:36 +02:00
Daniel J Walsh
fe6d90461f
Merge pull request #1472 from rhatdan/llama-stack
...
Fix handling of generate with llama-stack
2025-06-09 03:29:53 -04:00
Daniel J Walsh
e4ea40a1b8
Merge pull request #1483 from containers/renovate/huggingface-hub-0.x
...
fix(deps): update dependency huggingface-hub to ~=0.32.4
2025-06-09 00:14:15 -04:00
renovate[bot]
9627b5617b
fix(deps): update dependency huggingface-hub to ~=0.32.4
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-06-08 20:35:25 +00:00
Eric Curtin
4a10c02716
Merge pull request #1481 from ieaves/imp/dev-dependency-groups
...
Adds dev dependency groups
2025-06-08 15:34:54 -05:00
Daniel J Walsh
4fe7ae73a1
Fix stopping of llama-stack based containers by name
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com >
2025-06-08 11:54:24 +02:00
Daniel J Walsh
2ca6b57dc3
Fix handling of generate with llama-stack
...
llama-stack API is not working without --generate command.
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com >
2025-06-07 10:36:46 +02:00
Ian Eaves
f65529bda7
adds dev dependency groups
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com >
2025-06-06 18:12:33 -05:00
Nathan Weinberg
268e47ccc0
Merge pull request #1478 from nathan-weinberg/stack-bump
...
chore: bump 'ramalama-stack' version to 0.2.0
2025-06-05 16:15:03 -04:00
Nathan Weinberg
c59a507426
chore: bump 'ramalama-stack' version to 0.2.0
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com >
2025-06-05 15:00:11 -04:00
Daniel J Walsh
fc9b33e436
Merge pull request #1477 from containers/no-warmup
...
Don't warmup by default
2025-06-05 14:46:30 -04:00
Eric Curtin
8d2041a0bb
Don't warmup by default
...
llama-server by default warms up the model with an empty run for
performance reasons. We can warm up ourselves with a real query.
Warming up was causing issues and delays start time.
Signed-off-by: Eric Curtin <ecurtin@redhat.com >
2025-06-05 19:42:41 +01:00
Daniel J Walsh
a67d8c1f6a
Merge pull request #1476 from containers/env-var
...
Call set_gpu_type_env_vars rather than set_accel_env_vars
2025-06-05 14:05:08 -04:00
Eric Curtin
882011029c
Call set_gpu_type_env_vars rather than set_accel_env_vars
...
For GPU detection.
Signed-off-by: Eric Curtin <ecurtin@redhat.com >
2025-06-05 16:43:48 +01:00
Daniel J Walsh
f07a062124
Merge pull request #1475 from containers/env-var
...
Do not override a small subset of env vars
2025-06-05 11:00:31 -04:00
Eric Curtin
ff446f96fb
Do not override a small subset of env vars
...
RamaLama does not try to detect GPU if the user has already set
certain env vars. Make this list smaller.
Signed-off-by: Eric Curtin <ecurtin@redhat.com >
2025-06-05 14:01:45 +01:00
Daniel J Walsh
ef7bd2a004
Merge pull request #1467 from rhatdan/llama-stack
...
llama-stack container build fails with == 1.5.0
2025-06-05 01:39:04 -04:00
Daniel J Walsh
b990ef0392
Merge pull request #1469 from containers/timeout-change
...
Change timeouts
2025-06-04 20:13:44 -04:00
Eric Curtin
0bcf3b8308
Merge pull request #1468 from waltdisgrace/documentation_improvements
...
Documentation improvements
2025-06-04 11:38:55 -05:00
Eric Curtin
0455e45073
Change timeouts
...
Most we want to sleep between request attempts in 100ms, a request
every 100ms isn't that expensive.
Signed-off-by: Eric Curtin <ecurtin@redhat.com >
2025-06-04 17:37:11 +01:00
Grace Chin
c4777a9ccc
Add documentation about running tests
...
Signed-off-by: Grace Chin <gchin@redhat.com >
2025-06-04 11:55:57 -04:00