ramalama

mirror of https://github.com/containers/ramalama.git synced 2026-02-05 15:47:26 +01:00

Author	SHA1	Message	Date
Eric Curtin	057c19e8d2	Remove libexec files This is breaking nocontainer invocations, python package managers don't recognize libexec files and replace the shebang. Signed-off-by: Eric Curtin <ecurtin@redhat.com>	2025-06-11 05:20:49 +01:00
Daniel J Walsh	d98adcbc9f	Merge pull request #1499 from containers/update-shortnames This is not a multi-model model	2025-06-10 23:43:49 -04:00
Eric Curtin	6959d73d30	Merge pull request #1501 from alaviss/push-tumrzqxpzvkn amdkfd: add constants for heap types	2025-06-10 17:41:49 -05:00
Leorize	309766dd8c	amdkfd: add constants for heap types Signed-off-by: Leorize <leorize+oss@disroot.org>	2025-06-10 17:22:30 -05:00
Eric Curtin	4808a49de0	Merge pull request #1500 from alaviss/push-pwxuznmnqptr Only enumerate ROCm-capable AMD GPUs	2025-06-10 17:02:17 -05:00
Leorize	db4a7d24af	Apply formatting fixes Signed-off-by: Leorize <leorize+oss@disroot.org>	2025-06-10 15:20:18 -05:00
Leorize	93e36ac24e	Extract VRAM minimum into a constant Signed-off-by: Leorize <leorize+oss@disroot.org>	2025-06-10 15:17:37 -05:00
Leorize	ecb9fb086f	Extract amdkfd utilities to its own module Signed-off-by: Leorize <leorize+oss@disroot.org>	2025-06-10 15:17:20 -05:00
Leorize	fab87654cb	Only enumerate ROCm-capable AMD GPUs Discover AMD graphics devices using AMDKFD topology instead of enumerating the PCIe bus. This interface exposes a lot more information about potential devices, allowing RamaLama to filter out unsupported devices. Currently, devices older than GFX9 are filtered, as they are no longer supported by ROCm. Signed-off-by: Leorize <leorize+oss@disroot.org>	2025-06-10 14:54:48 -05:00
Eric Curtin	9bc76c2757	This is not a multi-model model Although the other gemma once are. Point the user towards a single gguf. Signed-off-by: Eric Curtin <ecurtin@redhat.com>	2025-06-10 18:43:06 +01:00
Daniel J Walsh	83a75f16f7	Merge pull request #1492 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1749542372	2025-06-10 08:42:14 -04:00
Daniel J Walsh	8a9f6a0291	Merge pull request #1496 from containers/fix-build Install uv to fix build issue	2025-06-10 08:32:17 -04:00
Eric Curtin	b21556b513	Install uv to fix build issue Run the install-uv.sh script. Signed-off-by: Eric Curtin <ecurtin@redhat.com>	2025-06-10 13:14:56 +01:00
Daniel J Walsh	4be8cbc71e	Merge pull request #1495 from containers/dont-use-llvmpipe There's a change that we want that avoids using software rasterizers	2025-06-10 08:08:50 -04:00
Eric Curtin	b4a3375d94	There's a change that we want that avoids using software rasterizers It avoids using llvmpipe when Vulkan is built in and fallsback to ggml-cpu. Signed-off-by: Eric Curtin <ecurtin@redhat.com>	2025-06-10 13:05:31 +01:00
Daniel J Walsh	7bdd073b59	Merge pull request #1491 from makllama/xd/fix_hf Fix #1489	2025-06-10 05:25:40 -04:00
renovate[bot]	5b849722cb	chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1749542372 Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>	2025-06-10 09:22:45 +00:00
Daniel J Walsh	5925bb6908	Merge pull request #1490 from rhatdan/llama-stack Make sure llama-stack URL is shown to user	2025-06-10 05:22:05 -04:00
Xiaodong Ye	ae0775afd1	Address review comments Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2025-06-10 16:45:47 +08:00
Xiaodong Ye	6f020d361c	Fix #1489 Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2025-06-10 16:39:26 +08:00
Daniel J Walsh	764fc2d829	Make sure llama-stack URL is shown to user Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2025-06-10 09:50:04 +02:00
Daniel J Walsh	b64d82276c	Merge pull request #1471 from rhatdan/oci Throw exception when using OCI without engine	2025-06-10 03:36:20 -04:00
Daniel J Walsh	041c05d2b8	Throw exception when using OCI without engine Fixes: https://github.com/containers/ramalama/issues/1463 Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2025-06-10 08:46:01 +02:00
Daniel J Walsh	97a14e9c2d	Merge pull request #1486 from containers/remove-duplicate-line-on-restapi Only print this in the llama-stack case	2025-06-10 00:09:54 -04:00
Eric Curtin	2368da00ac	Only print this in the llama-stack case In the llama.cpp case it doesn't make as much sense, llama-server prints this string when it's ready to be served like so: main: server is listening on http://0.0.0.0:8080 - starting the main loop This can be printed seconds or minutes too early potentially in the llama.cpp case. Signed-off-by: Eric Curtin <ecurtin@redhat.com>	2025-06-09 15:25:08 +01:00
Daniel J Walsh	c62acfbba6	Merge pull request #1484 from rhatdan/VERSION Bump to v0.9.1 v0.9.1	2025-06-09 08:37:35 -04:00
Daniel J Walsh	9c639fc651	Bump to v0.9.1 Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2025-06-09 14:37:05 +02:00
Daniel J Walsh	bbcfb7c0f1	Fix llama-stack Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2025-06-09 14:37:05 +02:00
Daniel J Walsh	3317372625	Merge pull request #1474 from rhatdan/demos Update demos to show serving models.	2025-06-09 03:35:06 -04:00
Daniel J Walsh	cd2a8c3539	Update demo scripts to show serve Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2025-06-09 09:34:36 +02:00
Daniel J Walsh	fe6d90461f	Merge pull request #1472 from rhatdan/llama-stack Fix handling of generate with llama-stack	2025-06-09 03:29:53 -04:00
Daniel J Walsh	e4ea40a1b8	Merge pull request #1483 from containers/renovate/huggingface-hub-0.x fix(deps): update dependency huggingface-hub to ~=0.32.4	2025-06-09 00:14:15 -04:00
renovate[bot]	9627b5617b	fix(deps): update dependency huggingface-hub to ~=0.32.4 Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>	2025-06-08 20:35:25 +00:00
Eric Curtin	4a10c02716	Merge pull request #1481 from ieaves/imp/dev-dependency-groups Adds dev dependency groups	2025-06-08 15:34:54 -05:00
Daniel J Walsh	4fe7ae73a1	Fix stopping of llama-stack based containers by name Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2025-06-08 11:54:24 +02:00
Daniel J Walsh	2ca6b57dc3	Fix handling of generate with llama-stack llama-stack API is not working without --generate command. Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com> Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>	2025-06-07 10:36:46 +02:00
Ian Eaves	f65529bda7	adds dev dependency groups Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>	2025-06-06 18:12:33 -05:00
Nathan Weinberg	268e47ccc0	Merge pull request #1478 from nathan-weinberg/stack-bump chore: bump 'ramalama-stack' version to 0.2.0	2025-06-05 16:15:03 -04:00
Nathan Weinberg	c59a507426	chore: bump 'ramalama-stack' version to 0.2.0 Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-06-05 15:00:11 -04:00
Daniel J Walsh	fc9b33e436	Merge pull request #1477 from containers/no-warmup Don't warmup by default	2025-06-05 14:46:30 -04:00
Eric Curtin	8d2041a0bb	Don't warmup by default llama-server by default warms up the model with an empty run for performance reasons. We can warm up ourselves with a real query. Warming up was causing issues and delays start time. Signed-off-by: Eric Curtin <ecurtin@redhat.com>	2025-06-05 19:42:41 +01:00
Daniel J Walsh	a67d8c1f6a	Merge pull request #1476 from containers/env-var Call set_gpu_type_env_vars rather than set_accel_env_vars	2025-06-05 14:05:08 -04:00
Eric Curtin	882011029c	Call set_gpu_type_env_vars rather than set_accel_env_vars For GPU detection. Signed-off-by: Eric Curtin <ecurtin@redhat.com>	2025-06-05 16:43:48 +01:00
Daniel J Walsh	f07a062124	Merge pull request #1475 from containers/env-var Do not override a small subset of env vars	2025-06-05 11:00:31 -04:00
Eric Curtin	ff446f96fb	Do not override a small subset of env vars RamaLama does not try to detect GPU if the user has already set certain env vars. Make this list smaller. Signed-off-by: Eric Curtin <ecurtin@redhat.com>	2025-06-05 14:01:45 +01:00
Daniel J Walsh	ef7bd2a004	Merge pull request #1467 from rhatdan/llama-stack llama-stack container build fails with == 1.5.0	2025-06-05 01:39:04 -04:00
Daniel J Walsh	b990ef0392	Merge pull request #1469 from containers/timeout-change Change timeouts	2025-06-04 20:13:44 -04:00
Eric Curtin	0bcf3b8308	Merge pull request #1468 from waltdisgrace/documentation_improvements Documentation improvements	2025-06-04 11:38:55 -05:00
Eric Curtin	0455e45073	Change timeouts Most we want to sleep between request attempts in 100ms, a request every 100ms isn't that expensive. Signed-off-by: Eric Curtin <ecurtin@redhat.com>	2025-06-04 17:37:11 +01:00
Grace Chin	c4777a9ccc	Add documentation about running tests Signed-off-by: Grace Chin <gchin@redhat.com>	2025-06-04 11:55:57 -04:00

1 2 3 4 5 ...

2360 Commits