1
0
mirror of https://github.com/containers/ramalama.git synced 2026-02-05 06:46:39 +01:00

3151 Commits

Author SHA1 Message Date
Mike Bonnet
dc54155ba2 Merge pull request #2003 from rhatdan/VERSION
Bump to v0.12.4
v0.12.4
2025-10-06 13:34:00 -07:00
Daniel J Walsh
7d00e37e38 Bump to v0.12.4
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-10-06 11:41:01 -04:00
Daniel J Walsh
82047b1f1f Merge pull request #1999 from rhatdan/llama.cpp
Bump the versions of llama.cpp and whisper.cpp
2025-10-06 11:39:34 -04:00
Daniel J Walsh
de6c684d7a Merge pull request #1982 from rhatdan/max
Add unified --max-tokens CLI argument for output token limiting
2025-10-06 11:15:49 -04:00
Daniel J Walsh
eb6acd26fa Add unified --max-tokens CLI argument for output token limiting
Implements GitHub issue #1981 by adding a unified --max-tokens argument
that works across different AI runtimes with appropriate parameter mapping.

Changes:
- Add max_tokens field to BaseConfig with default value 0 (unlimited)
- Add --max-tokens CLI argument to run, serve, and perplexity commands
- Implement runtime-specific parameter mapping:
  * llama.cpp: maps to -n parameter
  * MLX: maps to --max-tokens parameter
  * vLLM: maps to --max-tokens parameter
- Add daemon service support for max_tokens parameter
- Include comprehensive unit and e2e tests

The --max-tokens argument accepts integer values where 0 means unlimited
output tokens, providing a consistent interface across all supported
AI runtimes while maintaining compatibility with existing runtime_args.

Update man pages and config documentation for --max-tokens

Fixes #1981

Signed-off-by: Assistant <assistant@cursor.com>
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-10-06 10:30:58 -04:00
Daniel J Walsh
c5b86deeda Bump the versions of llama.cpp and whisper.cpp
Fixes: https://github.com/containers/ramalama/issues/1997

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-10-06 10:22:31 -04:00
Daniel J Walsh
902df29e22 Merge pull request #2000 from olliewalsh/generic_llamacpp
Fix llama.cpp build instruction set
2025-10-03 06:25:06 -04:00
Daniel J Walsh
ba6ff658ee Merge pull request #1994 from bmahabirbu/linux-demo-update
chore: update demo script with RAG and mcp based on dev conf presenta…
2025-10-03 06:16:50 -04:00
Brian
5882abf932 chore: update demo script with RAG and mcp based on dev conf presentation
Signed-off-by: Brian <bmahabir@bu.edu>
2025-10-02 21:52:13 -04:00
Oliver Walsh
1103de0f07 Fix llama.cpp build instruction set
GGML_NATIVE=OFF does not work as one would expect.
Set SOURCE_DATE_EPOCH instead to emulate an rpm build environment.

Signed-off-by: Oliver Walsh <owalsh@redhat.com>
2025-10-02 21:11:24 +00:00
Daniel J Walsh
ea05811098 Merge pull request #1998 from olliewalsh/typo
Fix typo in llama.cpp engine spec
2025-10-02 06:26:59 -04:00
Oliver Walsh
89cd6c360b Fix typo in llama.cpp engine spec
Signed-off-by: Oliver Walsh <owalsh@redhat.com>
2025-10-02 10:59:17 +01:00
Daniel J Walsh
f2a97075dd Merge pull request #1988 from ramalama-labs/imp/docsite
updates docsite and adds docsite to the make docs process
2025-10-02 05:22:25 -04:00
Daniel J Walsh
da142a9fec Merge pull request #1990 from jwieleRH/tox
Fix --exclude-dir arguments to grep in Makefile and add .tox.
2025-10-02 05:20:13 -04:00
Daniel J Walsh
8eb4fffae1 Merge pull request #1991 from containers/konflux/mintmaker/main/pycqa-isort-6.x
Update pre-commit hook pycqa/isort to v6.1.0
2025-10-02 05:17:56 -04:00
Daniel J Walsh
f6e70d629b Merge pull request #1992 from containers/renovate/react-monorepo
Update react monorepo to v19.2.0
2025-10-02 05:17:23 -04:00
renovate[bot]
52378bac18 Update react monorepo to v19.2.0
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-10-01 22:04:46 +00:00
red-hat-konflux-kflux-prd-rh03[bot]
f1a12467b6 Update pre-commit hook pycqa/isort to v6.1.0
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <206760901+red-hat-konflux-kflux-prd-rh03[bot]@users.noreply.github.com>
2025-10-01 20:05:38 +00:00
John Wiele
01e2c0bcc9 Fix --exclude-dir arguments to grep in Makefile and add .tox.
Use the GNU Make addprefix function instead of relying on shell brace
expansion to generate --exclude-dir arguments for grep.

GNU Make uses /bin/sh to run recipes by default. /bin/sh may not
support brace expansion at all. On Fedora and RHEL, /bin/sh is an
alias for bash. On Ubuntu, /bin/sh is an alias for dash. On MacOS,
/bin/sh is an alias for zsh. Those all handle braces differently. In
particular, dash on Ubuntu does not do brace expansion within a word
with the result that PYTHON_SCRIPTS is generated incorrectly.

Signed-off-by: John Wiele <jwiele@redhat.com>
2025-10-01 14:44:16 -04:00
Ian Eaves
5cd9722231 lint
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-10-01 12:28:15 -05:00
Ian Eaves
b32ce823f4 updates docsite and adds docsite to the make docs process
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-10-01 12:12:48 -05:00
Daniel J Walsh
ccc6e6a61f Merge pull request #1986 from engelmi/add-perplexity-and-bench
Add perplexity and bench
2025-10-01 11:03:41 -04:00
Michael Engel
5d00a839f5 Use sys.prefix instead of hard-coded /usr prefix for default config dir list
By adding the sys.prefix instead of /usr, ramalama is able to use
installed configuration inside a virtual environment (e.g. created
via tox)

Signed-off-by: Michael Engel <mengel@redhat.com>
2025-10-01 16:02:43 +02:00
Michael Engel
0991c83259 Use yaml anchor for options
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-10-01 16:02:43 +02:00
Michael Engel
e5264f450c Removed obsolete gpu_args function
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-10-01 16:02:43 +02:00
Michael Engel
403c310b07 Moved hard-coded bench to llama.cpp spec
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-10-01 16:02:43 +02:00
Michael Engel
299e54ecb4 Moved hard-coded perplexity to llama.cpp spec
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-10-01 16:02:43 +02:00
Daniel J Walsh
602468e1bf Merge pull request #1978 from telemaco/e2e-pytest-run-cmd
Add e2e pytest test for run command
2025-09-30 16:54:00 -04:00
Daniel J Walsh
7da8ac82cc Merge pull request #1959 from engelmi/inference-engine-spec
Inference engine spec
2025-09-30 16:11:27 -04:00
Daniel J Walsh
39a2cf7908 Merge pull request #1958 from rhatdan/chat
Fix handling of API_KEY in ramalama chat
2025-09-30 16:04:45 -04:00
Michael Engel
ad742219a8 Replace equals with substring or equals check in is_healthy
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-09-30 17:57:10 +02:00
Michael Engel
09277eecb8 Replace eval() with more Jinja templating
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-09-30 17:57:10 +02:00
Michael Engel
2753c371b1 Move ensure_model_exists from serve and run to cli functions
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-09-30 17:57:10 +02:00
Michael Engel
afe9913bb7 Replace hard-coded serve and run command with builder
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-09-30 17:57:10 +02:00
Michael Engel
fcdbb52a49 Removed obsolete model.py file
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-09-30 17:57:10 +02:00
Michael Engel
5aa484d4c4 Added command builder based on external specification
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-09-30 17:57:10 +02:00
Daniel J Walsh
9ca6bdca18 Fix handling of API_KEY in ramalama chat
First all RAMALAMA environment variables should be prefixed with
RAMALAMA, so changing API_KEY to RAMALAMA_API_KEY.

Also add api_key to ramalama.conf config files.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-09-30 10:34:07 -04:00
Daniel J Walsh
f7c3111f5b Merge pull request #1970 from containers/renovate/macos-15.x
chore(deps): update dependency macos to v15
2025-09-29 08:28:32 -04:00
Daniel J Walsh
78f1f0c132 Merge pull request #1972 from jeffmaury/GH-1971
fix: rename file with : in name
2025-09-29 08:27:51 -04:00
Daniel J Walsh
411737d4ca Merge pull request #1973 from telemaco/e2e-pytest-help-cmd
Add e2e pytest test for help command
2025-09-29 08:26:36 -04:00
Daniel J Walsh
d75e9b8e25 Merge pull request #1975 from containers/konflux/references/main
chore(deps): update konflux references to abf231c
2025-09-29 08:26:09 -04:00
Daniel J Walsh
961427f78a Merge pull request #1980 from futursolo/fix-xe-driver
fix(intel): detect xe driver
2025-09-29 08:21:27 -04:00
Roberto Majadas
86d6079ef5 Add e2e pytest test for run command
- Migrate to e2e pytest the `test/system/030-run.bats` bat test.
- Basic dry-run functionality with and without prompts
- Various parameter combinations, including environment variables, security settings, pull policies, and runtime arguments
- Error handling for invalid inputs and conflicting options
- Actual model execution scenarios
- RAG and non-existent image handling

Signed-off-by: Roberto Majadas <rmajadas@redhat.com>
2025-09-29 13:56:34 +02:00
Jeff MAURY
da93662f77 fix: convert : in file name
Signed-off-by: Jeff MAURY <jmaury@redhat.com>
2025-09-29 12:19:49 +02:00
Kaede Hoshikawa
e2190bb9d7 fix: apply Gemini suggestion with a quirk
Signed-off-by: Kaede Hoshikawa <11693215+futursolo@users.noreply.github.com>
2025-09-29 10:15:56 +00:00
Kaede Hoshikawa
9fca773249 fix(intel): detect xe driver
Signed-off-by: Kaede Hoshikawa <11693215+futursolo@users.noreply.github.com>
2025-09-29 10:15:56 +00:00
red-hat-konflux-kflux-prd-rh03[bot]
e4e81d02c1 chore(deps): update konflux references to abf231c
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <206760901+red-hat-konflux-kflux-prd-rh03[bot]@users.noreply.github.com>
2025-09-27 16:04:51 +00:00
Roberto Majadas
8ee65eaa54 Add e2e pytest test for help command
- Migrate to e2e pytest the `test/system/015-help.bats` bat test.
- Dynamically discover and verify the help output for all subcommands.
- Check default values and override precedence for options like image, engine, and store path.
- Validate error messages for invalid arguments and missing parameters.

Signed-off-by: Roberto Majadas <rmajadas@redhat.com>
2025-09-26 17:46:40 +02:00
Jeff MAURY
e868b0fada fix: rename file with : in name
Fixes #1971

Signed-off-by: Jeff MAURY <jmaury@redhat.com>
2025-09-26 10:22:38 +02:00
renovate[bot]
fcdd71be7f chore(deps): update dependency macos to v15
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-09-25 22:10:44 +00:00