Implements GitHub issue #1981 by adding a unified --max-tokens argument
that works across different AI runtimes with appropriate parameter mapping.
Changes:
- Add max_tokens field to BaseConfig with default value 0 (unlimited)
- Add --max-tokens CLI argument to run, serve, and perplexity commands
- Implement runtime-specific parameter mapping:
* llama.cpp: maps to -n parameter
* MLX: maps to --max-tokens parameter
* vLLM: maps to --max-tokens parameter
- Add daemon service support for max_tokens parameter
- Include comprehensive unit and e2e tests
The --max-tokens argument accepts integer values where 0 means unlimited
output tokens, providing a consistent interface across all supported
AI runtimes while maintaining compatibility with existing runtime_args.
Update man pages and config documentation for --max-tokens
Fixes #1981
Signed-off-by: Assistant <assistant@cursor.com>
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
GGML_NATIVE=OFF does not work as one would expect.
Set SOURCE_DATE_EPOCH instead to emulate an rpm build environment.
Signed-off-by: Oliver Walsh <owalsh@redhat.com>
Use the GNU Make addprefix function instead of relying on shell brace
expansion to generate --exclude-dir arguments for grep.
GNU Make uses /bin/sh to run recipes by default. /bin/sh may not
support brace expansion at all. On Fedora and RHEL, /bin/sh is an
alias for bash. On Ubuntu, /bin/sh is an alias for dash. On MacOS,
/bin/sh is an alias for zsh. Those all handle braces differently. In
particular, dash on Ubuntu does not do brace expansion within a word
with the result that PYTHON_SCRIPTS is generated incorrectly.
Signed-off-by: John Wiele <jwiele@redhat.com>
By adding the sys.prefix instead of /usr, ramalama is able to use
installed configuration inside a virtual environment (e.g. created
via tox)
Signed-off-by: Michael Engel <mengel@redhat.com>
First all RAMALAMA environment variables should be prefixed with
RAMALAMA, so changing API_KEY to RAMALAMA_API_KEY.
Also add api_key to ramalama.conf config files.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
- Migrate to e2e pytest the `test/system/030-run.bats` bat test.
- Basic dry-run functionality with and without prompts
- Various parameter combinations, including environment variables, security settings, pull policies, and runtime arguments
- Error handling for invalid inputs and conflicting options
- Actual model execution scenarios
- RAG and non-existent image handling
Signed-off-by: Roberto Majadas <rmajadas@redhat.com>
- Migrate to e2e pytest the `test/system/015-help.bats` bat test.
- Dynamically discover and verify the help output for all subcommands.
- Check default values and override precedence for options like image, engine, and store path.
- Validate error messages for invalid arguments and missing parameters.
Signed-off-by: Roberto Majadas <rmajadas@redhat.com>