1
0
mirror of https://github.com/containers/ramalama.git synced 2026-02-06 09:47:54 +01:00
Files
Michael Engel 941ae1b2d9 Added --max-tokens to llama.cpp inference spec
Relates to: https://github.com/containers/ramalama/pull/1982

Previously, the --max-tokens param was integrated in the daemon internal
command factory. With the introduction of the spec, this command factory
has now been replaced by the spec and the --max-tokens option added to
the llama.cpp one.

Signed-off-by: Michael Engel <mengel@redhat.com>
2025-10-23 14:19:49 +02:00
..