[RUNPOD_API_KEY]
with your Runpod API key.
[RUNPOD_API_KEY]
with your Runpod API key and [RUNPOD_ENDPOINT_ID]
with your vLLM endpoint ID.
Parameter | Type | Description |
---|---|---|
temperature | float | Controls randomness (0.0-1.0) |
max_tokens | int | Maximum number of tokens to generate |
top_p | float | Nucleus sampling parameter (0.0-1.0) |
top_k | int | Limits consideration to top k tokens |
stop | string or array | Sequence(s) at which to stop generation |
repetition_penalty | float | Penalizes repetition (1.0 = no penalty) |
presence_penalty | float | Penalizes new tokens already in text |
frequency_penalty | float | Penalizes token frequency |
min_p | float | Minimum probability threshold relative to most likely token |
best_of | int | Number of completions to generate server-side |
use_beam_search | boolean | Whether to use beam search instead of sampling |