Temperature
The lower the temperature, the sharper the distribution; the higher the temperature, the flatter the distribution, giving low-probability items a better chance of being sampled.
The same model can produce noticeably different response styles with different inference parameters. The reason is not that "the model suddenly changes its mind," but that the strategy for selecting the next token differs at the final step. Temperature changes the sharpness of the distribution, Top-k and Top-p trim the candidate set, and Greedy directly picks the highest probability item.
First pick a scenario, then switch decoding strategies, and finally drag the parameters. You will directly see how the retention range, probability distribution, and sampling results of candidate tokens change.
The bar chart below shows the distribution that actually participates in sampling under the current strategy. Cut-off items become 0.
This set of cards places the current scenario under Greedy, Temperature, Top-k, and Top-p respectively, helping you quickly establish a sense of contrast between "stability" and "diversity."