Reward#
Rewards are for?#
Rewards are for good actions. Good actions in RL get more reward. Rewards are used in updating the agent such that the agent tries to take actions that yields more rewards in the future.
Rewards in deep learning.#
Rewards in deep learning is still a scalar number. It is used in the loss function to update the model. On thing worth notice is: The bigger the reward, the better. However, losses should be as minimal as possible. So a common way to do that is to set \( loss = - reward \). This way, minimizing losses is equal to maximizing rewards.