Monitoring your agent



What do you monitor when training Actor-Critics ? I suppose the critic loss is useful, but are there other sources of informations ?

Thanks !


So far, here’s what I’ve found:

  • Critic loss (mean and variance)
  • Same for policy
  • Entropy seems to be an interesting parameters. Entropy is a way to measure the dispersion in the policy output (whether the output probabilities are close or if one is much higher than the others).