You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I applied the code of discrete sac to a custom discrete action environment. During the training process, I found that the loss of critic did not decrease but increased, and the critic-loss value after the increase was very large, even reaching 200+, what is the problem? Caused, how can I fix it? thanks.