cntk.contrib.deeprl.agent.agent module¶

Base class for defining an agent.

class AgentBaseClass(o_space, a_space)[source]¶

Bases: object

Base class for defining an agent.

end(reward, next_state)[source]¶

Last observed reward/state of the episode (which then terminates).

Parameters:	reward (float) – amount of reward returned after previous action. next_state (object) – observation provided by the environment.

evaluate(o)[source]¶

Choose action for given observation without updating agent’s status.

Parameters:	o (object) – observation provided by the environment.
Returns:	action choosen by agent.
Return type:	action (int)

save_parameter_settings(filename)[source]¶: Save parameter settings to file.

start(state)[source]¶

Start a new episode.

Parameters:	state (object) – observation provided by the environment.
Returns:	action choosen by agent. debug_info (dict): auxiliary diagnostic information.
Return type:	action (int)

step(reward, next_state)[source]¶

Observe one transition and choose an action.

Parameters:	reward (float) – amount of reward returned after previous action. next_state (object) – observation provided by the environment.
Returns:	action choosen by agent. debug_info (dict): auxiliary diagnostic information.
Return type:	action (int)