cntk.contrib.deeprl.agent.shared.models module¶
A set of predefined models used by Q learning or Actor-Critic.
-
class
Models
[source]¶ Bases:
object
A set of predefined models to approximate Q or log of pi (policy).
The loss function needs to be ‘cross_entropy_with_softmax’ for policy gradient methods.
-
static
dueling_network
(shape_of_inputs, number_of_outputs, model_hidden_layers, loss_function=None, use_placeholder_for_input=False)[source]¶ Dueling network to approximate Q function.
See paper at https://arxiv.org/pdf/1511.06581.pdf.
Parameters: - shape_of_inputs – tuple of array (input) dimensions.
- number_of_outputs – dimension of output, equals the number of possible actions.
- model_hidden_layers – in the form of “[comma-separated integers, [comma-separated integers], [comma-separated integers]]”. Each integer is the number of nodes in a hidden layer.The first set of integers represent the shared component in dueling network. The second set correponds to the state value function V and the third set correponds to the advantage function A.
- loss_function – if not specified, use squared loss by default.
- use_placeholder_for_input – if true, inputs have to be replaced later with actual input_variable.
- Returns: a Python dictionary with string-valued keys including
- ‘inputs’, ‘outputs’, ‘loss’ and ‘f’.
-
static
feedforward_network
(shape_of_inputs, number_of_outputs, model_hidden_layers, loss_function=None, use_placeholder_for_input=False)[source]¶ Feedforward network to approximate Q or log of pi.
Parameters: - shape_of_inputs – tuple of array (input) dimensions.
- number_of_outputs – dimension of output, equals the number of possible actions.
- model_hidden_layers – string representing a list of integers corresponding to number of nodes in each hidden layer.
- loss_function – if not specified, use squared loss by default.
- use_placeholder_for_input – if true, inputs have to be replaced later with actual input_variable.
- Returns: a Python dictionary with string valued keys including
- ‘inputs’, ‘outputs’, ‘loss’ and ‘f’.
-
static