cntk.layers.blocks module

Basic building blocks that are semantically not layers (not used in a layered fashion), e.g. the LSTM block.

ForwardDeclaration(name='forward_declaration')[source]

Helper for recurrent network declarations. Returns a placeholder variable with an added method resolve_to() to be called at the end to close the loop. This is used for explicit graph building with recurrent connections.

Example

>>> # create a graph with a recurrent loop to compute the length of an input sequence
>>> from cntk.layers.typing import *
>>> x = C.input_variable(**Sequence[Tensor[2]])
>>> ones_like_input = C.sequence.broadcast_as(1, x)  # sequence of scalar ones of same length as input
>>> out_fwd = ForwardDeclaration()  # placeholder for the state variables
>>> out = C.sequence.past_value(out_fwd, initial_state=0) + ones_like_input
>>> out_fwd.resolve_to(out)
>>> length = C.sequence.last(out)
>>> x0 = np.reshape(np.arange(6,dtype=np.float32),(1,3,2))
>>> x0
    array([[[ 0.,  1.],
            [ 2.,  3.],
            [ 4.,  5.]]], dtype=float32)
>>> length(x0)
    array([ 3.], dtype=float32)
Returns:a placeholder variable with a method resolve_to() that resolves it to another variable
Return type:Variable
GRU(shape, cell_shape=None, activation=tanh, init=glorot_uniform(), init_bias=0, enable_self_stabilization=False, name='')[source]

Layer factory function to create a GRU block for use inside a recurrence. The GRU block implements one step of the recurrence and is stateless. It accepts the previous state as its first argument, and outputs its new state.

Example

>>> # a gated recurrent layer
>>> from cntk.layers import *
>>> gru_layer = Recurrence(GRU(500))
Parameters:
  • shape (int or tuple of ints) – vector or tensor dimension of the output of this layer
  • cell_shape (tuple, defaults to None) – if given, then the output state is first computed at cell_shape and linearly projected to shape
  • activation (Function, defaults to tanh()) – function to apply at the end, e.g. relu
  • init (scalar or NumPy array or cntk.initializer, defaults to glorot_uniform) – initial value of weights W
  • init_bias (scalar or NumPy array or cntk.initializer, defaults to 0) – initial value of weights b
  • enable_self_stabilization (bool, defaults to False) – if True then add a Stabilizer() to all state-related projections (but not the data input)
  • name (str, defaults to '') – the name of the Function instance in the network
Returns:

A function (prev_h, input) -> h that implements one step of a recurrent GRU layer.

Return type:

Function

LSTM(shape, cell_shape=None, activation=tanh, use_peepholes=False, init=glorot_uniform(), init_bias=0, enable_self_stabilization=False, name='')[source]

Layer factory function to create an LSTM block for use inside a recurrence. The LSTM block implements one step of the recurrence and is stateless. It accepts the previous state as its first two arguments, and outputs its new state as a two-valued tuple (h,c).

Example

>>> # a typical recurrent LSTM layer
>>> from cntk.layers import *
>>> lstm_layer = Recurrence(LSTM(500))
Parameters:
  • shape (int or tuple of ints) – vector or tensor dimension of the output of this layer
  • cell_shape (tuple, defaults to None) – if given, then the output state is first computed at cell_shape and linearly projected to shape
  • activation (Function, defaults to tanh()) – function to apply at the end, e.g. relu
  • use_peepholes (bool, defaults to False) –
  • init (scalar or NumPy array or cntk.initializer, defaults to glorot_uniform) – initial value of weights W
  • init_bias (scalar or NumPy array or cntk.initializer, defaults to 0) – initial value of weights b
  • enable_self_stabilization (bool, defaults to False) – if True then add a Stabilizer() to all state-related projections (but not the data input)
  • name (str, defaults to '') – the name of the Function instance in the network
Returns:

A function (prev_h, prev_c, input) -> (h, c) that implements one step of a recurrent LSTM layer.

Return type:

Function

RNNStep(shape, cell_shape=None, activation=sigmoid, init=glorot_uniform(), init_bias=0, enable_self_stabilization=False, name='')[source]

Layer factory function to create a plain RNN block for use inside a recurrence. The RNN block implements one step of the recurrence and is stateless. It accepts the previous state as its first argument, and outputs its new state.

Example

>>> # a plain relu RNN layer
>>> from cntk.layers import *
>>> relu_rnn_layer = Recurrence(RNNStep(500, activation=C.relu))
Parameters:
  • shape (int or tuple of ints) – vector or tensor dimension of the output of this layer
  • cell_shape (tuple, defaults to None) – if given, then the output state is first computed at cell_shape and linearly projected to shape
  • activation (Function, defaults to signmoid) – function to apply at the end, e.g. relu
  • init (scalar or NumPy array or cntk.initializer, defaults to glorot_uniform) – initial value of weights W
  • init_bias (scalar or NumPy array or cntk.initializer, defaults to 0) – initial value of weights b
  • enable_self_stabilization (bool, defaults to False) – if True then add a Stabilizer() to all state-related projections (but not the data input)
  • name (str, defaults to '') – the name of the Function instance in the network
Returns:

A function (prev_h, input) -> h where h = activation(input @ W + prev_h @ R + b)

Return type:

Function

RNNUnit(shape, cell_shape=None, activation=sigmoid, init=glorot_uniform(), init_bias=0, enable_self_stabilization=False, name='')[source]

This is a deprecated name for RNNStep(). Use that name instead.

Stabilizer(steepness=4, enable_self_stabilization=True, name='')[source]

Layer factory function to create a Droppo self-stabilizer. It multiplies its input with a scalar that is learned.

This takes enable_self_stabilization as a flag that allows to disable itself. Useful if this is a global default.

Note

Some other layers (specifically, recurrent units like LSTM()) also have the option to use the Stabilizer() layer internally. That is enabled by passing enable_self_stabilization=True to those layers. In conjunction with those, the rule is that an explicit Stabilizer() must be inserted by the user for the main data input, whereas the recurrent layer will own the stabilizer(s) for the internal recurrent connection(s).

Note

Unlike the original paper, which proposed a linear or exponential scalar, CNTK uses a sharpened Softplus: 1/steepness ln(1+e^{steepness*beta}). The softplus behaves linear for weights around and above 1 (like the linear scalar) while guaranteeing positiveness (like the exponentional variant) but is also more robust by avoiding exploding gradients.

Example

>>> # recurrent model with self-stabilization
>>> from cntk.layers import *
>>> with default_options(enable_self_stabilization=True): # enable stabilizers by default for LSTM()
...     model = Sequential([
...         Embedding(300),
...         Stabilizer(),           # stabilizer for main data input of recurrence
...         Recurrence(LSTM(512)),  # LSTM owns its own stabilizers for the recurrent connections
...         Stabilizer(),
...         Dense(10)
...     ])
Parameters:
  • steepness (int, defaults to 4) –
  • enable_self_stabilization (bool, defaults to False) – a flag that allows to disable itself. Useful if this is a global default
  • name (str, defaults to '') – the name of the Function instance in the network
Returns:

A function

Return type:

Function

UntestedBranchError(name)[source]