15.3 Cs Unit Specifications

The CsUnit is fairly simple. As was described for backpropagation (see section 14.1.3 Bp Unit Specifications) the bias weight is implemented as a connection object on the unit. It also keeps a record of the minus and plus phase activation values, act_m and act_p (primarily for display purposes), and the change in activation state for this unit, da.

All of the different flavors of constraint satisfaction algorithms differ only in their activation function. Thus, each one has a different type of unit spec. However, they all derive from a common CsUnitSpec, which has parameters shared by all the different algorithms. This unit spec has the parameters for noise and gain, the step size taken in updating activations, and schedules for adapting noise and gain over time:

CsConSpec_SPtr bias_spec
Points the the connection spec that controls the learning of the bias weight on the unit.
MinMaxRange real_range
The actual range to allow units to be in. Typically, units are kept within some tolerance of the absolute act_range values, in order to prevent saturation of the computation of inverse-sigmoid functions and other problems.
Random noise
These are the parameters of the noise to be added to the unit. GAUSSIAN noise with zero mean is the standard form of noise to use. Noise is always added to the activations in an amount proportional to the square-root of the step size (except in the BoltzUnitSpec, where no noise is added).
float step
The step size to take in updating activation values. A smaller step leads to smoother updating, but longer settling times.
float gain
The sharpness of the sigmoidal activation function, or 1 over the temperature for the Boltzmann units. A higher gain makes the units act more like binary units, while a lower gain makes them act more continuous and graded.
ClampType clamp_type
This controls the way in which external inputs (from the environment) are applied to the network. HARD_CLAMP means that the activation is exactly the ext value from the environment. HARD_FAST_CLAMP is like hard clamp, but optimized so that all of the inputs from clamped layers are computed once at the start of settling, saving considerable computational overhead. This should not be used if the inputs are noisy, since this noise will not be included! SOFT_CLAMP means that external inputs are added into the net input to a unit, instead of forcing the activation value to take on the external value. SOFT_THEN_HARD_CLAMP performs soft clamping in the minus phase, and then hard clamping in the plus phase.
float clamp_gain
When soft clamping, this parameter determines how strongly the external input contributes to the unit net input. It simply multiplies the value in the ext field.
Random initial_act
Controls the random initialization of unit activations in the InitState function.
bool use_annealing
Controls whether an annealing schedule is used to adapt the variance of the noise distribution over time (noise_sched).
Schedule noise_sched
This schedule contains values which are multiplied times the var parameter of the noise field to get an effective variance level. The value from the schedule is the linear interpolation of the cycle count from the settle process based on points listed in the schedule. Thus, each point in the schedule gives a variance multiplier for a particular cycle count, and intermediate cycles yield interpolated multiplier values.
bool use_sharp
Controls whether a sharpening schedule is used to adapt the gain parameter over time (gain_sched).
Schedule gain_sched
This is a schedule for the gain multiplier. The effective gain is the gain parameter times the value from this schedule. The schedule works just like the noise_sched described above.

The basic CsUnitSpec uses the inverse-logistic activation function developed by Movellan and McClelland, 1994. Thus, the change in activation is a function of the difference between the actual net input, and the inverse logistic of the current activation value. This formulation ends up being an exact solution to the objective function used in their derivation.

The SigmoidUnitSpec uses a simple sigmoidal function of the net input, which is like the formulation of Hopfield, 1984, and is also the same as the RBp units described in section 14.2.3 RBp Unit Specifications. This type also has the option with the time_avg parameter of computing time averaging (i.e., as a function of the step parameter) on either the ACTIVATION or the NET_INPUT, as was the case with the RBp implementation. As was described there the NET_INPUT option allows units to settle faster.

The BoltzUnitSpec implements a binary activation function like that used Boltzmann machine and the network of Hopfield, 1982. Here, the unit takes on a 0 or 1 value probabilistically as a sigmoidal function of the net input. The gain of this sigmoid can also be represented by its inverse, which is known as temperature, by analogy with similar systems in statistical physics. Thus, we have a temp parameter which is used to update the gain parameter. Noise is intrinsic to this function, and is not added in any other way.

The IACUnitSpec implements the interactive activation and competition function. This requires two new parameters, rest and decay. If the net input to the unit is positive, the activation is increased by net * (max - act), and if it is negative it is decreased by net * (act - min). In either case, the activation is also decayed towards the resting value by subtracting off a decay * (act - rest) term. IAC also has the option of only sending activation to other units when it is over some threshold (send_thresh). Doing this requires a different way of computing the net input to units, so it must be selected with the use_send_thresh flag, and by setting the update_mode in the CsCycle process to SYNC_SENDER_BASED. Pressing ReInit or NewInit at any level of process including and above the CsTrial process will check for the consistency of these settings, and prompt to change them.

The LinearCsUnitSpec computes activation as a simple linear function of the net input.

The ThreshLinCsUnitSpec computes activation as a threshold-linear function of the net input, where net input below threshold gives an activity of 0, and (net - threshold) above that.