The Statistic Process is an extension of the basic Process object which is used for computing values that are then made available for recording and displaying in logs. The basic Stat object defines an interface for computing and reporting data. This interface is used by the schedule processes, who supervise the running of stats and the reporting of their data to the logs.
Each statistic object can operate in one of two capacities. The first is as the original computer (or collector) of some kind of data. For example, a squared-error statistic (SE_Stat) knows how to go through a network and compute the squared difference between target values and actual activations. Typically, this would be performed after every event is presented to the network, since that is when the relevant information is available in the state variables of the network.
The second capacity of a statistic is as an aggregator of data computed by another statistic. This is needed in order to be able to compute the sum of the squared-errors over all of the trials in an epoch, for example. When operating in aggregation mode, statistics work from data in the statistic they are aggregating from, instead of going out and collecting data from the network itself.
Typically, the statistic and its aggregators are all of the same type (e.g., they are all SE_Stats), and the aggregated values appear in the same member variable that the originally computed value appears in. Thus, this is where to look to set a stopping criterion for an aggregated stat value, for example.
Each statistic knows how to create a series of aggregators all the way
up the processing hierarchy. This is done with the
CreateAggregates
function on the stat, which is available as an
option when a statistic is created. Thus, one always creates a
statistic at the processing level where it will do the original
computation. If aggregates of this value are needed at higher levels,
then make sure the CreateAggregates
field is checked when the
stat is created, or call it yourself later (e.g., from the Actions
menu of a stat edit dialog). You can also UpdateAllAggregators
,
if you want to make sure their names reflect any changes (i.e., in
layer
or network aggregation operator), and FindAggregator
to find the immediate aggregator of the current stat.
It is recommend that you use the NewStat menu from the .processes menu of the project to create a new statistic, or use the Project Viewer (see section 9.2 The Project Viewer). This will bring up a dialog with the default options of where to create the stat (i.e., at what processing level) that the stat itself suggested (each stat knows where it should do its original computation).
There are several different kinds of aggregation operators that can be
used to aggregate information over processing levels, including summing,
averaging, etc. The operator is selected as part of the time_agg
member of the statistic. See below for descriptions of the different
operators.
Note that all aggregation statistics reside in the loop_stats
group of the schedule processes, since they need to be run after every
loop of the lower level statistic to collect its values and aggregate
them over time.
In addition to aggregating information over levels of processing,
statistics are often aggregating information over objects in the
network. Thus, for example, the SE_Stat typically computes the sum
of all the squared error terms over the output units in the network.
The particular form of aggregation that a stat performs over network
objects is controlled by the net_agg
member. Thus, it is
possible to have the SE_Stat compute the average error over output
units instead of the sum by changing this variable.
Finally, the name of a statistic as recorded in the log and as it
appears in the name
field is automatically set to reflect the
kinds of aggregation being performed. The first three-letter prefix (if
there are two) reflects the time_agg
operator. The second
three-letter prefix (or the only one) reflects the net_agg
operator. Further the layer name if the layer
pointer is
non-NULL is indicated in the name. The stat name
field is not
automatically set if it does not contain the type name of the stat, so
if you want to give a stat a custom name, don't include the type name in
this.