pyhctsa.operations.symbolic.surprise

pyhctsa.operations.symbolic.surprise(y, what_prior='dist', memory=0.2, num_groups=3, coarse_grain_method='quantile', num_iters=500, random_seed=0)

Quantifies how surprised you would be of the next data point given recent memory.

Coarse-grains the time series, turning it into a sequence of symbols of a given alphabet size (num_groups), and quantifies measures of surprise of a process with local memory of the past memory values of the symbolic string. For each sample, the ‘information gained’ (log(1/p)) is estimated using expectations calculated from the previous memory samples.

Parameters:
y : array-like

The input time series.

what_prior : {'dist', 'T1', 'T2'}, optional

The type of information to store in memory:

  • ’dist’: the values of the time series in the previous memory samples (default),

  • ’T1’: the one-point transition probabilities in the previous memory samples,

  • ’T2’: the two-point transition probabilities in the previous memory samples.

memory : float, optional

The memory length (either number of samples, or a proportion of the time-series length if between 0 and 1). Default is 0.2.

num_groups : int, optional

The number of groups to coarse-grain the time series into. Default is 3.

coarse_grain_method : {'quantile', 'updown', 'embed2quadrants'}, optional

The coarse-graining or symbolization method:

  • ’quantile’: equiprobable alphabet by value of each time-series datapoint (default),

  • ’updown’: equiprobable alphabet by incremental changes in the time-series values,

  • ’embed2quadrants’: 4-letter alphabet of the quadrant each data point resides in a

    2D embedding space.

num_iters : int, optional

The number of iterations to repeat the procedure for. Default is 500.

random_seed : int, optional

Whether (and how) to reset the random seed. Default is 0.

Returns:

Summaries of the series of information gains.

Return type:

dict