distancematrix.generator

Package Contents

Classes

Euclidean

Class capable of efficiently calculating parts of the euclidean distance matrix between two series,

ZNormEuclidean

Class capable of efficiently calculating parts of the z-normalized distance matrix between two series,

FilterGenerator

Helper class that provides a standard way to create an ABC using

class distancematrix.generator.Euclidean(rb_scale_factor=2.0)

Bases: distancematrix.generator.abstract_generator.AbstractGenerator

Class capable of efficiently calculating parts of the euclidean distance matrix between two series, where each entry in the distance matrix equals the euclidean distance between 2 subsequences of both series.

This generator can handle streaming data.

prepare_streaming(self, m, series_window, query_window=None)

Create a bound generator that supports streaming data. The generator will need to receive data before any distances can be calculated.

Parameters
  • m – the size of the subsequences used to calculate distances between series and query

  • series_window – number of values to keep in memory for series, the length of the horizontal axis of the distance matrix will be equal to (series_window - m + 1)

  • query_window – number of values to keep in memory for query, the length of the vertical axis of the distance matrix will be equal to (query_window - m + 1), or None to indicate a self-join.

Returns

a bound generator that supports streaming

prepare(self, m, series, query=None)

Create a bound non-streaming generator for the given series and query sequences.

Parameters
  • m – the size of the subsequences used to calculate distances between series and query

  • series – 1D array, used as the horizontal axis of a distance matrix

  • query – 1D array, used as the vertical axis of a distance matrix, or None to indicate a self-join

Returns

a bound generator

class distancematrix.generator.ZNormEuclidean(noise_std=0.0, rb_scale_factor=2.0)

Bases: distancematrix.generator.abstract_generator.AbstractGenerator

Class capable of efficiently calculating parts of the z-normalized distance matrix between two series, where each entry in the distance matrix equals the euclidean distance between 2 z-normalized (zero mean and unit variance) subsequences of both series.

This generator can handle streaming data.

Subsequences with standard deviation <= 1e-6 will be treated as flat sequences to avoid problems with numerical stability.

prepare_streaming(self, m, series_window, query_window=None)

Create a bound generator that supports streaming data. The generator will need to receive data before any distances can be calculated.

Parameters
  • m – the size of the subsequences used to calculate distances between series and query

  • series_window – number of values to keep in memory for series, the length of the horizontal axis of the distance matrix will be equal to (series_window - m + 1)

  • query_window – number of values to keep in memory for query, the length of the vertical axis of the distance matrix will be equal to (query_window - m + 1), or None to indicate a self-join.

Returns

a bound generator that supports streaming

prepare(self, m, series, query=None)

Create a bound non-streaming generator for the given series and query sequences.

Parameters
  • m – the size of the subsequences used to calculate distances between series and query

  • series – 1D array, used as the horizontal axis of a distance matrix

  • query – 1D array, used as the vertical axis of a distance matrix, or None to indicate a self-join

Returns

a bound generator

class distancematrix.generator.FilterGenerator(generator, invalid_data_function=is_not_finite, rb_scale_factor=2.0)

Bases: distancematrix.generator.abstract_generator.AbstractGenerator

Helper class that provides a standard way to create an ABC using inheritance.

prepare_streaming(self, m, series_window, query_window=None)

Create a bound generator that supports streaming data. The generator will need to receive data before any distances can be calculated.

Parameters
  • m – the size of the subsequences used to calculate distances between series and query

  • series_window – number of values to keep in memory for series, the length of the horizontal axis of the distance matrix will be equal to (series_window - m + 1)

  • query_window – number of values to keep in memory for query, the length of the vertical axis of the distance matrix will be equal to (query_window - m + 1), or None to indicate a self-join.

Returns

a bound generator that supports streaming

prepare(self, m, series, query=None)

Create a bound non-streaming generator for the given series and query sequences.

Parameters
  • m – the size of the subsequences used to calculate distances between series and query

  • series – 1D array, used as the horizontal axis of a distance matrix

  • query – 1D array, used as the vertical axis of a distance matrix, or None to indicate a self-join

Returns

a bound generator