distancematrix.generator
¶
Submodules¶
Package Contents¶
Classes¶
Class capable of efficiently calculating parts of the euclidean distance matrix between two series, |
|
Class capable of efficiently calculating parts of the z-normalized distance matrix between two series, |
|
Helper class that provides a standard way to create an ABC using |
- class distancematrix.generator.Euclidean(rb_scale_factor=2.0)¶
Bases:
distancematrix.generator.abstract_generator.AbstractGenerator
Class capable of efficiently calculating parts of the euclidean distance matrix between two series, where each entry in the distance matrix equals the euclidean distance between 2 subsequences of both series.
This generator can handle streaming data.
- prepare_streaming(self, m, series_window, query_window=None)¶
Create a bound generator that supports streaming data. The generator will need to receive data before any distances can be calculated.
- Parameters
m – the size of the subsequences used to calculate distances between series and query
series_window – number of values to keep in memory for series, the length of the horizontal axis of the distance matrix will be equal to (series_window - m + 1)
query_window – number of values to keep in memory for query, the length of the vertical axis of the distance matrix will be equal to (query_window - m + 1), or None to indicate a self-join.
- Returns
a bound generator that supports streaming
- prepare(self, m, series, query=None)¶
Create a bound non-streaming generator for the given series and query sequences.
- Parameters
m – the size of the subsequences used to calculate distances between series and query
series – 1D array, used as the horizontal axis of a distance matrix
query – 1D array, used as the vertical axis of a distance matrix, or None to indicate a self-join
- Returns
a bound generator
- class distancematrix.generator.ZNormEuclidean(noise_std=0.0, rb_scale_factor=2.0)¶
Bases:
distancematrix.generator.abstract_generator.AbstractGenerator
Class capable of efficiently calculating parts of the z-normalized distance matrix between two series, where each entry in the distance matrix equals the euclidean distance between 2 z-normalized (zero mean and unit variance) subsequences of both series.
This generator can handle streaming data.
Subsequences with standard deviation <= 1e-6 will be treated as flat sequences to avoid problems with numerical stability.
- prepare_streaming(self, m, series_window, query_window=None)¶
Create a bound generator that supports streaming data. The generator will need to receive data before any distances can be calculated.
- Parameters
m – the size of the subsequences used to calculate distances between series and query
series_window – number of values to keep in memory for series, the length of the horizontal axis of the distance matrix will be equal to (series_window - m + 1)
query_window – number of values to keep in memory for query, the length of the vertical axis of the distance matrix will be equal to (query_window - m + 1), or None to indicate a self-join.
- Returns
a bound generator that supports streaming
- prepare(self, m, series, query=None)¶
Create a bound non-streaming generator for the given series and query sequences.
- Parameters
m – the size of the subsequences used to calculate distances between series and query
series – 1D array, used as the horizontal axis of a distance matrix
query – 1D array, used as the vertical axis of a distance matrix, or None to indicate a self-join
- Returns
a bound generator
- class distancematrix.generator.FilterGenerator(generator, invalid_data_function=is_not_finite, rb_scale_factor=2.0)¶
Bases:
distancematrix.generator.abstract_generator.AbstractGenerator
Helper class that provides a standard way to create an ABC using inheritance.
- prepare_streaming(self, m, series_window, query_window=None)¶
Create a bound generator that supports streaming data. The generator will need to receive data before any distances can be calculated.
- Parameters
m – the size of the subsequences used to calculate distances between series and query
series_window – number of values to keep in memory for series, the length of the horizontal axis of the distance matrix will be equal to (series_window - m + 1)
query_window – number of values to keep in memory for query, the length of the vertical axis of the distance matrix will be equal to (query_window - m + 1), or None to indicate a self-join.
- Returns
a bound generator that supports streaming
- prepare(self, m, series, query=None)¶
Create a bound non-streaming generator for the given series and query sequences.
- Parameters
m – the size of the subsequences used to calculate distances between series and query
series – 1D array, used as the horizontal axis of a distance matrix
query – 1D array, used as the vertical axis of a distance matrix, or None to indicate a self-join
- Returns
a bound generator