repytah

repytah is a Python package that builds aligned hierarchies, a representation for sequential data streams.

Citing repytah

Please cite repytah using the following:

  1. Jia et al., repytah: A Python package that builds aligned hierarchies for sequential data streams. Python package version 0.1.1, 2023. [Online]. Available: [https://github.com/smith-tinkerlab/repytah](https://github.com/smith-tinkerlab/repytah).

Why repytah

Sequential data streams often have repeated elements that build on each other, creating hierarchies. Therefore, the goal of the Python package repytah is to extract these repetitions and their relationships to each other in order to form aligned hierarchies, a low-dimensional representation for sequential data that synthesizes and aligns all possible hierarchies of the meaningful repetitive structure in a sequential data stream along a common time axis.

An example of sequential data streams is music-based data streams, such as songs. In addition to visualizing the structure of a song, aligned hierarchies can be embedded into a classification space with a natural notion of distance and post-processed to narrow the exploration of a music-based data stream to certain lengths of structure, or to address numerous MIR tasks, including the cover song task, the segmentation task, and the chorus detection task.

For future work, based on the aligned hierarchies, we can build aligned sub-hierarchies. Aligned sub-hierarchies are a collection of individual aligned hierarchies for each repeated pattern in a given sequential data. With aligned sub-hierarchies, we can better deal with tasks that require a degree of flexibility. For example, we can implement the aligned sub-hierarchies to find all versions of the same piece of music based on a given version of the recording, which might involve different expressions including the number of repeats.

Installing from conda package-manager or from pip

Anaconda

The safest way to install repytah is through conda. In terminal, navigate to the directory that contains environment.yml and execute:

conda env create -f environment.yml
conda activate repytahenv
conda install -c conda-forge repytah

PyPI

In terminal, navigate to the directory that contains requirements.txt and execute the following command:

pip install -r requirements.txt
pip install repytah

Warning: Might encounter problems if Python version is not >= 3.7, <3.11.

Install from pip or conda

PyPI

The latest stable release is available on PyPI, and you can install it by running:

pip install repytah

Anaconda

If you use Anaconda, you can install the package using conda-forge:

conda install -c conda-forge repytah

Source

To build repytah from source, you need to first clone the repo using git clone git@github.com:smith-tinkerlab/repytah.git. Then build with python setup.py build. Then, to install repytah, say python setup.py install.

Alternatively, you can download or clone the repository and use pip to handle dependencies:

unzip repytah.zip
pip install -e repytah

or:

git clone https://github.com/smith-tinkerlab/repytah.git
pip install -e repytah-main

By calling pip list you should see repytah now as an installed package:

repytah (0.x.x, /path/to/repytah)

Utilities

utilities.py

This module, when imported, allows search.py, transform.py and assemble.py in the repytah package to run smoothly.

The module contains the following functions:

  • create_sdm

    Creates a self-dissimilarity matrix; this matrix is found by creating audio shingles from feature vectors, and finding cosine distance between shingles.

  • find_initial_repeats

    Finds all diagonals present in thresh_mat, removing each diagonal as it is found.

  • stretch_diags

    Fills out diagonals in binary self-dissimilarity matrix from diagonal starts and lengths.

  • add_annotations

    Adds annotations to each pair of repeated structures according to their length and order of occurrence.

  • __find_song_pattern

    Stitches information about repeat locations from thresh_diags matrix into a single row.

  • reconstruct_full_block

    Creates a record of when pairs of repeated structures occur, from the first beat in the song to the last beat of the song. Pairs of repeated structures are marked with 1’s.

  • get_annotation_lst

    Gets one annotation marker vector, given vector of lengths key_lst.

  • get_y_labels

    Generates the labels for visualization.

  • reformat [Only used for creating test examples]

    Transforms a binary matrix representation of when repeats occur in a song into a list of repeated structures detailing the length and occurrence of each repeat.

repytah.utilities.create_sdm(fv_mat, num_fv_per_shingle)

Creates self-dissimilarity matrix; this matrix is found by creating audio shingles from feature vectors, and finding the cosine distance between shingles.

Parameters
  • fv_mat (np.ndarray) – Matrix of feature vectors where each column is a time step and each row includes feature information i.e. an array of 144 columns/beats and 12 rows corresponding to chroma values.

  • num_fv_per_shingle (int) – Number of feature vectors per audio shingle.

Returns

Self-dissimilarity matrix with paired cosine distances between shingles.

Return type

self_dissim_mat (np.ndarray)

repytah.utilities.find_initial_repeats(thresh_mat, bandwidth_vec, thresh_bw)

Looks for the largest repeated structures in thresh_mat. Finds all repeated structures, represented as diagonals present in thresh_mat, and then stores them with their start/end indices and lengths in a list. As each diagonal is found, they are removed to avoid identifying repeated sub-structures.

Parameters
  • thresh_mat (np.ndarray[int]) – Thresholded matrix that we extract diagonals from.

  • bandwidth_vec (np.ndarray[1D,int]) – Array of lengths of diagonals to be found. Should be 1, 2, 3,…, n where n is the number of timesteps.

  • thresh_bw (int) – One less than smallest allowed repeat length.

Returns

List of pairs of repeats that correspond to diagonals in thresh_mat.

Return type

all_lst (np.ndarray[int])

repytah.utilities.stretch_diags(thresh_diags, band_width)

Creates a binary matrix with full length diagonals from a binary matrix of diagonal starts and length of diagonals.

Parameters
  • thresh_diags (np.ndarray) – Binary matrix where entries equal to 1 signals the existence of a diagonal.

  • band_width (int) – Length of encoded diagonals.

Returns

Logical matrix with diagonals of length band_width starting at each entry prescribed in thresh_diag.

Return type

stretch_diag_mat (np.ndarray[bool])

repytah.utilities.add_annotations(input_mat, song_length)

Adds annotations to the pairs of repeats in input_mat.

Parameters
  • input_mat (np.ndarray) – List of pairs of repeats. The first two columns refer to the first repeat of the pair. The third and fourth columns refer to the second repeat of the pair. The fifth column refers to the repeat lengths. The sixth column contains any previous annotations, which will be removed.

  • song_length (int) – Number of audio shingles in the song.

Returns

List of pairs of repeats with annotations marked.

Return type

anno_list (np.ndarray)

repytah.utilities.__find_song_pattern(thresh_diags)

Stitches information from thresh_diags matrix into a single row, song_pattern, that shows the time steps containing repeats; From the full matrix that decodes repeat beginnings (thresh_diags), the locations, or beats, where these repeats start are found and encoded into the song_pattern array.

Parameters

thresh_diags (np.ndarray) – Binary matrix with 1 at the start of each repeat pair (SI,SJ) and 0 elsewhere. WARNING: Must be symmetric.

Returns

Row where each entry represents a time step and the group that time step is a member of.

Return type

song_pattern (np.ndarray)

repytah.utilities.reconstruct_full_block(pattern_mat, pattern_key)

Creates a record of when pairs of repeated structures occur, from the first beat in the song to the end. This record is a binary matrix with a block of 1’s for each repeat encoded in pattern_mat whose length is encoded in pattern_key.

Parameters
  • pattern_mat (np.ndarray) – Binary matrix with 1’s where repeats begin and 0’s otherwise.

  • pattern_key (np.ndarray) – Vector containing the lengths of the repeats encoded in each row of pattern_mat.

Returns

Binary matrix representation for pattern_mat with blocks of 1’s equal to the length’s prescribed in pattern_key.

Return type

pattern_block (np.ndarray)

repytah.utilities.get_annotation_lst(key_lst)

Creates one annotation marker vector, given vector of lengths key_lst.

Parameters

key_lst (np.ndarray[int]) – Array of lengths in ascending order.

Returns

Array of one possible set of annotation markers for key_lst.

Return type

anno_lst_out (np.ndarray[int])

repytah.utilities.get_y_labels(width_vec, anno_vec)

Generates the labels for visualization with width_vec and anno_vec.

Parameters
  • width_vec (np.ndarray[int]) – Vector of widths for a visualization.

  • anno_vec (np.ndarray[int]) – Array of annotations for a visualization.

Returns

Labels for the y-axis of a visualization. Each label contains the width and annotation number of an essential structure component.

Return type

y_labels (np.ndarray[str])

repytah.utilities.reformat(pattern_mat, pattern_key)

Transforms a binary array with 1’s where repeats start and 0’s otherwise into a list of repeated structures. This list consists of information about the repeats including length, when they occur and when they end.

Every row has a pair of repeated structure. The first two columns are the time steps of when the first repeat of a repeated structure start and end. Similarly, the second two columns are the time steps of when the second repeat of a repeated structure start and end. The fifth column is the length of the repeated structure.

Reformat is not used in the main process for creating the aligned hierarchies. It is helpful when writing example inputs for the tests.

Parameters
  • pattern_mat (np.ndarray) – Binary array with 1’s where repeats start and 0’s otherwise.

  • pattern_key (np.ndarray) – Array with the lengths of each repeated structure in pattern_mat.

Returns

Array with the time steps of when the pairs of repeated structures start and end organized.

Return type

info_mat (np.ndarray)

Transform

transform.py

This module contains functions that transform matrix inputs into different forms that are of use in bigger functions where they are called. These functions focus mainly on overlapping repeated structures and annotation markers.

The module contains the following functions:
  • remove_overlaps

    Removes any pairs of repeats with the same length and annotation marker where at least one pair of repeats overlap in time.

  • __create_anno_remove_overlaps

    Turns rows of repeats into marked rows with annotation markers for the start indices and zeroes otherwise. After removing the annotations that have overlaps, the function creates separate arrays for annotations with overlaps and annotations without overlaps. Finally, the annotation markers are checked and fixed if necessary.

  • __separate_anno_markers

    Expands vector of non-overlapping repeats into a matrix representation. The matrix representation is a visual record of where all of the repeats in a song start and end.

repytah.transform.remove_overlaps(input_mat, song_length)

Removes any pairs of repeat length and specific annotation marker where there exists at least one pair of repeats that overlap in time.

Parameters
  • input_mat (np.ndarray[int]) – List of pairs of repeats with annotations marked. The first two columns refer to the first repeat or the pair, the second two refer to the second repeat of the pair, the fifth column refers to the length of the repeats, and the sixth column contains the annotation markers.

  • song_length (int) – Number of audio shingles.

Returns

A tuple (lst_no_overlaps, matrix_no_overlaps, key_no_overlaps, annotations_no_overlaps, all_overlap_lst). All variables have data type np.ndarray[int].

lst_no_overlaps is a list of pairs of repeats with annotations marked where all the repeats of a given length and with a specific annotation marker do not overlap in time.

matrix_no_overlaps is a matrix representation of lst_no_overlaps with one row for each group of repeats.

key_no_overlaps is a vector containing the lengths of the repeats encoded in each row of matrix_no_overlaps.

annotations_no_overlaps is a vector containing the annotation markers of the repeats encoded in each row of matrix_no_overlaps.

all_overlap_lst is a list of pairs of repeats where for each pair of repeat length and specific annotation marker, there exists at least one pair of repeats that do overlap in time.

repytah.transform.__create_anno_remove_overlaps(k_mat, song_length, band_width)

Turns k_mat into marked rows with annotation markers for the start indices and zeroes otherwise. After removing the annotations that have overlaps, the function outputs k_lst_out which only contains rows that have no overlaps, then takes the annotations that have overlaps from k_lst_out and puts them in overlap_lst. Lastly, it checks if the proper sequence of annotation markers was given and fix them if necessary.

Parameters
  • k_mat (np.ndarray) – List of pairs of repeats of length 1 with annotations marked. The first two columns refer to the first repeat of the pair, the second two refer to the second repeat of the pair, the fifth column refers to the length of the repeats, and the sixth column contains the annotation markers.

  • song_length (int) – Number of audio shingles.

  • band_width (int) – Length of repeats encoded in k_mat.

Returns

A tuple (pattern_row, k_lst_out, overlap_lst) where all variables have data type np.ndarray.

pattern_row marks where non-overlapping repeats occur, marking start indices with annotation markers and 0’s otherwise.

k_lst_out is a list of pairs of repeats of length band_width that contain no overlapping repeats with annotations marked.

overlap_lst is a list of pairs of repeats of length band_width that contain overlapping repeats with annotations marked.

repytah.transform.__separate_anno_markers(k_mat, song_length, band_width, pattern_row)

Expands pattern_row, a row vector that marks where non-overlapping repeats occur, into a matrix representation or np.array. The dimension of this array is twice the pairs of repeats by song_length. k_mat provides a list of annotation markers that is used in separating the repeats of length band_width into individual rows. Each row will mark the start and end time steps of a repeat with 1’s and 0’s otherwise. The array is a visual record of where all of the repeats in a song start and end.

Parameters
  • k_mat (np.ndarray) – List of pairs of repeats of length band_width with annotations marked. The first two columns refer to the start and end time steps of the first repeat of the pair, the second two refer to the start and end time steps of second repeat of the pair, the fifth column refers to the length of the repeats, and the sixth column contains the annotation markers. We will be indexing into the sixth column to obtain a list of annotation markers.

  • song_length (int) – Number of audio shingles.

  • band_width (int) – Length of repeats encoded in k_mat.

  • pattern_row (np.ndarray) – Row vector of the length of the song that marks where non-overlapping repeats occur with the repeats’ corresponding annotation markers and 0’s otherwise.

Returns

A tuple (pattern_mat, pattern_key, anno_id_lst) where all variables have data type np.ndarray.

pattern_mat is a matrix representation where each row contains a group of repeats marked.

pattern_key is a column vector containing the lengths of the repeats encoded in each row of pattern_mat.

anno_id_lst is a column vector containing the annotation markers of the repeats encoded in each row of pattern_mat.

Assemble

assemble.py

This module finds and forms essential structure components, which are the smallest building blocks that form every repeat in the song.

These functions ensure that each time step of a song is contained in at most one of the song’s essential structure components by checking that there are no overlapping repeats in time. When repeats overlap, they undergo a process where they are divided until there are only non-overlapping pieces left.

The module contains the following functions:

  • breakup_overlaps_by_intersect

    Extracts repeats in input_pattern_obj that has the starting indices of the repeats, into the essential structure components using bw_vec, that has the lengths of each repeat.

  • check_overlaps

    Compares every pair of groups, determining if there are any repeats in any pairs of the groups that overlap.

  • __compare_and_cut

    Compares two rows of repeats labeled RED and BLUE, and determines if there are any overlaps in time between them. If there are overlaps, we cut the repeats in RED and BLUE into up to 3 pieces.

  • __num_of_parts

    Determines the number of blocks of consecutive time steps in a list of time steps. A block of consecutive time steps represents a distilled section of a repeat.

  • __inds_to_rows

    Expands a vector containing the starting indices of a piece or two of a repeat into a matrix representation recording when these pieces occur in the song with 1’s. All remaining entries are marked with 0’s.

  • __merge_based_on_length

    Merges repeats that are the same length, as set by full_bandwidth, and are repeats of the same piece of structure.

  • __merge_rows

    Merges rows that have at least one common repeat. These common repeat(s) must occur at the same time step and be of a common length.

  • hierarchical_structure

    Distills the repeats encoded in matrix_no_overlaps (and key_no_overlaps) to the essential structure components and then builds the hierarchical representation. Optionally outputs visualizations of the hierarchical representations.

repytah.assemble.breakup_overlaps_by_intersect(input_pattern_obj, bw_vec, thresh_bw)

Extracts repeats in input_pattern_obj that has the starting indices of the repeats, into the essential structure components using bw_vec, that has the lengths of each repeat. The essential structure components are the smallest building blocks that form every repeat in the song.

Parameters
  • input_pattern_obj (np.ndarray) – Binary matrix with 1’s where repeats begin and 0’s otherwise.

  • bw_vec (np.ndarray) – Vector containing the lengths of the repeats encoded in input_pattern_obj.

  • thresh_bw (int) – One less than the smallest allowable repeat length.

Returns

A tuple (pattern_no_overlaps, pattern_no_overlaps_key) where all variables have data type np.ndarray.

pattern_no_overlaps is a binary matrix with 1’s where repeats of essential structure components begin.

pattern_no_overlaps_key is a vector containing the lengths of the repeats of essential structure components in pattern_no_overlaps.

repytah.assemble.check_overlaps(input_mat)

Compares every pair of repeat groups and determines if there are any repeats in any pairs of the groups that overlap.

Parameters

input_mat (np.ndarray) – Binary matrix with blocks of 1’s equal to the length of repeats to be checked for overlaps.

Returns

Logical array where (i,j) = 1 if row i of input_mat and row j of input_mat overlap and (i,j) = 0 elsewhere.

Return type

overlap_mat (np.ndarray)

repytah.assemble.__compare_and_cut(red, red_len, blue, blue_len)

Compares two rows of repeats labeled RED and BLUE, and determines if there are any overlaps in time between them. If there is, then we cut the repeats in RED and BLUE into up to 3 pieces.

Parameters
  • red (np.ndarray) – Binary row vector encoding a set of repeats with 1’s where each repeat starts and 0’s otherwise.

  • red_len (np.ndarray) – Length of repeats encoded in red.

  • blue (np.ndarray) – Binary row vector encoding a set of repeats with 1’s where each repeat starts and 0’s otherwise.

  • blue_len (np.ndarray) – Length of repeats encoded in blue.

Returns

A tuple (union_mat, union_length) where all variables have data type np.ndarray.

union_mat is a binary matrix representation of up to three rows encoding non-overlapping repeats cut from red and blue.

union_length is a vector containing the lengths of the repeats encoded in union_mat.

repytah.assemble.__num_of_parts(input_vec, input_start, input_all_starts)

Determines the number of blocks of consecutive time steps in a list of time steps. A block of consecutive time steps represents a distilled section of a repeat. This distilled section will be replicated and the starting indices of the repeats within it will be returned.

Parameters
  • input_vec (np.ndarray) – Vector that contains one or two parts of a repeat that are overlap(s) in time that may need to be replicated.

  • input_start (np.ndarray) – Starting index for the part to be replicated.

  • input_all_starts (np.ndarray) – Starting indices for replication.

Returns

A tuple (start_mat, length_vec) where all variables have data type np.ndarray.

start_mat is an array of one or two rows containing the starting indices of the replicated repeats.

length_vec is a column vector containing the lengths of the replicated parts.

repytah.assemble.__inds_to_rows(start_mat, row_length)

Expands a vector containing the starting indices of a piece or two of a repeat into a matrix representation recording when these pieces occur in the song with 1’s. All remaining entries are marked with 0’s.

Parameters
  • start_mat (np.ndarray) – Matrix of one or two rows, containing the starting indices.

  • row_length (int) – Length of the rows.

Returns

Binary matrix of one or two rows, with 1’s where the starting indices and 0’s otherwise.

Return type

new_mat (np.ndarray)

repytah.assemble.__merge_based_on_length(full_mat, full_bw, target_bw)

Merges repeats that are the same length, as set by full_bw, and are repeats of the same piece of structure.

Parameters
  • full_mat (np.ndarray) – Binary matrix with ones where repeats start and zeroes otherwise.

  • full_bw (np.ndarray) – Length of repeats encoded in input_mat.

  • target_bw (np.ndarray) – Lengths of repeats that we seek to merge.

Returns

A tuple (out_mat, one_length_vec) where all variables have data type np.ndarray.

out_mat is a binary matrix with 1’s where repeats start and 0’s otherwise with rows of full_mat merged if appropriate.

one_length_vec is a vector that contains the length of repeats encoded in out_mat.

repytah.assemble.__merge_rows(input_mat, input_width)

Merges rows that have at least one common repeat; said common repeat(s) must occur at the same time step and be of common length.

Parameters
  • input_mat (np.ndarray) – Binary matrix with ones where repeats start and zeroes otherwise.

  • input_width (int) – Length of repeats encoded in input_mat.

Returns

Binary matrix with ones where repeats start and zeroes otherwise.

Return type

merge_mat (np.ndarray)

repytah.assemble.hierarchical_structure(matrix_no_overlaps, key_no_overlaps, sn, vis=False)

Distills the repeats encoded in matrix_no_overlaps (and key_no_overlaps) to the essential structure components and then builds the hierarchical representation. Optionally shows visualizations of the hierarchical structure via the vis argument.

Parameters
  • matrix_no_overlaps (np.ndarray) – Binary matrix with 1’s where repeats begin and 0’s otherwise.

  • key_no_overlaps (np.ndarray) – Vector containing the lengths of the repeats encoded in matrix_no_overlaps.

  • sn (int) –

  • length (Song) –

  • shingles. (which is the number of audio) –

  • vis (bool) – Shows visualizations if True (default = False).

Returns

A tuple (full_visualization, full_key, full_matrix_no_overlaps, full_anno_lst) where all variables have data type np.ndarray.

full_visualization is a binary matrix representation for full_matrix_no_overlaps with blocks of 1’s equal to the lengths prescribed in full_key.

full_key is a vector containing the lengths of the hierarchical structure encoded in full_matrix_no_overlaps.

full_matrix_no_overlaps is a binary matrix with 1’s where hierarchical structure begins and 0’s otherwise.

full_anno_lst is a vector containing the annotation markers of the hierarchical structure encoded in each row of full_matrix_no_overlaps.

Search

search.py

This module holds functions used to find and record the diagonals in the thresholded matrix, T. These functions prepare the diagonals found for transformation and assembling later. The module contains the following functions:

  • find_complete_list

    Finds all smaller diagonals (and the associated pairs of repeats) that are contained in pair_list, which is composed of larger diagonals found in find_initial_repeats.

  • __find_add_rows

    Finds pairs of repeated structures, represented as diagonals of a certain length, k, that neither start nor end at the same time steps as previously found pairs of repeated structures of the same length.

  • find_all_repeats

    Finds all the diagonals present in thresh_mat. This function is nearly identical to find_initial_repeats except for two crucial differences. First, we do not remove diagonals after we find them. Second, there is no smallest bandwidth size as we are looking for all diagonals.

  • find_complete_list_anno_only

    Finds annotations for all pairs of repeats found in find_all_repeats. This list contains all the pairs of repeated structures with their starting/ending indices and lengths.

repytah.search.find_complete_list(pair_list, song_length)

Finds all smaller diagonals (and the associated pairs of repeats) that are contained in pair_list, which is composed of larger diagonals found in find_initial_repeats.

Parameters
  • pair_list (np.ndarray) – List of pairs of repeats found in earlier steps (bandwidths MUST be in ascending order). If you have run find_initial_repeats before this script, then pair_list will be ordered correctly.

  • song_length (int) – Song length, which is the number of audio shingles.

Returns

List of pairs of repeats with smaller repeats added.

Return type

final_lst (np.ndarray)

repytah.search.__find_add_rows(lst_no_anno, check_inds, k)

Finds pairs of repeated structures, represented as diagonals of a certain length, k, that that start at the same time step, or end at the same time step, or neither start nor end at the same time step as previously found pairs of repeated structures of the same length.

Parameters
  • lst_no_anno (np.ndarray) – List of pairs of repeats.

  • check_inds (np.ndarray) – List of starting indices for repeats of length k that we use to check lst_no_anno for more repeats of length k.

  • k (int) – Length of repeats that we are looking for.

Returns

List of newly found pairs of repeats of length K that are contained in larger repeats in lst_no_anno.

Return type

add_rows (np.ndarray)

repytah.search.find_all_repeats(thresh_mat, bw_vec)

Finds all the diagonals present in thresh_mat. This function is nearly identical to find_initial_repeats, with two crucial differences. First, we do not remove diagonals after we find them. Second, there is no smallest bandwidth size as we are looking for all diagonals.

Parameters
  • thresh_mat (np.ndarray) – Thresholded matrix that we extract diagonals from.

  • bw_vec (np.ndarray) – Vector of lengths of diagonals to be found. Should be 1, 2, 3, …, n where n = number of timesteps.

Returns

Pairs of repeats that correspond to diagonals in thresh_mat.

Return type

all_lst (np.ndarray)

repytah.search.find_complete_list_anno_only(pair_list, song_length)

Finds annotations for all pairs of repeats found in find_all_repeats. This list contains all the pairs of repeated structures with their starting/ending indices and lengths.

Parameters
  • pair_list (np.ndarray) – List of pairs of repeats. WARNING: Bandwidths must be in ascending order.

  • song_length (int) – Number of audio shingles in song.

Returns

List of pairs of repeats with smaller repeats added and with annotation markers.

Return type

out_lst (np.ndarray)

Function Pipeline

Before we dive deep into the implementation of the package via the walk-through of a complete example, a function pipeline is shown below to illustrate the function calls in each step. For an in-depth look at each function, check the corresponding Jupyter Notebooks for each module.

_images/function_pipeline.jpg

A Quick Start to Use the Package repytah

This notebook demonstrates how to use repytah to create aligned hierarchies for a music-based data stream. Before diving into the details, we’ll walk through a brief example program.

The example input is a csv file containing Chroma feature vectors for each beat of Chopin’s Mazurka Op.30, No.1.

[1]:
# Standard imports
import scipy.io as sio
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Import the package repytah
from repytah import *

# Make the images clear
%matplotlib inline
%config InlineBackend.figure_format = 'retina'

To skip visualizing the middle steps and get the output directly, try the following code:

[2]:
file_in = load_ex_data('../repytah/data/mazurka30-1.csv').to_numpy()
file_out = "hierarchical_out_file.mat"
num_fv_per_shingle = 12
thresh = 0.02
csv_to_aligned_hierarchies(file_in, file_out, num_fv_per_shingle, thresh, True)
_images/quick_start_5_0.png
_images/quick_start_5_1.png
_images/quick_start_5_2.png
_images/quick_start_5_3.png
_images/quick_start_5_4.png

To visualize the middle steps, try the following code:

Load the input file and modify the music based data to a matrix representation:

[4]:
# Load the input file
file_in = load_ex_data('../repytah/data/mazurka30-1.csv').to_numpy()
fv_mat = file_in

# Number of feature vectors per shingle
num_fv_per_shingle = 12

# Create the self-dissimilarity matrix
self_dissim_mat = create_sdm(fv_mat, num_fv_per_shingle)

# Produce a visualization
SDM = plt.imshow(self_dissim_mat, cmap="RdBu")
plt.title('Self-Dissimilarity Matrix')
plt.show()
_images/quick_start_8_0.png

Threshold the above matrix:

[12]:
song_length = self_dissim_mat.shape[0]
thresh = 0.02

# Threshold the SDM to produce a binary matrix
thresh_dist_mat = (self_dissim_mat <= thresh)

# Produce a visualization
SDM = plt.imshow(thresh_dist_mat, cmap="Greys")
plt.title('Thresholded Matrix')
plt.show()
_images/quick_start_10_0.png

Find and group each pair of repeats:

[13]:
all_lst = find_initial_repeats(thresh_dist_mat, np.arange(1, song_length + 1), 0)
complete_lst = find_complete_list(all_lst, song_length)
mat_no_overlaps, key_no_overlaps = remove_overlaps(complete_lst, song_length)[1:3]

Find the essential structure components of the song and build the aligned hierarchies:

[14]:
output_tuple = hierarchical_structure(mat_no_overlaps, key_no_overlaps, song_length, vis=True)
_images/quick_start_14_0.png
_images/quick_start_14_1.png
_images/quick_start_14_2.png
_images/quick_start_14_3.png
_images/quick_start_14_4.png

To visualize the repeated structures in the actual music score, the examples in color yellow and purple are shown: 5379e89a0fb647c18980998426c0b0f7 0161f4fa27534729a159f40d102cb007 4b5a8b91a1274b10907d44e55d5ced41

This notebook is just a quick walk-through that demonstrates how to use the package repytah, more technical details are in the `example <https://github.com/smith-tinkerlab/repytah/blob/main/docs/example_vignette.ipynb>`__ notebook.

Example of Creating Aligned Hierarchies for a Mazurka Score

In this example, we will walk through the elements of the repytah package that pertain to the creation of aligned hierarchies for a music-based data stream (eg. a song).

Beginning with features (such as MFCCs and chroma features) for each timestep in your music-based data stream, there are several steps to this process:

  1. Create the self-dissimilarity matrix (SDM).

  2. Highlight pairs of timesteps that are close enough to be considered as repetitions of each other. (In other words, threshold the SDM)

  3. Find pairs of structure repetitions (represented as diagonals within the thresholded SDM).

  4. Find any pairs of structure repetitions not found in step 2, and group the structure repetitions.

  5. Remove any repeated structures that have overlapped instances.

  6. Distill the collection of repeated structures into the essential structure components , i.e. the smallest meaningful repetitions on which all larger repeats are constructed. Each timestep will be contained in no more than one essential structure component

Note: The walk-through of this example is very similar to the code in example.py found in this package.

We begin by importing the necessary packages:

[1]:
# NumPy and SciPy are required for mathematical operations
import scipy.io as sio
import numpy as np

# Pandas is used to import the csv
import pandas as pd

# Import all modules from repytah
from repytah import *

# Matplotlib is used to display outputs
import matplotlib.pyplot as plt

# Make the images clear
%matplotlib inline
%config InlineBackend.figure_format = 'retina'

# Hide code (optional)
import ipywidgets as widgets
from IPython.display import display, HTML
javascript_functions = {False: "hide()", True: "show()"}
button_descriptions  = {False: "Show code", True: "Hide code"}
def toggle_code(state):
    output_string = "<script>$(\"div.input\").{}</script>"
    output_args   = (javascript_functions[state],)
    output        = output_string.format(*output_args)
    display(HTML(output))
def button_action(value):
    state = value.new
    toggle_code(state)
    value.owner.description = button_descriptions[state]
state = True
toggle_code(state)
button = widgets.ToggleButton(state, description = button_descriptions[state])
button.observe(button_action, "value")
display(button)

Step 0: Create the SDM

In this phase, there are a few crucial details, namely importing the data file that we would like to unearth hierarchical structural information for and determining the appropriate dissimilarity measure to use. If you already have (symmetrical) matrix representations for your data stream, then you may find it more appropriate to load your matrix and then skip ahead to Step 1 or Step 2.

This step assumes that your music-based data stream (ie. a recording or score) has already had your preferred features extracted (like chroma or MFCCs) and is arranged into columns such that each column represents a time step (or beat). We refer to this as a feature vector matrix as each feature vector is laid out as column within one cohesive matrix.

Importing data for structure analysis

Note: For this demonstration, we are using the load_ex_data() built-in function from the example module to load our data. To recreate our example, you could do the same. This is an optional task and the normal method of reading in a data file will work.

We are using Chopin’s Mazurka Op.6, No.1 as input for this demonstration.

[2]:
# Import csv
file_in = load_ex_data('data/input.csv').to_numpy()

fv_mat = file_in
Creating the SDM

In just one line, we define the self-dissimilarity matrix. This function create_sdm, uses feature vectors to create an audio shingle for each time step and represents these shingles as vectors by stacking the relevant feature vectors on top of each other. Then, the cosine distance is found between these shingles.

[3]:
# Number of feature vectors per shingle
num_fv_per_shingle = 12

# Create the self-dissimilarity matrix
self_dissim_mat = create_sdm(fv_mat, num_fv_per_shingle)

print('self_dissim_mat:\n', self_dissim_mat)

# Produce a visualization
SDM = plt.imshow(self_dissim_mat, cmap="RdBu")
plt.title('Self-Dissimilarity Matrix')
plt.show()

self_dissim_mat:
 [[0.         0.36468911 0.58429924 ... 0.57425539 0.77909953 0.82434591]
 [0.36468911 0.         0.41660702 ... 0.68829818 0.64039838 0.80480641]
 [0.58429924 0.41660702 0.         ... 0.72072778 0.59670143 0.57191158]
 ...
 [0.57425539 0.68829818 0.72072778 ... 0.         0.60941866 0.70720396]
 [0.77909953 0.64039838 0.59670143 ... 0.60941866 0.         0.39416349]
 [0.82434591 0.80480641 0.57191158 ... 0.70720396 0.39416349 0.        ]]
_images/example_vignette_6_1.png

Step 1: Threshold the SDM

In this step, the self-dissimilarity matrix is thresholded to produce a binary matrix of the same dimensions. This matrix is used to identify repeated structures, which are represented by diagonals of the same length.

[4]:
song_length = self_dissim_mat.shape[0]
thresh = 0.02

# Threshold the SDM to produce a binary matrix
thresh_dist_mat = (self_dissim_mat <= thresh)

print('thresh_dist_mat:\n', thresh_dist_mat)

# Produce a visualization
SDM = plt.imshow(thresh_dist_mat, cmap="Greys")
plt.title('Thresholded Matrix')
plt.show()
thresh_dist_mat:
 [[ True False False ... False False False]
 [False  True False ... False False False]
 [False False  True ... False False False]
 ...
 [False False False ...  True False False]
 [False False False ... False  True False]
 [False False False ... False False  True]]
_images/example_vignette_8_1.png

Step 2: Find the diagonals present and store all pairs of repeats found in a list

The diagonals in the thresholded matrix are found and recorded in an array of repeats. find_initial_repeats does this by looking for the largest repeated structures in thresh_dist_mat, which are illustrated in the above Thresholded Matrix diagram. Once all repeated structures are found, which are represented as diagonals present in thresh_dist_mat, they are stored with their start/end indices and lengths in a list. As each diagonal is found, it is removed to avoid identifying repeated sub-structures.

Below is the listing of the pairs of repeats found by find_initial_repeats:

[5]:
all_lst = find_initial_repeats(thresh_dist_mat, np.arange(1, song_length + 1), 0)

print('all_lst:\n', all_lst)
all_lst:
 [[  2   2  26  26   1]
 [  2   2  74  74   1]
 [  2   2 146 146   1]
 [  2   2 218 218   1]
 [  2   2 314 314   1]
 [ 26  26  50  50   1]
 [ 50  50  74  74   1]
 [ 50  50 146 146   1]
 [ 50  50 218 218   1]
 [ 50  50 314 314   1]
 [247 250 271 274   4]
 [124 156 292 324  33]
 [196 228 292 324  33]
 [  2  36 290 324  35]
 [ 50  84 290 324  35]
 [  3  38 123 158  36]
 [ 51  86 195 230  36]
 [  3  39 195 231  37]
 [  1  38  49  86  38]
 [ 51 122 123 194  72]
 [ 87 158 159 230  72]
 [  1 325   1 325 325]]

Let’s take moment to examine what we have and time it back to the original thresholded SDM. The first ten pairs listed in all_lst are repeats of length 1. We also note that the last “pair” of repeats is the whole score being matched to itself. For now, we set this “repeat” and the ones that are of length 1 aside.

In the image below, for each pair of repeats, we color one of the two diagonals associated to that pairing. The second one is the matching diagonal when flipped over the main diagonal.

For example, the red diagonal represents the pair of repeats that start at beats 124 and 292 and are 33 beats long, and the blue diagonal represents the pair of repeats that start at beat 51 and 123 and are 72 beats long.

[8]:
visualize_all_lst(thresh_dist_mat)
_images/example_vignette_12_0.png

Step 3: Find any diagonals in the thresholded matrix T that are contained in larger diagonals in T but not found in step 2, then group pairs of repeats

Any smaller diagonals that are contained in larger diagonals and are not found in step 2 are found and added to the array of repeats in this step. All possible repeat lengths are looped over in all_lst, and larger repeats containing smaller repeats are broken up into up to 3 sections each, the part before the overlap, the overlap, and the part after the overlap. With this, a more complete list of repeated structures is created.

[9]:
complete_lst = find_complete_list(all_lst, song_length)

print('complete_lst:\n', complete_lst)
complete_lst:
 [[  1   1  49  49   1   1]
 [  2   2  26  26   1   2]
 [  2   2  50  50   1   2]
 [  2   2  74  74   1   2]
 [  2   2 146 146   1   2]
 [  2   2 218 218   1   2]
 [  2   2 290 290   1   2]
 [  2   2 314 314   1   2]
 [ 26  26  50  50   1   2]
 [ 26  26  74  74   1   2]
 [ 26  26 146 146   1   2]
 [ 26  26 218 218   1   2]
 [ 26  26 314 314   1   2]
 [ 50  50  74  74   1   2]
 [ 50  50 146 146   1   2]
 [ 50  50 218 218   1   2]
 [ 50  50 290 290   1   2]
 [ 50  50 314 314   1   2]
 [ 74  74 146 146   1   2]
 [ 74  74 218 218   1   2]
 [ 74  74 314 314   1   2]
 [146 146 218 218   1   2]
 [146 146 314 314   1   2]
 [218 218 314 314   1   2]
 [  3   3 123 123   1   3]
 [  3   3 195 195   1   3]
 [ 51  51 123 123   1   3]
 [ 51  51 195 195   1   3]
 [ 39  39 231 231   1   4]
 [  1   2  49  50   2   1]
 [  2   3 290 291   2   2]
 [ 50  51 290 291   2   2]
 [ 37  38  85  86   2   3]
 [ 37  38 157 158   2   3]
 [ 85  86 229 230   2   3]
 [157 158 229 230   2   3]
 [ 37  39 229 231   3   1]
 [247 250 271 274   4   1]
 [ 27  36 315 324  10   1]
 [ 75  84 315 324  10   1]
 [147 156 315 324  10   1]
 [219 228 315 324  10   1]
 [ 27  38  75  86  12   1]
 [ 27  38 147 158  12   1]
 [ 75  86 219 230  12   1]
 [147 158 219 230  12   1]
 [ 27  39 219 231  13   1]
 [124 145 292 313  22   1]
 [196 217 292 313  22   1]
 [  3  25 123 145  23   1]
 [  3  25 195 217  23   1]
 [ 51  73 123 145  23   1]
 [ 51  73 195 217  23   1]
 [  2  25 290 313  24   1]
 [ 50  73 290 313  24   1]
 [  1  25  49  73  25   1]
 [  4  36 124 156  33   1]
 [  4  36 196 228  33   1]
 [  4  36 292 324  33   1]
 [ 52  84 124 156  33   1]
 [ 52  84 196 228  33   1]
 [ 52  84 292 324  33   1]
 [124 156 196 228  33   1]
 [124 156 292 324  33   1]
 [196 228 292 324  33   1]
 [  3  36 291 324  34   1]
 [ 51  84 291 324  34   1]
 [  2  36  50  84  35   1]
 [  2  36 290 324  35   1]
 [ 50  84 290 324  35   1]
 [  3  38  51  86  36   1]
 [  3  38 123 158  36   1]
 [  3  38 195 230  36   1]
 [ 51  86 123 158  36   1]
 [ 51  86 195 230  36   1]
 [123 158 195 230  36   1]
 [ 87 122 159 194  36   2]
 [  3  39 195 231  37   1]
 [ 87 123 159 195  37   2]
 [  1  38  49  86  38   1]
 [ 85 122 157 194  38   2]
 [ 75 122 147 194  48   1]
 [ 87 145 159 217  59   1]
 [ 51 122 123 194  72   1]
 [ 87 158 159 230  72   2]]

It is clear that complete_list is much longer that all_lst, which makes sense because we are adding smaller pieces of the larger repeats, when an overlap in time has occurred between a smaller repeat and a larger one.

For example, the repeat of length 72 starting at beat 51 overlaps with the smaller repeat of length 48 starting at beat 75 (the purple section).

The image below visualizes the idea of distilling smaller parts from larger repeats in step 2 by showing some of the smaller repetitions found within the larger repeats.

[10]:
visualize_complete_lst(thresh_dist_mat)
_images/example_vignette_16_0.png

Step 4: Remove any repeated structure that has at least two repeats that overlap in time

In this step, repeated structures with the same annotation and length are removed if they are overlapping.

This is done by looping over all possible repeat lengths, finding all the groups of repeats of the same length. For each of those groups, remove_overlaps determines whether there exists any pair of repeats that overlaps in time. If a pair like that exists, all the overlapping repeats are removed.

Along with a list of all the repeats with no overlaps, this step also forms a matrix, a list of the associated lengths of the repeats in the matrix, the annotations of the repeats in the matrix, and a list of the overlaps. The matrix is a binary matrix that visualizes the repeats, representing the start of a repeat with a 1. This matrix, in combination with the list of the lengths of the repeats, can be used to visualize the repeats.

[11]:
output_tuple = remove_overlaps(complete_lst, song_length)

print('List with no overlaps:\n', output_tuple[0])
print('Matrix with no overlaps:\n', output_tuple[1])
print('Lengths of the repeats in the matrix:', output_tuple[2])
print('Annotations of the repeats in the matrix:', output_tuple[3])
print('List of overlaps:\n', output_tuple[4])
List with no overlaps:
 [[  1   1  49  49   1   1]
 [  2   2  26  26   1   2]
 [  2   2  50  50   1   2]
 [  2   2  74  74   1   2]
 [  2   2 146 146   1   2]
 [  2   2 218 218   1   2]
 [  2   2 290 290   1   2]
 [  2   2 314 314   1   2]
 [  3   3 123 123   1   3]
 [  3   3 195 195   1   3]
 [ 26  26  50  50   1   2]
 [ 26  26  74  74   1   2]
 [ 26  26 146 146   1   2]
 [ 26  26 218 218   1   2]
 [ 26  26 314 314   1   2]
 [ 39  39 231 231   1   4]
 [ 50  50  74  74   1   2]
 [ 50  50 146 146   1   2]
 [ 50  50 218 218   1   2]
 [ 50  50 290 290   1   2]
 [ 50  50 314 314   1   2]
 [ 51  51 123 123   1   3]
 [ 51  51 195 195   1   3]
 [ 74  74 146 146   1   2]
 [ 74  74 218 218   1   2]
 [ 74  74 314 314   1   2]
 [146 146 218 218   1   2]
 [146 146 314 314   1   2]
 [218 218 314 314   1   2]
 [  1   2  49  50   2   1]
 [  2   3 290 291   2   2]
 [ 37  38  85  86   2   3]
 [ 37  38 157 158   2   3]
 [ 50  51 290 291   2   2]
 [ 85  86 229 230   2   3]
 [157 158 229 230   2   3]
 [ 37  39 229 231   3   1]
 [247 250 271 274   4   1]
 [ 27  36 315 324  10   1]
 [ 75  84 315 324  10   1]
 [147 156 315 324  10   1]
 [219 228 315 324  10   1]
 [ 27  38  75  86  12   1]
 [ 27  38 147 158  12   1]
 [ 75  86 219 230  12   1]
 [147 158 219 230  12   1]
 [ 27  39 219 231  13   1]
 [124 145 292 313  22   1]
 [196 217 292 313  22   1]
 [  3  25 123 145  23   1]
 [  3  25 195 217  23   1]
 [ 51  73 123 145  23   1]
 [ 51  73 195 217  23   1]
 [  2  25 290 313  24   1]
 [ 50  73 290 313  24   1]
 [  1  25  49  73  25   1]
 [  4  36 124 156  33   1]
 [  4  36 196 228  33   1]
 [  4  36 292 324  33   1]
 [ 52  84 124 156  33   1]
 [ 52  84 196 228  33   1]
 [ 52  84 292 324  33   1]
 [124 156 196 228  33   1]
 [124 156 292 324  33   1]
 [196 228 292 324  33   1]
 [  3  36 291 324  34   1]
 [ 51  84 291 324  34   1]
 [  2  36  50  84  35   1]
 [  2  36 290 324  35   1]
 [ 50  84 290 324  35   1]
 [  3  38  51  86  36   1]
 [  3  38 123 158  36   1]
 [  3  38 195 230  36   1]
 [ 51  86 123 158  36   1]
 [ 51  86 195 230  36   1]
 [ 87 122 159 194  36   2]
 [123 158 195 230  36   1]
 [  3  39 195 231  37   1]
 [ 87 123 159 195  37   2]
 [  1  38  49  86  38   1]
 [ 85 122 157 194  38   2]
 [ 75 122 147 194  48   1]
 [ 87 145 159 217  59   1]
 [ 51 122 123 194  72   1]
 [ 87 158 159 230  72   2]]
Matrix with no overlaps:
 [[1 0 0 ... 0 0 0]
 [0 1 0 ... 0 0 0]
 [0 0 1 ... 0 0 0]
 ...
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]]
Lengths of the repeats in the matrix: [ 1  1  1  1  2  2  2  3  4 10 12 13 22 23 24 25 33 34 35 36 36 37 37 38
 38 48 59 72 72]
Annotations of the repeats in the matrix: [1 2 3 4 1 2 3 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2 1 2 1 1 1 2]
List of overlaps:
 []

Step 5: Find the essential structure components of the song and build the aligned hierarchies

Building the aligned hierarchies takes two final steps: finding the essential structure components and using them to create the full aligned hierarchies. hierarchical_structure first finds the essential structure components with breakup_overlaps_by_intersect. After this, modified versions of steps 2-4 take place to create the full hierarchical structure.

Essential Structure Components

The essential structure components are the structure building blocks; that is they are the smallest meaningful repeated structures. All repeated structures within a piece are constructed with right and left unions of the essential structure components. Each timestep can be in at most one essential structure component. The first panel in the below image shows the essential structure components for our example piece. The visualization is organized such that there is one row per type of repeat. Notice that we have nine types of repeats in the below example.

Creating the aligned hierarchies

After finding the essential structure components for a song, we build the aligned hierarchies using a process whose result is akin to taking right and left unions of the essential structure components. The process begins by creating a list of the essential structure components in the order they appear. We assign each essential structure component the number of the row that it sits in. For consectutive timesteps where there is not essential structure component, we use 0 for that block. In the below example, this list would be: 1,2,3,4,2,5,6,7,0,1,2,3,4,2,5,6,8,3,4,2,5,6,8,3,4,2,5,6,7,0,9,0,9,0,2,3,4,2,5,0.

Using this list, a thresholded dissimilarity matrix is created. Using steps similar to steps two through four above, we can extract information about the combinations of essential structures. From the thresholded dissimilarity matrix, all diagonals are extracted, starting with the longest one, but are not removed. This way, all possible combinations of essential structure components are found. Then, all these combinations are grouped and checked for overlapping repeated combinations. The final product is then restructured to show the widths and annotations for each type of repeated structure (represented by each row).

The figures below represent the final result we get, including two of the most important images: the essential structure components (top most panel) and the complete hierarchical structure (bottom panel).

The figure below with the name Threshold Self-dissimilarity matrix of the ordering Essential Structure Components shows the square thresholded self-dissimilarity matrix such that the (i,j) entry is 1 if the following three conditions are true:

  • A repeat of an essential structure component is the i-th item in the ordering.

  • A repeat of an essential structure component is the j-th item in the ordering.

  • The repeat occurring in the i-th place of the ordering and the one occurring in the j-th place of the ordering are repeats of the same essential structure component.

The figure below with the name Repeated ordered sublists of the Essential Structure Components is the result of extracting all the diagonals and getting pairs of repeated ordered sublists of the essential structure components. The repetitive copies of the same repeat and overlaps are also removed.

The second to last figure labeled Repeated ordered sublists of the Essential Structure Components with leading index highlighted contains the same repeats as Repeated ordered sublists of the Essential Structure Components but only notes the starting item in black and shows the rest of the repeat in gray.

The aligned hierarchies are shown in the last image. This is the result of stretching each essential structure component to its correct length (ie. the one noted in the first image below). This transformation results in a visualization that is the number of time steps of the original song.

[12]:
(mat_no_overlaps, key_no_overlaps) = output_tuple[1:3]

# Distill non-overlapping repeats into essential structure components and
# use them to build the hierarchical representation
output_tuple = hierarchical_structure(mat_no_overlaps, key_no_overlaps, song_length, vis=True)

_images/example_vignette_20_0.png
_images/example_vignette_20_1.png
_images/example_vignette_20_2.png
_images/example_vignette_20_3.png
_images/example_vignette_20_4.png

Changelog

v0.1.0

Initial public release.

Index

:ref:genindex