Assemble
assemble.py
This module finds and forms essential structure components, which are the smallest building blocks that form every repeat in the song.
These functions ensure that each time step of a song is contained in at most one of the song’s essential structure components by checking that there are no overlapping repeats in time. When repeats overlap, they undergo a process where they are divided until there are only non-overlapping pieces left.
The module contains the following functions:
- breakup_overlaps_by_intersect
Extracts repeats in input_pattern_obj that has the starting indices of the repeats, into the essential structure components using bw_vec, that has the lengths of each repeat.
- check_overlaps
Compares every pair of groups, determining if there are any repeats in any pairs of the groups that overlap.
- __compare_and_cut
Compares two rows of repeats labeled RED and BLUE, and determines if there are any overlaps in time between them. If there are overlaps, we cut the repeats in RED and BLUE into up to 3 pieces.
- __num_of_parts
Determines the number of blocks of consecutive time steps in a list of time steps. A block of consecutive time steps represents a distilled section of a repeat.
- __inds_to_rows
Expands a vector containing the starting indices of a piece or two of a repeat into a matrix representation recording when these pieces occur in the song with 1’s. All remaining entries are marked with 0’s.
- __merge_based_on_length
Merges repeats that are the same length, as set by full_bandwidth, and are repeats of the same piece of structure.
- __merge_rows
Merges rows that have at least one common repeat. These common repeat(s) must occur at the same time step and be of a common length.
- hierarchical_structure
Distills the repeats encoded in matrix_no_overlaps (and key_no_overlaps) to the essential structure components and then builds the hierarchical representation. Optionally outputs visualizations of the hierarchical representations.
- repytah.assemble.breakup_overlaps_by_intersect(input_pattern_obj, bw_vec, thresh_bw)
Extracts repeats in input_pattern_obj that has the starting indices of the repeats, into the essential structure components using bw_vec, that has the lengths of each repeat. The essential structure components are the smallest building blocks that form every repeat in the song.
- Parameters
input_pattern_obj (np.ndarray) – Binary matrix with 1’s where repeats begin and 0’s otherwise.
bw_vec (np.ndarray) – Vector containing the lengths of the repeats encoded in input_pattern_obj.
thresh_bw (int) – One less than the smallest allowable repeat length.
- Returns
A tuple (pattern_no_overlaps, pattern_no_overlaps_key) where all variables have data type np.ndarray.
pattern_no_overlaps is a binary matrix with 1’s where repeats of essential structure components begin.
pattern_no_overlaps_key is a vector containing the lengths of the repeats of essential structure components in pattern_no_overlaps.
- repytah.assemble.check_overlaps(input_mat)
Compares every pair of repeat groups and determines if there are any repeats in any pairs of the groups that overlap.
- Parameters
input_mat (np.ndarray) – Binary matrix with blocks of 1’s equal to the length of repeats to be checked for overlaps.
- Returns
Logical array where (i,j) = 1 if row i of input_mat and row j of input_mat overlap and (i,j) = 0 elsewhere.
- Return type
overlap_mat (np.ndarray)
- repytah.assemble.__compare_and_cut(red, red_len, blue, blue_len)
Compares two rows of repeats labeled RED and BLUE, and determines if there are any overlaps in time between them. If there is, then we cut the repeats in RED and BLUE into up to 3 pieces.
- Parameters
red (np.ndarray) – Binary row vector encoding a set of repeats with 1’s where each repeat starts and 0’s otherwise.
red_len (np.ndarray) – Length of repeats encoded in red.
blue (np.ndarray) – Binary row vector encoding a set of repeats with 1’s where each repeat starts and 0’s otherwise.
blue_len (np.ndarray) – Length of repeats encoded in blue.
- Returns
A tuple (union_mat, union_length) where all variables have data type np.ndarray.
union_mat is a binary matrix representation of up to three rows encoding non-overlapping repeats cut from red and blue.
union_length is a vector containing the lengths of the repeats encoded in union_mat.
- repytah.assemble.__num_of_parts(input_vec, input_start, input_all_starts)
Determines the number of blocks of consecutive time steps in a list of time steps. A block of consecutive time steps represents a distilled section of a repeat. This distilled section will be replicated and the starting indices of the repeats within it will be returned.
- Parameters
input_vec (np.ndarray) – Vector that contains one or two parts of a repeat that are overlap(s) in time that may need to be replicated.
input_start (np.ndarray) – Starting index for the part to be replicated.
input_all_starts (np.ndarray) – Starting indices for replication.
- Returns
A tuple (start_mat, length_vec) where all variables have data type np.ndarray.
start_mat is an array of one or two rows containing the starting indices of the replicated repeats.
length_vec is a column vector containing the lengths of the replicated parts.
- repytah.assemble.__inds_to_rows(start_mat, row_length)
Expands a vector containing the starting indices of a piece or two of a repeat into a matrix representation recording when these pieces occur in the song with 1’s. All remaining entries are marked with 0’s.
- Parameters
start_mat (np.ndarray) – Matrix of one or two rows, containing the starting indices.
row_length (int) – Length of the rows.
- Returns
Binary matrix of one or two rows, with 1’s where the starting indices and 0’s otherwise.
- Return type
new_mat (np.ndarray)
- repytah.assemble.__merge_based_on_length(full_mat, full_bw, target_bw)
Merges repeats that are the same length, as set by full_bw, and are repeats of the same piece of structure.
- Parameters
full_mat (np.ndarray) – Binary matrix with ones where repeats start and zeroes otherwise.
full_bw (np.ndarray) – Length of repeats encoded in input_mat.
target_bw (np.ndarray) – Lengths of repeats that we seek to merge.
- Returns
A tuple (out_mat, one_length_vec) where all variables have data type np.ndarray.
out_mat is a binary matrix with 1’s where repeats start and 0’s otherwise with rows of full_mat merged if appropriate.
one_length_vec is a vector that contains the length of repeats encoded in out_mat.
- repytah.assemble.__merge_rows(input_mat, input_width)
Merges rows that have at least one common repeat; said common repeat(s) must occur at the same time step and be of common length.
- Parameters
input_mat (np.ndarray) – Binary matrix with ones where repeats start and zeroes otherwise.
input_width (int) – Length of repeats encoded in input_mat.
- Returns
Binary matrix with ones where repeats start and zeroes otherwise.
- Return type
merge_mat (np.ndarray)
- repytah.assemble.hierarchical_structure(matrix_no_overlaps, key_no_overlaps, sn, vis=False)
Distills the repeats encoded in matrix_no_overlaps (and key_no_overlaps) to the essential structure components and then builds the hierarchical representation. Optionally shows visualizations of the hierarchical structure via the vis argument.
- Parameters
matrix_no_overlaps (np.ndarray) – Binary matrix with 1’s where repeats begin and 0’s otherwise.
key_no_overlaps (np.ndarray) – Vector containing the lengths of the repeats encoded in matrix_no_overlaps.
sn (int) –
length (Song) –
shingles. (which is the number of audio) –
vis (bool) – Shows visualizations if True (default = False).
- Returns
A tuple (full_visualization, full_key, full_matrix_no_overlaps, full_anno_lst) where all variables have data type np.ndarray.
full_visualization is a binary matrix representation for full_matrix_no_overlaps with blocks of 1’s equal to the lengths prescribed in full_key.
full_key is a vector containing the lengths of the hierarchical structure encoded in full_matrix_no_overlaps.
full_matrix_no_overlaps is a binary matrix with 1’s where hierarchical structure begins and 0’s otherwise.
full_anno_lst is a vector containing the annotation markers of the hierarchical structure encoded in each row of full_matrix_no_overlaps.