Tag: musical models

How Does Isomer Work?

Isomer is a suite of software tools that produce abstracted representations of the characteristic trends expressed in existing musical models. The system then uses these observed trends to create query and transformation algorithms, with the ultimate goal of generating of new/hybrid materials.

Isomer’s ten software modules are separated into two categories: representation and processing. Summary descriptions of each module are presented below.

Model Representation

Audio
Isomer-Audio analyzes audio files and populates a MySQL database with raw audio feature data at various time resolutions. The feature data remains uninterpreted but is organized into streams depending on the type of processing employed (i.e. blind source separation, spectral analysis).

Symbolic
Isomer-Symbolic ingests symbolic (e.g. MIDI) files and populates a MySQL database with raw event data. The event data is organized into streams (separated by track) containing collections of raw NodeEvents.

Model (A.R.E.)
Isomer-Model normalizes raw analysis data (symbolic and/or audio) and populates a single (or multiple) MySQL database(s) with observed and interpreted feature representations. Multiple model abstractions of the same input may exist simultaneously in discrete databases, a key feature of Isomer that encourages the emergence of a desired representation schema for the musical model.

Query
Isomer-Query allows users to mine Isomer-generated model data for specific values and relevant trends. High dimensional queries (specified using time and stream/sub-stream ranges) can be made to model databases. Query results constitute the subset of model features available for subsequent processing.

Segment
Isomer-Segment detects perceptual changes over any subset of vectors in a given model. Once this new gestalt-map of the model is defined, Isomer-Segment searches for points of alignment between the various sub-streams (melody, texture, rhythm, harmony) and segments the model accordingly.

Model Processing

Similarity
To maximize usability of the representations, the system must be capable of flexibly comparing their traits to determine potential influence and overall relatedness. Isomer-Similarity performs this task by matching range limits or geometric contours of model data segments for any combination of vectors.

Classify
Isomer-Classify applies machine learning classifiers to expose relationships between groups of specific model features and trends found in the larger corpus or externally sourced data (such as human-generated meta-descriptions).

Transform
Isomer-Transform combines machine-learned feature trends with an extensive library of transform algorithms. Transformations can be applied across parameters and across any number of models, giving Isomer the ability to generate original musical material.

Render
Isomer-Render is responsible for orchestrating Isomer-Transform output and rendering performances. Today, final results are auditioned by humans. In future updates, Isomer-Render will employ a series of fitness functions to allow users to establish targeted output.

Keyword
Isomer-Keyword is a standalone module designed to collect, catalog and datamine human-generated descriptors. Its purpose is to provide a framework for researching the associative connections between the keywords themselves. From there, the content representation produced by Isomer can be analyzed to discover points of contact; where musical features closely associate with keyword descriptors.

This modular architecture makes Isomer ideal for a wide range of applications. Modules can be connected in myriad ways to create a wide range of workflows options.

Model-Based Music Generation

Trained Description Tagging

May 25, 2017
Representing Musical Models (the Isomer way)
Isomer is a suite of software that produces an abstracted representation of the characteristic trends present in musical language through the analysis of symbolic or audio input. Isomer uses this abstracted representation to execute query and transformation algorithms, with the ultimate goal of generating of new/hybrid materials.

Isomer assembles a unified model from raw analysis data (symbolic and/or audio) and populates a single (or multiple) database(s) with observed and interpreted feature data representations. Multiple model abstractions can exist simultaneously in discrete databases, a key feature of Isomer that encourages the emergence of a desired representation schema for the musical model. Model data is separated into four musical components: melody, harmony, rhythm and timbre depending on source availability.

Data Representation Overview

Isomer provides normalized storage of feature data in a Cartesian coordinate (grid) format. This allows input from various sources to combine in a single model representation suitable for further analysis and abstraction.

Cartesian Grid Format

Horizontal Axis

The horizontal (X) axis represents time with sampled data appearing at a regular, user-defined time interval (default = 2 ms). Each vertical column of data is called a slice.

Why is 2ms the default Slice window?

2ms is the smallest time interval at which humans can detect multiple transients and allows for complete capture of MIDI files with resolution of 30bpm @ 960ppq (or 60bpm @ 480ppq).

Vertical Axis

The vertical (Y) axis contains individual streams (type is dependent on input). Each stream forms a horizontal row that is further divided into four sub-stream categories: melody, rhythm, timbre and harmony.

Each grid point contains multidimensional data for each of the stream sub-categories in time at a resolution defined by the user.

Observed and Extrapolated Data Fields

The source of input data (audio or symbolic) is determined at the time of model creation. Once the model is generated, the original format (audio or symbolic) of the raw data is no longer an issue, although input format can limit the parameters available (i.e. symbolic input contains no texture data).

Regardless of the input format, Isomer attempts to collect data without interpretation using the most accurate methods of analysis available. These data fields are treated as empirical observations. However, during model creation Isomer also calculates additional parameters to better prepare the data for use in a musical context. These are extrapolated data fields.

The grid format described above requires two coordinates: stream/sub-stream (Y) and a series of analysis windows (X). Each point in the grid contains data for the following parameters:

* indicates an extrapolated data field
- Melody
  - Pitch (hertz, midi)
  - Pitch IR *
  - IR Equity *
  - Proximity Equity *
  - Registral Return *
- Rhythm
  - Band (describes Bark Scale freq range where onset is detected)
  - Intensity (percentage, velocity)
  - Onset Equity *
- Timbre
  - RMS
  - Spectral Stability
  - Spectral Flatness
  - MFCC (13)
- Harmony
  - Pitches (hertz, midi, intensity)
  - Average Chroma
  - Total Pitches *
  - Tension *
Multi-Event Query Parameters

Querying musical data in Isomer allows users to mine the models for musically relevant trends. The user supplies coordinates as time and stream/sub-stream ranges. This input is translated into the matrix coordinates (as defined by the model) before the query is executed.

The interface returns the results as time points (X-axis) and streams/sub-streams (Y-axis) for a given model and allows on-demand calculation of any/all of the following data for any grid coordinates or range of coordinates:
- Melody
  - Registral Direction
  - Registral Return
  - Proximity Ratio
  - Interval Range
  - Melodic Accent
- Rhythm
  - Temporal Direction
  - Temporal Instability
  - IOI Range
  - NPVI IOI
  - NPVI Duration
  - Agogic Accent
  - Syncopation
- Timbre
  - Texture Fingerprint
  - Dynamic (Timbre) Accent
- Harmony
  - Tension Direction
  - Tension Instability
  - Average Tension
Increasing Understanding Through Multiple Perspectives

Isomer extrapolates parameters in addition to the observed data only after window-based quantization takes place. The interpretation that necessarily occurs during quantization can create desirable variations in the resulting model representation. By generating several models from the same source input with varying window sizes, Isomer can create multiple representations of a single input source.

The power of implementing a series of interpreted abstractions is that the raw data can be represented multiple times at varying depths from the original (raw) observations regardless of the model input format, giving the user a variety of ways to view and examine the model. For symbolic input, this provides a useful way to control the level of detail captured in the model. In the case of audio input, the user can find the most appropriate representation for a specific input file; a problem inherent in the musical analysis of raw audio.

For example, determining the location of event onsets, “beats”, or perceptual pulse points may require a high-resolution view of feature data, while the detection of more expansive musical elements like harmonic rhythm may benefit significantly from a more general reduction of the raw DSP data.

In conjunction with the Isomer’s query function, multiple versions of the abstraction layer can exist simultaneously in discrete databases. This design is a key feature of Isomer that encourages the emergence of a desired representation schema, as dictated by the specific task the user has in mind.
January 9, 2015
Inferring Meaning from Expectation

Of all the sublime moments in operatic literature, few surpass “Pur ti miro, pur ti godo” (I gaze at you, I possess you), the final love duet between Nero and Poppea in Monteverdi’s “L’incoronazione di Poppea”.

This enduring masterpiece aims to represent love’s supreme power over all other forces through the unifying of the couple in unbreakable bonds of love.

(more…)

October 19, 2012