Category: Isomer Software

Isomer is a suite of software tools that produce abstracted representations of the characteristic trends expressed in existing musical models. The system then uses these observed trends to create query and transformation algorithms, with the ultimate goal of generating of new/hybrid materials.

  • Isomer On Record

    Isomer On Record

    An EP featuring music collaboratively composed with Isomer between 2017 and 2018 is now available on a wide range of streaming sites.

    Here’s the track list:

    1. Dark Halls (4:32)
    2. Bundles of Superstition (1:51)
    3. Structural Quarters (1:43)
    4. Other Tiny Magic (3:14)
    5. After October (1:59)

    Enjoy at your favorite musical watering hole!

    Answer is Waves on SpotifyAnswer is Waves on Apple Music
    Answer is Waves on TidalAnswer is Waves on Google Play

    Answer is Waves on YouTube Music

  • Composing Music with Isomer

    Composing Music with Isomer

    After graduating Eastman in 2002, I got involved with music informatics research and spent several years developing music search and discovery technologies. During that time, I was exposed to emerging AI that led me to imagine potential solutions to existing challenges in computational creativity. Today, I’m working to develop methods for human and software collaboration with the goal of generating novel musical expression grounded in familiar perceptual structures.

    Unlike other approaches to computational creativity, I’m not interested in directly modeling the creative process in humans. Instead, I’m searching for a musical result that speaks emotionally through a context-aware, emergent grammar — a language that feels natural and engaging, even when its surface is unfamiliar. It’s a tall order, and something that I’ll be working toward for a long time to come.

    What is Isomer?

    Since early 2012, I’ve been developing my own software (called Isomer) which allows me to move flexibly between interpretations of fixed audio sources, symbolic performances, and detailed orchestrational renderings. (You can read more about how Isomer thinks about music here.)

    What makes working with Isomer unique is the advanced role the software plays in musically interpreting the raw input data. Over time, Isomer is learning how human-composed music creates and satisfies expectation (emotional tension) and applying this knowledge to the output it generates.

    Making Machines Musical

    Applying deep learning technologies to artistic expression is a popular trend. But even in the best cases, this approach generally results in simple, distorted copies of the original input. While it can provide uncanny effects, the results remain tightly bound to limitations within the input models. In my view, creativity requires the ability to forge new connections between existing contexts, and the missing ingredient with current machine learning approaches is the conscious application of context within the creative medium. Or put more accurately: the contextual classification of musical elements critical to defining musical intent and expectation.

    These critical elements do not appear to reveal themselves with current deep learning methods which is why Isomer must become capable of deciding whether or not to apply learned rules to guide musical expression within an ever-changing artistic context. A simple example of this might be the application of a crescendo to a rising melodic line. There isn’t a single, ideal crescendo that will work in every case. Determining whether or not it’s an appropriate addition, and working out exactly how it should be executed, depends heavily on the melodic, harmonic, and timbral environment. Even if Isomer can learn how to apply an appropriate solution, the contextually-dependent decision whether or not to do so still remains.

    More complex examples include: how to finalize a musical gesture (i.e. registral return, agogic accent, dynamic contrast, none/all of the above?), how to affect a harmonic transition (i.e. adjust harmonic tension within a specific context), and how to orchestrate evolving variations of a specific melodic idea. These examples show how important the interpretation of musical events within their near-term context can be, and it’s for this reason that context awareness is the primary focus of ongoing development. And so with that in mind, let’s take a look at how I collaborate with Isomer throughout the composition process.

    Creative Collaboration: Isomer’s Role

    Isomer’s first job in the process is to reverse engineer audio input to determine how its spectral components change over time. Additionally, Isomer searches for perceptually important transient attack points or “event onsets”. Each event onset defines an adaptive analysis window that provides rhythmic context and helps Isomer develop large-scale formal pacing.

    Within each window, partial data is extracted, normalized, and stored. A picture of the composite harmonic layers is assembled (from bottom to top), allowing Isomer to define a series of layered monophonic lines. These lines are then shaped into useful musical material using learned principles of musical expectation and universal construction.

    Once the model is established (click here for details), Isomer can search for musically relevant patterns and modify the output based on composer input. For some applications, the composer may wish to keep the original analysis generally intact, while in other cases, it may be desirable to ask Isomer to add, remove, or modify events to create a result that is more predictable and universally appealing in its construction.

    Creative Collaboration: The Composer’s Role

    What’s left for the human composer to contribute, you ask? First, the composer must curate the input source(s). Whether it’s a series of short “found” sounds or a meticulously crafted concrète work, the input source has a significant impact on the harmonic content and dramatic pacing of the resulting work.

    Speaking of output, Isomer’s analysis often results in hundreds of tracks, but only a few (maybe 24 or so?) are likely to be used in the competed work. The composer must evaluate the potential for each of these output tracks and determine which will make it into the piece.

    In early versions of the software, the composer completely determined the orchestration. Today, Isomer sets out a basic orchestrational plan which must be modified by the composer. In future versions, Isomer will evaluate the timbral signatures of orchestral options to determine the most musically useful combinations of instruments to employ within the varying emotional states of the work.

    And finally, the composer must digitally realize and produce the resulting piece. Digital production isn’t traditionally considered part of the compositional process, but for me (as with many composers of electronic music) it is. Machine listening is still a long way from dealing with detailed production choices, and these decisions can have an enormous effect on the resulting piece. So for now, production must be up to the composer.

    A Final Thought

    Keep in mind that while exploring creative collaboration with computers is an interesting topic, these process details really don’t define anything too important. What’s truly important is the experience gained by listening. And that will always remain the biggest challenge for anyone involved in creative pursuits — computationally collaborative, or otherwise.

  • How Does Isomer Work?

    How Does Isomer Work?

    Isomer is a suite of software tools that produce abstracted representations of the characteristic trends expressed in existing musical models. The system then uses these observed trends to create query and transformation algorithms, with the ultimate goal of generating of new/hybrid materials.

    Isomer’s ten software modules are separated into two categories: representation and processing. Summary descriptions of each module are presented below.

    Model Representation

    Audio
    Isomer-Audio analyzes audio files and populates a MySQL database with raw audio feature data at various time resolutions. The feature data remains uninterpreted but is organized into streams depending on the type of processing employed (i.e. blind source separation, spectral analysis).
    Symbolic
    Isomer-Symbolic ingests symbolic (e.g. MIDI) files and populates a MySQL database with raw event data. The event data is organized into streams (separated by track) containing collections of raw NodeEvents.
    Model (A.R.E.)
    Isomer-Model normalizes raw analy­sis data (sym­bolic and/or audio) and populates a single (or multiple) MySQL database(s) with observed and interpreted feature representations. Multiple model abstractions of the same input may exist simultaneously in discrete databases, a key feature of Isomer that encourages the emergence of a desired representation schema for the musical model.
    Query
    Isomer-Query allows users to mine Isomer-generated model data for specific values and relevant trends. High dimensional queries (specified using time and stream/sub-stream ranges) can be made to model databases. Query results constitute the subset of model features available for subsequent processing.
    Segment
    Isomer-Segment detects perceptual changes over any subset of vectors in a given model. Once this new gestalt-map of the model is defined, Isomer-Segment searches for points of alignment between the various sub-streams (melody, texture, rhythm, harmony) and segments the model accordingly.

    Model Processing

    Similarity
    To maximize usability of the representations, the system must be capable of flexibly comparing their traits to determine potential influence and overall relatedness. Isomer-Similarity performs this task by matching range limits or geometric contours of model data segments for any combination of vectors.
    Classify
    Isomer-Classify applies machine learning classifiers to expose relationships between groups of specific model features and trends found in the larger corpus or externally sourced data (such as human-generated meta-descriptions).
    Transform
    Isomer-Transform combines machine-learned feature trends with an extensive library of transform algorithms. Transformations can be applied across parameters and across any number of models, giving Isomer the ability to generate original musical material.
    Render
    Isomer-Render is responsible for orchestrating Isomer-Transform output and rendering performances. Today, final results are auditioned by humans. In future updates, Isomer-Render will employ a series of fitness functions to allow users to establish targeted output.
    Keyword
    Isomer-Keyword is a standalone module designed to collect, catalog and datamine human-generated descriptors. Its purpose is to provide a framework for researching the associative connections between the keywords themselves. From there, the content representation produced by Isomer can be analyzed to discover points of contact; where musical features closely associate with keyword descriptors.

     

    This modular architecture makes Isomer ideal for a wide range of applications. Modules can be connected in myriad ways to create a wide range of workflows options.

    Model-Based Music Generation

    Isomer Music Generation

    Trained Description Tagging

    Isomer Tagging Process Flow

  • How Can I Use Isomer?

    How Can I Use Isomer?

    Isomer’s flexible architecture makes it ideal for a wide range of research, creative, and industry applications. The current application roadmap focuses on musical applications, however, there’s nothing to prevent Isomer’s representation and processing algorithms from being applied to other multi-vector, time series data.

    While not the current focus, exploration into non-musical applications is also under consideration.

    Industry Applications

    Audio Tagging
    Isomer can correlate audio tracks from a music library with detailed, human-generated descriptions of mood, texture, and commercial usage. From there, Isomer can apply this learned taxonomy to new audio tracks (as metadata) for use in text-based search engines. This application may also be used to clean and unify existing metadata across multiple libraries.
    SFX Classifier
    Isomer can classify one-shot audio files from a sound effect library based on pitch content (melodic or harmonic), timbral trends, rhythmic profile, and/or length.
    Symbolic Similarity
    Isomer is capable of comparing symbolic (MIDI) files for similarity in terms of melodic or rhythmic contour, motivic or harmonic content, and more. Because Isomer’s representations are perceptual in nature, close (but not exact) copies can be detected using a user-defined threshold.
    Audio Search
    Isomer can directly compare audio files for similarity in terms of pitch (melodic or harmonic), timbral, and/or rhythmic trends.

    Creative Applications

    Music Composition
    Isomer’s ability to convert existing music into highly precise or abstract perceptual models makes it ideal for developing a library of musical models. Once analyzed, these models can be used in any number of ways to influence novel musical output. Over time, and in combination with machine learning techniques, the system may be capable of developing unique compositional styles.
    Sonification
    By deconstructing an audio signal into its component parts and then recombining and orchestrating these parts, Isomer can generate output (either symbolic or audio) to creatively imitate the original source signal.
    Sample Replacement
    A single seed sample can generate a list of similar options from a user-defined library based on fundamental pitch content (melodic), harmonic trends, timbral trends, rhythmic profile, and/or length.
    Automated Orchestration
    Using either symbolic or audio models tagged with human-generated descriptions, Isomer can orchestrate musical ideas for either score or audio rendering.

     

    Can I Use Isomer In My Own Projects?

    Unfortunately, Isomer isn’t available for public use. At least not yet.

    While the software has proven to be extremely robust (development is test-driven with long-run reliability and portability as primary goals), the modules are currently managed via CLI. In other words, by modern standards, it’s incredibly cumbersome to use.

    This is by design, however. Top priority is to keep the system maximally flexible while work is completed on all modules and I discover the strengths and weaknesses of the system. Once testing is completed, I’ll look at releasing feature-limited versions of Isomer for use in various environments and applications.

    Of course, I’m always open to new ideas, so if you have a something in mind, feel free to contact me with details.

  • Representing Musical Models (the Isomer way)

    Representing Musical Models (the Isomer way)

    Isomer is a suite of software that produces an abstracted representation of the characteristic trends present in musical language through the analysis of symbolic or audio input. Isomer uses this abstracted representation to execute query and transformation algorithms, with the ultimate goal of generating of new/hybrid materials.

    Isomer assembles a unified model from raw analysis data (symbolic and/or audio) and populates a single (or multiple) database(s) with observed and interpreted feature data representations. Multiple model abstractions can exist simultaneously in discrete databases, a key feature of Isomer that encourages the emergence of a desired representation schema for the musical model. Model data is separated into four musical components: melody, harmony, rhythm and timbre depending on source availability.

    Data Representation Overview

    Isomer provides normalized storage of feature data in a Cartesian coordinate (grid) format. This allows input from various sources to combine in a single model representation suitable for further analysis and abstraction.
    Isomer-Model Data Representation

    Cartesian Grid Format

    Horizontal Axis

    The horizontal (X) axis represents time with sampled data appearing at a regular, user-defined time interval (default = 2 ms). Each vertical column of data is called a slice.

    Why is 2ms the default Slice window?

    2ms is the smallest time interval at which humans can detect multiple transients and allows for complete capture of MIDI files with resolution of 30bpm @ 960ppq (or 60bpm @ 480ppq).
    Isomer-Model Cartesian Grid

    Vertical Axis

    The vertical (Y) axis contains individual streams (type is dependent on input). Each stream forms a horizontal row that is further divided into four sub-stream categories: melody, rhythm, timbre and harmony.
    Isomer-Model Sub-Streams
    Each grid point contains multidimensional data for each of the stream sub-categories in time at a resolution defined by the user.

    Observed and Extrapolated Data Fields

    The source of input data (audio or symbolic) is determined at the time of model creation. Once the model is generated, the original format (audio or symbolic) of the raw data is no longer an issue, although input format can limit the parameters available (i.e. symbolic input contains no texture data).

    Regardless of the input format, Isomer attempts to collect data without interpretation using the most accurate methods of analysis available. These data fields are treated as empirical observations. However, during model creation Isomer also calculates additional parameters to better prepare the data for use in a musical context. These are extrapolated data fields.

    The grid format described above requires two coordinates: stream/sub-stream (Y) and a series of analysis windows (X). Each point in the grid contains data for the following parameters:

    * indicates an extrapolated data field

    • Melody
      • Pitch (hertz, midi)
      • Pitch IR *
      • IR Equity *
      • Proximity Equity *
      • Registral Return *
    • Rhythm
      • Band (describes Bark Scale freq range where onset is detected)
      • Intensity (percentage, velocity)
      • Onset Equity *
    • Timbre
      • RMS
      • Spectral Stability
      • Spectral Flatness
      • MFCC (13)
    • Harmony
      • Pitches (hertz, midi, intensity)
      • Average Chroma
      • Total Pitches *
      • Tension *

    Multi-Event Query Parameters

    Querying musical data in Isomer allows users to mine the models for musically relevant trends. The user supplies coordinates as time and stream/sub-stream ranges. This input is translated into the matrix coordinates (as defined by the model) before the query is executed.

    The interface returns the results as time points (X-axis) and streams/sub-streams (Y-axis) for a given model and allows on-demand calculation of any/all of the following data for any grid coordinates or range of coordinates:

    • Melody
      • Registral Direction
      • Registral Return
      • Proximity Ratio
      • Interval Range
      • Melodic Accent
    • Rhythm
      • Temporal Direction
      • Temporal Instability
      • IOI Range
      • NPVI IOI
      • NPVI Duration
      • Agogic Accent
      • Syncopation
    • Timbre
      • Texture Fingerprint
      • Dynamic (Timbre) Accent
    • Harmony
      • Tension Direction
      • Tension Instability
      • Average Tension

    Increasing Understanding Through Multiple Perspectives

    Isomer extrapolates parameters in addition to the observed data only after window-based quantization takes place. The interpretation that necessarily occurs during quantization can create desirable variations in the resulting model representation. By generating several models from the same source input with varying window sizes, Isomer can create multiple representations of a single input source.

    The power of implementing a series of interpreted abstractions is that the raw data can be represented multiple times at varying depths from the original (raw) observations regardless of the model input format, giving the user a variety of ways to view and examine the model. For symbolic input, this provides a useful way to control the level of detail captured in the model. In the case of audio input, the user can find the most appropriate representation for a specific input file; a problem inherent in the musical analysis of raw audio.

    For example, determining the location of event onsets, “beats”, or perceptual pulse points may require a high-resolution view of feature data, while the detection of more expansive musical elements like harmonic rhythm may benefit significantly from a more general reduction of the raw DSP data.

    In conjunction with the Isomer’s query function, multiple versions of the abstraction layer can exist simultaneously in discrete databases. This design is a key feature of Isomer that encourages the emergence of a desired representation schema, as dictated by the specific task the user has in mind.

  • Automating Musical Descriptions: A Case Study

    Automating Musical Descriptions: A Case Study

    It’s widely accepted that music elicits similar emotional responses from culturally connected groups of human listeners. Less clear is how various aspects of musical language contribute to these effects.

    Funded by a Mellon Foundation Research Grant through Dickinson College Digital Humanities, our research into this questions leverages cognition-based machine listening algorithms and network analysis of musical descriptors to identify the connections between musical affect and language.

    (more…)