Tag: music cognition

Debussy is Overrated
or… “How Others Did It So Much Better”

The following is a video script I prepared, but never filmed. In the end, I decided I had better things to do, but perhaps the idea is of interest?

VIDEO: Me playing the Drowning Church

VOICE OVER
When I first encountered Debussy, it was here. With this piece.

At the time, my hands weren’t yet big enough to reach the wide spreads… and strange shapes… but that’s not why I found it difficult. It was difficult because I couldn’t find a way to convincingly express anything musical.

It was like the composer needed to convince himself that gravity didn’t apply to him – in my hands, the music couldn’t escape a suffocating heaviness… everything… just… went nowhere…

Debussy was not for me.

VIDEO: Greg speaks directly to the camera while music continues in the background

Of course, Debussy’s influence then and now is undeniable. But I’m still not convinced. And 40 years later, I think I know why…

CHANNEL SPLASH

VIDEO: Greg speaks to camera
TEXT OVERLAY: “Debussy Oversimplified”

Debussy’s legacy and influence comes down to a couple of things:
- he rejected common practice counterpoint and voice leading
- relied on parallel harmony that broke formal conventions
- and most important – the sound texture mattered above all else (symbolism)
VIDEO: composer headshots of Ravel, Satie, Scriabin…

BUT the thing is… other composers around that time were making similar choices – advancing musical language – and they were writing better music.

VIDEO: New Location – Greg Speaks to camera

Musical events are connected in time by our brains – this an unavoidable principle of musical language. One event next to another – implies a range of NEXT options… and if we go too far outside what was implied… the connection can be lost.

Ignoring this is like deciding to ignore physics just so you can build a perpetual motion machine.

Let me show you what I mean.

VIDEO: Me playing Moons Over My Hammy
TEXT OVERLAY: “Sweet Anticipation”

VOICE OVER
Let’s take the opening of one of Debussy’s most famous works – Clair de Lune.

If at any moment…

ANYTHING can happen…

The connection is lost. The implication isn’t strong enough to imply a satisfying resolution.

A sense of anticipation is never created – because just about anything… can happen… and all we can do is take it in.

Anticipation requires that we look forward to something… and THAT depends on what we JUST heard.

VIDEO: New Location – Greg Speaks to camera

If you know this piece, you realize that I just started adding random bits of… whatever. If you don’t know this piece, you probably weren’t bothered.

Now… creating a sense of suspended animation – of floating aimlessly in a hazy mist – can be done in so many wonderfully compelling ways – ways that maintain and deepen connections to the way our brains listen – and ways that still feel as “magical” as anything you can dream up…

VIDEO: Me playing Gymnopédie No. 1
TEXT OVERLAY: “Suspended Animation Doesn’t Exists Without Movement”

VOICE OVER
Here the composer sets up several expectations…

A “looping” accompaniment of sorts…

And a beautiful melody that has clear direction…

Even if it takes some time to complete…

This is Erik Satie. He wrote this in 1888 and in many ways, this is the earliest “ambient” music.

But it creates a sense of stillness through the illusion of suspended animation. Expectations are set forth and satisfied on multiple levels and we don’t have to ignore the way our brains embrace music!

VIDEO: Me playing Scriabin Op. 74
TEXT OVERLAY: “To Break Rules, You Must First Know The Rules”

VOICE OVER
In this case, Alexander Sciabin has discarded traditional harmony in favor of symmetrical harmonic structures. He’s breakin’ the rules. (what a rebel!)

But a sense of anticipation with a relatively limited set of options is maintained.

And yet we’re absolutely floating in a fragrant cloud of misty goodness – gravity is intact. And the anticipation is sweet.

VIDEO: New Location – Greg Speaks to camera

So there it is.

40 years after I struggled to give this music direction…

I learned that it doesn’t have any. At least not to my ears…

VIDEO: Me playing the Drowning Church

VOICE OVER
I’d like to know what you think about Debussy and my take… Leave your thoughts in the comments below. I look forward to you showing me the error of my ways!
November 15, 2022
How Does Isomer Work?

Isomer is a suite of software tools that produce abstracted representations of the characteristic trends expressed in existing musical models. The system then uses these observed trends to create query and transformation algorithms, with the ultimate goal of generating of new/hybrid materials.

Isomer’s ten software modules are separated into two categories: representation and processing. Summary descriptions of each module are presented below.

Model Representation

Audio
Isomer-Audio analyzes audio files and populates a MySQL database with raw audio feature data at various time resolutions. The feature data remains uninterpreted but is organized into streams depending on the type of processing employed (i.e. blind source separation, spectral analysis).

Symbolic
Isomer-Symbolic ingests symbolic (e.g. MIDI) files and populates a MySQL database with raw event data. The event data is organized into streams (separated by track) containing collections of raw NodeEvents.

Model (A.R.E.)
Isomer-Model normalizes raw analysis data (symbolic and/or audio) and populates a single (or multiple) MySQL database(s) with observed and interpreted feature representations. Multiple model abstractions of the same input may exist simultaneously in discrete databases, a key feature of Isomer that encourages the emergence of a desired representation schema for the musical model.

Query
Isomer-Query allows users to mine Isomer-generated model data for specific values and relevant trends. High dimensional queries (specified using time and stream/sub-stream ranges) can be made to model databases. Query results constitute the subset of model features available for subsequent processing.

Segment
Isomer-Segment detects perceptual changes over any subset of vectors in a given model. Once this new gestalt-map of the model is defined, Isomer-Segment searches for points of alignment between the various sub-streams (melody, texture, rhythm, harmony) and segments the model accordingly.

Model Processing

Similarity
To maximize usability of the representations, the system must be capable of flexibly comparing their traits to determine potential influence and overall relatedness. Isomer-Similarity performs this task by matching range limits or geometric contours of model data segments for any combination of vectors.

Classify
Isomer-Classify applies machine learning classifiers to expose relationships between groups of specific model features and trends found in the larger corpus or externally sourced data (such as human-generated meta-descriptions).

Transform
Isomer-Transform combines machine-learned feature trends with an extensive library of transform algorithms. Transformations can be applied across parameters and across any number of models, giving Isomer the ability to generate original musical material.

Render
Isomer-Render is responsible for orchestrating Isomer-Transform output and rendering performances. Today, final results are auditioned by humans. In future updates, Isomer-Render will employ a series of fitness functions to allow users to establish targeted output.

Keyword
Isomer-Keyword is a standalone module designed to collect, catalog and datamine human-generated descriptors. Its purpose is to provide a framework for researching the associative connections between the keywords themselves. From there, the content representation produced by Isomer can be analyzed to discover points of contact; where musical features closely associate with keyword descriptors.

This modular architecture makes Isomer ideal for a wide range of applications. Modules can be connected in myriad ways to create a wide range of workflows options.

Model-Based Music Generation

Trained Description Tagging

May 25, 2017
Representing Musical Models (the Isomer way)
Isomer is a suite of software that produces an abstracted representation of the characteristic trends present in musical language through the analysis of symbolic or audio input. Isomer uses this abstracted representation to execute query and transformation algorithms, with the ultimate goal of generating of new/hybrid materials.

Isomer assembles a unified model from raw analysis data (symbolic and/or audio) and populates a single (or multiple) database(s) with observed and interpreted feature data representations. Multiple model abstractions can exist simultaneously in discrete databases, a key feature of Isomer that encourages the emergence of a desired representation schema for the musical model. Model data is separated into four musical components: melody, harmony, rhythm and timbre depending on source availability.

Data Representation Overview

Isomer provides normalized storage of feature data in a Cartesian coordinate (grid) format. This allows input from various sources to combine in a single model representation suitable for further analysis and abstraction.

Cartesian Grid Format

Horizontal Axis

The horizontal (X) axis represents time with sampled data appearing at a regular, user-defined time interval (default = 2 ms). Each vertical column of data is called a slice.

Why is 2ms the default Slice window?

2ms is the smallest time interval at which humans can detect multiple transients and allows for complete capture of MIDI files with resolution of 30bpm @ 960ppq (or 60bpm @ 480ppq).

Vertical Axis

The vertical (Y) axis contains individual streams (type is dependent on input). Each stream forms a horizontal row that is further divided into four sub-stream categories: melody, rhythm, timbre and harmony.

Each grid point contains multidimensional data for each of the stream sub-categories in time at a resolution defined by the user.

Observed and Extrapolated Data Fields

The source of input data (audio or symbolic) is determined at the time of model creation. Once the model is generated, the original format (audio or symbolic) of the raw data is no longer an issue, although input format can limit the parameters available (i.e. symbolic input contains no texture data).

Regardless of the input format, Isomer attempts to collect data without interpretation using the most accurate methods of analysis available. These data fields are treated as empirical observations. However, during model creation Isomer also calculates additional parameters to better prepare the data for use in a musical context. These are extrapolated data fields.

The grid format described above requires two coordinates: stream/sub-stream (Y) and a series of analysis windows (X). Each point in the grid contains data for the following parameters:

* indicates an extrapolated data field
- Melody
  - Pitch (hertz, midi)
  - Pitch IR *
  - IR Equity *
  - Proximity Equity *
  - Registral Return *
- Rhythm
  - Band (describes Bark Scale freq range where onset is detected)
  - Intensity (percentage, velocity)
  - Onset Equity *
- Timbre
  - RMS
  - Spectral Stability
  - Spectral Flatness
  - MFCC (13)
- Harmony
  - Pitches (hertz, midi, intensity)
  - Average Chroma
  - Total Pitches *
  - Tension *
Multi-Event Query Parameters

Querying musical data in Isomer allows users to mine the models for musically relevant trends. The user supplies coordinates as time and stream/sub-stream ranges. This input is translated into the matrix coordinates (as defined by the model) before the query is executed.

The interface returns the results as time points (X-axis) and streams/sub-streams (Y-axis) for a given model and allows on-demand calculation of any/all of the following data for any grid coordinates or range of coordinates:
- Melody
  - Registral Direction
  - Registral Return
  - Proximity Ratio
  - Interval Range
  - Melodic Accent
- Rhythm
  - Temporal Direction
  - Temporal Instability
  - IOI Range
  - NPVI IOI
  - NPVI Duration
  - Agogic Accent
  - Syncopation
- Timbre
  - Texture Fingerprint
  - Dynamic (Timbre) Accent
- Harmony
  - Tension Direction
  - Tension Instability
  - Average Tension
Increasing Understanding Through Multiple Perspectives

Isomer extrapolates parameters in addition to the observed data only after window-based quantization takes place. The interpretation that necessarily occurs during quantization can create desirable variations in the resulting model representation. By generating several models from the same source input with varying window sizes, Isomer can create multiple representations of a single input source.

The power of implementing a series of interpreted abstractions is that the raw data can be represented multiple times at varying depths from the original (raw) observations regardless of the model input format, giving the user a variety of ways to view and examine the model. For symbolic input, this provides a useful way to control the level of detail captured in the model. In the case of audio input, the user can find the most appropriate representation for a specific input file; a problem inherent in the musical analysis of raw audio.

For example, determining the location of event onsets, “beats”, or perceptual pulse points may require a high-resolution view of feature data, while the detection of more expansive musical elements like harmonic rhythm may benefit significantly from a more general reduction of the raw DSP data.

In conjunction with the Isomer’s query function, multiple versions of the abstraction layer can exist simultaneously in discrete databases. This design is a key feature of Isomer that encourages the emergence of a desired representation schema, as dictated by the specific task the user has in mind.
January 9, 2015
The Pleasures of Music

Much of what currently passes as interdisciplinary work in computational creativity is profoundly lacking in artistic experience and understanding. This is a pervasive and critical flaw — after all, the creative side of the equation is our primary goal, is it not?

To help us consider a more appropriate balance, let’s briefly examine how human composers and analysts approach their work.

Music as Data

Our first challenge is that musicians don’t generally think of raw musical materials as data. Immediately we find a conflict between the computational and creative — how can we compute without data? The answer lies with how we collect, represent, and ultimately select musical elements for processing. The key issue we need to consider is context.

Interpreting the musical context around individual events is what allows us to develop a meaningful picture of how groups of musical events function. Understanding these musical effects requires computational systems to make aesthetic judgments based on experience (or training in this case). I’m developing my own approaches to this set of challenges, but suffice it to say…

Any composer that deals with musical materials as anything but a means to achieve musical effect is not likely to produce results worth exploring or repeating.

A Composer’s View

Composers seek combinations of musical patterns that take on an irrepressible life of their own. As someone with decades of experience attempting this, I can say that it’s an intimate and perilous journey with few external guiding references. Every step forward simultaneously removes options and presents new opportunities, and when a combination of musical events expressing strong potential finally presents itself, there are no guarantees.

Unfortunately, this first-hand description of the compositional process isn’t very helpful when designing creative algorithms. To do that, we need ways of defining and comparing the effect of musical ideas to learn what sinks and what floats. In recent years my work as composer, pianist, and researcher has focused on developing a computation-friendly approach to this problem and I believe reasonable solutions aren’t that far off.

What About the Analyst?

On the other hand, analysis (whether computational or theoretical) encourages the deconstruction and justification of existing musical ideas by developing a highly-relational network of connections between events with clearly definable attributes. Strange as it may seem, some of these relationships may be carefully planned and executed by the composer, but many are not.

Ultimately, analysis aims to reveal a tightly woven fabric of methodically cataloged connections; regardless of the composer’s design or intent, and without the natural aesthetic judgments mandated by the compositional process.

Analysis is a necessary component of model-based computational composition. But for analysis to become directly relevant to the compositional process, it must be capable of addressing a wide range of emotional potential to provide a meaningful musical experience. In other words, it must evolve to encompass the pleasures that music provides to us.

Pleasures of Music

What exactly do I mean by the “pleasures” of music?

Well, I don’t mean the types of emotional associations that attach themselves to musical textures, forms, and ideas through cultural conditioning and repetition. No. I’m speaking of an aurally perceptible, inner geometry that attaches itself to our collective nervous system and forces us to repeat and examine it for generations upon generations.

While it may be difficult to codify them, these designs are recognizable and valued because we are human. We find a pleasure in their architecture that universally speaks to us, regardless of time and cultural distance. And they did not come to exist in a vacuum — they have been under constant development for thousands of years through the myriad generations that precede us. Lucky for us, these designs can be found throughout the musical canon.

To create music worthy of sustained exploration and repetition, computational creativity must seek out the patterns and trends that elucidate these universal pleasures. This process begins with our attempting to understand fundamental musical aspects like memory, identity, expectation, surprise, and developing a data-driven understanding of the expressive qualities of musical character.

December 13, 2014
Automating Musical Descriptions: A Case Study

It’s widely accepted that music elicits similar emotional responses from culturally connected groups of human listeners. Less clear is how various aspects of musical language contribute to these effects.

Funded by a Mellon Foundation Research Grant through Dickinson College Digital Humanities, our research into this questions leverages cognition-based machine listening algorithms and network analysis of musical descriptors to identify the connections between musical affect and language.

(more…)

November 6, 2013