| A
brief statement of the problem(s) on which the group worked
Many user interfaces in computer audio and music production systems impede
creativity by slowing the ability to capture inspiration, leading to customer
dissatisfaction. For example, these interfaces are:
- Overly technical, intimidating users who do not have experience
in traditional recording methods.
- Inflexible, frustrating users whose workflow is different
from the rigid one designed in the product.
- Difficult to navigate, making it hard to balance screen space
and locate necessary functions.
- Difficult to configure, making it hard to access features
and add or remove system components.
These problems promote customer disaffection, increase frustration, lower
productivity, destroy inspiration, and reduce customer loyalty as users
move to other products (i.e., don’t upgrade).
A brief statement
of the group’s solutions to those problems
- As part of product planning, identify personas and potential workflows
for targeted users.
- Applications should deliver an adaptive UI environment based on the
user’s needs and background. This can be done in various ways
such as:
- Querying users when the application is first run to identify
their degree and type of experience.
- Querying users as they start a new project to determine the likely
workflow.
- Previewing the environment and guiding users through the workflow,
allowing them to refine the environment as they go.
- Deemphasizing elements of the workspace that are seldom used
or not applicable (at that time).
- Explore other metaphors for visualizing and interacting with audio
data, such as using frequency displays in conjunction with waveform
displays.
The action item
list
- Flesh out the personas and workflows: Pat Azzarello and Philip Merrill
[completed]
- Survey alternative interfaces, such as ones used in children’s
software: David Battino [ongoing]
- Survey interfaces used by DAWs and plug-ins: Ron Kuper [ongoing]
- Explain how Yamaha is integrating software and hardware through Studio
Connections: Kevin MacManus [ongoing]
- Solicit input from top producers and engineers and publish results:
Philip Merrill/David Battino [ongoing; see also The Art of Digital
Music]
- Contribute and refine additional personas: Todd Hager [ongoing]
Expanded problem
statement
- Most GUIs replicate the tools used for music and sound manipulation
instead of representing the processes.
- People approach song-making many different ways, and we need to help
them regardless of how they joined the game. The current model is either
keyboardist- or recording engineer-focused, and often that isn’t
the customer’s background.
- The current model doesn’t look like it’s making music.
Songs don’t look like songs; there is no obvious visual (static)
depiction of which way time flows.
- The most complicated issue for most users is managing the connections
between hardware and software (as well as between multiple software
packages).
- Hardware and software aren’t automatically and seamlessly integrated.
- Reviewers slam innovative interfaces for being nonstandard.
- None of today’s software products support all the different
ways that people want to create music or audio.
- Because of the “democratization of recording,” musicians
have to be their own engineers. The technical left-brained stuff isn’t
separated from the creative right-brained stuff in the UI.
- The UI is not as compartmentalized as it should be, so users can’t
focus on the task at hand.
- Eye candy sells products, but too much candy is bad for you. (Often
what looks good isn’t necessarily the most intuitive or usable
UI.)
Expanded solution description
The scope of the solution encompasses:
- Modular workspaces that change depending on context or user
experience and background.
- “Adaptive workflow” — a way of following
the individual work habits and personality of the user, step by step.
- Defined needs for hardware integration.
- Exploring alternatives to waveform displays and timeline.
1. Modular Workspaces
The group’s discussion about developing better user interfaces
for audio software began with the simple question, “Why do so many
music programs look like spreadsheets?” The goal of music software
is to facilitate and record creativity, not perform calculations, so if
a user wants a multitrack project to look like something else, he should
be able to do that. (Similarly, if a user does want to use popular
spreadsheet programs to manage tracks, why not enable that as well?) We
also noted the challenge of managing 70-plus tracks stacked like a downtown
office building.
Why do so many music programs look like spreadsheets? Doesn’t
that unnecessarily “box in” the user? To demonstrate the
resemblance, we simulated a digital audio workstation in Microsoft Excel.
On the other extreme, making user interfaces look like physical objects
(mixing consoles, effects processors, analog synthesizers, etc.) helps
a segment of the population but intimidates the rest. We shouldn’t
throw out those paradigms if they still work (e.g., transport controls),
but we must balance intuitive appearance with usability. For example,
it’s intuitive that moving a fader up will increase the amount of
something, but faders consume a lot of screen space and, unless grouped,
can be addressed only one at a time with a mouse. Moreover, there are
many times musicians will want to make sweeping, gestural changes to the
music rather than surgical ones.
Configuring the Modular Workspace
We feel that a successful user interface for audio software should connect
a segmentable demographic with a workflow that can be represented using
project work-modules like virtual Lego pieces. The appearance and use
of the display should support the needs of people who purchased the product
specifically for a certain project.
We considered various means for customizing the UI. For example, the
program could present a split screen, and the user would select instruments
on the left side, building a personal avatar graphic on the right. The
user could confirm that this graphic pictured his personal situation (segment).
For configuring gear, the user (or program) could fill an onscreen rack
incrementally, or the program could configure a workflow, with the user
confirming that the details pictured matched his project needs. If standardized,
this data could be managed and shared via XML.
To construct an appropriate production environment, the DAW would survey
the user at the first startup and customize itself based on the response.
Questions might include:
- What instrument do you play?
- Do you have a DJ background?
In other words, who are you, and what do you want to do?
A person who reads music, but has no traditional recording experience,
would likely be a good candidate to either:
- Enter their music in notation and have it rendered with a software
instrument
- Record their instrument, utilizing tempo and bar lines, etc.
A DJ with little or no musical training would be unlikely to use the
same methodology.
Here is a simplified survey example:
Survey Flowchart
User Setup Questionnaire
- We recommend that all complex music software include some version
of this questionnaire and make use of the information to tailor the
UI. Different apps would have different surveys; for example, Logic
Pro would be able to make some assumptions about the user’s skill
level. The layout could resemble an avatar builder—as you answer
the survey your avatar fills in and you can see how it matches up. Potential
questions:
- Do you play an instrument? [Yes/No]
- Can you read music? [Yes/A little/No]
- Have you ever recorded anything? [Yes, on a computer; Yes, on a tape
deck; Yes, in a recording studio; No]
- What kinds of gear will you be using? [Keyboards; Guitars; Effects;
Mixing Surface; Turntable; Microphone]
- Are you a professional or an amateur? (Skilled or unskilled? Complicated
or simple? Familiar with traditional production paradigm or not? Clued
or clueless?)
- How old are you? [Child; Young Adult; Adult; Mature] (This could
be qualified by humorous questions such as, What band did Paul McCartney
play in? [Wings; Beatles; Michael Jackson; Who is Paul McCartney?])
User Personas: from Granny to Grammy
During and after the Project Bar-B-Q conference, we developed personas
characterizing a variety of users. The simplest extreme is Grandma seated
at a piano wanting to record herself playing and singing a song for her
grandchildren. With a tape recorder, it would be easy for her to press
Record, and then perform the music and mail the cassette tape to her grandkids.
It should be just that easy using today’s digital tools.
One member mentioned meeting a 50-something woman who had purchased an
expensive and powerful laptop and DAW software, hoping to record her song
ideas. Even with great equipment costing more than $3,000, she was unable
to do what grandma had been able to do so easily back in the era of tape.
It is unwise to abandon dissatisfied customers, leaving them in the position
of believing money they spent was a disappointing waste. As mentioned
above, that is a problem our approach would remedy.
Personas 1: Simple Recording/Performing Scenarios
Name: Grandma
Summary: Wants to record herself playing piano and singing
“Happy Birthday” to her granddaughter.
Description: The simple analog example of recording voice and piano
live to tape requires mentally substituting a device that does the same
thing digitally. Here is a potential production scenario:
Simple Recording Scenario
Name: Choir Director
Summary: Wants to record church choir and distribute recording
to congregation.
Description: This is a directly practical example including placing
mics for straight-to-device recording, although many worship institutions
have dedicated sound systems with central sound boards.
Name: Songwriter
Summary: Plays acoustic guitar, wants drum and bass accompaniment.
Description: Many a guitar-playing, songwriting singer has needed
rudimentary backing both for practice and performance; these can also
be accompaniment in multitrack audio for demos and listen-back sketches.
Name: Cover-Band Guy
Summary: Wants to learn the organ solo in “Smokin’”
by Boston.
Description: Like lessons in a language lab, many a cover-band
guy needs to listen to a solo over and over again, ideally slowing down
the tempo without changing the pitch. Recording oneself and listening
back to judge accuracy and style are helpful.
Name: MIDI Gear Junkie
Summary: Wants to play some basic tracks (bones) to jam with.
Description: From a compositional point of view, being able to
rapidly combine ideas as they are generated is ideal for creating the
“bones” of a piece, ready to be arranged and through-composed
later. This allows personal jamming, and because MIDI gear offers a variety
of sounds, can easily result in the accumulation of keyboards and rack-mount
modules.
Personas 2: Amateur/Consumer Recording Scenarios
Another source of Personas is music and recording newbies — certainly
a significant slice of the consumer market. Although many customers will
be unsophisticated about things like music notation or how to play an
instrument, others will enjoy more advanced knowledge and experience and
want UI tools to support their skill level.
Name: Britney Wannabe
Summary: Mall booth karaoke
Description: Many teens go to the mall to have glamour photos or
portrait photos taken, so it’s easy to imagine an audio version
as karaoke-to-CD, or even DVD with some video. A teen singer would want
song selection, key adjustment for vocal range, vocal enhancement, and
possibly visible music notation tracking the lyrics.
Name: Teenage Bedroom Hobbyist
Summary: Kid with a guitar writing a song for this girlfriend.
Description: Unlike a songwriter who has worked to assemble a personal
configuration of gear, a casual Romeo might delight in lying on his bed
singing a one-time romantic ballad to someone the song is about...and
then hope she likes it.
Name: College Kid
Summary: Selects a group of tracks to play together to create
a mix CD.
Description: Popularized by High Fidelity, the mix tape
or CD takes playlist-making to a high level in the art of self-expression,
especially when the playlist is specifically created for a known individual.
Name: Weekend Warrior
Summary: Plays electric guitar, owns a 4-track, wants to make
his band demo.
Description: What Grandma could have done easily with a tape recorder,
the Weekend Warrior could once have done easily with a Portastudio or
similar 4-track multitrack tape recorder that allows bouncing and simple
mixing. Basically, Weekend Warrior thinks his band sounds great and just
wants to be able to burn CDs and e-mail MP3s of it.
Name: Forty-something Guy with Disposable Income
Summary: Used to play in a band back in the day, now jams with
his buddies; wants to impress friends and kids
Description: This is distinct from Weekend Warrior because Forty-somethings
like this are very well-known and desirable customers of gear manufacturers,
since they can afford really good equipment and buy more regularly. This
emphasis on quality extends to the finished product they want if they
are multitrack recording. It may still be friends, family, and CD Baby,
but the finished quality should be enough to impress the friends of a
teenage daughter — the project output should be mall-worthy.
Name: Postal Service
Summary: Two guys collaborating by mailing CDs back and forth.
Description: This is actually the name of a real band. According
to AllMusic.com, Dntel’s Jimmy Tamborello worked with Death Cab
for Cutie’s Ben Gibbard by snail mail “with Tamborello sending
electronic pieces and Gibbard adding guitars, vocals and lyrics.”
The team was forced to promote the United States Postal Service in order
to keep their name.
Name: Elementary School Kid
Summary: Edutainment, creativity, experiencing interactive
music.
Description: This is where the first reference to Fisher-Price
came up, later extended to the idea that advanced users might want to
be able to create their own toys to work with. Both in the educational
music market as well as just for general fun with music-making, all sorts
of equipment exists to encourage interactivity with music. There are video
games now driven by banging on a drum.
Name: Dance or Gym Teacher
Summary: Wants interactive control over a playlist.
Description: An exercise instructor wants a dynamic playlist of
tracks to energize a class. S/he might want to be able to select a track
easily with trick modes. A dance instructor might also want to use loops,
since learning a routine often involves repeating a certain section of
choreography.
Personas 3: Pro/Prosumer Recording Scenarios
Name: Interapp Collaborator
Summary: A project studio user running Cubase takes
raw tracks into a pro studio to overdub and mix.
Description: At a certain advanced level, it becomes essential
to be able to maintain the integrity of a file that represents the master
output of a project. Depending on circumstances, such as interoperability
between two particular products, this is generally either very difficult
or very easy.
Name: Power-User Mix Engineer
Summary: Wants efficient editing and comping of recorded tracks
using QWERTY keyboard primarily.
Description: Speed is of the essence for a certain kind of high-powered
mix engineer. These professionals are highly practiced with their music
production configuration and often resort to numerous time-saving ways
of working, including using keyboard shortcuts to navigate the commands
offered to a software package’s user.
Name: Small Project Studio
Summary: A semi-pro who records demos and small bands for hire.
Description: Multipurpose multitracking studios can be reasonably
small and take on a wide range of projects for bands, singers, etc. These
end up being a low-end pro version of the Grandma persona. Surround sound
is probably a must now.
Name: Soundtrack Composer
Summary: Composer wants to sync to video.
Description: The composer receives a video from the director and
wants to be able to play along with the video and record his instrument.
Later, he will adjust tempo, bar lines, etc., render it in notation, copy
parts, and distribute to an orchestra for final recording.
In some cases the Soundtrack Composer will utilize a cue sheet, or DVD/VCR
to write the music (using pad and paper), set up a project that reflects
this handwritten roadmap in the DAW, and then record parts along with
picture, mix it, and deliver it to the director.
Name: Video Post
Summary: Wants to produce sound effects, foley, musical cues,
ADR, dimensional effects, surround sound, and run a networked, multi-user
studio.
Description: Like the Soundtrack Composer, the Video Post persona
will receive a video (with or without audio). Video post houses generally
do not create music (i.e., record individual instruments or parts), but
rather drop audio events into a timeline that is synced with the picture.
These events may be sound effects, sound design, or music cues created
by the Soundtrack Composer.
Name: The Game Composer
Summary: Wants to create video game audio and music.
Description: See 2005 Project Bar-B-Q report New
Approaches for Developing Interactive Audio Production Systems.
Name: Frank Filipetti recording James Taylor on Martha’s
Vineyard
Summary: High-end location recording.
Description: Frank has a lot of resources at his disposal. He’s
looking for a high quality system in limited space. In the analog days,
Frank would have to bring a large tape deck with him, along with all of
his other recording equipment (compressors, gates, microphones, mixing
console). In this case, Frank is likely to replace the tape deck with
a computer, though perhaps not the other gear. He wants it to be portable,
unobtrusive, reliable, and very likely quiet.
Unlike many of the previous personas, Frank requires the ability to record
many tracks simultaneously.
Name: Jimmy Jam Comping on the Plane
Summary: Takes Janet Jackson tracks on his notebook and assembles
them on the plane.
Description: Jimmy has already recorded tracks in his studio. He
gets on an airplane and auditions versions of individual tracks, identifying
the “best” performance, and adding them to his “final”
project. Since he isn’t recording anything from the analog domain
on the flight, he only needs to play back the audio, and in small spurts.
Most of his time is going to be spent listening to a limited number of
playback channels, though he will likely want to validate his edits against
the complete track (many playback channels).
Name: Grammy-Winning Producer
Summary: Wants to record eight people or an orchestra in a
room.
Description: Usually utilizing a recording studio, the producer
insulates himself from the actual recording process. His requirements
are that the performance be captured faithfully and without compromise.
He doesn’t care about technology as much as he cares about the musicianship,
and gets very upset when the engineer asks the performer to sing
again because there was an artifact introduced through the recording chain
(though he generally has patience during the same session with the rap
artists’ chains clinking in the background J). He may require one
audio capture channel, multiple channels, and even video sync.
Name: The Personalizer
Summary: Wants to make distribution-related tweaks such as
callouts and station/customer customization.
Description: A maker of children’s videos wants to personalize
audio so that its customers can buy the songs with their child’s
name included. (“Are you sleeping, brother Peter.”) The Personalizer
generally remixes the original content (or reassembles it from subgroups/stems),
inserting only the content that is absolutely necessary (name, station
call letters, etc.).
Name: Mastering Engineer
Summary: Takes mixed tracks and produces a production-quality
master.
Description: The mastering engineer’s key requirement is
high quality, and his main goal is to stabilize the audio and solve problems.
In the end, whether he plays back the mastered recording in real time
to an analog deck or renders it within the DAW, he wants what he hears
as he edits to be what he hears when he plays it back. He also has invested
a huge amount of capital in processing (plug-ins and outboard), which
often allow much greater flexibility than usually utilized during a mix.
Name: A&R Guy
Summary: Wants to supervise the workflow to get demos in the
proper format for submission.
Description: The A&R guy is closely related to the DJ in workflow.
He doesn’t add instruments, effects, or vocals. He doesn’t
mix from the constituent parts. He assembles the various songs in order,
sometimes modifying them at a high level (fadeouts, maybe cutting out
sections), and adjusts their volume. Eventually he’ll burn a CD
with the material and wants to have track indices, etc.
Personas 4: DJ-Related Recording Scenarios
Name: Open-Mic Performer
Summary: Wants backing tracks.
Description: A singer transfers existing songs from CD and removes
the vocal part with special processing. She then adjusts the key to match
her vocal range. She burns the result onto a CD so that she can take it
to her gig.
Name: Remixer
Summary: Starts with an existing song and reworks it for the
dancefloor or simply an alternate sound.
Description: Rips the song from CD and processes it with EQ. He
rearranges the original track using copy and paste. He adds a few drum
loops to give it a more pounding beat. He also records a few keyboard
lines using an external synthesizer to improve the flow of the track.
Name: Masherupper
Summary: Takes two or more songs and combines them, maybe adding
loops and other new parts.
Description: Rearranges the original tracks using copy, paste,
time-stretching, and pitch-shifting. He may add a few drum loops to give
it a new rhythmic backbone. He may also add a few keyboard lines using
an external synthesizer to improve the flow of the track.
Name: Acid Looper
Summary: Assembles songs from sound libraries.
Description: The looper relies heavily on sound libraries and could
be considered a collage artist. He spends much of his time searching the
libraries and previewing material for possible inclusion. Much of his
work is trial, error, and refinement. When he finishes a track he renders
it to an audio file and posts it on the Internet.
Personas 5: Notation-Related Recording Scenarios
Name: The Orchestrator
Summary: Wants to work in notation and either print it out
or render it.
Description: For this user, accurate, legible display and printing
are likely the most important features.
Name: The Transcriber
Summary: Wants to compose a piece and get it down in sheet
music and/or tablature.
Description: This user is more concerned with speed and ease-of-use
than orchestral filigree. Lyric handling may be important, as may intelligent
transposition.
Personas 6: Education-Related Recording/Performance Scenarios
Name: Music Educator
Summary: Prepares lessons, uses sequencer to teach orchestration,
composition, and perhaps synthesis or pop music production.
Description: Multi-user support may be useful.
Name: Ivory Tower Academic
Summary: Wants to explore new frontiers of sound and computer-aided
composition.
Description: Thinks shrink-wrapped software is too mainstream and
limited; wants to use Csound, Max, or Reaktor to build music atomically.
Personas 7: Other Recording Scenarios
Name: Disabled User
Summary: Visually impaired, diminished fine motor control.
Description: Needs support within the DAW for alternative controllers
(flexible mapping), or uses operating system accessibility tools, such
as a zoned QWERTY keyboard that doesn’t require him to hit an individual
key, but an area of the keyboard.
Name: Auto-accompaniment Keyboard User
Summary: Wants to export their work into software to tweak.
Description: Starts a song by playing along with the keyboard’s
backing tracks, then edits the complete performance in a computer, adjusting
such aspects as phrasing, dynamics, notes, and tempo.
Name: Al Gore Rhythm Method
Summary: Uses tools such as KARMA for getting new ideas (on
one extreme) or for rapid soundtrack development.
Description: Similar to the Auto-accompaniment Keyboard User, but
relies on the software to make compositional suggestions, not just harmonic
enhancements.
2. Adaptive Workflow
The concept of adaptive workflow is to define the steps (and the transitions
between them) for creating music on a DAW, and then to guide the user
through those steps interactively.
Again, an initial survey could help set up the project. For example:
- Will you be recording new material (audio or MIDI) or using existing
material?
- Will you use the software in a live performance?
- Will you be recording original music or someone else’s?
At times, the adaptive workflow may shift to a tutorial, or even a creativity-stimulating
Oblique Strategy.
A newcomer might prefer easy visual UIs, whereas an advanced user should
be able to customize whatever is easiest for their project. The software
could build a template file for a specific user and project, containing
the user/project details. A beginner receiving such a file by e-mail would
be able to complete projects more easily.
Workflow Diagram
As the software guides the user through the steps, it should balance
right-brain tasks and left-brain tasks.
3. Hardware Integration
With modern music production, tasks bounce between software and hardware
constantly. To make that hybrid system work well, the group developed
these recommendations:
- Hardware discovery should be automatic.
- Configuration of newly discovered hardware should happen automatically
in applications.
- Configurations should be storable, recallable, and transportable.
- Hardware functionality needs to be scalable, e.g., it should be easy
to add a second control surface to a setup.
Some integration solutions have begun to emerge, including
4. Explore Alternatives to Timeline Displays
In addition to discussing ways to escape the “tyranny of the spreadsheet,”
we considered ways to render musical or control data. Here is a partial
list.
- Musical arranging “blocks”
- Tracks
- Piano roll
- Notation view
- Waveform
- Frequency
- Envelopes
- Text (event list, lyrics)
- Section (SMPTE cue sheet)
- Arrangement / Playlist
- Media Pool
Rationale for Current (Wave) Paradigm
- Gives a visual distinction between MIDI and audio track
- Makes it relatively easy to see the location of parts
- Users can “see” dynamic information (including issues
like clipping)
Problems with Current (Wave) Display Paradigm
- Conveys amplitude information only
- Loudness is logarithmic but waveforms are linear
- Waves are drawn filled in, but that seems to be a waste of space
- Meaningful locations in time aren’t always obvious: bar lines,
phrasing, when a specific note begins, where the verse or chorus begins,
etc.
Other Ways to Represent Sound
- Frequency domain: 2-D FFT, 3-D FFT
- Wavelet transform
- Musical octaves
- Metadata, automatically determine “sounds like”
- Visualizations
- Cycling ’74 Radial (circular rather
than linear display)
Way Out of the Box
- MadPlayer (“Death Star bombing run” interface)
- AudioPad (A gestural
interface developed at the MIT Media Lab)
- StikAx (a remixing toy)
- Sony Block
Jam (building blocks that make or shape music)
- Rendered 3-D graphics for workspace
- Virtual reality
- Using transparency for overlays
- 3-D rendered visualization (^ la Creative Lava)
- Force feedback
- Animated avatars
- Stage/mix plot (use icons for each track, move instruments into position)
- Rhetorical question: Is it possible to build a DAW that has no menus
at all, where everything is direct interaction?
- The Minority Report UI
- Audio feedback, e.g., for scrubbing
- Using
a spreadsheet as a DAW
Differentiators for Faders and Widgets in Common Use Today
- Fader
- Taper
- Fine/coarse control modifiers
- Numeric feedback
- Photorealism
- Orientation
- Size on screen
- Grouped manipulation
- Knob
- –inf to 1
- Rotary encoder, like a data wheel
- Incremental
- Mapping to other products’ layouts
- Button (on/off)
- Button (radio)
- Peak/VU Meter
- Alphanumeric display
- Numeric data entry
- Musical note data entry
Items from the
brainstorming lists that the group thought were worth reporting
- Personalizing the user interface: Peter Drescher suggested that software
and objects such as cell phones could have a customizable musical “personality.”
Other reference material
section 8
|