Quantcast
Channel: Kadenze Blog
Viewing all articles
Browse latest Browse all 235

“We as programmers can ‘compute’ sound”– An Interview with Perry Cook

$
0
0

Physics-Based Sound Synthesis” has only been live for one week, but it has already become one of our most popular courses. I recently sat down with Perry Cook to talk about the class and sound synthesis as a whole.

Why/when did you first become interested in the subject matter of “Physics-Based Sound Synthesis”?

Oh my.  I guess in my undergrad engineering studies in the 80’s… Actually no, in my undergrad music studies—a full decade before that. I took an acoustics class when I was in Conservatory, and was really into all of the (relatively simple) equations—like calculating wavelengths from frequencies and estimating reverberation times from room dimensions and materials.  Just algebra, but I thought it was cool. Working as a sound engineer after that, I knew that the acoustical contractors we paid to come in and help us equalize our rooms and speaker systems had tools and equipment that could measure such things.

I was interested in how these things worked, which eventually led to me go back to school to get another degree in engineering.  As a senior in that program, I wrote a paper which eventually ended up winning a national IEEE Student Paper Award, called “Numerical Solutions of Boundary Value Problems in Musical Acoustics” –  which is a big part of this course.  When I got to Stanford, I was delighted to connect with CCRMA and Julius Smith (co-instructor for this course), who is one of the world’s experts on this topic.  So you can see, this course is a culmination of 30 years of my own interests, work, and study.

perry

Can you talk briefly about the history of traditional sampling synthesis as well as this form of digital sound synthesis?

It’s odd, really, because most people who know about synthesis think that analog came before digital (which it sort of did, but not by very long), and that sampling came before parametric synthesis.  Actually, as far back as the 1950’s, people at research labs were investigating, writing about, and even synthesizing sound digitally.  Sampling was considered way too expensive (back then memory was measured in kilobytes, not gigabytes or more) to actually do in practice.

John Kelly, Carol Lochbaum, and Max Mathews (the father of computer music) created the first digitally synthesized singing in 1960 at Bell Labs.  And the model was a digital simulation of an acoustic tube vocal tract.  Not waveforms of speech, but solving the wave equation in a model of a human head!

Sampling didn’t come to us in the music world until the late 1980’s, when memory was cheap enough to actually store and reproduce the digitally recorded/edited sounds.  Physical modeling has had a few attempts to find its way into the mainstream, but never has gotten [much] traction.  An increasing amount of physics-based digital algorithms are now in use in synthesizers (plug-ins for computer DAW software), some to model acoustic instruments, but ironically much more to model the sound and behaviors of old analog synthesizers, guitar amps and tube distortion, and other historical aspects of the electrical/electronic analog world.  (We touch on this just a bit in this course).

We can compute

What are the largest technical challenges for this type of sound synthesis?

Sampling has an advantage in that there is just one model:  record a sound and store it.  Then, play it back, possibly with some modifications (playback rate, looping, filtering).  Physical modeling requires a different model or computer algorithm for each family of sound-making things.  Stringed instruments are different from drums.  Trumpet/Trombone/Tuba are all one model, just with different parameters.  But flute and clarinet, although similar to each other (and to brass as well) require modifications and additions to the algorithms.  So physical modeling had a disadvantage conceptually, and no hardware to accelerate it (unlike sampling, which can be done with one single set of dedicated and consistent computer operations).  The graphics people standardized themselves on the triangle as a basic unit, and graphics acceleration chips could be designed that can compute and display millions of triangles quickly.  Physical modeling has no such primitive (except perhaps the delay line, but much more is needed to build a virtual sound-making object).  Also, physical models require the parameters (often many of them) to be controlled to make the sound.  A good physical model of a violin is just as hard, or harder, to “play” than a real one.  So we have to (or get to) build software to control the physical models as well.

Now this might all sound like much more trouble than it’s worth, but in games and interactive systems, many of the parameters to “play” the physical models are there already.  The game logic knows how hard and where the car is hitting the wall.  It knows how the character’s feet are landing on the ground, and what the materials on the ground are.  All of those can feed into the physics-based sound synthesis to yield the right sounds pretty much automatically, once everything is implemented and set into motion.  For sample-based synthesis, a layer of software is required to figure out which sound to play for these conditions.  That could be fairly computationally gnarly, and lots and lots (and lots and lots) of pre-recorded/authored sound would have to be stored and available, to give the game the liveliness that we’d expect.

Is there any reason (other than technical ignorance or personal preference) that someone may want to choose traditional sampling synthesis over digital modeling?

Absolutely!  Sampling can give us instant realism and recognition of many types of sounds.  The voices of known players in a video football game should be actual recordings of those people – especially if they’re famous athletes.  We are nowhere near being able to synthesize individual human voices with the kind of realism that samples give us.  If we want a bird call in the distance, it’s probably fine to just play a pre-recorded bird.  But if it happens very often, we need to vary it some, like have multiple recordings of that bird.  One of the things we teach in this course is a way to do, for some classes of sounds and objects, “parametric sampling.”  There we start with a recording, extract the parameters, then the recording can be thrown away and the sound resynthesized, with super-flexible modifications, from the parameters.  It’s like having our cake and eating it too!!

Real_Sound_Synthesis--An_Introduction_to_Physics-Based_Digital_Audio_Synthesis_and_Signal_Processing-_CourseCard_061415_v02_still

What was the impetus for offering this course – both at CCRMA/Princeton and on Kadenze?

My life’s interest, research, and writing.  Julius’ life work and research, teaching and writing.  I turned my CCRMA summer course into a set of open-source software (STK) and a book (the one we use in this course).  Julius turned his into a vast set of online notes and books.  Kadenze offered us the opportunity to bring this to the world.  

What are your goals for students of this course?

To come out knowing that we as audio programmers can “compute” sound, just like graphics programmers compute character movements, waves, hair, skin, and lots of other cool things we’ve seen developed in (CG) Computer Graphics over the last couple of decades.  We’re not way behind on our understanding of how to do this, but the movie, game, and other interactive industries have been slow to adopt.  They still opt to record and manipulate sounds, which is cool, but there’s so much awesome stuff to do in interactive, physics-based sound computation.  When students finish this course, they’ll have seen a lot of code and examples of how to do this, and demos of it controlling lots of real-time interactive things, AND it won’t be eating huge amounts of CPU in the process.  Much of this is surprisingly efficient, and it can scale for quality/memory/cycles.

so much

Can you speak to the specific benefits of offering this course in an online format?

Bringing it to the world, for sure.  Getting the word out to lots of folks who would likely never be taking our courses at Princeton or Stanford, and letting them do it on pretty much their own terms.  So much of this course is learning that these algorithms exist, and seeing the demos and hearing the sounds in the video presentations, but then getting the actual code that implements them and playing with it yourself.  Perfect for online, I think.

Can you speak to the challenges  (both technical and instructional)?

The number of students taking the course.  It’s great to see them helping each other in the forums, though.  Also trying to make sure that all the software (for the course, the physical modeling algorithms and other) works on all platforms.  Since the course and assignments are so much about making and modifying algorithms, everybody needs to be able to do that.  We’re trying to test on all of these: Windows (XP, Vista, 7, 8, 8.1, 10), Mac (Snow Leopard through El Capitan), Linux (various flavors), ugh!!

The course has been running for about a week – has there been anything significant to report? Surprises? Challenges?

The enthusiasm has been great!

What is your hope for the upcoming weeks of instruction?

To survive :-)  To see even more interaction and discussion amongst the students,  and to hear some reports that they’re using what they’ve learned in their own work (be it hobby or profession).

Physics-Based Sound Synthesis for Games and Interactive Systems” is now open on Kadenze. Enroll today. To learn more about the course, visit kadenze.com, or read our blog post on the course. If you have questions about the course or would like to hear more from Perry, please send us an email at communications@kadenze.com

Full

 

The post “We as programmers can ‘compute’ sound” – An Interview with Perry Cook appeared first on Kadenze Blog.


Viewing all articles
Browse latest Browse all 235

Latest Images

Trending Articles





Latest Images