If you are anything like me, you are curious about the current state and the future of Web Audio. So I asked one of the Web Audio API spec editors, Mozilla’s Paul Adenot, if I could shoot some questions. He said sure, and was so kind to take some time and answer them elaborately. Here are his answers, stuffed with lots of useful information.
Did you come across digital clipping in web audio apps? I certainly did several times (mostly in my own apps though). This undesired effect occurs when you play several sound sources at the same time, which results in a signal that is louder than the maximum of 0 dBFS. Since a digital system is unable to reproduce higher amplitudes, you will hear nasty distortion and get an unworthy waveform looking like this:
I just wanted to mention that I did this thing called Beatsketch last year. It lets you make music on the web without having to know much about making music.
BeatSketch from Sebastian Zimmer is a collaborative music production tool that Sebastian developed for his Master’s degree in Computer Science. A song consists of multiple tracks, and each track is backed by a grid-based sequencer. Any changes you make are synchronised between connected collaborators immediately. It also supports mixing the final song down to a WAV file for downloading. An impressive set of features and a very useful exploration of possible methods of implementing collaborative working.
Chris Lowis on Web Audio Weekly #43
Has anybody created an emoji keyboard that’s actually a piano keyboard for writing musical notation? Wanted a quick way to tweet a melody.
— AudioGrains (@AudioGrainsBlog) 5. Januar 2016
Inspired by @AudioGrains tweet, I made this little Emoji Piano.
Emoji Piano lets you create simple melodies and encodes them with Unicode emojis which you can share and tweet.
THREE.js developer Mr.doob has posted an important comment on this.
Playing around with it, I got the idea to use the Web Audio API to spatialize the sound of an object within the matrix, so that a person wearing a headphone could not only see, but also hear where an object is located.
Since the Web Audio API is great, you can do that with ease.
I LOVE Web Audio. It’s one of the most fun things in the browser right now. But doing more and more stuff with it, I came to realize that there are three limits that prevent this technology from making traditional pro audio software obsolete, at least at the moment:
When I open a big session in Samplitude, it likely uses up to 2 GB of memory. Chrome on the other hand, allows about 200 MB memory per web page. If your script tries to allocate more, the site crashes. That’s a good thing. Older machines and mobile devices have their hardware limits and you don’t want to push them too hard. But if you deal with AudioBuffers with high sampling rates like 192 kHz, like pro users do, you may reach this limit very quickly, if the browser even supports such high sampling rates. I did reach the limit several times. Implementations have to support rates for an OfflineAudioContext “only” up to 96 kHz.
A browser is a browser. It’s a very universal piece of software you can do all sorts of things with. Since browsers are not dedicated pieces of software built for audio synthesizing/manipulation, they usually use your system’s standard audio driver. In Windows this is WASAPI (Windows Audio Session API). WASAPI (Shared Mode) isn’t suitable for pro audio applications, as it introduces round-trip latencies well over 20 ms. With Windows 10, this has gotten better. But it still cannot compete with drivers dedicated to real-time audio processing, like ASIO. In the best case, ASIO allows for latencies of about 2 ms. This could be less than the time a sound needs to travel from a speaker to your ear through the air.
People (like me) once proposed that Chrome would implement ASIO support. But let’s be realistic: That is unlikely to happen.
I made another little tool after CAAT, the WAV Builder. This time it’s not about testing filter algorithms, but synthesizing waveforms which are then rendered and saved as a WAV file. It helps me sometimes to test stuff. If it should help you too, that’s great!
WAV Builder uses the great Recorder.js.
CAAT, the custom audio algorithm tester is a page that let’s you try out your own simple audio filter algorithms.
Just (mis)use the textarea for coding and listen to what you get. There are some examples on how you would do basic things.
It helps me sometimes, when I just want to check something out very quickly.
Of course, this is a very unperformant way to implement audio filter algorithms for several reasons. This is just a demo. If you’re interested in how to implement algorithms the right way, I recommend using Web Audio API’s Audio Worklets or the talk “C++ in the Audio Industry” by Timur Doumler.
I hate squashed and over-compressed music. It leads to ear-fatigue quickly, is often distorted and sounds dull and low-fi compared to dynamic music. And although the Loudness War apparently is over, there’s still the need for proper loudness metering, so that people don’t fall into the trap of making their music too loud and destroy the liveliness of their precious recordings.
A few years ago, the European Broadcasting Union (EBU) has released a recommendation on how to measure loudness and how to distribute audio material with the right loudness. After that, some metering plugins for DAWs popped up but I haven’t seen anything like that for the web.
That’s why I created something called LoudEv, an open-source online loudness evaluator, which is compliant to EBU R128.
LoudEv uses the Web Audio API, Web Workers and the great wavesurfer.js by katspaugh to do its thing: Analyzing an audio file (on the client-side, no server upload necessary) and then creating a two-dimensional loudness map of the song as well as a dynamics map. The loudness map shows the song’s short-term loudness over time. The dynamics map shows the peak to short-term loudness and indicates if and what sections of a song are too loud. If it gets red-ish, the dynamic range is at or below 8 LU. If it’s black, you can hardly call that music anymore. If most sections of your song are green-ish, you’re on the safe side. This color scheme derives from the recommendations of mastering engineer Ian Shepherd. According to him, your masters should never become louder than 10 LUFS to prevent a potention loss of punch, impact and space in your mix. You should listen to him, he knows what he says and his masters sound great.
The technical side
To obtain the subjective loudness of a piece, the EBU reccomendation demands of R128-compliant meters to apply some filters (a lowpass and a shelving filter) to the signal. These filters are described in the ITU loudness standard document. But unfortunately, the document does not provide frequency, Q or gain values for these filters. Instead it gives us filter coefficients for a biquad filter that only works with audio of a sampling rate of 48 kHz.
Having obtained a 48 kHz version of my audio, I decided to implement the biquad filter function myself, after learning that, as of today, the creation of custom IIR filters with the Web Audio API hasn’t been implemented in Chrome yet.
Due to my initial lack of knowledge in implementing biquad filters myself, I had a tough time of it with the biquad filter equation, but then the great Audio EQ cookbook by Robert Bristow-Johnson came to the rescue and showed my code that I could use:
y[n] = (b0/a0)*x[n] + (b1/a0)*x[n-1] + (b2/a0)*x[n-2] - (a1/a0)*y[n-1] - (a2/a0)*y[n-2]
So finally, I had my R128-compliant values for the short-term loudness.
Measuring True Peak
After that, I tried to implement a true-peak meter which considers inter-sample peaks. The recommendation suggests the following way to do this: Resample (upsample/zero-stuff + interpolate) the signal to 192 kHz and then seek for the sample with the absolute maximum (see Annex 2 of the ITU document).
Even if it would work, I had to learn that by creating an OfflineAudioContext and an AudioBuffer of 192kHz often results in a crash.
Chrome allows for a memory limit of about 200 MB per web page. This limit is reached very quickly when you deal with 192kHz audio.
Next, I will try the filter suggested by the ITU document. It provides filter coefficients for a FIR interpolation of an upsampled (zero-stuffed) signal. Russel McClellan at iZotope has written an insightful assessment of this filter.
That is the end of the first chapter of bringing R128 onto the web. There’s a lot going on with the Web Audio spec at the moment so I expect to be able to do things soon that I cannot do now.
Let me know, if you know things that I don’t know or if you wish to contribute. The source code of LoudEv is on GitHub.
Try it out here and let me know what you think:
Please be aware, that it only works with mono/stereo files and audio file types that are supported by your browser. Both Chrome and Firefox accept MP3, for example.