Home / Opinions / Sounds Good To Me / How We Tested

How We Tested

Still, stuff the science. We wanted to see whether the choice of lossless or lossy audio format made a difference when tracks were listened to by reasonably ordinary subjects, including members of the TR team. We won't pretend that we took the most scientifically rigorous approach or brought out an armoury of test equipment to check and compare waveforms. Instead, we ripped four tracks from CD to FLAC using DBPowerAmp CD Ripper, then used the freeware WavePad editor to create thirty-second excerpts from those files for testing purposes. We then used DBPowerAmp converter to make two MP3 encodes of those tracks, one at a constant bit rate (CBR) of 192kbps, and one at 320kbps. The LAME encoder, widely considered the best for high-bit rate MP3, was selected for encoding duties. 192kbps is widely considered the minimum bit rate for decent quality MP3 audio. 320kbps is the top-end standard for most MP3 players, and the one adopted by online music stores such as 7 Digital or Play.com. We wanted to see whether our guinea pigs could spot the difference between these files and the original FLACs.

FLAC gives you true CD quality audio, but check out the impact on your hard disk space!

Our test tracks went onto an Asus notebook. The kind chaps at hifi headphones had provided us with an iBasso D3 Python USB DAC and headphone amplifier - similar to the iBasso D2 we reviewed earlier in the year, but with enhanced sound quality and a little more oomph in the output stages. We used this to provide the audio output. Into the D3 we plugged a pair of BeyerDynamic DT770 Pro headphones, ensuring that our test subjects would get excellent (though not ridiculously high-end) audio without being bothered by any background noise (though there wasn't much inside the TR office used for testing).

Each test subject heard each track in two versions. Massive Attack's Small Time, Shot Away and Radiohead's There There were heard in both 192kbps MP3 and FLAC formats, while Maxwell's Ascension and Yumeji's Theme from the In the Mood for Love soundtrack were heard in 320kbps and FLAC. In each case, the subject simply had to say which version sounded best. We gave them two listens to each version, and the option to listen again if they wanted. We then jotted down their findings, plus any comments they had as to why they had made a particular judgement.

No test subject could see the screen during testing, so all tests were conducted blind. We took every step possible to ensure that the subjects did not know which version of a track they were listening to at any time.

Ada Mari Frost

June 14, 2013, 6:07 am

High quality FLAC sounds warmer than MP3. Most people equate the warm sound to not being crisp, that's how you get a lot of people thinking the MP3s are better.


January 4, 2014, 1:41 pm

Those are the worst graphs ever. The axis are so poorly labled that I can't understand what they are supposed to show.


February 16, 2014, 2:40 pm

This article is a few years old now and out if date, but even taking that into consideration......

Unless I missed it, no mention was made of the MP3 decoder being used. They're not all created equally in respect to accurate decoding. WMP had a poor decoder prior to version 7. The Winamp MP3 decoder prior to version 2.666 was known to be substandard.

Constant bitrate is never going to be the best method for audio encoding any more than it'd be the best method for encoding video.

Bitrate is not the only variable which effects quality. The LAME MP3 encoder has settings for CBR encoding which use different algorithms and result in different encoding speeds at exactly the same bitrate. The encoding time for 5 minutes of stereo audio using my (aging) PC is around 7 seconds for the LAME default of q3, and 38 seconds for q0 (CBR 192k, LAME 3.99). The article makes no mention of the quality setting used.

From version 3.99, LAME's CBR encoding uses the PSY model from the VBR code. Even today, at the same average (lower) bitrate, CBR encoding will never match the quality of VBR encoding, bitrate. Back in 2009 it was possible it wouldn't even at very high bitrates.

The golden eared folks over at hydrogenaudio have a page dedicated to LAME which states any of the LAME VBR presets from V3 to V0 should normally be "transparent". ie it's not possible to distinguish the MP3 from the original. The average bitrate for V3 is around 175 kbps. For V0 it's 245 kbps.

The article refers to lossless audio and bitdepths/sample rates higher than those used for CD sounding better. A claim which seems to have been debunked.

Hannes Minkema

June 8, 2014, 2:00 pm

"Only one person could accurately pinpoint which tracks were MP3 and which tracks were FLACs in every case."

I appreciate the effort, and I believe the general conclusion is about right. Yet it is unwarranted to state that only one person *can* (or *could*, which is the past tense of *can*) accurately distinguish MP3 from FLAC. The statement should have been that only one person *did* this. But this, by itself, says *nothing* about his general capacity to do so.

This is not nitpicking. I am sure that many readers misinterpret this statement, and believe that this one guy actually has better ears than all the rest. Heck, you might even believe this yourself. But there is no proof of that.

Take one hundred people. Ask them to draw the Queen of Hearts from a stack of playing cards. Two, maybe three of them will do so at the first draw. How is that possible? Should we conclude that they are psychic? Of course not. They are just lucky.

Take one hundred people. Ask them to tell which of two playing cards is red, and which is black. Fifty of them will be totally right at the first guess. Ask these fifty people to do it again. Twenty-five will succeed. Ask them again: twelve of them will be so lucky as to perform the trick thrice. Six will do the trick four times in a row, and three are so 'psychic' that they manage to guess right five times in a row. Incredible! Not.

They have no special gifts, of course. They are just assisted by blind chance. The next time they are put to the same test, they most probably won't be so lucky. So far for their 'general capacity'.

During the MP3-to-FLAC trials, the one guy who was able to guess right four times out of four trials was assisted by blind chance either. You can't erase blind chance. That's why scientific inductive statistics take 'blind chance' into account. And that's why the statement 'one guy *could* do it' is misleading. He couldn't. He just did. The combination of his ability AND blind chance did the trick. But that's not the same.

The question was not if one person *could* distinguish MP3 and FLAC, but whether a more general null hypothesis was rejected or not. That null hypothesis would be that out of seven people trying four times to distinguish MP3 from FLAC, a greater number of guesses turned out to be correct than can be accounted for by blind chance, with an uncertainty of less than 5%.

Anyone with a scientific training could have told you this. It is Statistics 101.

comments powered by Disqus