40

In the never-ending debate raging in the audiophile community about sound quality and what humans can or cannot hear, it is very very very very incredibly often cited that the upper-limit of the audible range of human hearing is 20 kHz, give or take. Some indicate that this is a conservative estimate, and that the actual upper-limit is actually lower than that (~18 kHz). While others suggest that sounds could be heard or otherwise perceived up to about 25 kHz-30 kHz:

Sampling rates higher than about 50 kHz to 60 kHz cannot supply more usable information for human listeners.

And some others suggest that there is substantial variation between individuals around the upper limit:

The human range is commonly given as 20 to 20,000 Hz, though there is considerable variation between individuals, especially at high frequencies

So is there any biological evidence whatsoever that young, healthy humans can hear or otherwise perceive (or sense) sound waves above 20 kHz? And what would be a conservative estimate of the absolute upper limit of the audible spectrum for humans (i.e. usable sound information for human ears and senses)?

landroni
  • 527
  • 1
  • 4
  • 8

4 Answers4

47

Yes, we can. By means of bone conduction we can hear up to 50 kHz, and values up to 150 kHz have been reported in the young (Pumphrey, 1950). However, it is indeed generally agreed that 20 kHz is the upper acoustical hearing limit through air conduction. The reason for this is debated, but the transfer function of the ossicle chain in the middle ear is a suspected culprit in setting the upper frequency limit to 20 kHz (Hemila et al., 1995).

Hence, using normal speakers or headphones 20 kHz is a very reasonable absolute upper limit. Note that the Nyquist criterium necessitates higher sampling rates (at least 40 kHz), so your statement of using 50k-60k sound cards is correct. If you decide on using bone conduction aids, you might start to think on using higher sampling rates still.

Here is an example of a commercially available bone conduction head set (AfterShockz):

enter image description here

These devices have the potential to increase the upper limit because they bypass the middle ear and hence circumvent the limiting transfer function of the middle ear. They induce vibrations onto the temporal bone, that travel via the bone directly to the inner ear. See the following picture from The High Tech Society:

enter image description here

As a side note: when you grow older the hearing sensitivities at high frequencies are severely reduced, and even the 6 kHz range is severely affected in the elderly (picture from John Perr's website):

HFHL

Disclaimer: I haven't looked into the capabilities in terms of upper frequency limits of bone conduction head sets. I am just talking theoretical limits.

References
- Pumphrey, Nature 1950; 166:571
- Hemila et al., Hear Res 1995; 85:31-44

AliceD
  • 52,402
  • 19
  • 174
  • 238
  • Thanks! I'm curious what you mean by this: "the maximal acoustic frequencies are quite dramatically undersampled at 50 kHz". This source is making a solid argument that technically the analog sound wave is perfectly recreated up to half the sampling rate (as per Nyquist theory). So if you sample at 96kHz (commonly done), you can recreate perfectly up to 48kHz. Do you mean that some things like harmonics may be missed? – landroni Jan 21 '15 at 12:54
  • 1
    Yeah, I see your point, and that's what I was thinking, too. But that guy seems to be making a serious argument. According to my understanding of his arguments, whether you use 44.1khz or 192khz, the data is always sufficient to perfectly recreate the exact original sinewave (up to the bandwidth limit, here ~22kHz), as per the theory. In each case, there in only one band-limited signal that passes through each sample point, and it is a unique solution (see minutes 5-8 of the video). – landroni Jan 21 '15 at 13:05
  • One other bit... Do I understand correctly that the reported "substantial variation between individuals" implies that very often adults will have an effective upper limit lower that the generally agreed upon 20 kHz? – landroni Jan 21 '15 at 13:38
  • 1
    Correct. 16 KHz sensitivity for example dwindles in young adults – AliceD Jan 21 '15 at 13:41
  • 2
    Note that the Nyquist sampling limit is the minimum sampling rate necessary to have any hope of reproducing the input signal before the signal becomes unreproducible. At 2f the reproduced signal is potentially distorted. This works well if you don't care about amplitude distortions but only care about the relative phases of the signal (such as for digital square wave signal) but is very poor at accurately capturing full analog signals. For applications other than audio (such as networking, imaging etc) the rule of thumb is to use 1.5 or 2 times Nyquist frequency. In other words f3 or f*4. – slebetman Jan 22 '15 at 03:46
  • Doesn't hurt to note that Shannon-Nyquist assumes you start with a time-based function f(x), NOT a live audio stream. It assumes the creation of stream-accurate samples has already occurred, and concerns encoding them in a reproducible manner. This is not equivalent to the sampling rate required for recording, which can be as high as 10x the maximum desired frequency. For illustration, consider 2 22khz tones immediately next to each other in a 44khz sampling window; the window will express a single 22khz tone, losing the second one. S-N is almost universally mis-applied in this manner. – Asher Oct 27 '22 at 19:08
  • I think your first sentence misrepresents Pumphrey. He said "As the frequency was varied continuously from 12 to 100 kc./s., the pitch rose with it to 15 kc./s. and stayed there as the frequency was further increased." In other words, he found evidence that high-frequency oscillations could be heard through aliasing at lower frequencies, not that sampling rates above 44.1 kHz are needed to reproduce perceived sounds accurately, even when using bone conduction. – benrg Feb 07 '23 at 20:36
8

If all the processes through which a signal passes are linear, then it makes sense to speak in terms of a maximum useful-content frequency. If a signal passes through non-linear stages, however, it is possible that frequency content which would in and of itself be above the range of hearing, may interact with other frequency content which is also above that range, in such a fashion as produce artifacts which are well within the range of hearing.

A rather annoying example of that may be observed when a GSM cell phone is located near audio equipment. All of the frequency content transmitted by the phone exceeds the upper limit of human hearing by multiple orders of magnitude, and yet the annoying buzz picked up by the audio equipment clearly does not.

What happens is that the frequency content of the cell phone's transmissions contains numerous frequencies which are separated by tens or hundreds of hertz, and many amplifier stages don't completely filter out the radio-frequency content, but are unable to process it without distortion. This distortion causes the amplifier to output sum and difference frequencies, some of which are very much in audible range.

Many kinds of objects and materials will reflect sound waves in a fashion which varies in non-linear fashion with the sounds being reflected. If a diaphragm which had more freedom of movement in one direction than the other were hit with a mixture of 100,000 Hz and 100,100 Hz tones, it would "buzz" at 100 Hz, whereas it would not do so if hit by either tone alone; further, a conventional recording of the combined tones by a high-quality microphone would detect nothing, so playing it back in the presence of the diaphragm would yield no buzz.

It would be rare for aesthetically-pleasing audio content to have frequency content over 20 kHz which contributed materially to its aesthetic aspects. It would certainly be possible, however, to construct frequency content over 20 kHz, however, which could be heard in many common environments, and whose perceived sound would vary in ways that would not be possible using only frequencies below 20 kHz, and would not be implausible that some kinds of musical instruments (e.g. handbells) might produce mixtures of high frequency content which would sound different to different people who found them pleasing, in ways which could not be mimicked using only directly-audible frequencies.

It may be possible for an audio technician working with a functional MRI team to create, for an individual, a sound which was indistinguishable from the original, but for another individual that recreated sound might sound nothing like the original.

Peter Mortensen
  • 209
  • 1
  • 6
supercat
  • 181
  • 3
  • Very interesting... But how would that work in the context of audio recording and playback? Say a handbell produced a mixture of high frequency content which would have somehow made it's way into the audible spectrum, what would the recording pick up? The ultrasound, or its impact on the audible spectrum? And upon playback, what would make the listener tickle: hearing the recorded audible spectrum, or the ultrasound which would dynamically interfere with the audible spectrum? So, to ensure that the peculiar handbell sounds aren't lost, is it imperative to reproduce the high-freq content? – landroni Jan 21 '15 at 19:16
  • 2
    @landroni: I have neither the tools nor expertise to identify the extent to which frequency aliasing caused by intermodulation of ultrasonic-frequency content affects perception in non-contrived cases. I would expect that the eardrum would introduce some distortion because of its asymmetrical loading, and that the details vary between one person and the next, but I don't know the extent of the effect or its variation. I do know that the auditory experience of hearing a handbell concert is different from that of hearing a recording, but I have no idea whether the difference... – supercat Jan 21 '15 at 19:26
  • 1
    ...stems from 20Khz content, 30Khz content, 40Khz content, or something else entirely. A recording at a 44KHz sample rate must not capture anything above 22KHz, though if room acoustics or the microphone caused a combination of higher-frequency sounds to produce a lower-frequency sound, the recording would capture that lower frequency. My main point is that it's entirely plausible that capturing higher frequency content may, in at least some cases, make a recording sound better even if the content itself would not be directly perceptible. – supercat Jan 21 '15 at 19:32
  • So am I getting this right: imagine that a handbell concert was recorded at 192 kHz sampling. Then when played back at 192 kHz (without any downsampling), it would be possible that the higher frequency content in the 20-96 kHz band could combine in such away as to produce a lower-frequency sound which would then be audible by the listener... Correct? And in that case, would the playback feature that lower-freq content twice: once from the recording, and once from the playback containing the high-freq sounds which would once more combine to create low-freq waves? I'm slightly confused here... – landroni Jan 21 '15 at 19:54
  • Or is it instead the case that when the hi-freq content (20-96 kHz) combine in the live performance and the 192 kHz recording equipment captures that lower frequency sound (0-20kHz), and if you downsampled the recording from 192 kHz to 44.1 kHz and played it back, that even then the low-freq content (caused by the hi-freq combination) would still be audible to the listener? In that case the 192 kHz high-resolution recording would be redundant for playback purposes to humans, and safely discarded while keeping just the 44.1 kHz downsampled version... – landroni Jan 21 '15 at 20:01
  • I guess the ultimate purpose of my question is to determine whether playback of 192 kHz recordings is utterly unnecessary, as all content that we may hear (below more or less 20 kHz) is already within the 44.1 kHz samples (including low-freq sounds generated by high-freq waves)... And whether all recorded info above 44.1 kHz sampling is simply irrelevant for playback purposes... – landroni Jan 21 '15 at 20:24
  • 1
    I suspect the most important kind of intermodulation distortion from the standpoint of human perception would occur within the ears of each individual listener. One listener might perceive a combination of equal-strength 60kHz and 55kHz as a 5kHz tone whose strength is 20dB down; another person might perceive the combination as a 5kHz tone whose strength is 25dB down. For the first listener, a recording where the higher frequencies were replaced with a 5kHz tone that was attenuated 20dB would be indistinguishable from the original, but the second listener might find... – supercat Jan 21 '15 at 20:38
  • 1
    ...such a recording to be a tiny bit more shrill than the original (for him a proper recording would have attenuated by 25dB). I would expect the differences to generally be subtle in any case. – supercat Jan 21 '15 at 20:40
  • ...but either way, if recorded at 192 kHz sampling, both the 60kHz and 55kHz waves would not recreate the 5kHz tone when played back, correct? From what I understand, the 5kHz tone would be a product of both high-frequencies and the acoustics of the venue, and would already be recorded within the 44.1 kHz samples... When playing back the recording, humans would only hear what was recorded below 22 kHz, while all those high-freq waves would no longer have the same effect as they did in the venue... – landroni Jan 21 '15 at 21:13
  • 1
    If the listener's ear, when given 60KHz and 55Khz at some amplitude would create a 5KHz signal at a lower amplitude, capturing both frequencies on a 192KHz recording and playing them back would achieve a similar result. Someone with a good computer model of a particular listener's ear might be able to produce a 44.1KHz recording which that person would find indistinguishable from the higher-quality one, but other people might find the two recordings to be different. – supercat Jan 21 '15 at 21:32
  • capturing both frequencies on a 192KHz recording and playing them back would achieve a similar result. Very entertaining! Thanks so much for all these explanations. So this would mean that in some instances, capturing and replaying >20 kHz frequencies could create something different below 20 kHz, something unique for some ears... Apart from the handbells example, what other musical instruments could provoke such runaway high-frequencies? Flutes? And would such effects be confined to classical music, or other genres like jazz could also be affected by such phenomena? – landroni Jan 21 '15 at 21:40
  • 1
    @landroni: A C4 handbell sounds a roughly-262Hz fundamental; a C5 is an octave up from that, C6 is two octaves up, etc. Mallmark sells handbells up to C9 (about 8KHz). I can't think of any other instruments other than pipe organs or electronic synthesizers which allow a musician to play notes with a fundamental that high. Since the harmonic richness of a pipe is a function of its height/width ratio and high-pitched pipes are very short, I'm not sure how much predictable ultrasonic harmonic content the highest-pitch pipes have. – supercat Jan 22 '15 at 00:15
  • Interesting... And a related follow-up. As mentioned in the OP, Wikipedia says that "Sampling rates higher than about 50 kHz to 60 kHz cannot supply more usable information for human listeners, implying that reproducing frequencies of only up to 25 kHz--30 kHz could be useful for the hearing experience... And I suspect via the mechanism you are proposing: hi-freq waves combining to generate an audible low-freq tone. So in your opinion is there a reasonable upper-bound for such high-frequencies that could conspire to create aesthetically-pleasing low-freq content? I am trying to understand... – landroni Jan 22 '15 at 06:35
  • ...whether, to capture most "useful" cases of intermodulation of ultrasonic-frequency content, the commonly used 96 kHz sampling rate is sufficient to cover such cases (capturing up to 48 kHz frequencies, well above the Wikipedia-suggested 30 kHz upper limited); or whether going to 192 kHz sampling rates (up to 96 kHz frequencies) is necessary. So would one need to play back 96 kHz or 192 kHz samples in order to potentially cover these higher-freq-generated and aesthetically-pleasing low-freq content? My intuition is that 96 kHz sampling is sufficiently ample to cover most such cases... – landroni Jan 22 '15 at 06:42
  • 1
    I would suggest that, given a 96kHz recording, it should be possible to produce a 44kHz recording aesthetically indistinguishable from the original by applying a tiny amount of distortion or other processing before down-conversion, but the kind of processing required for best results might vary depending upon the nature of the original sound. If an individual would perceive a 96kHz original as aesthetically superior to a "straight" 44kHz conversion, some combination of processing parameters would probably yield a 44kHz file the person would consider to be better yet. – supercat Jan 22 '15 at 16:25
  • 1
    If one knew in advance exactly how to configure the microphone and recording, it wouldn't be necessary to record at 96kHz; the advantage of recording at 96kHz would stem from the fact that one could experiment with parameters in post production and then yield a result essentially equivalent to having recorded at 44kHz with exactly the right parameters. – supercat Jan 22 '15 at 16:30
  • We're drifting far more into music/recording issues than biology, however. – supercat Jan 22 '15 at 16:45
  • Absolutely. Thank you so much for all the wonderful explanations! To revert to biology, from what I hear, the bottom line is that adult humans do not hear above 20 kHz. But it is perfectly conceivable that ultrasonic waves interact to generate audible content (below 20 kHz). This is likely to be very subtle, and to vary from one individual to another. This is an incredible insight, that may or may not validate the need to record and play back at higher sampling resolutions (e.g. 96 kHz). – landroni Jan 22 '15 at 17:51
1

Taken from "24/192 Music Downloads... and why they make no sense":

Sampling rate and the audible spectrum

I'm sure you've heard this many, many times: The human hearing range spans 20Hz to 20kHz. It's important to know how researchers arrive at those specific numbers.

First, we measure the 'absolute threshold of hearing' across the entire audio range for a group of listeners. This gives us a curve representing the very quietest sound the human ear can perceive for any given frequency as measured in ideal circumstances on healthy ears. Anechoic surroundings, precision calibrated playback equipment, and rigorous statistical analysis are the easy part. Ears and auditory concentration both fatigue quickly, so testing must be done when a listener is fresh. That means lots of breaks and pauses. Testing takes anywhere from many hours to many days depending on the methodology.

Then we collect data for the opposite extreme, the 'threshold of pain'. This is the point where the audio amplitude is so high that the ear's physical and neural hardware is not only completely overwhelmed by the input, but experiences physical pain. Collecting this data is trickier. You don't want to permanently damage anyone's hearing in the process.

The upper limit of the human audio range is defined to be where the absolute threshold of hearing curve crosses the threshold of pain. To even faintly perceive the audio at that point (or beyond), it must simultaneously be unbearably loud.

At low frequencies, the cochlea works like a bass reflex cabinet. The helicotrema is an opening at the apex of the basilar membrane that acts as a port tuned to somewhere between 40Hz to 65Hz depending on the individual. Response rolls off steeply below this frequency.

Thus, 20Hz - 20kHz is a generous range. It thoroughly covers the audible spectrum, an assertion backed by nearly a century of experimental data.

enter image description here

Above: Approximate equal loudness curves derived from Fletcher and Munson (1933) plus modern sources for frequencies > 16kHz. The absolute threshold of hearing and threshold of pain curves are marked in red. Subsequent researchers refined these readings, culminating in the Phon scale and the ISO 226 standard equal loudness curves. Modern data indicates that the ear is significantly less sensitive to low frequencies than Fletcher and Munson's results.


This seems to imply that it's highly, highly improbable that anything above 20 kHz could be heard by the human ear, and that in most realistic conditions even that threshold would never be reached. I'm curious if others more knowledgeable can confirm or contradict this...

landroni
  • 527
  • 1
  • 4
  • 8
0

Yes, perhaps 10% of humans can hear up to 28 kHz, if it's loud enough:

  • Ashihara K. Hearing thresholds for pure tones above 16 kHz. J Acoust Soc Am. 2007 Sep;122(3):EL52. doi: 10.1121/1.2761883. PMID: 17927307.

Hearing thresholds for pure tones between 16 and 30 kHz were measured by an adaptive method. The maximum presentation level at the en- trance of the outer ear was about 110 dB SPL. To prevent the listeners from detecting subharmonic distortions in the lower frequencies, pink noise was presented as a masker. Even at 28 kHz, threshold values were obtained from 3 out of 32 ears.

Absolute thresholds for pure tones have been studied by many groups of researchers. The absolute threshold usually starts to increase sharply when the signal frequency exceeds about 15 kHz. It reaches about 80 dB SPL at the frequency of 20 kHz. Above 20 kHz, however, only limited data have been reported. According to recent studies, ultrasounds seem to be inaudible as long as their level does not exceed about 85 dB SPL.

The present results show that some humans can perceive tones up to at least 28 kHz when their level exceeds about 100 dB SPL.

Many people who think they've heard higher than 20 kHz are actually hearing intermodulation products or aliasing in poor-quality audio systems, though. Typical speaker systems have nonlinearities that can cause spurious tones.

The linked paper confirmed that this wasn't due to subharmonics, though, so 28 kHz is a real world limit.

endolith
  • 170
  • 7