8

The upper limit of hearing is approximately 15 kHz, dependent on age and other factors. According to the principles of digital signal-processing, such an upper limit would mean that the auditory system samples at least at 30 kHz or more.

Now suppose an ultrasonic signal, say a 40-kHz acoustic frequency - why would I hear nothing, instead of that signal aliased at a 30 kHz sampling rate?

AliceD
  • 52,402
  • 19
  • 174
  • 238
Bzrs
  • 223
  • 1
  • 6
  • one thing i could not understand... if you can't hear above 15kHZ ("upper limit") then your auditory system should sample below 15 kHz. isnt? – Always Confused Dec 14 '16 at 06:13
  • 2
    In order to hear a sound of a given frequency, you have to sample at twice that frequency (really, above twice that frequency). This is the Nyquist rate. Not sure if you're interested enough, but this Youtube video on the concept is really good: https://www.youtube.com/watch?v=yWqrx08UeUs – Bzrs Dec 14 '16 at 06:32
  • When a digital sound is processed by a speaker, the speaker diaphragm travels through every point in space between the two digital bits of data. therefore, all sounds transmitted in air are generated and heard non-digitally, they are continuous signals limited by atomic size, which is bigger than HDD requirements. Frequency is mathematical, but the human cellular ear structures are not mathematical so you can't rate an ear in mathematical way very clearly. – bandybabboon Dec 15 '16 at 12:17
  • 1
    The answer to your question is in here. I scanned through it, but I don't have time right now to write up a formal answer (someone else is welcome to if they would like). Short answer is that, according to this, the exact mechanics of sound perception are still not pinned down, but it likely has something to do with a lack of space within in the cochlea since vibrations at different areas correspond to the perception of different frequencies. – kingfishersfire Dec 15 '16 at 18:16
  • 2
    I don't have time to type up a whole answer right now, but the premise of your question has a major flaw: the Nyquist rate that you are referencing by talking about necessary sampling rates applies to discrete sampling of a waveform; this isn't how auditory information is represented in the cochlea or auditory nerve or anywhere in the brain that we know of, so it does not apply. You could look into "phase locking" which occurs at lower frequencies, but higher frequencies can just as easily be represented by their envelope, or phase information can be distributed across a population. – Bryan Krause Dec 15 '16 at 19:39
  • @BryanKrause I would like to see your entire answer later. It is a conceptual simplification but there are discrete aspects to how sound is received and encoded, at least potentially: since frequency is tonotopically represented, and the spiral ganglion has fibers that innervate different sections of it, they could potentially be 'sampling' that section of the basilar membrane even though the entire waveform is complex and excites many places on the cochlea. – Bzrs Dec 15 '16 at 20:01
  • @BryanKrause The auditory nerve conserves that tonotopy: http://images.slideplayer.com/18/5700795/slides/slide_38.jpg Relatedly, the actual auditory neurons fire discretely, e.g. at a certain rate, not continuously, and presumably some aspect of auditory perception is limited by the max rate (~1 kHz) although this could be overcome by groups of neurons (volley principle). – Bzrs Dec 15 '16 at 20:05
  • @kingfishersfire I have found that page before and love it, thank you for reminding me of its existence. It's getting bookmarked. – Bzrs Dec 15 '16 at 20:06
  • @Bzrs I am quite familiar with tonotopic organization of the cochlea and auditory nerve, my point is that this has NOTHING to do with the sampling rate. You can have a part of the cochlea respond to 50kHz tones (humans don't, but mice do, and vocalize in that range), but no need to fire at 50kHz. Also although action potentials are (roughly) point processes, that is not the same as discrete sampling... – Bryan Krause Dec 15 '16 at 21:01
  • ...Discrete sampling of a point process would imply you have a series of 0s and 1s, but an action potential, even though it is a point process, can occur at any arbitrary time point: even if your maximum firing rate is 1000 Hz, you could have a spike at t=0s, or t=.001s, or t=.0001s, or t=.00001s. This is actually a necessary concept behind the "volley principle" you mention. At the highest frequencies, there is no evidence for volley principle, instead cells sensitive to basilar membrane vibrations in the high freq area spike w/ respect to the amplitude of those vibrations, not phase. – Bryan Krause Dec 15 '16 at 21:04
  • @BryanKrause I get what you are saying, just a few points: (1) detection of up to 90 kHz has been reported in humans through bone conduction, see https://www.ncbi.nlm.nih.gov/pubmed/2063208 (2) amplitude would seem to be a poor way to code high frequencies given their greater attenuation.... – Bzrs Dec 15 '16 at 21:28
  • For echolocating animals which hear well into the hundreds of kHz it seems particularly problematic. I doubt you could get large displacements of the BM in response to high frequencies without help (active amplification), but immunogold labeling has shown similar amounts of prestin in the base and apex which seems inconsisent with amplification being more important at the base. https://www.researchgate.net/publication/44651053_The_ultrastructural_distribution_of_prestin_in_outer_hair_cells_A_post-embedding_immunogold_investigation_of_low-frequency_and_high-frequency_regions_of_the_rat_cochlea – Bzrs Dec 15 '16 at 21:29

3 Answers3

4

Short answer
The cochlea is a tonotopic map with certain physically determined boundaries that determine the range of frequencies perceived. Ultrasonic soundwaves simply do not have a correlate on this map.

Background
The cochlea is a frequency analyzer that basically translates acoustic frequencies into a place-map. High frequencies are encoded basally (up to 20 kHz), low-frequencies apically (down to 20 Hz or so). Hence, it is pretty much a Fourier analyzing system (Fig. 1). This way of analyzing sounds is referred to as the place-coding theory of pitch. The place where a frequency is encoded is mainly dependent on physical characteristics of the basilar membrane in the cochlea. Every part is sensitive to a slightly different frequency then the next. This is caused by gradual variations in the stiffness and width of the basilar membrane, among other less important factors like hair cell length and so forth. The specific physical characteristics determine what specific resonant frequency a particular part of the basilar membrane has. Hence, incoming sounds are torn apart with standing waves, where each frequency results in a standing wave at a particular spot in the cochlea.

ear
Fig. 1. Tonotopic map of the inner ear. source: Ternopil State Medical University

The frequencies mentioned are physical wavelengths of the acoustic air pressure differences entering the outer and middle ear. The cochlea translates these into fluid-based pressure differences. Hair cels in the cochlea pick up these fluid pressure differences and tranlate them into potential gradient differences.

The sampling rate of hair cells is pretty much infinite, as they work on a continuous membrane voltage, i.e., they are analogue.

The secondary neurons, the spiral ganglion cells, translate these voltage differences into neural spikes and lead them through the auditory nerve to the brain.

Neural spiking follows the acoustic frequencies up to, say 1 kHz (frequency following response). This phenomenon is referred to as the temporal code of pitch hearing. After that, the refractory characteristics cause single fibers to fire only to one in a few wavelength periods. So at the upper limit of hearing, say 20 kHz, a ganglion cell may only fire once every 20 wavelengths or so. No problem, as many others do the same thing. Stochastics cause the wavelength to be nicely encoded in a population of responsive fibers. Furthermore, the auditory cortex contains a tonotopic map, meaning that high frequencies are processed elsewhere then lower frequencies. In other words, the auditory nerve doesn't need to faithfully encode the incoming wave.

A nice example in this are cochlear implants; they stimulate the auditory nerve with electrical currents. The place of the electrodes determines the pitch, not their pulse rate (although it can have an effect).

Now why are you not hearing ultrasounds? Simply because the basilar membrane does not contain regions sensitive to frequencies above 20 kHz or so. This is referred to as the Greenwood map, which depends on species.

AliceD
  • 52,402
  • 19
  • 174
  • 238
  • Thanks for the answer. Just a few questions: re: "The sampling rate of hair cells is pretty much infinite, as they work on a continuous membrane voltage, i.e., they are analogue" don't they have to depolarize, though, which is a discrete event? Is your statement that there is no real limit at all related to this study which showed that OHCs are not limited by their membrane time constant? https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3143834/ – Bzrs Dec 15 '16 at 22:44
  • 2
    @Bzrs The point is that hair cells voltage is continuous - those cells don't fire spikes. So, no, depolarization is not a discrete event. The most likely explanation is that mechanical deflection of the hair cells causes a channel to open; that open probability is a function of the magnitude of the deflection. The depolarization is analog meaning that it is continuous. You could depolarize by 1 mV, or 2 mV, or 0.5 mV, or 0.1 mV, or anything between. Here we are talking about the inner hair cells, the primary sensory cells, not the OHCs. – Bryan Krause Dec 15 '16 at 22:59
  • @BryanKrause I know IHCs actually transduce the sound, just was thinking there would have to be a similar mechanism behind their analog quality and it appears that there is (it being the probability of open channels). It's a bit confusing to me since one could see discreteness in some aspects of this, e.g. neurotransmitter release from the IHC is also continuous/analog due to channel state being so, but the release itself is quantal (released in packets of constant quantity) according to "Neurotransmitters and Synaptic Transmission" by Sewell in "The Cochlea" (Springer Auditory Handbook). – Bzrs Dec 15 '16 at 23:08
  • 1
    @Bzrs Yes, there are some aspects that are still somewhat discrete, like neurotransmitter release, but even that is continuous on average: see law of large numbers. Essentially, the membrane potential is setting a probability of release, and release events are frequent enough that the output is roughly continuous, especially because there will be some postsynaptic temporal filtering that smooths the individual release events. – Bryan Krause Dec 15 '16 at 23:22
  • @Christiaan if freqs above 20 kHz have no representation in the human cochlea, what is your interpretation of the human ultrasound via bone conduction studies? This study claims to rule out lower freq stimulation (https://www.ncbi.nlm.nih.gov/pubmed/23384569) and another by the same group shows inhibition of tinnitus after 30 kHz BCU, which they say is evidence that their BCU stimulus activated the cochlea base (https://www.ncbi.nlm.nih.gov/pubmed/24530434). Your answer http://biology.stackexchange.com/a/27901/28436 invokes ossicles and not cochlear representation as the limiter on high freqs. – Bzrs Dec 15 '16 at 23:33
  • Also, some interesting findings: BM motion of up to 100 kHz detected in guinea pig, despite it being well over the upper freq limit for them: https://www.ncbi.nlm.nih.gov/pubmed/16603325

    Responses of up to 100 kHz recorded in single OHCs: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC16347/

    Presumably the analog/continuous voltage changes induce conformational changes in prestin particles such that the OHCs can change length at this rate. Perhaps, as with volley principle, there are enough prestin particles that only a subset need to be shrinking/enlarging to cause movement each cycle?

    – Bzrs Dec 16 '16 at 04:47
  • Why should the OHCs amplify at such high rates (essentially following a 100 kHz signal) if detection at this high frequency will only matter on the place code? At this point we are obviously butting against the limit of knowledge on the topic, but...it's fascinating stuff. – Bzrs Dec 16 '16 at 04:49
  • Sorry, final comment: the OHCs amplifying 100 kHz signals is not inconsistent with the place code. Rather, it serves to boost the strength of the response which is likely needed for high frequencies. So its purpose is not encoding frequency so much as it is amplifying the signal; and failing to keep up with the high frequency would cause distortion through subtraction/addition (similar to what is posited for prestin's mechanism of conferring frequency selectivity). – Bzrs Dec 16 '16 at 05:57
  • @Bzrs Per this answer: http://biology.stackexchange.com/a/27901 , "the transfer function of the ossicle chain in the middle ear is a suspected culprit in setting the upper frequency limit to 20 kHz (Hemila et al., 1995)." The answer has a link to that paper. So... we have a 20kHz LP filter in the acoustic path to the hair cells; even if the latter were doing "sampling", we still wouldn't get aliasing. otoh, bone conduction bypasses the middle ear, explaining how we can "hear" ultrasonics by that path. – Jamie Hanrahan Jan 06 '17 at 21:19
  • @JamieHanrahan The ossicles do not set the upper frequency limit. That's an outdated idea. See http://www.pnas.org/content/99/20/13206.full.pdf – Bzrs Jan 20 '17 at 15:30
2

I think that you are misquoting Aliasing.

Digital acoustics is explained in a mathematical sense, Aliasing is a maths concept. Real life acoustics is explained in a physical sense, which talks about reflection, absorption, phase change, harmonic modes, weights... the perception of sound is discussed as psychoacoustics and cortical structures and individual nerve impulse detection thresholds. consider for example the explanation of drum acoustics, it is not digital and on no physical objects will you see the word "Aliasing" used: https://en.wikipedia.org/wiki/Vibrations_of_a_circular_membrane

Aliasing refers to a digital concept, whereby we devide screens into pixels, and you can't make out objects narrower than a pixel. A wave is at least 2 "/////" points of data, so it requires a 2x window so we have 44k CD's to code 22Khz sounds.

I'll just tackle this precise question, aside from the mis-use of the term Aliasing: why would I hear nothing instead of the signal aliased at a 30 kHz sampling rate?

Pressure waves are continuous and physical sounds... A continous sound or physical object can't be subject to the digital distortion effect "Aliasing" which for example refers to the generation of infinitely high frequencies in between two sampled points of a clock rate...

Because physical sound is continuous, it can't have frequency distortion related to it's sampling rate of 15/30 Khz, it can attenuate and physically react with physical objects including other sound pressure waves and cause physical objects to resonate in different modes of movement.

The sound detection depends on the physical excitation of hairs and nerves that must exceed a threshold of detection. physical objects don't have radical and odd excitation modes when they absorb a frequency that is too high, they can resonate in different modes, but they wack about wildly and produce volume clipping and sound artefacts. Most of the time they don't have a limit of frequency after which they go crazy. the closest you can get to strange frequency modes in physical objects is resonance where the movement builds up into a high kinetic movement like the Tacoma Narrows Bridge. You have to approach the ear as a physical model and not a digital one. I think of the resonant modes of structures in the ear similar to a guitar string or a gong moving in 3D space. This gives you an idea of the nerve signals in the ear: https://www.youtube.com/watch?v=1JE8WduJKV4&t=17s

Almost all sound that is detected inside an ear is distorted by reflection from it's initial shape and source, and thereby it is smudged into reverb similar to light travelling through frosted windows.

Human brains and tissues are not digital and quantized, they aren't even analog, the are cellular with different types/sizes of receptor cells and nerves, variable and organic. You can say they alias only when you talk about perfectly equal sized cell matrix in 2d/3d pattern, like photoreceptors, except that our minds disregard information on cellular scales that aren't useful to us, like a biological version of aliasing would be.

If you study the function of the cochlea you will find that the structures, hairs, membranes are so different from a digital aliasing concept.

Human ears collect high frequency sounds unlike the dished ears of bats and cats and dogs, which are made of stiff cartilage that reflects higher frequencies well, into the ear canal. high frequencies are absorbed very fast by the skin and it takes specialized organs to reflect them to stiff cartilage chambers lined with hairs, each part of which is adapted to further reflect and absorb different frequencies.

The cochlea is organic and cellular, and it is similar to multiple electret microphone diaphragms and cilia all existing inside a complex organ which sends the vibrations to nerves. Sounds have to be collected by dishing and focused onto light and rigid membranes.

There is very much artifacting from all frequencies as they reach the ear. The sounds reflect of different surfaces, although high frequency ones absorb more easily and therefore are heard more linear from the source to the sensor, and have more time precision and more binaural precision.

Sounds don't tend to be generated in the exact same spot(point sources), so if you have for example an insect generating high frequencies, it will make a complex wave shape, that excites a large envelope of air around it, like dropping 5-10 stones into some water, and the outgoing wave will not be a simple form, but a complex series of phase interactions similar water that is excited by a swimmer. In that sense it has some properties of a moiré pattern, but it isn't aliasing, it's complex wave and phase interaction.

An Aliased oscillator on the other hand is a digital sound which contains infinitely high frequencies, because digital encoding forces sudden changes in amplitude to be abrupt, which is different to nature, where sounds are continuous and not discrete sets of values.

As sound travels through air and through flesh, the high frequencies which are all pure sine wave components of the overall sound, will simply be attenuated according to complex spectra of attenuation which correspond to the ambient air conditions, the angle of incidence towards the reflective and transmitting ear vestibules.

bandybabboon
  • 10,397
  • 22
  • 39
  • 1
    Nice answer, but can you add a couple of references or a good textbook to provide support/further information? – fileunderwater Dec 15 '16 at 13:45
  • 1
    I don't know that I understand what you're saying some of the time, but from your analogies I would guess you come from a visual background.

    Human ears also have cartilaginous pinnae that help amplify sound. The cochlea is not lined with hairs (did you mean hair cells?). And with regard to high frequency sounds, the place-code of the cochlea does come with particular frequencies being represented in a particular place.

    I know the ear is not digital and that signals do not come in without a lot of modification. It's a simplification, but my question was conceptual.

    – Bzrs Dec 15 '16 at 19:08
  • You can actually bypass the entire impedance from air to fluid/ossicle transfer function aspect of it if you transmit a signal through bone conduction, and indeed studies show that people (even the deaf) can hear higher frequencies this way. In which case the limitation may not be pre-cochlear.

    https://www.ncbi.nlm.nih.gov/pubmed/2063208 https://www.ncbi.nlm.nih.gov/pubmed/11234768

    – Bzrs Dec 15 '16 at 19:17
  • 1
    This answer makes some good points but a lot of the details are misleading or wrong, especially the comparisons between human and other animal hearing. I think it would be greatly improved if @comprehensible was able to source the real stuff and remove anything they can't find sources for. – Bryan Krause Dec 15 '16 at 19:42
  • Bzrs,that study talks about brain implants of some kind, ultrasound hearing aid cortical implants. Cats can and bats can hear above 60Khz. they have forward pointing ears with round dish structures that point towards the ear canal. some primates have dished ears, but almost all the human ear convex shape points away from the inner ear. The major mistake in the text is that i have not considered how the human auditory cilia can react to signals of 15Khz in a way that can be compared to a digital encoding of 30Khz which is necessary to detect a peak and trough at 15Khz. – bandybabboon Dec 15 '16 at 20:37
  • Most of my understanding of sound is from many years of backwards engineering digital audio synthesizers, and knowledge of physical modelling in audio, reverb, direct versus indirect audio. Vibration modes of a gongs and drums and the function of speakers and microphones. Most of the information is so obvious to me, that the human nervous system doesnt have a digital clock rate at 44khz, or a binary function, you have to tell me what i have stated too vaguely so that i can reference it. – bandybabboon Dec 15 '16 at 20:42
  • @comprehensible "Human ears collect high frequency sounds unlike the dished ears of bats and cats and dogs" - All of those non-human animals hear at higher frequencies than humans. This has to do with the physical properties of the cochlea, not the pinnae. "The sounds reflect of different surfaces, although high frequency ones absorb more easily and therefore are heard more linear from the source to the sensor, and have more time precision and more binaural precision." - The logic here is incorrect, and I'm not sure what point you are even making. – Bryan Krause Dec 15 '16 at 21:22
  • @comprehensible "High frequency sounds don't tend to be generated in the exact same spot" - By "exact same spot" I think you mean "precise frequency" - however the OP isn't talking about natural sounds necessarily, and it is certainly possible to artificially present tones fixed to any frequency. And your last paragraph is mostly jibberish, there is no difference in the "pureness" of sine waves at different frequencies, and probably the head itself is the main attenuation source, allowing for sound localization based on interaural level differences. – Bryan Krause Dec 15 '16 at 21:26
  • High frequencies are absorbed more readily by air, trees, stone, the ground, water, etc. they don't travel so far. that's simple. what that means is that the higher pitched the sound, the more likely that you will not have heard it as an Nth reflection... Aliasing is a kind of distortion, and indirect hearing is a temporal and spacial distortion, so I thought it was relevant to try and explain that.... Animals have reflective objects surrounding their ear, which transmit an imprecise reflection of the sounds, and it further illustrates that we have temporal and phase distortions, not Aliasing. – bandybabboon Dec 15 '16 at 21:39
  • You presume that i mean Frequency when i say "generated in the same Spot" ? I mean Point Source of sound that doesnt exist in nature. you presume Gibberishly yourself. Your judgement of giberrish seems to arise from a mind that doesnt even understand the concept of a "point source" of sound, you are confused. You need to consider that SOUND SOURCES aren't really idealised point sources. Depending on the application, you can shape speaker(s) so you have cylindrical waves, or dipole waves, or sections of a monopole wave ..., or a combination of several patterns. – bandybabboon Dec 15 '16 at 21:53
  • I'm sorry, you're right, it was dismissive to say gibberish, I apologize. What I meant to say was that I don't think your points actually go toward answering the OP, because the OP was talking about tones, and in that context it doesn't matter what naturalistic sources of sounds are: I totally agree with you that pure tones are not directly relevant to natural hearing. There are still misunderstandings by the OP that are more relevant to answering the actual question. – Bryan Krause Dec 15 '16 at 22:15
  • Sorry for not writing it more sensibly. Having programed anti-aliased oscillators and studied physical acoustics, I was confused by the question and it left me in a state of incomprehension when i tried to interpret the wording. I have rewritten paragraphs 3/4 to take up and answer his question properly. – bandybabboon Dec 15 '16 at 22:42
0

I actually think I realized the answer this morning. The biological equivalent of 'sampling' should occur at the neural level, e.g. after successful transduction. Since properties of the basilar membrane (e.g. stiffness/thickness) AND the transfer function of the ossicles would need to allow passing of a high frequency sound into the cochlea prior to sampling, the excessively high signal may never make it there in the first place.

Bzrs
  • 223
  • 1
  • 6
  • I would agree with what you've said about the properties of the ear, but I'm not sure that the concept of sampling is a good fit on a neuronal level either. Though I would be very interested if you had some info about how it would relate. – kingfishersfire Dec 15 '16 at 19:57
  • I just found this and will be digging into it as best I can with my limited neuro background: http://audiology.pagesperso-orange.fr/en/cochlear-sampling-theory.pdf – Bzrs Dec 15 '16 at 20:09
  • Also, not sure if you saw my answer above, but considering the nature of spiral ganglion fibers innervating different places along the cochlea, and the tonotopic representation of frequency, there's at least potential for these to act as 'channels' which are sampled/filtered/processed in different ways. This was obviously the thought behind some cochlear implants but I admittedly don't know how much it is an accurate reflection of the biological ear. – Bzrs Dec 15 '16 at 20:10
  • This looks interesting. Thanks for sharing. It looks like our supposition about the limits of the cochlea may have been premature. – kingfishersfire Dec 15 '16 at 20:11
  • From what I can gather from skimming the paper is that there isn't a practical limit to high frequency detection. The multiplexing that is proposed (p. 53) that got us to 15-20 kHz could get us much higher. Though it seems like the paucity of cochlear space devoted to higher frequencies means the cochlea has a functional high pass filter. Fascinating paper though, I'm going to dive into it later when I have more time! – kingfishersfire Dec 15 '16 at 20:31
  • Yeah, that 'multiplexing' is essentially the volley principle as explained in the link you provided above. It does not satisfactorily explain the hearing of other mammals, however, such as echolocating bats and whales. To hear 200 kHz you'd need 200 IHCs responding (to be strict) or else have enough of them firing such that the brain could still reconstruct the signal from the incoming periods (e.g. for a 10 Hz signal you wouldn't need firing on all 10 cycles but if you got 1, 2, 5, 6, 9, the brain might still deduce the frequency). Harder with 1, 3, 5, 7, 9, though (would look like 5 Hz). – Bzrs Dec 15 '16 at 20:45
  • 1
    Multiplexing and volley principle is not necessary for detecting or perceiving sounds; phase information is not needed to perceive a sound at a particular frequency, only amplitude. There is no need for the brain to be able to reconstruct the exact waveform of an incoming sound to be able to perceive it, a spectrogram is almost certainly sufficient. Phase information is important for localization of low frequency sounds, however, so it is still important, just not necessary at all frequencies. – Bryan Krause Dec 15 '16 at 21:29