2

liveatc.net is available, but the voice quality on there, I am given to understand, does not represent the quality of voice available to someone actually sitting in an ATC tower. So I am looking for a source which offers more representative recordings.

EDIT: I intend to use these recordings to evaluate which speech to text engine (Non comprehensive list) works best in the context of aviation, so recordings which best mirror what happens in reality are important to reduce pre-processing.

Priyank
  • 475
  • 6
  • 14
  • 9
    Go for a plane ride and listen to the sound quality... liveatc.net is pretty spot on! :D – Ryan Mortensen Jun 01 '16 at 06:12
  • 2
    If you would explain what you need these recordings for, it would be easier to come up with a useful answer. – 60levelchange Jun 01 '16 at 06:24
  • 2
    Are you interested in Tower communication first (approach, landing, T/O, departure)? Tower exchange will usually be more clear than another communication, as aircraft are closer. On the other hand, archiving audio streams takes storage resources, I doubt the sampling of the archive will be better than what is transmitted once using multicast. – mins Jun 01 '16 at 06:48
  • I I hope the edit clarifies things, i'm hoping there's something out there on the basis of the replies to this – Priyank Jun 01 '16 at 07:00
  • 2
    If you want to teach a speech-to-text engine you should use bad quality recordings instead of good ones. More often then not the quality of the radio transmissions is not very good. The human brain can "fill in the gaps" but a software program does not have the intelligence to do that. Using bad recordings will at least teach it to filter out the static a little bit more. – Ron Beyer Jun 01 '16 at 12:07
  • @RonBeyer: Just for sharing... Face Recognition Algorithms Surpass Humans. I believe this is the same for speech recognition, but I've no study at hand. Algorithms are actually able to cope with incomplete information if they are trained. The so-called "big data" raw material will be mostly live video, audio, and other unstructured data found in live data streams that won't be stored at all, only used. – mins Jun 01 '16 at 19:38
  • @mins Don't want to turn this into a discussion but I work with SR quite a bit, and SR is significantly more complicated than facial recognition. The biggest issues are things like inflection, pronunciation, accents, etc. Add in the different frequencies and volumes of voices along with noise and the issue is compounded. Facial recognition is easy, a face is a face, but the same word can sound very different depending on who speaks it. There is also no context in FR, where context plays a big part in SR to fill in the gaps. – Ron Beyer Jun 01 '16 at 20:24

1 Answers1

4

LiveATC is probably your best shot. The quality you'll get on the stream depends on where the receiver is setup. Since aircraft communications are in VHF or UHF (military) reception is basically within line-of-sight range. This is why often you'll hear the aircraft side of the conversation without hearing ATC.

I would just browse the various streams on LiveATC until you find a good one. OR, if you're near an airport, set up your own receiver!

Andrew
  • 314
  • 2
  • 3