12

I have written a program to perform FastICA on a stereo WAV file using the code on Python MDP FastICA Example

With the audio examples I get very good results.

Then I try to do real world recording using two computer mono microphones connected to the stereo mic in of my pc by connecting mic 1 to L channel and mic 2 to R channel. I test by playing some music at the background while I am talking in a quiet room.

However, running FastICA does not separate the signals at all. Is it possible that the quality of microphones is too poor? Do I need to do anything to the recorded WAV file (16 bits, signed PCM, 44100Hz) before running FastICA?

You can download the recording here.

Jeremy
  • 223
  • 2
  • 4

3 Answers3

11

ICA in raw form is only suitable for use with phase synchronised observation mixtures. Using microphones as you have described will introduce a phase delay as pointed out by other posters. However this phase delay can be used to great avail. The best known algorithm that deals with stereo separation in the presence of delays is DUET. The links are broken but the references you are looking for are here >http://eleceng.ucd.ie/~srickard/bss.html.

This is the paper you should look for >
A. Jourjine, S. Rickard, and O. Yilmaz, Blind Separation of Disjoint Orthogonal Signals: Demixing N Sources from 2 Mixtures, IEEE Conference on Acoustics, Speech, and Signal Processing (ICASSP2000), Volume 5, Pages 2985-2988, Istanbul, Turkey, June 2000

Dan Barry
  • 854
  • 4
  • 5
  • 2
    A question - if the mixes are coming from more than one spatially separate mic... then how can we ever have phase synchronicity?... In all the texts I see this example of multiple mics is used ubiquitously, but if the phase synchronicity is such an issue why is it not mentioned? Im just trying to understand here, I am new to the ICA scene. – Spacey Dec 06 '11 at 16:09
  • @Mohammad: I suspect the phase synchronicity is present in other applications, and they just use audio applications to make it more familiar to readers. – endolith Dec 06 '11 at 20:10
  • 1
    @Mohammad it is possible with spaced microphones to have phase sychronicity for one source. Imagine a source being captured with 2 microphones placed equidistant on either side of the source. The path length from source to microphone is the same in each case and the signals will be recieved in a phase at both mics, but only for that source. You can add more mics equidistantly along different spatial dimensions to further reject unwanted signals. Some EEG analysis techniques avail of this. You should also note that phase delay between each mic will be a function of frequency (due to wavelength) – Dan Barry Dec 07 '11 at 11:54
  • @DanBarry Thanks Dan - interesting point about the EEG. Let me just clarify - I of course obviously know that if sensors are equidistant from a source we get the same delay... :-) What I was trying to get at was for applications where such things cannot be controlled, (speakers in a room with a number of sensors), ICA is said to work in such cases - but 99% of the time we wont have phase synchronicity - if it is touted as a working algorithm in this case - yet is sensitive to those phases issues - then what is missing here?... Thanks! – Spacey Dec 08 '11 at 17:01
  • Hmm.. I think this is the same concept as the algorithm I thought of. Clustering amplitude ratios vs delay for each STFT component and then reassembling the clusters. Oh well... :) – endolith Dec 12 '11 at 20:31
  • @Mohammad I only deal with audio myself and as you have said, ICA is often mentioned as a possible solution to the separation problem but I can't remember any specific papers which use ICA in echoic multi mic scenarios. If they do, they must be using additional processing. I should also mention that the algorithm I suggested above, DUET, uses a very close microphone pair separated by less than 10mm. You may find ICA applicable for close microphone pairs or standard dual capsule stereo mics. But the general ICA formulation requires the instantaneous mixing model to work properly. – Dan Barry Dec 13 '11 at 12:41
  • 1
    @endolith yes it was a nice solution and it works! I also thought of a similar solution back in 2003 and was disappointed to find something similar but mine was sufficiently different that I managed to patent it. I was luckily able to be the first to develop a realtime source separation algorithm back in 2004. Demos of the original here > http://www.audioresearchgroup.com/main.php?page=Demos and the new improved one can be seen in action in the video demo here > http://www.riffstation.com – Dan Barry Dec 13 '11 at 12:46
  • @DanBarry: I also thought if it around 2003. :) I asked a professor if I could implement it as a final project and he shot it down and said people use ICA for this, so I didn't look into it any more. Only years later did I realize that ICA doesn't work in this case and methods like this are actually better. In Riffstation it's used to isolate the guitar from other instruments? – endolith Dec 13 '11 at 14:13
  • @endolith : That's correct. There is a free download of it available as of yesterday if you want to try it for yourself. – Dan Barry Dec 17 '11 at 21:51
6

As I say further down the page:

it turns out that ICA doesn’t actually work well when the signals occur at different delays in the different sensor channels; it assumes instantaneous mixing (that the signals are in perfect sync with each other in all the different recordings). Delay would happen in a real-life situation with performers and microphones, since each source is a different distance from each microphone.

I'd guess that this delay between channels is the reason. If you look closely at the two waves, you will probably see that some sounds occur sooner in one channel than the other, and the rest vice versa.

To prove that it's not the quality of the microphones, you could try recording two different signals using one microphone at different times, and then mix them together so that some of each signal is in each channel, and see if the ICA works in that case.

endolith
  • 15,759
  • 8
  • 67
  • 118
  • I've tried. It's should be a delay problem as you suggest. By mixing separate recordings FastICA produces almost perfect results.

    I need to find some ways to cope with the delay....

    – Jeremy Dec 04 '11 at 15:54
  • @Jeremy: I think you would need a different algorithm, then. – endolith Dec 05 '11 at 03:02
  • do you know any bss algo that can cope with delay? – Jeremy Dec 05 '11 at 14:53
  • actually when I record and clap my hand to produce a loud sharp noise, I cannot notice any delay in audacity. – Jeremy Dec 05 '11 at 14:55
  • @Jeremy: If you're exactly the same distance from both mics, there will be no delay. :) But if both sources are coplanar with this midpoint, then the levels of each will be the same and they will not be distinguishable by ICA. If they are at different angles, they will have different delays and also will not be distinguishable by ICA. I don't know of an algorithm that works with delays, but they probably exist. (I had an idea for one, which motivated looking into this, but I don't know if it would actually work.) – endolith Dec 05 '11 at 15:09
  • Then I just wonder whether or not all the ica sample recordings online are not real and only manually mixed. – Jeremy Dec 05 '11 at 15:16
  • @Jeremy: Yes, I'm sure they're mixed directly without any delay. "These examples are artificially mixed on DEC Alpha station 200 after recording each source signal separately." Though another mentions "Fast convergence speed was achieved by using a time-delayed decorrelation method as a preprocessing step." which might be a way to use ICA for these? I don't know. – endolith Dec 06 '11 at 02:57
  • @Jeremy Perhaps I have missed something here, but if you need to simply correct for the delay between two signals that are from the same source, why would you not correlate them first, calculate the delay, and then simply delay the say, second signal by the negative of that amount from the first? Then that way they would completely overlap, and you can run the ICA algo. .. – Spacey Dec 06 '11 at 04:10
  • @Mohammad: Would ICA work in that case, though? You could line up source 1 in both signals, but source 2 would become even more out of sync between the two signals. – endolith Dec 06 '11 at 05:34
  • @endolith Hmm...your comment got me thinking... However, here I think ICA might still be ok though. My current understanding is the following: ICA doesnt 'care' about the delays, so long as two signals in each of the two mixes are present. The reason I dont think it cares is because it looks at underlying statistics, and not phases or delays or such. The statistics differentiating the two sounds are constant throughout their overlap, and this is what ICA works on so to speak. – Spacey Dec 06 '11 at 15:58
  • @endolith That being said, if the at any one point, you are comparing a mix (of the voice and music) to a mix (of just music) then yes, I would not expect ICA to work. As long as the two 'chunks' have 2 signals, (wherever they may be), ICA should work IMU. (I am currently reading a fantastic book on ICA tutorials, half way through it, I am no expert, but this is my current understanding). – Spacey Dec 06 '11 at 16:00
  • @Mohammad: I'm not sure. The way I'm seeing it, after alignment, it would be 3 sources mixed into 2 signals (Source 1, Source 2, Source 2 Delayed). But since Source 2 and Source 2 Delayed are only in one signal each, maybe it can still handle it? – endolith Dec 06 '11 at 20:14
  • 1
    @endolith You might be right - some new information - I checked the footnotes, and apparently the author of my book does say that he assumes all signals are not delayed relative to each other. :-/ In other words the mixing matrix is simply one that changes amplitudes. Eh. Now its even more confusing. :-) – Spacey Dec 12 '11 at 18:49
0

There is another algorithm which uses second order statistics: AMUSE.

Here you can find an implementation in Python.

Matt L.
  • 89,963
  • 9
  • 79
  • 179