2

I am having the most difficult time mixing simple unsigned 8-bit/8kHz linear PCM audio samples in JavaScript (AppsScript). I've tried all the basic maths listed here, employing basic number arrays.

  1. Sign, add, clip, unsign. Subtract 128 from each byte, add them, clip, then add 128.
  2. Produce the Average, each sample just a[x]+b[x]/2
  3. Viktor T. Toth's, maths linked.

Full example is below. This will ask you to authorize Google Drive to access the audio samples (no it does NOT touch anything of yours): https://script.google.com/macros/s/AKfycbzMOWV5Z5soG3uinrFc0WcscNXxsDq9elE9rzG0t703vsvDJZMS/exec

As you can see the 3 mixed results are downright horrible. I've always assumed the unsigned bytes were just offset by 128. Not the case? Do I need to fold half the byte to get a true linear representation? Does a basic number in JavaScript not translate appropriately into signed/unsigned representation? If that's the case the Java/AppsScript is somehow properly doing it with the two real samples.


Update Here is some specific code below, and the output of this is in the link above. Alas, it simply doesn't quite work (byte folding in JavaScript)?
   var vegaArray = DriveApp.getFileById('0B-e9EqGm0pWPQ3RUTXFyUERDVTA').getBlob().getBytes();
   var fdraArray = DriveApp.getFileById('0B-e9EqGm0pWPaUJQUmFRQWctNG8').getBlob().getBytes();

   for(var i=44;i<vegaArray.length;i++)
     vegaArray[i] = Math.round( ( 2 * (vegaArray[i] + fdraArray[i]) )
       - ( vegaArray[i] * fdraArray[i] / 128 )
       - 256
     ); // for
   // vegaArray[] is the mixed output
Jé Queue
  • 121
  • 6
  • 2
    do you perform the addition phase with an 8-bit signed data type or a larger kind of data type such as an 32-bit signed integer or a float etc. Because using an 8 bit container, there will possibly be severe wrap-around distortions after addition, even before you can clip the saturated sums, moreover clipping is quite useless in that case as well... – Fat32 Dec 10 '16 at 00:41
  • @fat32, I believe the implicit numeric type in JavaScript is double, therefor the "byte" read is just [0.0,255.0] on the internal representation. If I output the array, I get values above and below the 8-bit range easily. – Jé Queue Dec 10 '16 at 01:05
  • 1
    I'm looking at a solution right now. the first thing I noticed is that both audio files are sized such that directly adding them will not clip, just FYI – benathon Dec 10 '16 at 02:00
  • 1
    and average-mixing should never clip. – robert bristow-johnson Dec 10 '16 at 02:18
  • FYI, it appears as though AppsScript flops the byte. Effectively b>127?b-256:b – Jé Queue Dec 10 '16 at 18:57

2 Answers2

1

Assuming that abyte[] and bbyte[] are two arrays with samples valued from 0-255, this is the solution:

combined = [];
for(i = 0; i < abyte.length; i++)
{   
    sample =  2*(abyte[i] + bbyte[i]) - ((abyte[i]*bbyte[i])/128) - 256
    combined[i] = Math.round(sample);
}

I have a full example with both wav files extracted into arrays (I did this using matlab). Also I verified that it plays back fine using matlab.

https://jsfiddle.net/4hsfo96g/

benathon
  • 201
  • 1
  • 6
  • But that's exactly what I've done. Edit above with code to address this specific answer. I'm thinking bytes are folded in JavaScript? – Jé Queue Dec 10 '16 at 02:54
  • 1
    Well I can guarentee you mine works fine so start comparing my outputs to yours. Also you should edit your question to include code – benathon Dec 10 '16 at 02:55
  • I just edited above to show the code that resembles exactly what you have. I'm going to try this in C, and see if it is simply the Number implicit type in JavaScript that is folding bytes? – Jé Queue Dec 10 '16 at 02:59
  • 1
    Maybe try a passthrough test where you call Math.round() on the samples from one array but that's it. Try and play that output, if it's all messed up you know you have a problem with how you are writing the results and not your DSP – benathon Dec 10 '16 at 03:06
  • 1
    even though i've seen it in the Toth reference, can you explain how the multiplication of the two signals together is a functional part of "mixing" audio signals? is the generation of frequency components that exist in neither of the two inputs a function of "mixing"? – robert bristow-johnson Dec 10 '16 at 19:22
  • 1
    No it's not mixing In the traditional sense. It's just added. But the lay person term In audio is "mixing" – benathon Dec 10 '16 at 22:10
  • 1
    no port, the Toth algorithm is not just adding. that's the point. despite that Viktor T. Toth may have some reputation as a software designer, his audio mixing algorithm is shit. if you use that algorithm, your output will sound bad. if you add without scaling, you will need to apply some kind of saturation (and there are soft-clipping algs out there) to prevent wrap-around overflow (which is the worst sounding), and if you add and scale by $\frac12$, then it is impossible that two clean signals going in will come out distorted. – robert bristow-johnson Dec 10 '16 at 22:16
  • 1
    so the Toth algorithm is crap. averaging keeps the signals clean and never clips, but it puts in a -6 dB gain factor you might not want. if you want to mix with no dB loss, then you will need to saturate if the value gets too big to fit in the output word. if "hard clipping" is a problem, then "soft clipping" applied to the sum is what people oft do. would you like a simple soft clipping alg? – robert bristow-johnson Dec 10 '16 at 22:36
0

Found the answer via some trial-error. Bytes are folded. AppsScript represents numerics (in byte example here) as 2s-complement, meaning the logical high 7 bits of this 8-bit integer [128,255] is [-128,-1] in ascending order. Therefor, averaging or adding as-is was pointless.

The step-wise code in AppsScript:

   // See code above from where vegaArray and fdraArray come
   for(var i=44;i<vegaArray.length;i++) {
     // flop the bytes atop
     vegaArray[i] = vegaArray[i]<0?vegaArray[i]+256:vegaArray[i];
     fdraArray[i] = fdraArray[i]<0?fdraArray[i]+256:fdraArray[i];

     // now do the unsigned 8-bit stuff (avg, add)
     vegaArray[i] += fdraArray[i];
     vegaArray[i] /= 2 ;

     // flop back the bytes abaft
     vegaArray[i] = vegaArray[i]<128?vegaArray[i]:vegaArray[i]-256;

   } // for
Jé Queue
  • 121
  • 6