Opened 12 years ago
Last modified 9 years ago
#9438 reopened bug
Mixer not resampling very well
Reported by: | Pete | Owned by: | nobody |
---|---|---|---|
Priority: | normal | Milestone: | R1 |
Component: | Add-Ons/Media/Mixer | Version: | R1/Development |
Keywords: | Cc: | degea@… | |
Blocked By: | Blocking: | #9704 | |
Platform: | All |
Description
If audio is generated at a different rate from the driver's sample rate, the output has crackles, hash and other artefacts. This seems to happen whatever the mismatched rates are. I originally noticed it with 96000 output and 48000 source, but, as I have now reduced my system's rate to 48000, I'm attaching two sample audio sweeps that are identical except that one was generated at 48000 and the other at 44100. If the output rate is not one of those, both will sound bad, but if you select one or the other of those rates in Media Prefs, the corresponding file will be a perfect sweep, and the other will have crackles and "ghost sweeps" in the background!
(My audio is HDA, but I assume that this is irrelevant because the resampling is done in the mixer.)
Attachments (4)
Change History (30)
by , 12 years ago
Attachment: | sweep44100.wav added |
---|
follow-up: 2 comment:1 by , 12 years ago
Downloaded and reproduced here; same setup (HDA, 48000 instead of 96000 Hz). Using old hrev44058
- crackles in the first second or so of playback of both files.
- ghost sweep only in the 44100 (the one that requires resampling) file.
- interestingly, the crackling is not there any more on "looping" (I'm using an old h-rev where MediaPlayer has a bug where it enables looping by default).
Wondering if #7879 could be related to this; especially since I can't seem to reproduce that bug in Alpha4.1 any more (didn't spend much time in 4.1 though, A4.1 has an irritating Desktop bug so I didn't migrate yet).
EDIT: this is a fairly temperamental bug/piece of code! Just double-clicked the files again after submitting this comment, and now they both have permanent crackling throughout the duration of playback, and it persists after looping back to beginning ad infinitum. The "ghost sweep" seems a more reliable "bug" (so to speak) since it continues to only occur in the resampled audio, not the apples-to-apples 48000 Hz file.
comment:2 by , 12 years ago
Replying to ttcoder:
Downloaded and reproduced here; same setup (HDA, 48000 instead of 96000 Hz). Using old hrev44058
Thanks for verifying...
Wondering if #7879 could be related to this; especially since I can't seem to reproduce that bug in Alpha4.1 any more
Haven't ever encountered that one, but I did forget to mention that I wondered if #1351 could be related -- though that's marked as 'fixed' too.
EDIT: this is a fairly temperamental bug/piece of code! Just double-clicked the files again after submitting this comment, and now they both have permanent crackling throughout the duration of playback, and it persists after looping back to beginning ad infinitum. The "ghost sweep" seems a more reliable "bug" (so to speak) since it continues to only occur in the resampled audio, not the apples-to-apples 48000 Hz file.
Huh. That has not happened to me. The matching file has always been clean.
comment:3 by , 12 years ago
Cc: | added |
---|---|
Version: | R1/alpha4.1 → R1/Development |
Tried both in hrev45325, with HDA output set to 192 kHz and I get ghost sweeps (but no crackles) in sweep48000.wav
comment:4 by , 12 years ago
I have found the same: https://dev.haiku-os.org/ticket/9704#comment:4 If i play wave files generated at 44.100 Hz and the mixer output samplerate is set to a different frequency (eg from 48000 to 192000) i can hear clicks and crackles. Instead, if i set the mixer output samplerate to the same of wave files (in this case 44100) i no longer hears clicks and crackles.
comment:5 by , 12 years ago
Blocking: | 9704 added |
---|
follow-up: 8 comment:6 by , 10 years ago
Which resampler are you using in media preferences? drop/repeat or linear interpolator? Does that make a difference?
comment:7 by , 10 years ago
Personally I noticed that "Linear interpolation" works far better: is well noticeable when I hear music.
comment:8 by , 10 years ago
Replying to pulkomandy:
Which resampler are you using in media preferences? drop/repeat or linear interpolator? Does that make a difference?
Hah! I've always used Linear Interpolation, so that's where I noticed it. But your query just prompted me to try Drop/repeat, and it's many times worse!
comment:9 by , 10 years ago
Component: | Audio & Video → Add-Ons/Media/Mixer |
---|
comment:10 by , 10 years ago
In hrev48459 I added a tool to show what the resampler is doing. This is simpler than relying only on my ears to debug the thing. And indeed it show that the linear interpolator isn't quite working as expected, attaching pictures.
by , 10 years ago
Attachment: | droprepeat.png added |
---|
by , 10 years ago
Attachment: | interpolate.png added |
---|
comment:11 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Fixed in hrev48460. The linear interpolator was completely broken since when I implemented it in 2010. Now it works as expected. You can play with mixerToy to see the difference, too.
comment:12 by , 9 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
I'm sorry to say that I still get this problem (hrev49993), just as it was before! Did something get regressed?
follow-up: 14 comment:13 by , 9 years ago
Almost certainly not, because not a single line of code in the mixer has changed since my fixes. My guess is your problem is yet something else than what I fixed.
comment:14 by , 9 years ago
Replying to pulkomandy:
Almost certainly not, because not a single line of code in the mixer has changed since my fixes. My guess is your problem is yet something else than what I fixed.
Dunh... Well, do you not get the "ghost sweeps"? If you use (the default?) 48000 sampling, and play sweep44100.wav (in a reasonable audio system -- it's not that prominent through my laptop speaker, though still audible), do you not hear a lot of crackles and fainter sweeps in the background (that actually go up and down)? And as before, if I reset sample rate to 44100, that file plays cleanly, but sweep48000.wav has the ghosts.
I made sure I had an absolutely standard hrev49993. I do have the custom HDA that lets me adjust buffer size, but I removed that and used the packaged one.
I assume I'm correct in assuming that everything after the mixer must be at the system sample-rate? So it can't be an audio hardware problem.
Let's see if we can track this down, because it definitely makes sample-rate conversion unusable. (Sorry I didn't get to check out your revisions at the time.)
follow-up: 19 comment:15 by , 9 years ago
Pete: Are the crackles gone now? My guess is those were due to the bugs (out-of-memory access, etc) in the linear resampler that Pulkomandy fixed.
The ghost sweeps on the other hand are probably an artefact of linear resampling; which is not a good way to resample audio in general. In a work project I've used a resampler from xiph.org speexdsp library - it's BSD licensed and the entire implementation is self-contained in the resampler.c file. It offers a configurable quality parameter to select the number of samples in each window and tune the speed/quality tradeoff.
https://github.com/xiph/speexdsp/blob/master/libspeexdsp/resample.c
comment:16 by , 9 years ago
In various parts of the bebook is stated that BeOS performs internal audio manipulation in floating point format, but in particular this extract from BSoundPlayer documentation is the most clear :
By default, the audio format used by a BSoundPlayer is BSoundPlayer::B_AUDIO_FLOAT, which is to say that the audio is in floating-point format, where each sample ranges from -1.0 to 1.0. The Media Kit uses floating-point audio internally, so using floating-point audio whenever you can will improve your application's performance by cutting down on format conversions. However, if you want to use another format, you may do so by specifying a media_raw_audio_format when you instantiate the BSoundPlayer object.
So, my interpretation is that the mixer should do it's internal processing representing everything in floating point audio. This give us at least two advantages, we are allowed to use various algorithm implementations out there and we easily allow audio users to don't incur into format conversions before the audio reach the sound card. In that case, if the user ensure that there are not nodes putting audio in a different format, can give an adequate average between performances and flexibility.
comment:17 by , 9 years ago
So, my interpretation is that the mixer should do it's internal processing representing everything in floating point audio.
Reading deep into the BeBook as if it was an antique tome of perfect knowledge and interpreting it to infer implementation details of BeOS probably isn't all that useful. I very much like the BeBook and it contains lots of relevant information... but in the end we should strive to do whatever is best, regardless of how things worked in BeOS. The internals of the mixer node are not relevant in terms of API or even binary compatibility, that's what encapsulation is all about after all.
Doing all resampling calculations in floating point is reasonable, given today's CPUs and the fact that resampling is an inherently lossy process anyway. However, resampling is not always necessary, and when it isn't, it must still be possible to pass bit-exact streams through the system mixer node without any changes occuring, that means no int->float->int conversion (which is, again, a lossy process) anywhere... otherwise, people who care a lot about audio quality would be unhappy :-)
comment:18 by , 9 years ago
I think there are various clues and technical aspects that seems to reflect that part of bebook. If the problem is to provide ways to do that, we can still add a field in the mixer settings to change the internal format to something different than float, but most audio apps (not only Haiku ones) out there are expecting float to be the native format. Other audio systems are actually working this way, so I might expect people to complain more on not having properly support for float/double more than integer accuracy.
follow-up: 21 comment:19 by , 9 years ago
Replying to tangobravo:
Pete: Are the crackles gone now? My guess is those were due to the bugs (out-of-memory access, etc) in the linear resampler that Pulkomandy fixed.
I'm afraid not. (:-() As far as I can tell, the output is comnpletely unchanged.
The ghost sweeps on the other hand are probably an artefact of linear resampling; which is not a good way to resample audio in general.
I'll be a bit surprised if they are. I suppose audio that is more complex than a sweep would have less prominent artefacts, but they could still sound pretty bad. (It's a pity I no longer have sound on my old BeOS machine. Would be nice to compare performance.)
Just in case there might be some 'folding' going on with out-of-range interpolated samples, I reduced the (MediaPlayer) volume by a factor of several (I recorded the file at a fairly high level), but it made no difference.
comment:20 by , 9 years ago
but most audio apps (not only Haiku ones) out there are expecting float to be the native format
I guess you mean DAWs and such, which support complex signal processing chains, so that they indeed use float/double/etc formats internally to avoid unnecessary losses along the chain. That's most certainly useful.
However, an even more common audio application is playback: play an mp3/wav/flac/etc file from disk. In that case, the samples will generally be integer-based. And as said on the ML, sound cards expect integers too. If you introduce a conversion to float and back to int anywhere between the file and the sound card, you degrade the playback quality. Sure, most people won't care (enthusiasts do). But since unnecessary conversions are, well, unnecessary, and easily avoided, I think we should not add preventable signal degradation. (Assuming the case that no resampling is needed here.)
we can still add a field in the mixer settings to change the internal format to something different than float
Being able to just connect any format to the mixer and make it automatically choose the best is a simpler (for the user!) and nicer solution. It would be annoying for users to have to switch this back and forth.
follow-up: 22 comment:21 by , 9 years ago
Replying to Pete:
Replying to tangobravo:
The ghost sweeps on the other hand are probably an artefact of linear resampling; which is not a good way to resample audio in general.
I'll be a bit surprised if they are. I suppose audio that is more complex than a sweep would have less prominent artefacts, but they could still sound pretty bad. (It's a pity I no longer have sound on my old BeOS machine. Would be nice to compare performance.)
Just in case there might be some 'folding' going on with out-of-range interpolated samples, I reduced the (MediaPlayer) volume by a factor of several (I recorded the file at a fairly high level), but it made no difference.
The Nyquist Theorem means you can perfectly reproduce an analog signal as long as your sampling rate is at least double the maximum frequency of the signal. Linear resampling introduces steps in the gradient of the signal (drop/repeat is much worse as it introduces steps in the value of the signal). Those steps will introduce higher-frequency components into the signal, harmonics of which will probably occur at audible frequencies. I could be wrong, but I have a strong suspicion there would be a pattern to these harmonics that would be particularly noticeable with a sweep - a little like how you get "beats" when there are two nearby frequencies.
Could test for sure by writing something to do the simple linear resampling offline to a wav, and then play that through the mixer (or on any other "known good" audio platform).
The float/int discussion is relatively separate; the speexdsp code I posted has both floating point and integer resamplers. I believe this is the code that is used by Audacity too. The key point is instead of just linearly filling in the new samples, it actually finds the frequencies present in the original signal and then interpolates using them.
comment:22 by , 9 years ago
Replying to tangobravo:
The Nyquist Theorem means you can perfectly reproduce an analog signal as long as your sampling rate is at least double the maximum frequency of the signal. Linear resampling introduces steps in the gradient of the signal (drop/repeat is much worse as it introduces steps in the value of the signal). Those steps will introduce higher-frequency components into the signal, harmonics of which will probably occur at audible frequencies. I could be wrong, but I have a strong suspicion there would be a pattern to these harmonics that would be particularly noticeable with a sweep - a little like how you get "beats" when there are two nearby frequencies.
Could test for sure by writing something to do the simple linear resampling offline to a wav, and then play that through the mixer (or on any other "known good" audio platform).
I don't think I want to go to that length right now (:-/), but I did another experiment that pretty well indicates you are right.
I realized that sox has all sorts of resampling options, so I tried the dirtiest one it provides -- the 'q' option => cubic interpolation -- and sure enough there are almost the same ghosts there. No crackles though, which are definitely still in the mixer. If I let sox do the resampling in higher quality, there are no artefacts.
Another test was to generate a sweep from 10k to 15k for the mixer to resample, and the ghosts (and the crackles) are the same and just as prominent, even though the 'base' frequencies aren't. The crackles show up even if I use a fixed high frequency sine, but not if I use a very low amplitude or silence.
The float/int discussion is relatively separate; the speexdsp code I posted has both floating point and integer resamplers. I believe this is the code that is used by Audacity too. The key point is instead of just linearly filling in the new samples, it actually finds the frequencies present in the original signal and then interpolates using them.
That sounds expensive! I suppose the problem can be minimized otherwise, anyway. I always do live music generation at the system sample rate, so there's no difficulty. And if I have a recorded sample which is bad, I can always resample it with one of sox's high-quality options to get a usable version.
comment:23 by , 9 years ago
There are many ways to resample, with better interpolation methods, or by applying more filters to the input. They all have strengths and drawbacks (in terms of computation time or output quality, latency, phase shifts, etc).
However, the use of simple linear interpolation explains the extra frequencies to some extent, but not crackles. So these will probably still be there no matter what we do with the resampling algorithm. They can be caused by a bug in the code (maybe there is some saturation), or a bug in something around the mixer (for example the way the samples are chunked before being sent to it.
The mixer core code has no information about the sample rates of the input and output. All it gets is input and output buffers of different sizes and it fills one from the data in the other. I think the code works well as long as we stay inside a single buffer, but the chaining of multiple buffers may cause problems, especially if the buffers are not always of the same size, or if there are "gaps" in the output (even a single sample with a value of 0 would be very noticeable).
Now that Barrett made some fixes to reconnecting nodes in Cortex, it's possible to record the output of the mixer with SoundRecorder, which means we can more easily analyze it (my ears are not a perfect debugging tool :)). Let's see what we can discover in the resampled output then?
comment:24 by , 9 years ago
As already said, most of the audio is done in floating point, this comprises apps like MediaPlayer. So the mixer should probably chose the internal format more smartly instead to rely only on the multi_audio_node one. I would like to avoid having the mixer be a bottleneck because choose strange formats...
comment:26 by , 9 years ago
@Barrett: I still don't get your point. The sound card expects input in a specific format, and the role of the mixer is to convert everything in said format. If we make the mixer always use floats, you would still need something to perform the float>int conversion. The conversion is done during mixing, because this allows to minimize errors and possibly apply some compensation effects (dithering, for example).
I don't see a bottleneck, the conversion has to be done anyway. Also, the mixer doesn't have a single internal format, it has an input and an output format, and dedicated code for each possible combination (some of these can be generated easily in C++ by using templates). This allows the least loss possible (both for latency and precision), no matter what your input and output formats are.
So: the output must be in the soundcard format, and the inputs must be whatever you feed into it. The mixer has no decision to make here, it is a constrained system.
100..10000Hz sweep at 44100 fps