this post was submitted on 28 Apr 2024
30 points (87.5% liked)

General Discussion

11932 readers
4 users here now

Welcome to Lemmy.World General!

This is a community for general discussion where you can get your bearings in the fediverse. Discuss topics & ask questions that don't seem to fit in any other community, or don't have an active community yet.


🪆 About Lemmy World


🧭 Finding CommunitiesFeel free to ask here or over in: [email protected]!

Also keep an eye on:

For more involved tools to find communities to join: check out Lemmyverse and Feddit Lemmy Community Browser!


💬 Additional Discussion Focused Communities:


Rules

Remember, Lemmy World rules also apply here.0. See: Rules for Users.

  1. No bigotry: including racism, sexism, homophobia, transphobia, or xenophobia.
  2. Be respectful. Everyone should feel welcome here.
  3. Be thoughtful and helpful: even with ‘silly’ questions. The world won’t be made better by dismissive comments to others on Lemmy.
  4. Link posts should include some context/opinion in the body text when the title is unaltered, or be titled to encourage discussion.
  5. Posts concerning other instances' activity/decisions are better suited to [email protected] or [email protected] communities.
  6. No Ads/Spamming.
  7. No NSFW content.

founded 1 year ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[–] mindbleach 1 points 4 months ago

Quite Okay Imaging. One-page spec, comically fast, similar ratios.

I got thoroughly nerd-sniped by the same guy's Quite Okay Audio format, because he did a much less mic-drop job of that one. The target bitrate is high, for a drop-in replacement on MP3 or Vorbis, and the complexity level is... weird. QOI is shockingly simple; QOA involves brute force and magic numbers. I thought I could do better.

I embraced the 64-bit "frame" concept, did some Javascript for encoding and decoding as a real-time audio filter, and got one-bit samples sounding pretty good... in some contexts. Basically I implemented delta coding. Each one-bit sample goes up or down by a value specified in that frame - with separate up and down values, defined in log scale, using very few bits. Searching for good up/down values invites obsession but works fine with guess-and-check because each dataset is tiny. I settled on simulated annealing.

Where I ditched this is shortly after doing double-delta coding. So instead of a 0 making sample N+1 be sample N plus the Down value, sample N+1 is always sample N plus the Change value, and a 0 makes Change equal Change plus the Down value. This turns out to be really good at encoding a wiggly line, one millisecond at a time. If it's low-frequency. Low frequencies sound fantastic. Old-timey music? Gorgeous, slight hiss. Speech? Crystal clear. Techno? Complete honking garbage. Hilariously bad. Throw a high sine wave at delta coding and you get noise. Double delta coding, you get pleasant noise, but it's still nonsense bearing little resemblance to the input. It's not even a low-pass filter; the encoding method just chokes.

The clear fix would be re-implementing an initial test, where you specify high and low absolute values, and your one-bit samples just pick between them. It's naive carried-error quantization and it sounds like a child's toy that's never getting new batteries. But I'd do it alongside the delta options. Selecting which approach produces the least error would be done per-millisecond. You'd get occasional artifacting instead of mangled output or constant buzzing. I just ran out of steam and couldn't be arsed.