this post was submitted on 28 Jan 2024
674 points (94.8% liked)
Programmer Humor
32712 readers
1201 users here now
Post funny things about programming here! (Or just rant about your favourite programming language.)
Rules:
- Posts must be relevant to programming, programmers, or computer science.
- No NSFW content.
- Jokes must be in good taste. No hate speech, bigotry, etc.
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I honestly don't know, why async got so popular. It seems like the entire reason for doing it was that some blokes were hellbent on using JS on a server and couldn't fathom that a second thread might be a good idea.
If you are waiting for IO, why would you block your current thread and not let it do something else? Async does not only exist in JS.
De facto it is callback hell, though. Debugging Async code horrible and let's be honest here: Async is just global-synchronized (or whatever it's called it not-Java) with extra steps.
You have a very basic (mis)understanding of what async is and as such are misrepresenting it in your arguments.
After using both extensively I would argue async code is easier to read. It has a lot less nesting. And generally easier to read code is a good thing so I'm all for async.
A huge amount of time in apps is spent waiting for IO, database or web requests to complete.
Async prevents locking a thread during this wait.
If you're handling a large amount of requests in a web server, for example, it allows other requests to progress while waiting for these operations.
Threads are also expensive to start and manage.
Also handling threads manually is a pain in the ass.
That's a very common misconception. async is just a scheduling tool that runs at the end of event loop (microtask queue). It still runs on the main thread and you can still lock up your UI. You'd need Web Workers for actual multi-threading.
It can lock up a UI doing cpu bound work. Making a web request, no. Preventing the ui thread from waiting on native IO is what async was created for.
Citation needed.
async
just a wrapper for Promises. IO isn't related, just commonly used with it.https://tc39.es/ecma262/multipage/control-abstraction-objects.html#sec-async-functions-abstract-operations-async-function-start
NodeJS's IO and
fetch
are just promises. (And NodeJS used to usecallback(err, response)
before adding promises.).Yes I’m simplifying a LOT, but in the context of background web calls, that was what callbacks became so important for. XMLHttpRequest in IE 5 sparked the Ajax movement and adventures in nested callbacks.
Prior to that, the browser had window.setTimeout and its callback for delays and animation and such - but that’s it.
The main purpose of all this async callback stuff was originally, and arguably still is (in the browser), for allowing the ui event loop to run while network requests are made.
NodeJS didn’t come into the picture for almost 10 years or so.
Yeah, that's a big simplification and I get it. But the
async
syntax itself syntax "sugar" for Promises. It's not like C# or Java/Android where it will spawn a thread. If you take a JSON of 1000 rows and attach a promise/await to each of them, you won't hit the next event loop until they all run to completion.It's a common misconception that asynchronous means "run in background". It doesn't. It means run at end of current call stack.
And you STILL have to call
setTimeout
in yourasync
executions or else you will stall your UI.Again
async
is NOT background. It's run later.async
wrapsPromise
which wrapsqueueMicrotask
.Here is a stack overflow that explains it more in detail.
I’m well aware how async works in the single threaded js environment. All code blocks the main thread! Calling await on an async operation yields back.
You’re right, async is commonly mixed up with multi-threaded. And in fact in many languages the two things work hand in hand. I’m very aware of how it works in JavaScript.
We are agreeing. Don’t need more info.
If you need to get multiple pieces of data for one request Async is great, but why would you work on different requests in the same thread? Why slow down one request because the other needs a bunch of computation?
You aren't slowing down anything. If you didn't use async that thread would be blocked.
You'd need a thread per request even though they are sat doing nothing while waiting for responses.
Instead when you hit an await that thread is freed for other work and when the wait is over the rest of the code is scheduled to run.
Because the alternative is a series of ridiculously nested call backs that make code hard to read and manage?
I honestly can't fathom how anyone would dislike async programming.
Async is good because threads are expensive, might aswell do something else when you need to wait for something anyways.
But only having async and no other thread when you need some computation is obviously awful.. (or when starting anothe rthread is not easily manageable)
Thats why i like go, you just tell it you want to run something in parallel and he will manage the rest.. computational work, shift current work to new thread.. just waiting for IO, async.
The "do something while waiting for something else" is not a reason to use async. That's why blocking system calls and threads exist.
Threads don't need to be expensive. Max stack usage can be determined statically before choosing the size when spawning a thread.
Any other reasons?
If you compare the performance of async rust vs. rust with blocking syscalls there's not even a comparison. 'epoll' and the like are vastly more performant than blocking system io, async then it simply a way to make that kind of system interface nice to program with as you can ignore all that yielding and waking up and write straight-line code.
Now, if all you do is read a config file, yes, all that is absolutely overkill. If you're actually doing serious io though there's no way around this kind of stuff.
I assume by performance you mean CPU usage per io request. Each io call should require a switch to the kernel and back. When you do blocking io the switch back is delayed(switch to other threads while waiting), but not more taxing. How could it be possible for there to be a difference?
Because the kernel doesn't like you spawning 100k threads. Your RAM doesn't, either. Even all the stacks aside, the kernel needs to record everything in data structures which now are bigger and need longer to traverse. Each thread is a process which could e.g. be sent a signal, requiring keeping stuff around that rust definitely doesn't keep around (async functions get compiled to tight state machines).
Specifically with io_uring: You can fire off quite a number of requests, not incurring a context switch (kernel and process share a ring buffer) and later on check on the completion status quite a number, also without having to context switch. If you're (exceedingly) lucky no io_uring call ever cause a context switch as the kernel will work on that queue on another cpu. The whole thing is memory, not CPU, bound.
Anyhow, your mode of inquiry is fundamentally wrong in the first place: It doesn't matter whether you can explain why exactly async is faster (I probably did a horrible job and got stuff wrong), what matters is that benchmarks blow blocking io out of the water. That's the ground truth. As programmers, as a first approximation, or ideas and models of how things work are generally completely wrong.
Why do you say this?
Not if your stacks per thread are small.
These data structures must exist either in userland or the kernel. Moving them to the kernel won't help anything. Also, many of these data structures scale at log(n). Splitting have the elements to userland and keeping the other half gives you two structures with log(n/2) so 2log(n/2) = log(n^2/4). Clearly that's worse.
If signals were the reason async worked better, then the correct solution is to enable threads that opt-out of signals. Anything that slows down threads that isn't present in an async design should be opt-out-able. The state-machines that async compiles to, do not appear inherently superior to multiple less stateful threads managed by a fast scheduler.
As described here you would still need to do a switch to kernel mode and back for the syscalls. The extra work required from assuming processes are hostile to each other should be easy to avoid among threads known to have a common process as they are obviously not hostile to each other and share memory space anyway. The synchronization required to handle multiple tasks should be the same regardless if they are being run on the same thread by a user land scheduler or if they are running on multiple threads with an os scheduler.
I'm not interested in saying that async is the best because it appears to work well currently. That's not the right way to decide the future of how to do things. That's just a statement of how things are. I agree, if your only goal is get the fastest thing now with no critical thought, then it does appear that async is faster. I am unconvinced it must fundamentally be the case.
Have you tried?
Page size is 4k it doesn't get smaller. The kernel can't give out memory in more fine-grained amounts, at least not without requiring a syscall on every access which would be prohibitively expensive.
That's what async does. It opts out of all the things, including having to do context switches when doing IO.
No, you don't: You can poll the data structure and the kernel can poll the data structure. No syscalls required. Kernel can do it on one core, the application on another, so in the extreme you don't even need to invoke the scheduler.
You can e.g. have a look at whether you can change the hardware to allow for arbitrarily small page sizes. The reaction of hardware designers will be first "are you crazy", then, upon explaining your issue, they'll tell you "well then just use async what's the problem".
Well too bad cause they are.
Go ahead and spin up a web worker and transfer a bunch of data to it and tell us how long you had to wait.
The only way I have heard threads are expensive, in the context of handling many io requests, is stack usage. You can tell the os to give less memory (statically determined stack size) to the thread when it's spawned, so this is not a fundamental issue to threads.
Time to transfer data to one thread is related to io speed. Why would this have anything to do with concurrency model?
Well I just told you another one, one actually relevant to the conversation at hand, since it's the only one you can use with JavaScript in the context of a web browser.
You cant say async is the fundamentally better model because threading is purposely crippled in the browser.
The conversation at hand is not "how do io in browser". Its "async is not inherently better than threads"
No, because async is fundamentally a paradigm for how to express asynchronous programming, i.e. situations where you need to wait for something else to happen, threading is not an alternative to that, callbacks are.
Threads are callbacks.
Ok, I'm a c# developer and I use async await quite extensively. Is it different in JS? Or am I missing something?
Nah, they're very similar, really. You generally kick IO heavy stuff you don't need immediately off to async await.
There are a few more applications of it in C# since you don't have the "single thread" to work with like in JS. And the actual implementation under the hood is different, sure. But conceptually they're similar. Pretty sure JS was heavily influenced by C#'s implementation and syntax.
Async rust with the Tokio Framework is pretty cool. Need none of that JS bloat for async.
Honestly I can't wrap my head how to effectively put computation into a thread, even with Tokio.
All I want is something like rayon where you got a task queue and you just yeet tasks into a free thread, and await when you actually need it
Might be too much JS/TS influence on me, or that I can't find a tutorial that would explain in a way that clicks for me
Tokio specifically says not to use it for CPU intensive tasks and rayon would be better for this: https://tokio.rs/tokio/tutorial
Tokio is for concurrency, not parallelism. Use it for IO stuff. They say rayon is good for that, but I haven't used that. If you just want something simple, I'd recommend working with threadpool.
Async Rust sucks. I hate how many libraries use it, forcing it apon you.
You suck, I hate how your comment was forced "apon" me. Anyone who claims things that they can easily avoid if theyre so opinionated against them are "forced upon" them are always pathetic people.
Have you programmed with rust a day in your life? Once you introduce one library that requires Tokio async you have to start wrapping all your calls that involve it in async shit.
So many better concurrency patterns out there.
And these libraries are not easily avoidable. Ex: most AWS libraries require it.
And forgive me for a stupid typo, I have had little sleep the last week but you are an asshole that thinks belittling people somehow makes you right so it doesn't really matter.
Imagine a webser or proxy and for every incoming request it creates an new thread 💣
Yes you're right if it's a second or third thread it is/may be fine. But if you're i/o bound and your application has to manage quite a lot of that stuff in parallel, there is no way around delegating that workload to the kernel's event loop. Async/Await is just a very convenient abstraction of that Ressource.
async/await is just
callback()
andqueueMicrotask
wrapped up into a neat package. It's not supposed to replace multi-threading and confusing it for such is dangerous since you can still stall your main/UI thread with Promises (which async also wraps).(
async
andawait
are also technically different things, but for the sake of simplicity here, consider them a pair.)Async got popular when the choices for clientside code in the browser were "Javascript" or "go fuck yourself." It's an utterly miserable way to avoid idiotic naive while() stalling. But it was the only way.