this post was submitted on 19 Apr 2024
508 points (98.1% liked)
Programmer Humor
19623 readers
5 users here now
Welcome to Programmer Humor!
This is a place where you can post jokes, memes, humor, etc. related to programming!
For sharing awful code theres also Programming Horror.
Rules
- Keep content in english
- No advertisements
- Posts must be related to programming or programmer topics
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I never looked into this, so I have some questions.
Isn't the overhead of a new function every time going to slow it down? Like I know that LLVM has special instructions for Haskell-functions to reduce overhead, but there is still more overhead than with a branch, right? And if you don't use Haskell, the overhead is pretty extensive, pushing all registers on the stack, calling new function, push buffer-overflow protection and eventual return and pop everything again. Plus all the other stuff (kinda language dependent).
I don't understand what advantage is here, except for stuff where recursive makes sense due to being more dynamic.
They aren't talking about using recursion instead of loops. They are talking about the map method for iterators. For each element yielded by the iterator, map applies a specified function/closure and collects the results in a new iterator (usually a list). This is a functional programming pattern that's common in many languages including Python and Rust.
This pattern has no risk of stack overflow since each invocation of the function is completed before the next invocation. The construct does expand to some sort of loop during execution. The only possible overhead is a single function call within the loop (whereas you could have written it as the loop body). However, that won't be a problem if the compiler can inline the function.
The fact that this is functional programming creates additional avenues to optimize the program. For example, a chain of maps (or other iterator adaptors) can be intelligently combined into a single loop. In practice, this pattern is as fast as hand written loops.
A great point in favour of maps is that each iteration is independent, so could theoretically be executed in parallel. This heavily depends on the language implementation, though.
Technically this is also possible with for loops, like with OpenMP
Imperative for loops have no guarantee at all that iterations could be executed in parallel.
You can do some (usually expensive, and never complete) analysis to find some cases, but smart compilers tend to work the best the dumbest you need them to be. Having a loop that you can just blindly parallelize will some times lead to it being parallel in practice, while having a loop where a PhD knows how to decide if you can parallelize will lead to sequential programs in practice.
While you do have a fair point, I was referring to the case where one is basically implementing a map operation as a for loop.
Compiler optimizations like function inlining are your friend.
Especially in functional languages, there are a lot of tricks a compiler can use to output more efficient code due to not needing to worry about possible side effects.
Also, in a lot of cases the performance difference does not matter.
I'm not familiar with any special LLVM instructions for Haskell. Regardless, LLVM is not actually a commonly used backend for Haskell (even though you can) since it's not great for optimizing the kind of code that Haskell produces. Generally, Haskell is compiled down to native code directly.
Haskell has a completely different execution model to imperative languages. In Haskell, almost everything is heap allocated, though there may be some limited use of stack allocation as an optimization where it's safe. GHC has a number of aggressive optimizations it can do (that is, optimizations that are safe in Haskell thanks to purity that are unsafe in other languages) to make this quite efficient in practice. In particular, GHC can aggressively inline a lot more code than compilers for imperative languages can, which very often can eliminate the indirection associated with function calls entirely. https://gitlab.haskell.org/ghc/ghc/-/wikis/commentary/compiler/generated-code goes into a lot more depth about the execution model if you're interested.
As for languages other than Haskell without such an execution model (especially imperative languages), it's true that there can be the overhead you describe, which is why the vast majority of them use iterators to achieve the effect, which avoids the overhead. Rust (which has mapping/filtering, etc. as a pervasive part of its ecosystem) does this, for example, even though it's a systems programming language with a great deal of focus on performance.
As for the advantage, it's really about expressiveness and clarity of code, in addition to eliminating the bugs so often resulting from mutation.
Interesting.
So it basically enables some more compiler magic. As an embedded guy I'll stay away from it, since I like my code being translated a bit more directly, but maybe I'll look into the generated code and see if I can apply some of the ideas for optimizations in the future.
I looked at the post again and they do talk about recursion for looping (my other reply talks about map over an iterator). Languages that use recursion for looping (like scheme) use an optimization trick called 'Tail Call Optimization' (TCO). The idea is that if the last operation in a function is a recursive call (call to itself), you can skip all the complexities of a regular function call - like pushing variables to the stack and creating a new stack frame. This way, recursion becomes as performant as iteration and avoids problems like stack overflow.
Not just calls to self - any time a function’s last operation is to call another function and return its result (a tail call), tail call elimination can convert it to a goto/jump.
Some languages have to optimize it with various tricks. There's a good reason why I call heavily functional "programmer wankery". It took me a while to run into an issue that was caused by a variable modified in a wrong way, which I fixed by saving the value of the variable before a call that seems to alter it. Probably I should have instead properly fix it so I could understand the actual root cause, but I have limited time to spend on things.