this post was submitted on 10 Mar 2024
24 points (92.9% liked)

No Stupid Questions (Developer Edition)

934 readers
1 users here now

This is a place where you can ask any programming / topic related to the instance questions you want!

For a more general version of this concept check out [email protected]

Icon base by Lorc under CC BY 3.0 with modifications to add a gradient

founded 1 year ago
MODERATORS
 

You see this with some apps (I think ReVanced is a popular example?) and games occasionally, and I've never been clear on how they do it.

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 22 points 8 months ago (4 children)

Compiled binaries can be decompiled back into source code. It’s not perfect by any means, but I was very surprised how well it worked the first time I decompiled a .Net application. With this as your base you can then make changes and recompile a new binary. This glosses over a lot of detail, and there are other ways like obtaining a leaked copy of the source code.

[–] [email protected] 17 points 8 months ago

Yeah, it's particularly easy with Java and C#, as they don't compile all the way to machine code, but rather just to an intermediate representation (byte code).

[–] [email protected] 14 points 8 months ago

The reason this works well for certain applications and not others comes down to programming language / framework and compilation optimization.

If the application was compiled directly into an executable binary and optimized, it can be decompiled, but it won't be human-readable. Programmers would have to delve in and manually trace the code paths to figure out how it works. Fun fact, this is how a lot of the retro game decompilation projects are happening. Teams of volunteers are going through the unreadable decompilations and working together to figure them out.

Dotnet and Java based applications are easier, because they don't usually get directly compiled into machine-executable binaries, and even when they do, it's still easy to decompile them. This is because they're both compiled to an intermediate language that's more optimized than the original, then that IL is run by a runtime. Dotnet's IL is called Common Intermediate Language and Java's is called bytecode. This sounds weird, but it's kinda cool, because it lets people write different languages without having to have a full compiler. They just have to be able to get it compiled to an intermediate language, and then the existing runtime can take it from there.

[–] [email protected] 7 points 8 months ago* (last edited 8 months ago)

That's because .net (by default) compiles to IL and is later compiled to machine code by the JIT.

Once compiled to machine code you are unlikely to get anything close to the source. Usually assembly.

[–] [email protected] 4 points 8 months ago (1 children)

Are the tools involved typically called decompilers, or would you happen to know the different names they may go by? Trying to make sure I have some solid terms to guide my own research. Thanks for the response!

[–] [email protected] 6 points 8 months ago

Yep, decompiler is the correct term