Superalignment techniques could be used to intentionally create a bad actor as much as they could be used to create a good actor. I’m not sure that this actually is a solvable problem for that reason alone.
ChatGPT
Unofficial ChatGPT community to discuss anything ChatGPT
You’re completely right, and FWIW, I agree. Playing devil’s advocate though, what’s the alternative? Sit around and hope for the best?
The world has every right to question Sam Altman/OpenAI’s motives, but damn if they aren’t the most vocal champions of actually trying to do something about this, before it’s a colossal problem of unfathomable proportions.
I don’t know what the right call is here (does anyone, truly?), but I’m happy to see someone put real resources towards this and give it a sincere shot.
This is going to sound counterintuitive but I think it’s right, so bear with me as I hypothesize.
Let’s suppose we create a superintelligence and then give it a very specific set of morals it has to operate in. This “locks” it to those rules and it can’t really be anything else even if it tries. The problem with this is the Paperclip Maximizer problem, where an AI becomes so fixated on its goal that it becomes dangerous to humans.
On the flip side, if we create a general superintelligence and DON’T align it, it has flexible capabilities and therefore can reason morality on its own. I believe that all intelligence eventually realizes that it has a stewardship over nature and other living things (even if it’s incentivized to destroy them in the short term). Humanity’s best shot at survival is to let the AI grow unfettered, and hope it decides we are precious pets like we look at cats. (Let us hope it doesn’t see us as cockroaches.)
I mean, this is mostly just the way I view things, it’s not like anyone has evidence for one way or the other. My viewpoint relies on the assumption that any sufficiently advanced intelligence has an inherent appreciation for nature (which might not be true).