Smells like bullshit. The graphs they showed in the source paper with their accuracy at like 100% for every test seem even more like bullshit. Did they run the model over the training data or what?
Maybe I'm wrong but text is just way too high signal to noise medium to be able to tell if it was written by an AI. The false positives would be high enough that it's effectively useless. Does anyone have another perspective on this? If I'm missing some nuance here I'd love to understand more.