The fact that OpenAI stole content from everybody in order to make its model doesn’t make it less infringing.
Totally in agreement with you here. They did something wrong and should have to deal with that.
But my question is more about...
The problem with AI as it currently stands is that it has no actual comprehension of the prompt, or ability to make leaps of logic, nor does it have the ability to extend and build upon existing work to legitimately transform it, except by using other works already fed into its model
Is comprehension necessary for breaking copyright infringement? Is it really about a creator being able to be logical or to extend concepts?
I think we have a definition problem with exactly what the issue is. This may be a little too philosophical but what part of you isn't processing your historical experiences and generating derivative works? When I saw "dog" the thing that pops into your head is an amalgamation of your past experiences and visuals of dogs. Is the only difference between you and a computer the fact that you had experiences with non created works while the AI is explicitly fed created content?
AI could be created with a bit of randomness added in to make what it generates "creative" instead of derivative but I'm wondering what level of pure noise needs to be added to be considered created by AI? Can any of us truly create something that isn't in some part derivative?
There’s little actual fundamental difference between what ChatGPT does and what a procedurally generated game like most roguelikes do
Agreed. I think at this point we are in a strange place because most people think ChatGPT is a far bigger leap in technology than it truly is. It's biggest achievement was being able to process synthesized data fast enough to make it feel conversational.
What worries me is that we will set laws and legal precedent based on a fundamental misunderstanding of what the technology does. I fear that had all the sample data been acquired legally people would still have the same argument think their creations exist inside the AI in some full context when it's really just synthesized down to what is necessary to answer the question posed "what's the statically most likely next word of this sentence?"