In its submission to the Australian government’s review of the regulatory framework around AI, Google said that copyright law should be altered to allow for generative AI systems to scrape the internet.
In its submission to the Australian government’s review of the regulatory framework around AI, Google said that copyright law should be altered to allow for generative AI systems to scrape the internet.
In that aspect, we are absolutely in agreement. We are meat computers in meat cages containing necessary support systems. That statement was, perhaps, an oversimplification.
Things like LLMs are attempts to model how the human brain works but are not identical, nor are LLMs, by themselves, capable of intelligence. If one argues contrarily that feeding data into an LLM and using it to produce something is the same, then the one using the LLM is clearly not the author and claiming so is plagiarism of the work of either the creator of the LLM or the LLM itself.
The argument that, legally, IP owners cannot specify that their works may not be used as feedstock for competing commercial products is rather absurd itself and would invalidate all but the most permissive open-source licenses as well as proprietary licenses. As pointed out elsewhere, this line of thought would allow one to steal leaked source code and use it to effectively clone existing software. Use of the source in this manner would be infringing on the owner’s IP rights.
Perhaps a good way to think about LLMs is as automated reverse engineering. They take data and statistically model it in order to characterize it. There is substantial case law there and the EFF has a great FAQ on the topic: https://www.eff.org/issues/coders/reverse-engineering-faq