Meta, OpenAI, Anthropic, and others allegedly pirated millions of authors' works. Find out if your books and writings are affected and what you can do about it.
I’m still thinking through a lot of this, especially the questions around ownership and what permission even means online. It’s a complicated issue and I can see both sides.
It is complicated, Bette. But on face value, Neela's term, shoplifting at scale, rings true. I don't mind AI scraping the content that I create on the interwebs. In fact, I want it to so that I show up in search returns. But my books. That's another matter.
Good coverage of the issue, Paul. My view: taking a person's labor without consent or compensation is THEFT, plain and simple, no matter how some may try to rationalize it. Remember Napster?
thank you for raising awareness on this topic, Paul. This is plain theft and it's sad these LLM's take other's people's work without their consent.
Another issue is the fact that new authors can be tempted to use LLM's to write their own books - and basically regurgitate thoughts of other authors. I did a test for my fantasy novel - by asking Chat GPT to provide names for certain locations / characters - and most of them are locations / names of characters from other fantasy stories (most of them I knew them already as I devour fantasy books). LLM's do not create something new, they just respond based on the data that was fed into them. That's what authors who are tempted to write books using AI should understand.
Also thank you for sharing practical steps on how to protect your work, this is very useful.
I think it depends on what you’re writing. A fantasy novel is one thing, but I write B2B technical content that would take hours or days to produce without AI’s help. It’s not copy/paste, but it saves a tremendous amount of time spent in research and ideation.
AI companies LOVE to say, "We’re democratizing knowledge.” But whose knowledge, exactly? It’s not democratic if creators don’t consent or benefit. If the system requires piracy to progress, maybe we should question the system, not just the ethics, but the sustainability.
If I paint a moustache on the Mona Lisa, is it a new work, or is it still hers with my graffiti?
The argument that LLMs “transform” content feels intellectually lazy when the transformation is statistical, not creative. These models don’t reinterpret.
Thanks, Neela. Happy Wednesday to you, too. You hit the nail on the proverbial head yet again. The pressure is building, so I have to believe the lawsuits will hold sway.
It really is. Authors put in an enormous amount of time and effort, only to have AI scrape their content without so much as a "howdy-do." Um, no. As we say down south, "That dog won't hunt." There has to be some way to acknowledge or remunerate authors for use of their content.
Thanks for digging into this Paul. I'm happy for the LLMs to train on all my free content (21 years of blog posts for example). However, I don't agree to stealing my copyrighted paid book content.
I'm with you there, David. There are advantages to training -- SEO (or AIO), for example -- but books we toiled over with blood, sweat, and tears, for which we were paid, is another matter entirely. Thanks for weighing in.
I’m still thinking through a lot of this, especially the questions around ownership and what permission even means online. It’s a complicated issue and I can see both sides.
It is complicated, Bette. But on face value, Neela's term, shoplifting at scale, rings true. I don't mind AI scraping the content that I create on the interwebs. In fact, I want it to so that I show up in search returns. But my books. That's another matter.
Good coverage of the issue, Paul. My view: taking a person's labor without consent or compensation is THEFT, plain and simple, no matter how some may try to rationalize it. Remember Napster?
https://medium.com/@bairdbrightman/stop-thief-chatgpt-midjourney-ai-etc-6be4a058e098
Yep. I sure do. And Limewire. And Bittorrent.
thank you for raising awareness on this topic, Paul. This is plain theft and it's sad these LLM's take other's people's work without their consent.
Another issue is the fact that new authors can be tempted to use LLM's to write their own books - and basically regurgitate thoughts of other authors. I did a test for my fantasy novel - by asking Chat GPT to provide names for certain locations / characters - and most of them are locations / names of characters from other fantasy stories (most of them I knew them already as I devour fantasy books). LLM's do not create something new, they just respond based on the data that was fed into them. That's what authors who are tempted to write books using AI should understand.
Also thank you for sharing practical steps on how to protect your work, this is very useful.
I think it depends on what you’re writing. A fantasy novel is one thing, but I write B2B technical content that would take hours or days to produce without AI’s help. It’s not copy/paste, but it saves a tremendous amount of time spent in research and ideation.
Start rant
AI companies LOVE to say, "We’re democratizing knowledge.” But whose knowledge, exactly? It’s not democratic if creators don’t consent or benefit. If the system requires piracy to progress, maybe we should question the system, not just the ethics, but the sustainability.
If I paint a moustache on the Mona Lisa, is it a new work, or is it still hers with my graffiti?
The argument that LLMs “transform” content feels intellectually lazy when the transformation is statistical, not creative. These models don’t reinterpret.
They remix. That matters to me.
I call it synthetic plagiarism at scale.
end rant
Happy Wednesday, Paul :)
Thanks, Neela. Happy Wednesday to you, too. You hit the nail on the proverbial head yet again. The pressure is building, so I have to believe the lawsuits will hold sway.
They will - I have been trying to write an article about AI ethics for almost 3 weeks now.
Hopefully I finish sometime soon lol
I can't wait to read it. I'll for sure share and perhaps cross-post to my readers.
I doubt "they" will give me the green light to publish but stranger things have happened lol
I hope they do.
This seems like a huge ethical consideration; thanks for sharing about it here.
It really is. Authors put in an enormous amount of time and effort, only to have AI scrape their content without so much as a "howdy-do." Um, no. As we say down south, "That dog won't hunt." There has to be some way to acknowledge or remunerate authors for use of their content.
Thanks for digging into this Paul. I'm happy for the LLMs to train on all my free content (21 years of blog posts for example). However, I don't agree to stealing my copyrighted paid book content.
I'm with you there, David. There are advantages to training -- SEO (or AIO), for example -- but books we toiled over with blood, sweat, and tears, for which we were paid, is another matter entirely. Thanks for weighing in.
Well, matey, my vote is privacy theft! (And, yes, my book and a slew of my articles are found in LibGen...arr!)
Enjoyed the essay--good info and practical takeaways!
Thank you, Dee. Love it that you got my kitsch.