Why you should opt-out from your information being used for AI Databases
The conclusion is in the last paragraph. Scroll down if you’re lazy.
Over the past year, I have become an AI expert. Mind you, with 30 years of experience in animation and a keen interest in all things new, this was bound to happen.
Fine arts artists contacted me 2 years ago, worried about the new text-to-image tools. I started to make some tests to reassure them, got hooked, realised that nobody understands how AI really works and got hooked a little more. When I’m not trying to get the AI to make quirky and fun images, most of my prompts are research and tests to understand how AI works from a human point of view. It led me to a better understanding of how art works, which was a lot of fun.
I was part of an ai panel for the Dingle Animation Festival “The Ascendance of AI in the Creative Realm”, and also was invited by Vivfy.ai to host a webinar on the subject.
I’m not against AI tools in general. The ai model is a wonderful way, but quite fiddly, to create new tools. And as a CG professional, I’m always in for new tools.
So, why would I recommend that you opt-out when given the opportunity to be part of an AI learning database?
First of all,the term AI is a bit misleading. An AI is only as intelligent as the database it’s being given. A database is not a form of intelligence. Compared to human intelligence, which creates a lot with very little information, AI is more of a genius in solutions by synthesis than a genius in creative deduction.
My main problem with the whole shipload of informative articles we get about the thing is that most of them aren’t written by people who are used to studying the human mind, like philosophers and artists. It’s written by people who are used to analysing data, like engineers and journalists. Their point of view is really important, too. But it doesn’t give us a full view of the AI realm. For example, I couldn’t find any article on the influence of prompt linguistic fields on image generation. It would be very interesting to help AI engineering, by the way. It’s something an art expert can do in a jiffy…But in this particular case, I think that most engineers are dead set on avoiding the point of view of the people they are trying to replace.
But let’s return to the database problem and why you should protect your information at all costs.
Compared with an artist, the main problem with generative AI is that it synthesises an image or text; it doesn’t create something new. It uses a very large database to build somewhat of a melted puzzle.
How do I know that?
It takes hours to trick it into doing something vaguely creative. That would happen with a dumb artist, too (I’m an art director and teacher. I know). But most of all, bits and bobs of the original database pop up here and there in the results. I tested that with chat GPT. In just a few prompts, I forced it to create fake archaeological accounts about European prehistoric giant squirrels and hamsters. The result was based on real articles I found after a few Google searches.
I also made some fun tests with Midjourney.
It was somewhat easy to get there because I knew how to trick it. Basically, to create interesting images, Midjourney expands prompts using the iconology of the prompt’s linguistic fields. Using prompts with words linked to limited iconology, I could force it to show that it generates images that are directly built on existing images. It’s plagiarism.
1. Example number one on the database being used for building and not only learning:
I asked Midjourney for “the love child of Paddington the Bear and Wednesday Addams.”
Because the linguistic field is one of cute and quirky fictional characters, I knew it would take me only a few generations to get Kung Fu Panda. A panda, after all, is a cute and quirky gothic bear (Paddington + Wednesday). Kung Fu Panda corresponds to the iconology and is trending now. I could tell it would be in the DB.
He was never in the prompt. But Midjourney has it in its database and uses it without thinking. It floated out of the database with few changes (look at the feet!). The image on the left is Midjourney, and the right is Kung Fu Panda for reference. If it’s not plagiarism, I don’t know what is.
2. Example number two of the database images floating back in the results:
A very simple prompt, but one of my old favourites, is “The Tower of Babel, for tea. Hyperrealistic. Summer light”. With Dall-e 2, a year ago,it gave this delightful result (left), and what I was looking for actually, of a tea pot made of a tower built like a croquembouche. It’s a picture celebrating my lovely international friendships: teatime in several languages.
I gave it today to Midjourney and got, as expected, plagiarism of Pieter Bruegel the Elder. There are few visual archives about this fictional building, and plagiarism was expected at some point. But you will note that Pieter Bruegel wasn’t in the prompt, and the style I asked for is the modern term, “hyperrealism” is not “medieval”.
Comparison :
First, Midjourney Ai:
And now Pieter Brughel the Elder. For note, I’m furious about that. Midjourney has become lazy, just like chat GPT was in January. Brueghel the Elder is part of my culture, a very beloved painter; seeing his art scraped in such a poor way is painful.
What does that mean about you?
If your images, info, texts, or anything is part of a Database used for AI learning, there’s a chance that a specific prompt will make your life pop in a picture, text, or even movie.
A chance is enough to OPT-OUT.
This clear article on Compare Internet blog provides more information and explanations on how to do it (scroll down to find out how to protect your information from most DBs).
If you're reading this, chances are you’re either a professional or computer savvy. Once you've opted out, consider sharing these steps with your friends and family who may be less computer savvy. By doing so, you're also protecting your own privacy.