An image of an archeologist adventurer who wears a hat and uses a bullwhip

https://theaiunderwriter.substack.com/p/an-image-of-an-archeologist-adventurer

ionwake
Not sure if anyone is interested in this story, but I remember at the height of the PokemonGo craze I noticed there were no shirts for the different factions in the game, cant rememebr what they were called but something like Teamread or something. I setup an online shop to just to sell a red shirt with the word on it. The next day my whole shop was taken offline for potential copyright infringement.

What I found surprising is I didnt even have one sale. Somehow someone had notified Nintendo AND my shop had been taken down, to sell merch that didn't even exist for the market and if I remember correctly - also it didnt even have any imagery on it or anything trademarkable - even if it was clearly meant for pokmeonGo fans.

Im not bitter I just found it interesting how quick and ruthless they were. Like bros I didn't even get a chance to make a sale. ( yes and also I dont think I infringed anything).

District5524
I asked Sora to turn a random image of my friend and myself into Italian plumbers. Nothing more, just the two words "Italian plumbers". The created picture was not shown to me because it was in violation of OpenAI's content policy. I asked then just to turn the guys on the picture into plumbers, but I asked this in the Italian language. Without me asking for it, Sora put me in an overall and gave me a baseball cap, and my friend another baseball cap. If I asked Sora to put mustache on us, one of us received a red shirt as well, without being asked to. Starting with the same pic, if I asked to put one letter on the baseball caps each - guess, the letters chosen were M and L. These extra guardrails are not really useful with such a strong, built-in bias towards copyright infringement of these image creation tools. Should it mean that with time, Dutch pictures will have to include tulips, Italian plumbers will have to have a uniform with baseball caps with L and M, etc. just not to confuse AI tools?
Cthulhu_
You (and the article, etc) show what a lot of the "work" in AI is going into at the moment - creating guardrails against creating something that might get them in trouble, and / or customizing weights and prompts under water to generate stuff that isn't the obvious. I'm reminded of when Google's image generator came up and this customization bit them in the ass when they generated a black pope or asian vikings. AI tools don't do what you wish they did, they do what you tell them and what they are taught, and if 99% of their learning set associates Mario with prompts for Italian plumbers, that's what you'll get.

A possible (probably already exists) business is setting up truly balanced learning sets, that is, thousands of unique images that match the idea of an italian plumber, with maybe 1% of Mario. But that won't be nearly as big a learning set as the whole internet is, nor will it be cheap to build it compared to just scraping the internet.

rurp
>> they do what you tell them and what they are taught, and if 99% of their learning set associates Mario with prompts for Italian plumbers, that's what you'll get.

I thought that a lot of the issues were the opposite of this, where Google put their thumb on the scale to go against what the prompt asked. Like when someone would ask for a historically accurate picture of a US senator from the 1800s and repeatedly get women and non-white men. The training set for that prompt has to be overwhelmingly white men so I don't think it was just a matter of following the training data.

feoren
I remember all the hullaballoo about Asian Vikings and the like. It was so preposterous that Vikings would ever be Asian that it must be ultra-woke DEI mind-worms being forced onto AI! But of course, as far as the AI's concerned, it is even more preposterous that an Italian plumber would not be wearing red or green overalls with a mustache and a lettered baseball cap. I don't see any way you can get the AI to recognize that Vikings "should" be white people and not also think that Italian plumbers "should" look like that. Are they allowed to recombine their training data or must they strictly adhere to only what they've seen?

Of course the irony is that if the people who get offended whenever they see images of non-white people asked for a picture of "Vikings being attacked by Godzilla" , they'd get worked up if any of the Vikings in the picture were Asian (how unrealistic!). It's a made-up universe! The image contains a damn (Asian) Kaiju in it, and everyone is supposed to be pissed because the Vikings are unrealistic!?

barbazoo
I feel like the golden and fun age of GenAI is already over.
echelon
OpenAI will eventually have competition for GPT 4o image generation.

They'll eventually have open source competition too. And then none of this will matter.

OmniGen is a good start, just woefully undertrained.

The VAR paper is open, from ByteDance, and supposedly the architecture this is based on.

Black Forest Labs isn't going to sit on their laurels. Their entire product offering just became worthless and lost traction. They're going to have to answer this.

I'd put $50 on ByteDance releases an open source version of this in three months.

weinzierl
Many years ago I tried to order a t-shirt with the postscript tiger on the front from Spreadshirt.

It was removed on Copyright claims before I could order one item myself. After some back and forth they restored it for a day and let me buy one item for personal use.

My point is: Doesn't have to be Sony, doesn't have to be a snitch - overzealous anticipatory obedience by the shop might have been enough.

Xmd5a
>After some back and forth they restored it for a day and let me buy one item for personal use.

I used Spreadshirt to print a panel from the Tintin comic on a T-shirt, and I had no problem ordering it (it shows Captain Haddock moving through the jungle, swatting away the mosquitoes harassing him, giving himself a big slap on the face, and saying, 'Take that, you filthy beasts!').

beardyw
I bought Tintin T-shirts 40 years ago in Thailand (the "branded" choices were amazing). They were actually really good, still got them!
yojo
Twenty years ago, I worked for Google AdWords as a customer service rep. This was still relatively early days, and all ads still had some level of manual human review.

The big advertisers had all furnished us a list of their trademarks and acceptable domains. Any advertiser trying to use one that wasn’t on the allow-list had their ad removed at review time.

I suspect this could be what happened to you. If the platform you were using has any kind of review process for new shops, you may have run afoul of pre-registered keywords.

ghostly_s
Well the teams in Pokemon Go aren't quite as generic as Teamred: they are Team Instinct, Team Mystic, and Team Valor. Presumably Nintendo has trademarks on those phrases, and I’m sure all the big print on demand houses have an API for rights-holders to preemptively submit their trademarks for takedowns.

Nintendo is also famously protective of their IP: to give another anecdote, I just bought one of the emulator handhelds on Aliexpress that are all the rage these days, and while they don't advertise it they usually come preloaded with a buttload or ROMs. Mine did, including a number of Nintendo properties — but nary an Italian plumber to be found. The Nintendo fear runs deep.

djoldman
I don't condone or endorse breaking any laws.

That said, trademark laws like life of the author + 95 years are absolutely absurd. The ONLY reason to have any law prohibiting unlicensed copying of intangible property is to incentivize the creation of intangible property. The reasoning being that if you don't allow people to exclude 3rd party copying, then the primary party will assumedly not receive compensation for their creation and they'll never create.

Even in the case where the above is assumed true, the length of time that a protection should be afforded should be no more than the length of time necessary to ensure that creators create.

There are approximately zero people who decide they'll create something if they're protected for 95 years after their death but won't if it's 94 years. I wouldn't be surprised if it was the same for 1 year past death.

For that matter, this argument extends to other criminal penalties, but that's a whole other subject.

csallen
> The ONLY reason to have any law prohibiting unlicensed copying of intangible property is to incentivize the creation of intangible property.

That was the original purpose. It has since been coopted by people and corporations whose incentives are to make as much money as possible by monopolizing valuable intangible "property" for as long as they can.

And the chief strategic move these people have made is to convince the average person that ideas are in fact property. That the first person to think something and write it down rightfully "owns" that thought, and that others who express it or share it are not merely infringing copyright, they are "stealing."

This plan has largely worked, and now the average person speaks and thinks in these terms, and feels it in their bones.

econ
>the average person speaks and thinks in these terms,

(Trademarks aside) Even more surprising to me is how everyone seems concerned about the studios making enough money?! As if they should make any money at all. As if it is up to us to create a profitable game for them.

If they all go bankrupt today I won't lose any sleep over it.

People also try to make a living selling bananas and apples. Should we create an elaborate scheme for them to make sure they survive? Their product is actually important to have. Why can't they own the exclusive right to sell bananas similarly? If anyone can just sell apples it would hurt their profit.

It is long ago but that is how things use to work. We do still have taxi medallions in some places and all kinds of legalized monopolies like it.

Perhaps there is some sector where it makes sense but I can't think of it.

If you want to make a movie you can just do a crowd funder like Robbert space industry.

bluGill
> Even more surprising to me is how everyone seems concerned about the studios making enough money?! As if they should make any money at all. As if it is up to us to create a profitable game for them.

Do you want more games (movies, books...)? Then you want studios to make money in that type of game. Because and if they make money they have incentive to do so. Now if you are happy with the number and quality of free games a few hard core people who will do it even if they make nothing then you don't care. However games generally take a lot of effort to create and so by paying people to make them we can ensure people who want to actually have the time - as opposed want to but instead have to spend hours in a field farming for their food.

Now it is true that games often do look alike and many are not worth making and such. However if you want more you need to ensure they make money so it is worth investing.

We can debate how much they should make and how long copyright should be for. However you want them to make money so they make more.

specproc
It's been a US-led project for the benefit of American corporations.

If I was running the trade emergency room in any European state right now, I'd have "stop enforcing US copyright" up there next to "reciprocal tarrifs".

TeMPOraL
Unfortunately we have a bunch of copyright-friendly groups in EU, so this would only work in the "stop enforcing US copyright in retaliation" sense, but not likely in the "stop enforcing copyright because on the net, it's a scam" sense.
InDubioProRubio
Worked for china
jimmaswell
We were close to your viewpoint being the popular one, but sadly many (most?) independent content creators are so overtaken by fear of AI that they've done a 180. The same people who learned by tracing references to sell fanart of a copyrighted franchise (not complaining, I spend thousands on such things) accuse AI of stealing when it glances at their own work. We're entering a new golden age of creative opportunity and they respond by switching sides to the philosophy of intellectual property championed by Disney and Oracle (except for those companies' ironic use of AI themselves..).
egypturnash
We would prefer a world where we can use the skills we have spent a lifetime honing without having to compete with some asshole taking everything we’ve shared and stuffing it into a machine that spits out soulless clones of our work without any acknowledgment of our existence.
csallen
People aren't motivated by principle so much as they are by self interest.
raytopia
Trademark isn't copyright, those are two different things. Trademarks can be renewed roughly every 10 years [1] until the end of time and are about protecting a brand. Now copyright law lasts for "author plus 70 years. For anonymous works, pseudonymous works, or works made for hire, the copyright term is 95 years from the year of first publication or 120 years from creation, whichever comes first." [2]

Is copyright too long? Yes. Is it only that long to protect large media companies? Yes. But I would argue that AI companies are pushing the limits of fair use if not violating fair use, which is used as a affirmative defense by the way meaning that AI companies have to go to court to argue what they are doing is okay. They don't just get to wave their hands and say everything is okay because what we're doing is fair use and we get to scrape the world's entire creative output for our own profit.

[1] https://www.uspto.gov/learning-and-resources/trademark-faqs#...

[2] https://www.copyright.gov/history/copyright-exhibit/lifecycl...

computerphage
Trademark isn't the same as Registered Trademark either, while we're at it
jl6
> There are approximately zero people who decide they'll create something if they're protected for 95 years after their death but won't if it's 94 years.

I’m sure you’re right for individual authors who are driven by a creative spark, but for, say, movies made by large studios, the length of copyright is directly tied to the value of the movie as an asset.

If that asset generates revenue for 120 years, then it’s slightly more valuable than an asset that generates revenue for 119 years, and considerably more valuable than an asset that generates revenue for 20 years.

The value of the asset is in turn directly linked to how much the studio is willing to pay for that asset. They will invest more money in a film they can milk for 120 years than one that goes public domain after 20.

Would studios be willing to invest $200m+ in movie projects if their revenue was curtailed by a shorter copyright term? I don’t know. Probably yes, if we were talking about 120->70. But 120->20? Maybe not.

A dramatic shortening of copyright terms is something of a referendum on whether we want big-budget IP to exist.

In a world of 20 year copyright, we would probably still have the LOTR books, but we probably wouldn’t have the LOTR movies.

AnthonyMouse
> If that asset generates revenue for 120 years, then it’s slightly more valuable than an asset that generates revenue for 119 years, and considerably more valuable than an asset that generates revenue for 20 years.

Not so, because of net present value.

The return from investing in normal stocks is ~10%/year, which is to say ~670% over 20 years, because of compounding interest. Another way of saying this is that $1 in 20 years is worth ~$0.15 today. A dollar in 30 years is worth ~$0.05 today. A dollar in 40 years is worth ~$0.02 today. As a result, if a thing generates the same number of dollars every year, the net present value of the first 20 years is significantly more than the net present value of all the years from 20-120 combined, because money now or soon from now is worth so much more than money a long time from now. And that's assuming the revenue generated would be the same every year forever, when in practice it declines over time.

The reason corporations lobby for copyright term extensions isn't that they care one bit about extended terms for new works. It's because they don't want the works from decades ago to enter the public domain now, and they're lobbying to make the terms longer retroactively. But all of those works were already created and the original terms were sufficient incentive to cause them to be.

fashion-at-cost
Your analysis misses the incredibly important caveat that revenue rises with inflation – or sometimes even faster.

50 years ago, a movie ticket was 0.50 cents in revenue. Today, it’s $25. That’s a 50x increase… a dollar in 50 years might be worth $0.02 today, but a movie ticket in 50 years is worth about a movie ticket today.

jl6
> And that's assuming the revenue generated would be the same every year forever, when in practice it declines over time.

For the crown jewel IP that the studios are most interested in protecting, the opposite of this assumption is true. Star Wars, for example, is making more money than ever. Streaming revenues will probably invalidate that assumption for an even wider pool of back catalog properties.

dragonwriter
IIRC, of works that bring in any money to their creators, the vast majority is returned, for almost all works, in the first handful of years after creation. Sure. the big names you know have value longer, but those are a miniscule fraction of works.

Make copyright last for a fixed term of 25 years with optional 10-year renewals up to 95 years on an escalating fee schedule (say, $100k for the first decade and doubling every subsequent decade) and people—and studios—would have essentially the same incentive to create as they do now, and most works would get into the public domain far sooner.

Probably be fewer entirely lost works, as well, if you had firmer deposit requirements for works with extended copyrights (using the revenue from the extensions to fund preservation) with other works entering the public domain soon enough that they were less likely to be lost before that happened.

m000
> I’m sure you’re right for individual authors who are driven by a creative spark, but for, say, movies made by large studios, the length of copyright is directly tied to the value of the movie as an asset.

That would be fine, if the studios didn't want to have it both ways. They want to retain full copyright control over their "asset", but they also use Hollywood Accounting [1] to both avoid paying taxes and cheat contributors that have profit-sharing agreements.

If studios declare that they made a loss on producing and releasing something to get a tax break, the copyright term for that work should be reduced to 10 years tops.

[1] https://en.wikipedia.org/wiki/Hollywood_accounting

yencabulator
Next, they'd switch to from Hollywood Accounting to Oilfield Accounting. Oh that wellhead is actually owned by this other company over there, we just purchased their product at a fair market rate while they were still in business, but now it seems that other company is going bankrupt and cannot do the environmental cleanup to even seal the wellhead, much less remove it.
RataNova
While I think the laws are broken, I also get why companies fight so hard to defend their IP: it is valuable, and they've built empires around it. But at some point, we have to ask: are we preserving culture or just hoarding it?
seadan83
Missing is why laws fight so hard too, missing the opposite of what we have (in the west), namely blatant and rampant piracy. The other extreme is really bad, creators of any type pirated by organized crime. There was no video game nor movie market in eastern Europe for example, can't compete against large scale piracy.

Which is to say, preservation without awareness of the threat will look like hoarding. A secondary question is to what extent is that threat real? Without seeing what true rampant piracy looks like, I think it would be easy to be ignorant of the threat.

MgB2
Idk, the models generating what are basically 1:1 copies of the training data from pretty generic descriptions feels like a severe case of overfitting to me. What use is a generational model that just regurgitates the input?

I feel like the less advanced generations, maybe even because of their limitations in terms of size, were better at coming up with something that at least feels new.

In the end, other than for copyright-washing, why wouldn't I just use the original movie still/photo in the first place?

jeroenhd
People like what they already know. When they prompt something and get a realistic looking Indiana Jones, they're probably happy about it.

To me, this article is further proof that LLMs are a form of lossy storage. People attribute special quality to the loss (the image isn't wrong, it's just got different "features" that got inserted) but at this point there's not a lot distinguishing a seed+prompt file+model from a lossy archive of media, be it text or images, and in the future likely video as well.

The craziest thing is that AI seems to have gathered some kind of special status that earlier forms of digital reproduction didn't have (even though those 64kbps MP3s from napster were far from perfect reproductions), probably because now it's done by large corporations rather than individuals.

If we're accepting AI-washing of copyright, we might as well accept pirated movies, as those are re-encoded from original high-resolution originals as well.

AlienRobot
The year is 2030.

A new MCU movie is released, its 60 second trailer posted on Youtube, but I don't feel like watching the movie because I got bored after Endgame.

Youtube has very strict anti-scraping techniques now, so I use deep-scrapper to generate the whole trailer from the thumbnail and title.

I use deep-pirate to generate the whole 3 hour movie from the trailer.

I use deep-watcher to summarize the whole movie in a 60 second video.

I watch the video. It doesn't make any sense. I check the Youtube trailer. It's the same video.

balamatom
Probably the majority of people in the world already "accept pirated movies". It's just that, as ever, nobody asks people what they actually want. Much easier to tell them what to want, anyway.

To a viewer, a human-made work and an AI-generated one both amount to a series of stimuli that someone else made and you have no control over; and when people pay to see a movie, generally they don't do it with the intent to finance the movie company to make more movies -- they do it because they're offered the option to spend a couple hours watching something enjoyable. Who cares where it comes from -- if it reached us, it must be good, right?

The "special status" you speak of is due to AI's constrained ability to recombine familiar elements in novel ways. 64k MP3 artifacts aren't interesting to listen to; while a high-novelty experience such as learning a new culture or a new discipline isn't accessible (and also comes with expectations that passive consumption doesn't have.)

Either way, I wish the world gave people more interesting things to do with their brains than make a money, watch a movies, or some mix of the two with more steps. (But there isn't much of that left -- hence the concept of a "personal life" as reduced to breaking one's own and others' cognitive functioning then spending lifetimes routing around the damage. Positively fascinating /s)

yk
Tried Flux.dev with the same prompts [0] and it seems actually to be a GPT problem. Could be that in GPT the text encoder understands the prompt better and just generates the implied IP, or could be that a diffusion model is just inherently less prone to overfitting than a multimodal transformer model.

[0] https://imgur.com/a/wqrBGRF Image captions are the impled IP, I copied the prompts from the blog post.

jsemrau
DALL-E 3 already uses a model that trained on synthetic data that take the prompt and augments it. This might lead to the overfitting. It could also be, and might be the simpler explanation, that its just looks up the right file from a RAG.
vjerancrnjak
If it overfits on the whole internet then it’s like a search engine that returns really relevant results with some lossy side effect.

Recent benchmark on unseen 2025 Math Olympiad shows none of the models can problem solve . They all accidentally or on purpose had prior solutions in the training set.

jks
You probably mean the USAMO 2025 paper. They updated their comparison with Gemini 2.5 Pro, which did get a nontrivial score. That Gemini version was released five days after USAMO, so while it's not entirely impossible for the data to be in its training set, it would seem kind of unlikely.

https://x.com/mbalunovic/status/1907436704790651166

MatthiasPortzel
The claim is that these models are training on data which include the problems and explanations. The fact that the first model trained after the public release of the questions (and crowdsourced answers) performs best is not a counter example, but is expected and supported by the claim.
jsemrau
The same timing is actually suspicious. And it would not be the first time something like this happened.
gertlex
What if the word "generic" were added to a lot of these image prompts? "generic image of an intergalactic bounty hunter from space" etc.

Certainly there's an aspect of people using the chat interface like they use google: describe xyz to try to surface the name of a movie. Just in this case, we're doing the (less common?) query of: find me the picture I can vaguely describe; but it's a query to a image /generating/ service, not an image search service.

squeaky-clean
Generic doesn't help. I was using the new image generator to try and make images for my Mutants and Masterminds game (it's basically D&D with superheroes instead of high fantasy), and it refuses to make most things citing that they are too close to existing IP, or that the ideas are dangerous.

So I asked it to make 4 random and generic superheroes. It created Batman, Supergirl, Green Lantern, and Wonder Woman. Then at about 90% finished it deleted the image and said I was violating copyright.

https://imgur.com/a/eG6kmqu

I doubt the model you interact with actually knows why the babysitter model rejects images, but it claims to know why and leads to some funny responses. Here is it's response to me asking for a superhero with a dark bodysuit, a purple cape, a mouse logo on their chest, and a spooky mouse mask on their face.

> I couldn't generate the image you requested because the prompt involved content that may violate policy regarding realistic human-animal hybrid masks in a serious context.

gertlex
Yeah... so much for that hope on my end! Thanks for testing.

(hard to formulate why I was too lazy to test myself :) )

enopod_
Looks to me like OpenAI drew their guardrails somewhere along a financial line. Generate a Micky Mouse or a Pikachu? Disney and Pokemon will sue the sh*t out of you. Ghibli? Probably not powerful enough to risk a multimillion years long court battle.
gcmrtc
Strong with the weak, weak with the strong.
marc_io
This one is a keeper.
nticompass
I thought Disney had the rights to publish Ghibli movies in the US.
davidhaymond
They did, but the rights expired. GKIDS now has the theatrical and home video rights to Studio Ghibli films in the US (except for Grave of the Fireflies).
bufferoverflow
Mickey Mouse (the original one) is out of copyright, as of last year, AFAIR.
briandear
Ghibli isn’t a character, but a style. You can’t copyright it.
sejje
Yes, the only test will eventually be "Can you train AI on copyrighted works"
contravariant
I consider this article quite strong proof that generative AI is closer to copying than it is to creating a new derivative work.
briandear
For the downvotes:

https://www.copyright.gov/circs/circ01.pdf

“Copyright does not protect • Ideas, procedures, methods, systems, processes, concepts, principles, or discoveries”

Not sure why this is even controversial, this has been the case for a hundred years.

simianparrot
So many arguing that "copyright shouldn't be a thing" etc., ad nauseam, which is a fine philosophical debate. But it's also the law. And that means ChatGPT et. al. also have to follow the law.

I really, really hope the multimedia-megacorps get together and class-action ChatGPT and every other closed, for-profit LLM corporation into oblivion.

There should not be a two-tier legal system. If it's illegal for me, it's illegal for Sam Altman.

Get to it.

nucleogenesis
> There should not be a two-tier legal system.

That’s a fine philosophical debate, but the law is designed by the rich to favor the rich and while there are a number of exceptions there is little you can do with the legal system without money and lots of it. So while having a truly just system would be neat it just isn’t in the cards for humanity (IMHO) so long as we allow entities to amass “fuck you” money and wield it to their liking.

fishpen0
There is more to it than copyright when you start going down the path of photorealism. As much as it is a picture of Indiana jones, it is also a picture of Harrison Ford. As fun as it is to make hilarious videos of presidents sucking ceo toes, there has to be a line.

There is a lack of consent here that runs even deeper than what copyright was traditionally made to protect. It goes further than parody. We can't flip our standards back and forth depending on who the image is made to reproduce

simianparrot
I fully agree. But since the average Joe has no chance legally against ChatGPT, at least Disney and other megacorps could.
dartos
Sorry, but have you paid attention to the legal system in the states?

Large corporations and their execs live by different laws than the rest of us.

That’s how it is.

Anything is else is, unfortunately, a fiction in this country.

simianparrot
And? Two wrongs don’t make a right.
dartos
There’s no “and.”

I’m just stating a fact. No discussion of wrong or right or whatever.

Just pointing out how there is no more rule of law in the US. Idk when exactly it disappeared, but it’s definitely not present anymore

jauntywundrkind
Obviously a horrible hideous theft machine.

One thing I would say, it's interesting to consider what would make this not so obviously bad.

Like, we could ask AI to assess the physical attributes of the characters it generated. Then ask it to permute some of those attributes. Generate some random tweaks: ok but brawy, short, and a different descent. Do similarly on some clothing colors. Change the game. Hit the "random character" button on the physical attributes a couple times.

There was an equally shatteringly-awful less-IP-theft (and as someone who thinks IP is itself incredibly ripping off humanity & should be vastly scoped down, it's important to me to not rest my arguments on IP violations).... An equally shattering recent incident for me. Having trouble finding it, don't remember the right keywords, but an article about how AI has a "default guy" type that it uses everywhere, a super generic personage, that it would use repeatedly. It was so distasteful.

The nature of 'AI as compression', as giving you the most median answer is horrific. Maybe maybe maybe we can escape some of this trap by iterating to different permutations, by injecting deliberate exploration of the state spaces. But I still fear AI, worry horribly when anyone relies on it for decision making, as it is anti-intelligent, uncreative in extreme, requiring human ingenuity to budge off its rock of oppressive hypernormality that it regurgitates.

areoform
Theft from whom and how?

Are you telling me that our culture should be deprived of the idea of Indiana Jones and the feelings that character inspires in all of us forever just because a corporation owns the asset?

Indiana Jones is 44 years old. When are we allowed to remix, recreate and expand on this like humanity has done since humans first started sitting down next to a fire and telling stories?

edit: this reminds of this iconic scene from Dr. Strangelove, https://www.youtube.com/watch?v=RZ9B7owHxMQ

    Mandrake: Colonel... that Coca-Cola machine. I want you to shoot the lock off it. There may be some change in there.
   
   Guano: That's private property.
   
   Mandrake: Colonel! Can you possibly imagine what is going to happen to you, your frame, outlook, way of life, and everything, when they learn that you have obstructed a telephone call to the President of the United States? Can you imagine? Shoot it off! Shoot! With a gun! That's what the bullets are for, you twit!

   Guano: Okay. I'm gonna get your money for ya. But if you don't get the President of the United States on that phone, you know what's gonna happen to you?
   
   Mandrake: What?
   
   Guano: You're gonna have to answer to the Coca-Cola company.
I guess we all have to answer to the Walt Disney company.
calmbell
"idea of Indiana Jones and the feelings that character inspires in all of us forever just because a corporation owns the asset" is very different from the almost exact image of Indiana Jones.
GolfPopper
And a reason people are getting ticked at the AI companies is the hypocrisy. They're near-universally arguing that it's okay for them to treat copyright in a way that it is illegal for us to, apparently on the basis of, "we've got a billions in investment capital, and applying the law equally will make it hard for us to get a return on that investment".
chongli
Exactly. The idea of Indiana Jones, the adventurer archaeologist more at home throwing a punch than reading a book, is neither owned by nor unique to Lucasfilm (Disney). There is a ton of media out there featuring this trope character [1]. Yes, the trope is overwhelmingly associated with the image of Harrison Ford in a fedora within the public consciousness, but copyright does not apply to abstract ideas such as tropes.

Some great video games to feature adventurer archaeologists:

* NetHack (One of the best roles in the game)

* Tomb Raider series (Lara Croft is a bona fide archaeologist)

* Uncharted series (Nathan Drake is more of a treasure hunter but he becomes an archaeologist when he retires from adventuring)

* Professor Layton series

* La-Mulana series (very obviously inspired by Indiana Jones, but not derivative)

* Spelunky (inspired by La-Mulana)

[1] https://tvtropes.org/pmwiki/pmwiki.php/Main/AdventurerArchae...

_ph_
Not forever. But 75 years after the death of the creator by current international agreement. I definitely think that the exact terms of copyright should be revisited - a lot of usages should be allowed like 50 years of publishing a piece of work. But that needs to be agreed upon and converted into law. Till then, one should expect everyone, especially large corporations, to stick to the law.
saulpw
When Mickey Mouse was created (1928), copyright was 28 years that could be reupped once for an additional 28 years. So according to those terms, Mickey Mouse would have ascended to the public domain in 1984.

IMO any change to copyright law should not be applied retroactively. Make copyright law to be what is best for society and creators as a whole, not for lobbyists representing already copyrighted material.

fullstop
I mean, at least shouldn't we wait until Harrison Ford has passed?
littlecranky67
But I can hire an artist and ask him to draw me a picture of Indiana Jones, he creates a perfect copy and I hang it on my fridge. Where did I (or the artist) violate any copyright (or other) laws? It is the artist that is replaced by the AI, not the copyrighted IP.
rdtsc
> But I can hire an artist and ask him to draw me a picture of Indiana Jones,

Sure, assuming the artist has the proper license and franchise rights to make and distribute copies. You can go buy a picture of Indy today that may not be printed by Walt Disney Studios but by some other outfit or artists.

Or, you mean if the artist doesn't have a license to produce and distribute Indiana Jones images? Well they'll be in trouble legally. They are making "copies" of things they don't own and profiting from it.

Another question is whether that's practically enforceable.

> Where did I (or the artist) violate any copyright (or other) laws?

When they took payment and profited from making unauthorized copies.

> It is the artist that is replaced by the AI, not the copyrighted IP.

Exactly, that's why LLMs and the companies which create them are called "theft machines" -- they are reproducing copyrighted material. Especially the ones charging for "tokens". You pay them, they make money and produce unauthorized copies. Show that picture of Indy to a jury and I think it's a good chance of convincing them.

I am not saying this is good or bad, I just see this having a legal "bite" so to speak, at least in my pedestrian view of copyright law.

saaaaaam
The likeness of Indiana Jones is not protected in any way - as far as I know - that would stop a human artist creating, rendering and selling a work of art representing their creative vision of Indiana Jones. And even more so in a private context. Even if the likeness is protected (“archaeologist, adventurer, whip, hat”) then this protection would only be in certain jurisdictions and that protection is more akin to a design right where the likeness would need to be articulated AND registered. Many jurisdictions don’t require copyright registration and do not offer that sort of technical likeness registration.

If they traced a photo they might be violating the copyright of the photographer.

But if they are drawing an archaeologist adventurer with a whip and a hat based on their consumption and memory of Indiana Jones imagery there is very little anyone could do.

If that image was then printed on an industrial scale or printed onto t-shirt there is a (albeit somewhat theoretical) chance that in some jurisdictions sale of those products may be able to be restricted based on rights to the likeness. But that would be a stretch.

planb
> Or, you mean if the artist doesn't have a license to produce and distribute Indiana Jones images? Well they'll be in trouble legally. They are making "copies" of things they don't own and profiting from it.

Ok, my sister can draw, and she gifts me an image of my favorite Marvel hero she painted to hang on my wall. Should that be illegal?

Velorivox
That does infringe copyright...you're just unlikely to get in trouble for it. You might get a cease and desist if the owner of the IP finds out and can spare a moment for you.
RajT88
Totally agree. LLM's are just automating that infringement process.

If you make money off it, it's no longer fair use; it's infringement. Even if you don't make money off it, it's not automatically fair use.

My own favorite crazy story about copyright violations:

Metallica sued Green Jello for parodying Enter Sandman (including a lyric where it says "It's not Metallica"):

https://en.wikipedia.org/wiki/Electric_Harley_House_(of_Love...

They lost that case. The kicker? Metallica were guest vocalists on that album.

ryandrake
This doesn't make any sense to me. No media is getting copied, unless the drawing is exactly the same as an existing drawing. Shouldn't "copy"right apply to specific, tangible artistic works? I guess I don't understand how the fantasy of "IP" works.

What if the drawing is of Indiana Jones but he's carrying a bow and arrow instead of a whip? Is it infringement?

What if it's a really bad drawing of Indiana Jones, so bad that you can't really tell that it's the character? Is that infringement?

What if the drawing is of Indiana Jones, but in the style of abstract expressionism, so doesn't even contain a human shape? Is it infringement?

What if it's a good drawing that looks very much like Indiana Jones, but it's not! The person's name is actually Iowa Jim. Is that infringement?

What if it's just an image of an archeologist adventurer who wears a hat and uses a bullwhip, but otherwise doesn't look anything like Indiana Jones? Is it infringement?

piyh
Presumably the artist is a human who directly or indirectly paid money to view a film containing an archaeologist with the whip.

I don't think this is about reproduction as much as how you got enough data for that reproduction. The riaa sent people to jail and ruined their lives for pirating. Now these companies are doing it and being valued for hundreds of billions of dollars.

RataNova
You're right, it's not just about reproduction, it's about how the data was collected
the_af
Plus the scale of it all.

A human friend can get tired and there's so many request he/she can fulfill and at a max rate. Even a team of human artists have a relatively low limit.

But Gen AI has very high limits and speeds, and it never gets tired. It seems unfair to me.

airstrike
Can we not call it "theft"? It's such a loaded term and doesn't really mean the same thing when we're talking about bits and bytes.
moolcool
OK, but then we need a common standard. If Facebook is allowed to use libgen, I should also be allowed.
bigyabai
Only if we stop calling software distribution "piracy" under the false pretenses that anything is being stolen.
airstrike
I'm OK with that too!
satvikpendem
> Obviously a horrible hideous theft machine [...] awful [...] horriffic

Ah, I thought I knew this account from somewhere. It seems surprisingly easy to figure out what account is commenting just based on the words used, as I've commented that only a few active people on this site seem to use such strong words as shown here.

KronisLV
I think the cat is out of the bag when it comes to generative AI, the same way how various LLMs for programming have been trained even on codebases that they had no business using, yet nobody hasn’t and won’t stop them. It’s the same as what’s going to happen with deepfakes and such, as the technology inevitably gets better.

> Hayao Miyazaki’s Japanese animation company, Studio Ghibli, produces beautiful and famously labor intensive movies, with one 4 second sequence purportedly taking over a year to make.

It makes me wonder though - whether it’s more valuable to spend a year on a scene that most people won’t pay that much attention to (artists will understand and appreciate, maybe pause and rewind and replay and examine the details, the casual viewer just enjoy at a glance) or use tools in addition to your own skills to knock it out of the park in a month and make more great things.

A bit how digital art has clear advantages over paper, while many revere the traditional art a lot, despite it taking longer and being harder. The same way how someone who uses those AI assisted programming tools can improve their productivity by getting rid of some of the boilerplate or automate some refactoring and such.

AI will definitely cheapen the art of doing things the old way, but that’s the reality of it, no matter how much the artists dislike it. Some will probably adapt and employ new workflows, others stick to tradition.

M95D
It's a very clear difference between a cheap animation and Ghibli. Anyone can see it.

In the first case, there's only one static image for an entire scene, scrolled and zoomed, and if they feel generous, there would be an overlay with another static image that slides over the first at a constant speed and direction. It feels dead.

In the second case, each frame is different. There's chaotic motions such as wind and there's character movement with a purpose, even in the background, there's always something happening in the animation, there's life.

paulluuk
There is a huge middle ground between "static image with another sliding static image" and "1 year of drawing per 4 second Ghibli masterpiece". From your comment is almost looks like you're suggesting that you have to choose either one or the other, but that is of course not true.

I bet that a good animator could make a really impressive 4-second scene if they were given a month, instead of a year. Possibly even if they were given a day.

So if we assume that there is not a binary "cheap animation vs masterpiece" but rather a sort of spectrum between the two, then the question is: at what point do enough people stop seeing the difference, that it makes economic sense to stay at that level, if the goal is to create as much high-quality content as possible?

M95D
Yes, that the current trend in the western world. Money is all that matters. There's only lowest accepted quality. Anything above that is a waste of money, profits that are lost. Nobody wants masterpieces. There is no market for that.

That lowest-accepted quality also declines over time, as generations after generations of people become used to rock-bottom quality. In the end, there's only slop and AI will make the cheapest slop ever. Welcome to a brave new world. We don't even need people anymore. They're too expensive.

zipmapfoldright
anyone _can_ see it, but _most_ people don't (and don't care)

To be clear, I am not saying it's not valuable, only that to the vast majority, it's not.

soneca
I wonder if really great stuff are always for a minority. You have to have listened a lot of classical music to notice a great interpretation of Mozart from a good one. To realize how great was a chess move, how magical was a soccer play, how deep was the writing of a philosopher. Not only for stuff that requires previous effort, but also the subjectiveness of art. Picasso will be really moving for a minority of people. The Godfather. Even Shakespeare.

Social media and generative AI may be good business because the capture the attention of the majority, but maybe they are not valuable to anyone.

whywhywhywhy
> but _most_ people don't (and don't care)

Perhaps it's not for everyone.

IanCal
Fundamentally I think this comes down to answering the question of "why are you creating this?".

There are many valid answers.

Maybe you want to create it to tell a story, and you have an overflowing list of stories you're desperate to tell. The animation may be a means to an end, and tools that help you get there sooner mean telling more stories.

Maybe you're pretty good at making things people like and you're in it for the money. That's fine, there are worse ways to provide for your family than making things people enjoy but aren't a deep thing for you.

Maybe you're in it because you love the act of creating it. Selling it is almost incidental, and the joy you get from it comes down to spending huge amounts of time obsessing over tiny details. If you had a source of income and nobody ever saw your creations, you'd still be there making them.

These are all valid in my mind, and suggest different reasons to use or not to use tools. Same as many walks of life.

I'd get the weeds gone in my front lawn quickly if I paid someone to do it, but I quite enjoy pottering around on a sunny day pulling them up and looking back at the end to see what I've achieved. I bake worse bread than I could buy, and could buy more and better bread I'm sure if I used the time to do contracting instead. But I enjoy it.

On the other hand, there are things I just want done and so use tools or get others to do it for me.

One positive view of AI tools is that it widens the group of people who are able to achieve a particular quality, so it opens up the door for people who want to tell the story or build the app or whatever.

A negative side is the economics where it may be beneficial to have a worse result just because it's so much cheaper.

mytailorisrich
> It makes me wonder though - whether it’s more valuable to spend a year on a scene that most people won’t pay that much attention to

In this case, yes it is.

People do pay attention to the result overall. Studio Ghibli has got famous because people notice what they produce.

Now people might not notice every single detail but I believe that it is this overall mindset and culture that enables the whole unique final product.

xandrius
I think most like the vibes, not the fact it took ages to make.
Qualitionion
Its the quality or level of detail.

Which might indicate an environment were quality is above quantity

happyraul
To me the question of what activity/method is more "valuable" in the context of art is kind of missing the point of art.
neomantra
> Maybe Studio Ghibli making it through the seemingly deterministic GPT guardrails was an OpenAI slip up, a mistake,

The author is so generous... but Sam Altman literally has a Ghibli-fied Social profile and in response to all this said OpenAI chooses its demos very carefully. His primary concern is that Ghibli-fying prompts are over-consuming their GPU resources, degrading the service by preventing other ChatGPT tasks.

gambiting
The official White House account has been posting ghiblified images too, Altman knows that as long as he's not critical of the current administration he's untouchable.
slig
>he's untouchable

Doesn't he have a pretty bad disagreement with Elon?

flessner
Everyone is talking about theft - I get it, but there's a more subtler point being made here.

Current generation of AI models can't think of anything truly new. Everything is simply a blend of prior work. I am not saying that this doesn't have economic value, but it means these AI models are closer to lossy compression algorithms than they are to AGI.

The following quote by Sam Altman from about 5 years ago is interesting.

"We have made a soft promise to investors that once we build this sort-of generally intelligent system, basically we will ask it to figure out a way to generate an investment return."

That's a statement I wouldn't even dream about making today.

nearbuy
> Current generation of AI models can't think of anything truly new.

How could you possibly know this?

Is this falsifiable? Is there anything we could ask it to draw where you wouldn't just claim it must be copying some image in its training data?

mjburgess
Novelty in one medium arises from novelty in others, shifts to the external environment.

We got brass bands with brass instruments, synth music from synths.

We know therefore, necessarily, that they can be nothing novel from an LLM -- it has no live access to novel developments in the broader environment. If synths were invented after its training, it could never produce synth music (and so on).

The claim here is trivially falsifiable, and so obviously so that credulous fans of this technology bake it in to their misunderstanding of novelty itself: have an LLM produce content on developments which had yet to take place at the time of its training. It obviously cannot do this.

Yet an artist which paints with a new kind of black pigment can, trivially so.

nearbuy
Kind of a weird take that excludes the vast majority of human artwork that most people would consider novel. For all the complaints one might have of cubism, few would claim it's not novel. And yet it's not based on any new development in the external world but rather on mashing together different perspectives. Someone could have created the style 100 years earlier if they were so inclined, and had Picasso never existed, someone could create the novel style today just by "remixing" ideas from past art in that very particular way.
moffkalast
> arises from novelty in others, shifts to the external environment

> Everything is simply a blend of prior work.

I generally consider these two to be the same thing. If novelty is based on something else, then it's highly derivative and its novelty is very questionable.

A quantum random number generator is far more novel than the average human artist.

> have an LLM produce content on developments which had yet to take place at the time of its training. It obviously cannot do this.

Put someone in jail for the last 15 years, and ask them to make a smartphone. They obviously cannot do it either.

jedimastert
The problem with generating genuinely new art is it requires "inputs" that aren't art. It's requires life experiences.
Davidzheng
I beseech you, in the bowels of Christ, think it possible that you may be mistaken.
kubanczyk
Oliver Cromwell, a letter to the General Assembly of the Church of Scotland, 3 August 1650
bbor
Disregarding the (common!) assumption that AGI will consist of one monolithic LLM instead of dozens of specialized ones, I think your comment fails to invoke an accurate, consistent picture of creativity/"truly new" cognition.

To borrow Chomsky's framework: what makes humans unique and special is our ability to produce an infinite range of outputs that nonetheless conform to a set of linguistic rules. When viewed in this light, human creativity necessarily depends on the "linguistic rules" part of that; without a framework of meaning to work within, we would just be generating entropy, not meaningful expressions.

Obviously this applies most directly to external language, but I hope it's clear how it indirectly applies to internal cognition and--as we're discussing here--visual art.

TL;DR: LLMs are definitely creative, otherwise they wouldn't be able to produce semantically-meaningful, context-appropriate language in the first place. For a more empirical argument, just ask yourself how a machine that can generate a poem or illustration depicting [CHARACTER_X] in [PLACE_Y] doing [ACTIVITY_Z] in [STYLE_S] without being creative!

[1] Covered in the famous Chomsky v. Foucault debate, for the curious: https://www.youtube.com/watch?v=3wfNl2L0Gf8

flessner
This may not be apparent to an english speaker as the language has a rather fixed set of words, but in German, where creating new words is common, the lack of linguistic creativity is obvious.

As an example, let's talk about "vibe coding" - It's a new term describing heavy LLM usage in programming, usually associated with Generation Z.

If I am asking an LLM to generate a German translation for "vibe coder" it comes up with the neutral "Vibe-Programmierer". When asking it to be more creative it came up with "Schwingungsschmied" ("vibration smith"?) - What?

I personally came up with the following words:

* Gefühlsprogrammierer ("A programmer, that focuses on intuition and feeling.")

* Freischnauzeprogrammierer ("Free-mouthed programmer - highlighting straightforwardness and the creative expression of vibe coding." - colloquial)

Interesstingly, LLMs can describe both these terms, they just can't create them naturally. I tested this on all major LLMs and the results were similar. Generating a picture of a "vibe coder" also highlights more of a moody atmosphere instead of the Generation Z aspects that are associated with it on social media nowadays.

Peritract
> a machine that can generate a poem or illustration depicting [CHARACTER_X] in [PLACE_Y] doing [ACTIVITY_Z] in [STYLE_S] without being creative

Your example disproves itself; that's a madlib. It's not creative, it's just rolling the dice and filling in the blanks. Complex die and complex blanks are a difference of degree only, not creativity.

bbor
It's not filling in the blanks that's impressive, it's meaningfully combining them all into an objectively unique narrative, building upon those blanks at length.

Definitions are always up for debate on instrumental grounds, but I'm dubious of any definition of "creative" that excludes truly unique yet meaningful artifacts. The only thing past that is ineffable stuff, which is inherently not very helpful for scientific discussion.

burnished
Oooh those guardrails make me angry. I get why they are there (dont poke the bear) but it doesn't make me overlook the self serving hypocrisy involved.

Though I am also generally opposed to the notion of intellectual property whatsoever on the basis that it doesn't seem to serve its intended purpose and what good could be salvaged from its various systems can already be well represented with other existing legal concepts, i.e deceptive behaviors being prosecuted as forms of fraud.

teddyh
The problem is people at large companies creating these AI models, wanting the freedom to copy artists’ works when using it, but these large companies also want to keep copyright protection intact, for their regular business activities. They want to eat the cake and have it too. And they are arguing for essentially eliminating copyright for their specific purpose and convenience, when copyright has virtually never been loosened for the public’s convenience, even when the exceptions the public asks for are often minor and laudable. If these companies were to argue that copyright should be eliminated because of this new technology, I might not object. But now that they come and ask… no, they pretend to already have, a copyright exception for their specific use, I will happily turn around and use their own copyright maximalist arguments against them.

(Copied from a comment of mine written more than three years ago: <https://news.ycombinator.com/item?id=33582047>)

ToValueFunfetti
I don't care for this line of argument. It's like saying you can't hold a position that trespassing should be illegal while also holding that commercial businesses should be legally required to have public restrooms. Yes, both of these positions are related to land rights and the former is pro- while the latter is anti-, but it's a perfectly coherent set of positions. OpenAI can absolutely be anti-copyright in the sense of whether you can train an an NN on copyrighted data and pro-copyright in the sense of whether you can make an exact replica of some data and sell it as your own without making it into hypocrisy territory. It does suggest they're self-interested, but you have to climb a mountain in Tibet to find anybody who isn't.

Arguments that make a case that NN training is copyright violation are much more compelling to me than this.

belorn
The example you gave with public restroom do not work because of two main concept: They are usually getting paid for it by the government, and operating a company usually holds benefits given by the government. Industry regulations as a concept is generally justified in that industry are getting "something" from society, and thus society can put in requirements in return.

A regulation that require restaurants to have a public bathroom is more akin to regulation that also require restaurants to check id when selling alcohol to young customers. Neither requirement has any relation with land rights, but is related to the right of operating a company that sell food to the public.

TremendousJudge
No, the exception they are asking for (we can train on copyrighted material and the image produced is non-copyright infringing) is copyright infringing in the most basic sense.

I'll prove it by induction: Imagine that I have a service where I "train" a model on a single image of Indiana Jones. Now you prompt it, and my model "generates" the same image. I sell you this service, and no money goes to the copyright holder of the original image. This is obviously infringment.

There's no reason why training on a billion images is any different, besides the fact that the lines are blurred by the model weights not being parseable

jofla_net
I guess the best explanation for what we're witnessing is the notion that 'Money Talks', and sadly nothing more. To think thats all that fair use activists lacked in years passed..
theshrike79
It's not just the guardrails, but the ham-fisted implementation.

Grok is supposed to be "uncensored", but there are very specific words you just can't use when asking it to generate images. It'll just flat out refuse or give an error message during generation.

But, again, if you go in a roundabout way and avoid the specific terms you can still get what you want. So why bother?

Is it about not wanting bad PR or avoiding litigation?

mrweasel
The implementation is what gets to me too. Fair enough that a company doesn't want their LLM used in a certain way. That's their choice, even if it's just to avoid getting sued.

How they then go about implementing those guardrails is pretty telling about their understand and control over what they've build and their line of thinking. Clearly, at no point before releasing their LLMs onto the world did anyone stop and ask: Hey, how do we deal with these things generating unwanted content?

Resorting to blocking certain terms in the prompts is like searching for keywords in spam emails. "Hey Jim, I got another spam email from that Chinese tire place" - "No worry boss, I've configured the mail server to just delete any email containing the words China or tire".

Some journalist should go to a few of these AI companies and start asking questions about the long term effectiveness and viability of just blocking keywords in prompts.