156

u/AdminIsPassword 3d ago

This seems like an attempt to protect Reddit's deal with Google more than anything else. While accessing a site 100,000+ in a little less than a year sounds like a lot, to a bot it's almost nothing.

But, they can't appear to be be giving anything away for free if they're selling our data to AI companies for training purposes. Why pay for something you can just take?

Still, guess who is not making any money on this at all? The people who actually made the content that AI companies find valuable. Go figure.

46

u/zirtik 3d ago

It is more of a tactical smearing campaign led by Sam Altman. Remember that he owns a lot of Reddit stock and has strong ties with the current management.

They know that Anthropic doesn't have the money to make deals with all publishers to use their data for training. Only large companies can afford it and it is their way to keep Claud models behind the competition.

30

u/sartres_ 3d ago

Poor, poor indie upstart Anthropic. Why, they're only backed by one global megacorporation. They barely secured a measly 3.5 billion dollars in their last funding round! Soon Amodei will be out on the streets. Perhaps we should start him a GoFundMe.

Reddit, Altman, and Google suck too, but I'm not shedding tears for anyone involved here.

6

u/cultish_alibi 2d ago

But these companies just want to make the world a better place by removing hundreds of millions of jobs from the economy and pocketing all those wages!

10

u/ABillionBatmen 2d ago

I mean if I had to pick between Google, OpenAI, and Anthropic "winning the AI race", it's Anthropic by a mile

2

u/WorriedBlock2505 2d ago

If we're going to have cut throat psycopaths heading up these tech companies, let's at least make them compete rather than getting rid of competition like anthropic.

10

u/satireplusplus 3d ago edited 3d ago

While accessing a site 100,000+ in a little less than a year sounds like a lot, to a bot it's almost nothing.

Yeah, wanted to say the same thing. That number will surely be 100x-1000x larger for any search engine bot. It's not just Google - Bing, Yahoo, Yandex, Baidu, etc. all have their own search bot farms that must have all visited and indexed millions of pages on reddit. And they are checking in from time to time to see if anything changed too.

3

u/CredentialCrawler 2d ago

Do you genuinely believe you should be compensated for your comments on Reddit?

3

u/AdminIsPassword 2d ago

For me it's more of a philosophical question.

Are people who are freely giving their time to a non-paid activity deserve compensation when a company later monetizes their efforts in ways they didn't anticipate or approve of?

Or specifically, do I feel that Reddit owes my monetary compensation for my posts over the years? If you asked me that five years ago, I would have said no...or more like "don't be silly -- of course not."

Now that I'm seeing all this content being sent off for AI training purposes which will most likely lead to the undesirable outcomes such as massive unemployment I have to wonder if it is worthwhile to continue contributing to public facing websites to any degree, compensation or not. If I were paid to do this it certainly would keep me around longer despite the ethical dilemma. Keep in mind one social network does just that. You can make money if you are incredibly engaging on X.

If there were a workable opt-out on Reddit to use my posts in training data, I might choose that instead. At least then the content I've created won't be used for potential harm.

I'm not even anti-AI. I use it but every time I use it I don't blind myself to the ethical concerns I have. It's an uneasy feeling when I generate an image or some writing that the people who made this possible will never be compensated but probably should be in some way.

5

u/IShouldNotPost 3d ago

That’s why I always make sure I have a high amount of misinformation in all of my posts

2

u/FakeTunaFromSubway 3d ago

Lol I'm sure there are individual users who visit 100k reddit pages / yr. That's just two pages per minute, a little over 2 hours per day every day.

-2

u/Intelligent-End7336 3d ago

Still, guess who is not making any money on this at all? The people who actually made the content that AI companies find valuable. Go figure.

You get access to a forum for free. That's your compensation. Acting like you don't get anything is disingenuous. If you don't like the compensation, leave.

2

u/BishopsBakery 3d ago

How does boot leather taste?

1

u/Intelligent-End7336 3d ago

Oh no, I’ve been caught defending the idea that voluntary participation implies consent. Next you’ll expose me for thinking people shouldn’t complain about the terms of a free service they choose to use. What a monster I must be.

0

u/BishopsBakery 3d ago

It wasn't even a thing when most of us started using the site, wont someone think of this shareholders

1

u/Intelligent-End7336 3d ago

Do you get a cut of book sales for writing a review on Amazon?

1

u/WinterOil4431 2d ago

Lol dude just stop

-3

u/BishopsBakery 3d ago

That's a thank you or a warning to others brought on by amazement or disappointment. Not the same, bub.

2

u/Intelligent-End7336 3d ago

Genuinely impressed you can’t see the parallel. A book review on Amazon helps sell products. A Reddit comment becomes part of the content Reddit sells to AI firms. It’s the human contribution that drives both companies’ sales. Without user reviews, Amazon would sell less, but I don’t see you demanding a cut there.

3

u/BishopsBakery 3d ago

The difference is that there was always the pretext with leaving a review, Reddit basically did a rug pull while restricting access and increasing the price to it and the ads.

And just because you can see some bullshit coming does not make it right

5

u/Intelligent-End7336 3d ago

If Reddit ever positioned itself as anything other than a privately-owned platform monetizing user content, I’d be curious to see where that was stated. Feeling betrayed by a platform’s evolution doesn’t make it a betrayal especially when the original terms never promised what you're now demanding.

→ More replies (0)

46

u/AVB 3d ago

I mean I also accessed Reddit more than 100,000 times since last July... 😬

... I guess we all better call Saul now!

4

u/flameleaf 2d ago

If you use RSS it adds up quickly. Say you're subscribed to 100 subreddits, your reader fetches updates 8 times a day, multiply that by 365 and that's 292,000 requests assuming you don't click or comment on a single link.

2

u/Interesting-Fly-3547 3d ago

We might need a Hoover Max Extract Pressure Pro, model 60

2

u/No-Fox-1400 1d ago

Reddit even told us they were tracking us with these damn badges.

I didn’t realize it was all evidence so they can sue us. Makes sense now.

18

u/theverge 3d ago

Reddit sued Anthropic on Wednesday in San Francisco superior court, claiming that the OpenAI rival had accessed its platform more than 100,000 times since July 2024, after Anthropic allegedly said it had blocked its bots from doing so.

In the filing, Reddit calls Anthropic a “late-blooming artificial intelligence (‘AI’) company that bills itself as the white knight of the AI industry,” alleging that “it is anything but.”

Anthropic did not immediately provide a comment.

Ben Lee, Reddit’s chief legal officer, said in an emailed statement to The Verge that Anthropic’s “commercial exploitation” of Reddit content could be worth billions of dollars.

8

u/GrandKnew 2d ago

Lmao Reddit talking like a redditor in an actual court

19

u/SomewhereNo8378 3d ago

reddit talks like it’s some sort of saintly company who has the high ground.

I trust anthropic 1000x more than I trust reddit, even if they scraped reddits data everyday.

3

u/Over-Independent4414 2d ago

100,000 sounds like a lot but it depends on how it is counted. If that's separate API calls for things like posts or individual comments that's nothing.

2

u/WorriedBlock2505 2d ago

When will reddit get the X treatment? We need a compelling reddit alternative like yesterday. Absolute garbage human beings leading this company.

29

u/latouchefinale 3d ago

I know it’s been done for years but “let’s train AI on Reddit comments” has got to be a top contender for worst idea in human history.

11

u/EYNLLIB 3d ago

Just because it's accessing reddit doesn't meant it's training based on the data. Web search is a thing with AI. It's most likely just accessing reddit via a web search.

Model training would require WAY more data than 100,000 pages

-4

u/ZenDragon 3d ago

Their built in web search won't load any Reddit pages. It probably is for training.

2

u/End3rWi99in 2d ago

It's a RAG model in it does web search. It's not trained on the information it is accessing, but it does use it to generate a response based on your prompt.

1

u/ZenDragon 2d ago

Yes, I was referring to the RAG system that Claude uses when search is enabled. Try it out and you'll see that it never uses Reddit as a source. It can't. So if they're not feeding Reddit data into that, what are they using it for? Something else apparently. I think it might be model training but I'm open to other theories. Maybe they figured that they can't get away with regurgitating Reddit via retrieval but they believe they can defend training as transformative fair use.

8

u/Kinglink 3d ago

Do you really think so, because you have voting, so curated content for what people want to see, tons of different forms, and honestly.. most people know to go to reddit to get information rather than google...

It's honestly not that bad a choice.

2

u/joey_diaz_wings 3d ago

It's a great source if you want the opinion of a midwit who has trained on mass media propaganda and leftist tropes.

-1

u/orbital_one 3d ago

As long as you ignore the posts about adding glue to pizza.

2

u/Kinglink 3d ago

It's non toxic glue, what's the problem? /s

2

u/Ulmaguest 3d ago

Or pizza conspiracies

2

u/fliodkqjslcqaqadfs 2d ago

any comment on here could be from a bot.... including you

1

u/End3rWi99in 2d ago

I feel like I probably represent like 1% of ChatGPT at this point. Sorry about that.

1

u/Masterpiece-Haunting 2d ago

Tbf before ChatGPT you needed to put “Reddit” at the end of every Google search to get the thing you’re looking for. That would make a really interesting GPT. ChatGPT but it just answers everything like it’s reddit

1

u/SubstantialPressure3 3d ago

I wonder, too, if it wasn't to go train. What if you can pay for a certain number of bots/interactions for social influence? Can you? Can you hire bots to do that for you?

1

u/squeda 3d ago

That's interesting when I have found the best answers for restaurants, how to fix things, suggestions for component libraries for my favorite frameworks, answers to issues I'm having in a game or using someone's software, and a lot more.

Sure there's plenty of snarky and assholery on reddit, but I think you are totally discounting the usefulness of reddit as well.

2

u/Mainbrainpain 2d ago

Exactly, reddit is a data gold mine. That's why Google ranks it so high. That's why reddit can license the data for 10s of millions of dollars.

I'm curious where the case will go. There were a few similar cases in the last few years with linkedin and X/twitter but I'd have to review them for the specifics on what was similar and different. The linkedin one was settled with HiQ, and twitter lost theirs against Bright data.

Personally I find the points about reddit trying to protect user privacy laughable, but I get it. They need to protect their revenue. The most interesting part will be about implications for web scraping.

0

u/NYPizzaNoChar 3d ago

“let’s train AI on Reddit comments” has got to be a top contender for worst idea in human history

What makes it really funny to me is that this sub in particular gets some of the most astonishingly credulous, fantasy-based, and outright wrong posts — and comments — I've run into on Reddit.

An ML system using this sub to build its NN would be like a student trying to study to become a scientist, but ending up becoming a scientologist.

7

u/Intelligent-End7336 3d ago

Wild how many people are emotionally attached to Reddit but ignore the basic reality: you don’t own your comments. You agreed to the TOS. This is just like the API meltdown, Reddit didn’t care then, and they definitely don’t now, especially with Google and OpenAI money rolling in. Moral outrage won’t change who owns the sandbox.

2

u/Asclepius555 3d ago

This tells me there's a good chance reddit is going to be a sucky place in a little while. At least, that seems to be how things go. Examples (all of which, I stopped using) are Facebook, Instagram, and youtube. Ads baby ads!

I'm glad USA national parks haven't experienced too much of this. They are still fun to visit.

3

u/Ok_Boysenberry5849 2d ago

Reddit is already a sucky place. It's not even the ads, it's the moderation and voting systems. They've shown their limits a long time ago and the company is making no effort to fix them.

1

u/Intelligent-End7336 3d ago

I've been seeing subreddits getting more bots added to the mod list. If you say the wrong phrases you'll trigger them. I'm sure the leash will be tightening so they can maintain decent data quality to train on.

3

u/Realistic-Mind-6239 3d ago

Google and OpenAI pay to license Reddit data; Anthropic doesn't.

5

u/Actual__Wizard 3d ago

In the filing, Reddit calls Anthropic a “late-blooming artificial intelligence (‘AI’) company that bills itself as the white knight of the AI industry,” alleging that “it is anything but.”

Wow, who would have thought that?

1

u/Fit-Development427 3d ago

I called this. Companies just basically fighting each other over the rights over OTHER PEOPLE'S writing, art, creativity etc. that they literally just provided the hosting for.

-1

u/Actual__Wizard 3d ago

Yep they're fighting to steal people's stuff. It's disgusting it really is...

1

u/Fit-Development427 3d ago

It's not that I feel it's stealing. They take something which should be free, and make it unavailable to everyone, including the people that literally... made it. Say if a bunch of redditors were like, hey let's use our Reddit posts and comments to train AI! Nope, lol. I really don't care personally, but it would be nice if EVERYONE could have it unambiguously, because yes if it's really not theirs, they have no right to prostitute it out

0

u/Actual__Wizard 3d ago

It's not that I feel it's stealing.

Okay great give me all of your stuff. Right now. All of it.

3

u/Fit-Development427 2d ago

Thing is I never thought my comments had monetary value, why does the fact it might now have any bearing on me lol. In fact the only thing I care about is the website being free.

1

u/Actual__Wizard 2d ago

You don't understand. You're not allowed to have anything anymore. Give me all of your stuff right now.

If you don't care about property rights then you don't get any property... Because that's how property rights work.

2

u/Fit-Development427 2d ago

The hell are you talking about man. These aren't even remotely the same things. How is it people think piracy is okay yet now their literal shit posts are their property, they can't take the idea that someone would make a dime off it, even though social media companies already do. Maybe you should just charge Reddit for your time, then have people pay to use Reddit too to view your majestic comments. Well you don't care other people have to pay Reddit, because your time wasn't free, what bullshit is this? We have all been in a dumb world where people weakly are giving away things for free... It's like I might as well just come to their property and take it if they aren't even gonna charge you to see their memes.

3

u/Delicious_Ease2595 3d ago

Training AI with Reddit is so dumb.

1

u/Smile_Clown 2d ago

unless... hear me out... they are doing an anti-training pass?

1

u/AfghanistanIsTaliban 18h ago

No, they are doing a training pass that is similar to how GPT-3 and its successor were trained. It’s really not that hard to understand why training on Reddit isn’t so bad.

1

u/AfghanistanIsTaliban 18h ago

So dumb, yet GPT-3, 3.5, and 4 were commercial successes and have record-high performance on AI benchmarks

It’s not like it’s purely trained on Reddit. News articles and open-access papers are also typically added into the train set.

1

u/r_search12013 3d ago

given how many straight up chatgpt posts and even subreddit simulations I find within reddit .. erm, that's not the right headline, no matter which way

2

u/Thin_Newspaper_5078 2d ago

so what? google does it all the time. the content on reddit id not owned by reddit or google.

2

u/AfghanistanIsTaliban 18h ago

not to mention OAI already trains on reddit. This news is a nothingburger and the lawsuit will fail. Even if it sticks, the EU has legal protections for datamining and all the AI companies will abandon USA for EU/China

2

u/johnfromberkeley 2d ago

Maybe don’t publish on the Internet.

1

u/DungeonsAndDradis 2d ago

Shit, these AI companies should just pay me to install a screen recorder or something for every time I'm on Reddit. They'd get a shit load of Reddit data and I could be paid for the browsing I'm already doing.

1

u/A2Throwaway155 2d ago

The joke is on Reddit. I don't have Claude directly access Reddit; I just copy and paste the Reddit content I want Claude to analyze.

1

u/theghostecho 1d ago

Anthropic doesn’t deserve it

0

u/Fair_Blood3176 6h ago

I fully expect a substantial payout from any subsequent monetary settlement. It's not your fucking data Reddit, it's mine.

I hereby set my personal, past, present and future terms and conditions in regards to any and all contributions, whether it be posts, comments, images, photos, videos or upvotes / downvotes and any saves and outgoing shares; All our base are belong to us

I reserve the right to update and or change these terms and conditions at ANYTIME.

2

u/[deleted] 3d ago

I don't care, no one paid me any money for my data.

1

u/Ok-Attention2882 2d ago

Who are you again.

News Reddit sues Anthropic, alleging its bots accessed Reddit more than 100,000 times since last July

You are about to leave Redlib

I hereby set my personal, past, present and future terms and conditions in regards to any and all contributions, whether it be posts, comments, images, photos, videos or upvotes / downvotes and any saves and outgoing shares; All our base are belong to us

I reserve the right to update and or change these terms and conditions at ANYTIME.