FOSS infrastructure is under attack by AI companies

simple@lemm.ee · 8 days ago

FOSS infrastructure is under attack by AI companies

fjordo@feddit.uk · 8 days ago

I wish these companies would realise that acting like this is a very fast way to get scraping outlawed altogether, which is a shame because it can be genuinely useful (archival, automation, etc).

Fijxu@programming.dev · 7 days ago

AI scrapping is so cancerous. I host a public RedLib instance (redlib.nadeko.net) and due to BingBot and Amazon bots, my instance was always rate limited because the amount of requests they do is insane. What makes me more angry, is that this fucking fuck fuckers use free, privacy respecting services to be able to access Reddit and scrape . THEY CAN’T BE SO GREEDY. Hopefully, blocking their user-agent works fine ;)

grue@lemmy.world · 8 days ago

ELI5 why the AI companies can’t just clone the git repos and do all the slicing and dicing (running git blame etc.) locally instead of running expensive queries on the projects’ servers?

Realitaetsverlust@lemmy.zip · 7 days ago

Because that would cost you money, so just “abusing” someone else’s infrastructure is much cheaper.

zovits@lemmy.world · 8 days ago

Takes more effort and results in a static snapshot without being able to track the evolution of the project. (disclaimer: I don’t work with ai, but I’d bet this is the reason and also I don’t intend to defend those scraping twatwaffles in any way, but to offer a possible explanation)

Retropunk64@lemmy.world · edit-2 7 days ago

deleted by creator

RobotToaster@mander.xyz · 8 days ago

If an AI is detecting bugs, the least it could do is file a pull request, these things are supposed to be master coders right? 🙃

MonkderVierte@lemmy.ml · edit-2 8 days ago

Assuming we could build a new internet from the ground up, what would be the solution? IPFS for load-balancing?

AbsoluteChicagoDog@lemm.ee · 8 days ago

There is no technical solution that will stop corporations with deep pockets in a capitalist society

dindonmasker@sh.itjust.works · 8 days ago

Maybe letters through the mail to receive posts.

WhyJiffie@sh.itjust.works · 8 days ago

so basically what you are saying is to not put information on public places, but only send information to specific people

AbsoluteChicagoDog@lemm.ee · 8 days ago

And only then because the USPS is a federal agency. You can bet if private corporations ran it there would be no such privacy.

/home/pineapplelover@lemm.ee · 8 days ago

They’re afraid