Cloudflare announces AI Labyrinth, which uses AI-generated content to confuse and waste the resources of AI Crawlers and bots that ignore “no crawl” directives.

Tea@programming.dev · 15 hours ago

Cloudflare announces AI Labyrinth, which uses AI-generated content to confuse and waste the resources of AI Crawlers and bots that ignore “no crawl” directives.

Dr. Moose@lemmy.world · edit-2 3 hours ago

Considering how many false positives Cloudflare serves I see nothing but misery coming from this.

weremacaque@lemmy.world · edit-2 3 hours ago

You have Thirteen hours in which to solve the labyrinth before your baby AI becomes one of us, forever.

TorJansen@sh.itjust.works · 5 hours ago

And soon, the already AI-flooded net will be filled with so much nonsense that it becomes impossible for anyone to get some real work done. Sigh.

4am@lemm.ee · 10 hours ago

Imagine how much power is wasted on this unfortunate necessity.

Now imagine how much power will be wasted circumventing it.

Fucking clown world we live in

Demdaru@lemmy.world · 10 hours ago

On on hand, yes. On the other…imagine frustration of management of companies making and selling AI services. This is such a sweet thing to imagine.

Melvin_Ferd@lemmy.world · 4 hours ago

I just want to keep using uncensored AI that answers my questions. Why is this a good thing?

explodicle@sh.itjust.works · 51 minutes ago

Because it only harms bots that ignore the “no crawl” directive, so your AI remains uncensored.

halfapage@lemmy.world · 9 hours ago

My dude, they’ll literally sell services to both sides of the market.

AtariDump@lemmy.world · 5 hours ago

Demdaru@lemmy.world · 9 hours ago

I…uh…frick.

DomesticForeigner@lemm.ee · 4 hours ago

deleted by creator

oldfart@lemm.ee · 11 hours ago

So the web is a corporate war zone now and you can choose feudal protection or being attacked from all sides. What a time to be alive.

theparadox@lemmy.world · 9 hours ago

There is also the corpo verified id route. In order to avoid the onslaught of AI bots and all that comes with them you’ll need to sacrifice freedom, anonymity, and privacy like a good little peasant to prove you aren’t a bot… and so will everyone else. You’ll likely be forced to deal with whatever AI bots are forced upon you while within the walls but better an enemy you know I guess?

quack@lemmy.zip · edit-2 10 hours ago

Generating content with AI to throw off crawlers. I dread to think of the resources we’re wasting on this utter insanity now, but hey who the fuck cares as long as the line keeps going up for these leeches.

AtomicHotSauce@lemmy.world · 15 hours ago

That’s just BattleBots with a different name.

aviationeast@lemmy.world · 15 hours ago

You’re not wrong.

IrateAnteater@sh.itjust.works · 14 hours ago

Ok, I now need a screensaver that I can tie to a cloudflare instance that visualizes the generated “maze” and a bot’s attempts to get out.

x1gma@lemmy.world · 12 hours ago

You probably just should let an AI generate that.

RelativeArea1@sh.itjust.works · edit-2 1 hour ago

this is some fucking stupid situation, we somewhat got a faster internet and these bots messing each other are hogging the bandwidth.

Dr. Moose@lemmy.world · 3 hours ago

Lol website traffic accounts for like 1% of bandwidth budget. 1 netflix movie is like 20k web pages.

dual_sport_dork 🐧🗡️@lemmy.world · edit-2 13 hours ago

Especially since the solution I cooked up for my site works just fine and took a lot less work. This is simply to identify the incoming requests from these damn bots – which is not difficult, since they ignore all directives and sanity and try to slam your site with like 200+ requests per second, that makes 'em easy to spot – and simply IP ban them. This is considerably simpler, and doesn’t require an entire nuclear plant powered AI to combat the opposition’s nuclear plant powered AI.

In fact, anybody who doesn’t exhibit a sane crawl rate gets blocked from my site automatically. For a while, most of them were coming from Russian IP address zones for some reason. These days Amazon is the worst offender, I guess their Rufus AI or whatever the fuck it is tries to pester other retail sites to “learn” about products rather than sticking to its own domain.

Fuck 'em. Route those motherfuckers right to /dev/null.

Flagstaff@programming.dev · 12 hours ago

Geez, that’s a lot of requests!

dual_sport_dork 🐧🗡️@lemmy.world · 12 hours ago

It sure is. Needless to say, I noticed it happening.

morrowind@lemmy.ml · 12 hours ago

Cloudflare offers that too, but you can’t always tell

desktop_user@lemmy.blahaj.zone · 12 hours ago

the only problem with that solution being applied to generic websites is schools and institutions can have many legitimate users from one IP address and many sites don’t want a chance to accidentally block one.

dual_sport_dork 🐧🗡️@lemmy.world · 12 hours ago

This is fair in those applications. I only run an ecommerce web site, though, so that doesn’t come into play.

AnthropomorphicCat@lemmy.world · 13 hours ago

So the world is now wasting energy and resources to generate AI content in order to combat AI crawlers, by making them waste more energy and resources. Great! 👍

brucethemoose@lemmy.world · edit-2 11 hours ago

The energy cost of inference is overstated. Small models, or “sparse” models like Deepseek are not expensive to run. Training is a one-time cost that still pales in comparison to, like, making aluminum.

Doubly so once inference goes more on-device.

Basically, only Altman and his tech bro acolytes want AI to be cost prohibitive so he can have a monopoly. Also, he’s full of shit, and everyone in the industry knows it.

AI as it’s implemented has plenty of enshittification, but the energy cost is kinda a red herring.

cultsuperstar@lemmy.world · 8 hours ago

I introduce to you, the Trace Buster Buster!

https://youtu.be/Iw3G80bplTg

If you’ve never seen the movie The Big Hit, it’s great.

GreenKnight23@lemmy.world · 10 hours ago

hey look it’s that “zip bomb” I mentioned.

fuck cloudflare though.

MTK@lemmy.world · 12 hours ago

I swear someone released this exact thing a few weeks ago

alecbowles@lemm.ee · 10 hours ago

We want names

Blackmist@feddit.uk · 9 hours ago

https://www.404media.co/developer-creates-infinite-maze-to-trap-ai-crawlers-in/

XeroxCool@lemmy.world · 15 hours ago

Will this further fuck up the inaccurate nature of AI results? While I’m rooting against shitty AI usage, the general population is still trusting it and making results worse will, most likely, make people believe even more wrong stuff.

ladel@feddit.uk · edit-2 14 hours ago

The article says it’s not poisoning the AI data, only providing valid facts. The scraper still gets content, just not the content it was aiming for.

E:

It is important to us that we don’t generate inaccurate content that contributes to the spread of misinformation on the Internet, so the content we generate is real and related to scientific facts, just not relevant or proprietary to the site being crawled.

ObsidianZed@lemmy.world · 9 hours ago

Until the AI generating the content starts hallucinating.

finitebanjo@lemmy.world · 11 hours ago

Cloudflare kind of real for this. I love it.

It makes perfect sense for them as a business, infinite automated traffic equals infinite costs and lower server stability, but at the same time how often do giant tech companies do things that make sense these days?

Dr. Moose@lemmy.world · 3 hours ago

I’m sorry how does it make sense?

ozymandias117@lemmy.world · 9 hours ago

Kind of seems like they simply installed this dude’s tarpit from a few months ago

https://zadzmo.org/code/nepenthes/

Cloudflare announces AI Labyrinth, which uses AI-generated content to confuse and waste the resources of AI Crawlers and bots that ignore “no crawl” directives.

Cloudflare announces AI Labyrinth, which uses AI-generated content to confuse and waste the resources of AI Crawlers and bots that ignore “no crawl” directives.

Trapping misbehaving bots in an AI Labyrinth