

I disagree that Reddit would gain in value over time if they kept banning automation, because it is increasingly difficult to avoid AI-generated material polluting your dataset, no matter how much you avoid automation and try banning it. Inevitably, some AI-generated material is going to get in.
It’s a problem in two ways:
- The vast vast majority of data on Reddit has already been sold, so you can’t rely on that data for future revenue
- The remaining data that’s current is polluted by AI and is therefore worth less than the historical data because the more AI pollutes your dataset, the more likely it is to lead to Model Collapse, where an LLM is poisoned due to unverified data generated by other LLMs
I am firmly of the belief that sites like Internet Archive will be some of the most valuable companies in the AI space, because they hold an immense amount of untainted data created prior to 2019.
I’ve been doing that for years. I’ve been claiming to be a conservative and supporting things like universal healthcare. I even give it capitalist flair by saying that ensuring everyone has more money means I can then take that money by selling them shit they don’t need. How the hell am I supposed to sell my useless crap if everyone’s spending their money on rent?!
Ditto with stuff like housing the unhoused. I don’t want filthy drug addicts strewn about the streets taking up my park benches and constantly asking me for ‘bus money’! Get them houses so I don’t have to see them anymore! Also god I hate kids, especially when they’re just hanging around on the street being annoying and intimidating. Build some youth centres so they have somewhere to go and get them away from me!
Altruism through selfishness etc etc etc.