Collective Bargaining (that's unions, that's us) to address AI-driven power concentration

llwyn · July 7, 2025, 7:31am

This position paper argues that there is an urgent need to restructure markets for the information that goes into AI systems. Specifically, producers of information goods (such as journalists, researchers, and creative professionals) need to be able to collectively bargain with AI product builders in order to receive reasonable terms and a sustainable return on the informational value they contribute. We argue that without increased market coordination or collective bargaining on the side of these primary information producers, AI will exacerbate a large-scale “information market failure” that will lead not only to undesirable concentration of capital, but also to a potential “ecological collapse” in the informational commons. On the other hand, collective bargaining in the information economy can create market frictions and aligned incentives necessary for a pro-social, sustainable AI future. We provide concrete actions that can be taken to support a coalition-based approach to achieve this goal. For example, researchers and developers can establish technical mechanisms such as federated data management tools and explainable data value estimations, to inform and facilitate collective bargaining in the information economy. Additionally, regulatory and policy interventions may be introduced to support trusted data intermediary organizations representing guilds or syndicates of information producers.

ddevault · July 9, 2025, 8:08am

Interesting paper, thanks for sharing!

I find it a bit optimistic, though. The main problem I face is that it fails to reckon with the fact that AI companies devour information in bad faith. Their willingness to DDoS the whole internet or pirate millions of books shows that pretty plainly. Information producers (e.g. journalists) are kind of powerless to do anything about it – if you publish, an AI company will just take your data and feed it to their model. The paper briefly acknowledges that AI companies take information in bad faith (p. 5) but doesn’t really address it in their argument.

The paper suggests “information-producing actors should, to the maximum feasible extent, seek to disseminate and license information in ways that prevent AI companies from training on it through backdoor access instead of negotiating for rights”, which is plainly clear, but it’s not obvious how we should be disseminating and licensing information in those ways.

This isn’t helped by the fact that Western courts seem to be, unfortunately, agreeing with the AI companies that their use of this data constitutes fair-use, and therefore any form of licensing to protect information producers is categorically ruled out.

I suspect that the only viable legal solution would involve citing the Computer Fraud and Abuse Act (or similar laws in other jurisdictions), which we could argue applies when LLM companies go out of their way to circumvent technologies deployed for the express purpose of denying them access to a computer system. But even that is quite limited in its effectiveness; it covers online journalism perhaps but not print media, and the AI companies’ willingness to resort to piracy shows us that even then it might not really work (though a court recently ruled that the piracy approach wasn’t going to fly, though they agreed with the fair use argument (ugh)).

I think the bigger issue with how all of this AI garbage is being ruled out ties into the broader problem of “move fast and break things” culture being tolerated by our legal institutions, where the rules (read: the laws) don’t apply if you’re a corporation. Whether it’s housing laws and regulations if you’re AirBnB, or transport regulations if you’re Uber, or copyright law if you’re some AI slop corporation. Anna’s Archive is public enemy number one of the corporate state, until it’s useful (profitable) for all of the AI companies to feed their models. Laws are optional if you’re a corporation, and changing that is probably priority one in fixing this mess.

Back to the paper, another thing I wanted to remark on is the paper’s reliance on market interventions to fix the problem, which is a very left-liberal “solution”, and with it the tacit legitimization of the copyright system and intellectual property markets in general, which I broadly disagree with. But setting aside question of the legitimacy of intellectual property, the issue with this paper is, in my view, that markets, even with liberal oversight, permit a non-democratic allocation of resources towards something none of us asked for: AI slop and all of its negative externalities. The paper never questions the assumption that we should be providing data for AI models, even in some more economically equitable manner. Democratic oversight over the distribution of wealth and investments in our society would allow us to side-step the whole question by investing in things that matter to people instead. Or at least we could demand that AI utilize only meager resources for further R&D until they can prove that it’s useful for anything.

In my most generous estimation, the paper could be read as making an appeal to liberal policymakers to “solve” the problem in the only way they understand, but the liberals aren’t going to be in charge for much longer. Either the liberals stubbornly cling to power as it drains out from between their fingers and into the conservative and far-right powers, or the leftists finally assert themselves and do away with these fucking markets once and for all.

I also had a thought on this passage:

CBI gives us a third state [besides (1) under private control of an agent, or (2) out in public], namely information held by trusted representatives of large classes of information producers, and who may then titrate information, with caution, to self-interested third-parties. These trusted representatives are not aiming simply to maximize power or wealth; they have broader, more complex concerns like diversity in information, the well-being of their set of stakeholders, and/or the maintenance and development of a particular aspect of culture. Because they are empowered to represent larger aggregated pools of data, they enjoy bargaining power that individual agents do not. By mitigating the influence of more narrowly self-interested agents within groups, they maximize the good for groups as a whole. By increasing potential returns on investment in information goods, they preserve incentives for knowledge production.

– p. 8

I think we already have an example of a “third state” information broker to examine: academic publishers (journals). I suspect we can all agree that these particular information brokers do not contribute positively to society.

Final thought: my take on this paper is pretty critical, but if creatives want to engage in collective bargaining with AI companies, we should support them, even perhaps to the point of investing more in technologies like Anubis/go-away, or even tech that poisons the well. Ultimately the word of the day is always solidarity. The outcomes of such an action can only be positive for the working class.