In dusty factories, cramped internet cafes and makeshift home offices around the world, millions of people sit at computers and tediously label data. These workers are the lifeblood of the booming company artificial intelligence (“AI”). Without them, products like ChatGPT simply wouldn’t exist. Indeed, the data they label helps AI systems “learn”. But despite the vital contribution of this workforce to an industry that is expected to be worth $407 billion by 2027, its people will be largely invisible and frequently exploited.
Earlier this year, nearly 100 data labelers and AI workers from Kenya who work for companies like Facebook, Scale AI and OpenAI published an open letter In the United States, President Joe Biden said: “Our working conditions amount to modern slavery. » To ensure AI supply chains are ethical, industry and governments must urgently address this issue. But the key question is: how?
What is data labeling?
Data labeling is the process of annotating raw data – such as images, videos or text – so that AI systems can recognize patterns and make predictions. Self-driving cars, for example, depend labeled video sequences to distinguish pedestrians from road signs. Large language models such as ChatGPT rely on labeled text understand human language. These labeled datasets are the lifeblood of AI models. Without them, AI systems could not function effectively.
Tech giants like Meta, Google, OpenAI and Microsoft outsource much of this work to data labeling factories in countries like the Philippines, Kenya, India, Pakistan, Venezuela and Colombia. China is also becoming another global hub for data labeling. Outsourcing companies that facilitate this work include Scale AI, iMerit, and Samasource. They are very large companies in their own right. For example, Scale AI, headquartered in California, is now worth 14 billion dollars.
Big tech companies like Alphabet (Google’s parent company), Amazon, Microsoft, Nvidia, and Meta paid billions in AI infrastructure, from computing power and data storage to emerging computing technologies. Large-scale AI models can be expensive tens of millions of dollars to train. Once deployed, maintaining these models requires ongoing investment in labeling, refining, and real-world testing of the data.
But even though investments in AI are significant, revenues have not always met expectations. Many industries continue to view AI projects as experimental with unclear profitability trajectories. In response, many companies are cutting costs, which affects those at the very bottom of the AI supply chain who are often very vulnerable: data labelers.
Low wages, dangerous working conditions
Companies involved in the AI supply chain are notably trying to reduce costs by employing large numbers of data labelers in countries in the Global South, such as the Philippines, Venezuela, Kenya and India. Workers in these countries face stagnation or decrease in wages. For example, the hourly rate for AI data labelers in Venezuela varies between 90 cents and $2. In comparison, in the United States, this rate is between $10 to $25 per hour. In the Philippines, workers who label data at multi-billion dollar companies like Scale AI often earn well below minimum wage.
Some labeling suppliers even use child labor for labeling purposes. But there are many other labor issues within the AI supply chain. Many data labelers work in crowded and dusty environments who pose a serious risk to their health. They also often work as independent contractors, lacking access to protections such as health care or compensation.
The mental cost of data labeling work is also significant, with repetitive tasks, strict deadlines and rigid quality controls. Data labelers are also sometimes asked to read and label hate speech or other abusive language or material, which has been it is proven to have negative psychological effects. And mistakes can lead to pay cuts or job losses. But labelers often find a lack of transparency in how their work is evaluated. They are often denied access to performance data, hindering their ability to improve or challenge decisions.
Making AI supply chains ethical
As AI development becomes more complex and companies strive to maximize profits, the need for ethical AI supply chains becomes urgent. Companies can in particular contribute to guaranteeing this objective by applying a human rights-centered approach to design, deliberation and monitoring across the entire AI supply chain. They must adopt fair salary policies, ensuring that data labelers receive a living wage that reflects the value of their contributions.
By integrating human rights into the supply chain, AI companies can foster a more ethical and sustainable industry, ensuring that worker rights and business responsibility align with long-term success . Governments should also create new regulations that mandate these practices, encouraging fairness and transparency. This includes transparency in performance evaluation and the processing of personal data, allowing workers to understand how they are evaluated and to challenge any inaccuracies.
Clear payment systems and redress mechanisms will ensure that workers are treated fairly. Instead of dismantling unions, like Scale AI did in Kenya in 2024Businesses should also support the formation of digital unions or cooperatives. This will give workers a voice to advocate for better working conditions.
As users of AI products, we can all advocate for ethical practices by supporting companies that are transparent about their AI supply chains and committed to treating their workers fairly. Just as we reward producers of ecological and fair physical goods, we can promote change by choosing digital services or applications on our smartphones that respect human rights standards, by promoting ethical brands on social networks and by voting with our dollars for the responsibility of technology giants every day.
Ganna Pogrebna is the Executive Director of the AI and Cyber Futures Institute at Charles Sturt University.