By Nivedita S
Copyright thehindu
The world is gearing towards an ‘automated economy’ where machines relying on artificial intelligence (AI) systems produce quick, efficient and nearly error-free outputs. However, AI is not getting smarter on its own; it has been built on and continues to rely on human labour and energy resources. These systems are fed information and trained by workers who are invisibilised by large tech companies, and mainly located in developing countries.
Areas of human involvement
A machine cannot process the meaning behind raw data. Data annotators label raw images, audio, video, and text with information that trains AI and Machine Learning (ML) models. This, then, becomes the training set for AI and Machine Learning (ML) models. For example, an large-language models (LLM) cannot recognise the colour ‘yellow’ unless the data has been labelled as such. Similarly, self-driving cars rely on information from video footage that has been labelled to distinguish between a traffic sign and humans on the road. The higher the quality of the dataset, the better the output and the more human labour is involved in creating it.
Data annotators play a major role in training LLMs like ChatGPT, Gemini, etc. An LLM is trained in three steps: self-supervised learning, supervised learning and reinforcement learning. In the first step, the machine picks up information from large datasets on the Internet. The data labellers or annotators enter in the second and third steps, where this information is fine-tuned for the LLM to give the most accurate response. Humans give feedback on the output the AI produces for better responses to be generated over time, as well as remove errors and jailbreaks.
This meticulous annotating work is outsourced by tech companies in Silicon Valley to mainly workers in countries like Kenya, India, Pakistan, China and the Philippines for low wages and long working hours.
Data labelling can be of two types: those which do not require subject expertise and those which are more niche and require subject expertise. Several tech companies have been accused of employing non-experts for technical subjects that require prior knowledge. This is a contributing factor in the errors found in the output produced by AI. A data labeller from Kenya revealed that they were tasked with labelling medical scans for an AI system intended for use in healthcare services elsewhere, despite lacking relevant expertise.
However, due to errors resulting from this, companies are starting to ensure experts for such information being fed into the system.
Automated features requiring humans
Even features marketed as ‘fully automated’ are often underpinned by invisible human work. For example, our social media feeds are ‘automatically’ filtered to censor sensitive and graphic content. This is only possible because human moderators labelled such content as harmful by going through thousands of uncensored images, texts and audio. The exposure to such content daily has also been reported to cause severe mental health issues like post-traumatic stress disorder, anxiety and depression in the workers.
Similarly, there are voice actors and actors behind AI-generated audios and videos. Actors may be required to film themselves dancing or singing for these machines to recognise human movements and sounds. Children have also been reportedly engaged to perform such tasks.
In 2024, AI tech workers from Kenya sent a letter to former U.S. President Joe Biden talking about the poor working conditions they are subjected to. “In Kenya, these US companies are undermining the local labor laws, the country’s justice system and violating international labor standards. Our working conditions amount to modern-day slavery,” the letter read. They said the content they have to annotate can range from pornography and beheadings to bestiality for more than eight hours a day, and for less than $2 an hour, which is very low in comparison to industry standards. There are also strict deadlines to complete a task within a few seconds or minutes.
When workers raised their concerns to the companies, they were sacked and their unions dismantled.
Most AI tech workers are unaware of the large tech company they are working for and are engaged in online gig work. This is because, to minimise costs, AI companies outsource the work through intermediary digital platforms. There are subcontract workers in these digital platforms who are paid per “microtask” they perform. They are constantly surveilled, and if they fall short of the targeted output, they are fired. Hence, the labour network becomes fragmented and lacking transparency.
The advancement of AI is powered by such “ghost workers.” The lack of recognition and informalisation of their work helps tech companies to perpetuate this system of labour exploitation. There is a need to bring in stricter laws and regulations on AI companies and digital platforms, not just on their content in the digital space, but also on their labour supply chains powering AI, ensuring transparency, fair pay, and dignity at work.