Facebook’s contract workers are looking at your private posts to train AI

A new report from Reuters reveals that subcontracted workers are looking for private posts on Facebook and Instagram to tag them for artificial intelligence systems.

Like many technology companies, Facebook uses machine learning and artificial intelligence to classify content on its platforms. But to do this, the software must be able to identify different types of content. To train these algorithms, they have to analyze sample data, all of which must be categorized and labeled by people, a process known as "data annotation."

Reuters & # 39; report focuses on the Indian outsourcing company WiPro, which has employed up to 260 workers to score posts according to five categories . These include the content of the publication (it is a selfie, for example, or a photo of the food); the occasion (it is for a birthday or a wedding); and the author's intention (are they making a joke, trying to inspire others, or organizing a party)?

WiPro employees have to rank a range of Facebook and Instagram content, including status updates, videos, photos, shared links, and stories. Each piece of content is verified by two workers to verify its accuracy and the workers score approximately 700 items each day.

Facebook confirmed to Reuters that the content reviewed by WiPro workers includes private publications shared to a select number of friends, and that the data sometimes includes the names of users and other confidential information. Facebook says it has 200 content labeling projects around the world, employing thousands of people in total.

"It's a fundamental part of what you need," he told Reuters Nipun Mathur, director of product management for the AI ​​of Facebook. "I do not see the need going away."

These data annotation projects are key to the development of AI, and have become something similar to the work of the call center, subcontracted to countries where human work is cheaper.

In China, for example, huge offices of people tag images of driving cars in order to train them on how to identify cyclists and pedestrians. Most Internet users have done this type of work without even knowing it. The Google CAPTCHA system, which asks you to identify the objects in the images to "prove" that it is human, is used to digitize information and train artificial intelligence.

This type of work is necessary, but worrying when the data in question is private. Recent research has highlighted how workers' teams tag confidential information collected by Amazon Echo devices and Ring security cameras. When you talk to Alexa, you do not imagine that someone else will listen to your conversation, but that is exactly what can happen.

The problem is even more worrisome when work is outsourced to companies that may have lower security and privacy standards than large technology companies.

Facebook says its legal and privacy teams approve all data tagging efforts, and the company told Reuters that it recently introduced an audit system "to ensure that privacy expectations are met. and that the parameters are working, as expected. "

However, the company may still be in breach of the European Union's recent GDPR regulations, which set strict limits on how companies can collect and use personal data.

Facebook says that the data tagged by human workers are used to train a series of machine learning systems. This includes recommending content in the purchasing function of the company's Market; describing photos and videos for users with visual disabilities; and order the publications so that certain ads do not appear next to the political content or for adults.