The vast majority of computer vision research leads to technology that surveils human beings, a new preprint study that analyzed more than 20,000 computer vision papers and 11,000 patents spanning three decades has found. Crucially, the study found that computer vision papers often refer to human beings as “objects,” a convention that both obfuscates how common surveillance of humans is in the field, and objectifies humans by definition.
This still just feels like a muddying of technical language. If you were to write an article about autopilot killing somebody and use object to refer to them, that’s certainly dehumanization, but saying that an object detection algorithm performs poorly on humans doesn’t feel like it is.
Part of the problem is that in general we aren’t talking about specialized human detection models that incorporate things like pose estimation. Instead it is almost always a general object detection alg, and referring to the same models differently based on the subject just adds muddiness.
I’m mostly familiar with AI within healthcare, and in my workplace, any released model is going to have a number of conversations and evaluations about the technical performance, practical impact on patients, and general ethics of the model. Those conversations blend, but it’s harmful to make the language less clear in any one of those contexts.