This photograph, taken by a motion-activated camera trap, was identified as "zebra, moving" by an artificial intelligence model.
An increasingly popular, non-invasive way of monitoring wildlife is the use of motion-sensor cameras, commonly called camera traps. The images produced typically must be processed by humans who record which animals, if any, the cameras captured. A new study shows how an artificial intelligence (AI) program can be trained to label images nearly as accurately as humans do.
Researchers led by biologists Mohammad Norouzzadeh and Jeff Clune at the University of Wyoming applied a program to 3.2 million camera-trap images collected in the Serengeti National Park in Tanzania. As part of an online citizen science project called Snapshot Serengeti, tens of thousands of human volunteers had previously noted which species were present in each image.
The researchers used 1.4 million of these images—and their volunteer-supplied labels—to train nine different AI programs, called “deep neural networks,” to label new images. Deep neural networks are loosely modeled on the method by which animal brains work. Millions of artificial neurons communicate with each other to learn to process visual information at multiple levels, starting with simple features such as lines and colors, and building up to recognize entire animals.
After it had been trained, the researchers tested each deep neural network’s model on a set of 105,000 images that the networks had never previously “seen.” Compared to the labels provided by human experts, the combined models identified species with 94.9 percent accuracy, quite close to the 96.6 percent accuracy obtained by human volunteers. The AI programs also provided a confidence score for the accuracy of each label. The combined AI programs labeled 99.3 percent of images with 96.6 percent confidence. For improved accuracy, the researchers suggest that remaining low-confidence images can be labeled by humans and fed back into the training set of images.
Such AI programs can potentially be used to save thousands of hours of labeling, by using two or three thousand human-labeled images to train AI programs to label millions more images. The AI program used in this study has already been trained and tested on a set of 3.4 million camera-trap images from North America, where it performed similarly well. (Proceedings of the National Academy of Sciences)