MIT and IBM research team released a data set “ ObjectNet '' that collected only strange images to overcome the “ blind spot '' of the image recognition model
Image recognition models using artificial intelligence (AI) are intended to accurately identify objects that appear in photos and videos, and are applied to various things such as the external recognition function of automatic driving cars. The For example, in the case of an automatic driving car, the object recognition accuracy of the image recognition model is directly related to the safety of the automatic driving car, so the data set used for learning the model plays a very important role. So, the Massachusetts Institute of Technology (MIT) and IBM's team of researchers have created a data set for image recognition models that include a wide variety of objects.ObjectNet"
This object-recognition dataset stumped the world ’s best computer vision models | MIT News
“ObjectNet”, which is a data set for image recognition models, does not include the training set used to train image recognition models, and consists only of test sets to verify the accuracy of the model. The number of test sets of recorded images is the data set of the cloud source that caused the AI boomImageNetSame as 50,000.
ImageNetFlickrAlthough it is a data set that contains images collected through photo sharing services such as, ObjectNet is a data set that summarizes the photo data that you asked freelance photographers to shoot for a fee . By intentionally tilting the object sideways, shooting from strange angles that are not normally taken, or shooting in a messy room intentionally, it collects images that make image recognition difficult .
In the case of ImageNet (left), only the following easy-to-understand photos are recorded. On the other hand, in ObjectNet (right), a chair is placed in a messy room, the back of the chair is photographed, or a photograph that is difficult for humans to judge.
Image recognition model uses data setDeep learningWill improve the image recognition accuracy. However, even in a huge data set such as ImageNet, the images included in it have blind spots that there are no images such as “ back of chair '' or “ fallen chair '' as in the above example about. Therefore, an image recognition model learned with a conventional data set such as ImageNet cannot accurately recognize an image if it encounters an irregular case such as "back of chair" or "fallen chair".
ObjectNet also does not include a training set, unlike other datasets. Most data sets have a training set for learning the model and a test set for accuracy verification, but the two have high similarity and there are cases where accurate accuracy verification can not be done .
Actually, when we performed a recognition test of images recorded on ImageNet and ObjectNet using the main image recognition models, we succeeded in correctly recognizing images recorded on ImageNet with a maximum accuracy of 97%. In the case of, the accuracy seems to have dropped to about 50-55%. This is a manifestation of the fact that the image recognition model cannot recognize the back side of the object accurately, and IBM researcher Dan Gut Freund, who was involved in the development of ObjectNet, said, “ The architecture of the latest image recognition model Indicates that it does not incorporate the concept of recognizing the back side of objects or unusual angles. "
One who has done researchMIT Computer Science and Artificial Intelligence Laboratory(CSAIL) andCBMMOf research scientists working inBoris Katz“I need a better and smarter algorithm,” he says, about the image recognition model. Regarding ObjectNet, a conference on neural information processing systems held from December 8 to 14, 2019NeurlPSIt is said that the results will be announced.
“ If you want to know how well the algorithm works in the real world, Andre Bulb, who works as a researcher at CSAIL and CBMM, said, “ If you want to know how well the algorithm works in the real world, image recognition with images that you have never seen before We need to test the model, "explains that ObjectNet is a dataset created to validate rather than create an image recognition model.
Image data for ObjectNetAmazon Mechanical TurkBecause it was collected using, photos taken in countries around the world as well as in the United States. Therefore, various variations are included, such as some of the same banana photos, some of which are yellow and some of which are green.