Suppose that you want to recognize an object, but you don’t have any images of that object. Standard deep learning will fail without training samples. Now suppose that you have knowledge about its parts. Often, images of (everyday) parts are available. We have developed a technique, ZERO, to recognize unseen objects by analyzing its parts.

ZERO is based on deep learning, the model is a graph, and the parts are analyzed by few-shot object detection. We experiment by recognizing bicycles while only having images of wheels, saddles, handlebars and pedals. The performance is shown in the ROC graph below.

Interestingly, we found that you need quite some images of the full object to beat our zero-shot graph!