A New Semantic-Based Tool Detection Method for Robots


  • Wenbai Chen Beijing Information Science & Technology University
  • Chao He
  • Chen W.Z.
  • Chen Q.L.
  • Wu P.L.


Functional Components, Mask-R-CNN Network, Tool Classification, Functional Semantics


Home helper robots have become more acceptable due to their excellent image recognition ability. However, some common household tools remain challenging to recognize, classify, and use by robots. We designed a detection method for the functional components of common household tools based on the mask regional convolutional neural network (Mask-R-CNN). This method is a multitask branching target detection algorithm that includes tool classification, target box regression, and semantic segmentation. It provides accurate recognition of the functional components of tools. The method is compared with existing algorithms on the dataset UMD Part Affordance dataset and exhibits effective instance segmentation and key point detection, with higher accuracy and robustness than two traditional algorithms. The proposed method helps the robot understand and use household tools better than traditional object detection algorithms.


[1] Gibson, J. J. (1977). The theory of affordances. Hilldale, USA, 1(2).

[2] Zhu, Y., Fathi, A., Fei-Fei, L. (2014, September). Reasoning about object affordances in a knowledge base representation. In European conference on computer vision (pp. 408-424). Springer, Cham. https://doi.org/10.1007/978-3-319-10605-2_27

[3] Greene, M. R., Baldassano, C., Esteva, A., Beck, D. M., Fei-Fei, L. (2014). Affordances provide a fundamental categorization principle for visual scenes. arXiv preprint arXiv:1411.5340. https://doi.org/10.1167/15.12.572

[4] Koppula, H. S., Saxena, A. (2014, September). Physically grounded spatio-temporal object affordances. In European Conference on Computer Vision (pp. 831-847). Springer, Cham. https://doi.org/10.1007/978-3-319-10578-9_54

[5] Stark, L., Bowyer, K. (1994). Function-based generic recognition for multiple object categories. CVGIP: Image Understanding, 59(1), 1-21. https://doi.org/10.1006/ciun.1994.1001

[6] Bohg, J., Kragic, D. (2009, June). Grasping familiar objects using shape context. In 2009 International Conference on Advanced Robotics (pp. 1-6). IEEE.

[7] Saxena, A., Driemeyer, J., Ng, A. Y. (2008). Robotic grasping of novel objects using vision. The International Journal of Robotics Research, 27(2), 157-173. https://doi.org/10.1177/0278364907087172

[8] Stark, M., Lies, P., Zillich, M., Wyatt, J., Schiele, B. (2008, May). Functional object class detection based on learned affordance cues. In International conference on computer vision systems (pp. 435-444). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-79547-6_42

[9] Grabner, H., Gall, J., Van Gool, L.(2011, June). What makes a chair a chair?. In CVPR 2011 (pp. 1529-1536). IEEE. https://doi.org/10.1109/CVPR.2011.5995327

[10] Kjellstrí¶m, H., Romero, J., Kragic, D. (2011). Visual object-action recognition: Inferring object affordances from human demonstration. Computer Vision and Image Understanding, 115(1), 81- 90. https://doi.org/10.1016/j.cviu.2010.08.002

[11] Zhu, Y., Zhao, Y., Chun Zhu, S. (2015). Understanding tools: Task-oriented object modeling, learning and recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2855-2864). https://doi.org/10.1109/CVPR.2015.7298903

[12] Hassan, M., Dharmaratne, A. (2015, November). Attribute based affordance detection from human-object interaction images. In Image and Video Technology (pp. 220-232). Springer, Cham. https://doi.org/10.1007/978-3-319-30285-0_18

[13] Kemp, C. C., Edsinger, A. (2006, June). Robot manipulation of human tools: Autonomous detection and control of task relevant features. In Proc. of the Fifth Intl. Conference on Development and Learning (Vol. 42).

[14] Mar, T., Tikhanoff, V., Metta, G., Natale, L. (2015, November). Multi-model approach based on 3D functional features for tool affordance learning in robotics. In 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids) (pp. 482-489). IEEE. https://doi.org/10.1109/HUMANOIDS.2015.7363593

[15] Lenz, I., Lee, H., Saxena, A. (2015). Deep learning for detecting robotic grasps. The International Journal of Robotics Research, 34(4-5), 705-724. https://doi.org/10.1177/0278364914549607

[16] Redmon, J., Angelova, A. (2015, May). Real-time grasp detection using convolutional neural networks. In 2015 IEEE International Conference on Robotics and Automation (ICRA) (pp. 1316-1322). IEEE. https://doi.org/10.1109/ICRA.2015.7139361

[17] Myers, A., Teo, C. L., Fermüller, C., Aloimonos, Y. (2015, May). Affordance detection of tool parts from geometric features. In 2015 IEEE International Conference on Robotics and Automation (ICRA) (pp. 1374-1381). IEEE. https://doi.org/10.1109/ICRA.2015.7139369

[18] Abelha, P., Guerin, F., Schoeler, M. (2016, May). A model-based approach to finding substitute tools in 3d vision data. In 2016 IEEE International Conference on Robotics and Automation (ICRA) (pp. 2471-2478). IEEE. https://doi.org/10.1109/ICRA.2016.7487400

[19] Schoeler, M., Wí¶rgí¶tter, F. (2015). Bootstrapping the semantics of tools: Affordance analysis of real world objects on a per-part basis. IEEE Transactions on Cognitive and Developmental Systems, 8(2), 84-98. https://doi.org/10.1109/TAMD.2015.2488284

[20] Massa, F., Girshick, R. (2018). maskrcnn-benchmark: Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch. Accessed: Apr, 29, 2019.

[21] [Online]. Available: https://github.com/facebookresearch/maskR-CNN-benchmark

[22] Peiliang Wu, Ben He, Lingfu Kong. (2017) A classification method of household daily tools based on functional semantic combination of components. Robots, 39(06): 786-794.

[23] Lakani, S. R., Rodrí­guez-Sánchez, A. J., Piater, J. (2019). Towards affordance detection for robot manipulation using affordance for parts and parts for affordance. Autonomous Robots, 43(5), 1155- 1172. https://doi.org/10.1007/s10514-018-9787-5

Additional Files



Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.