Video Saliency Detection by using an Enhance Methodology Involving a Combination of 3DCNN with Histograms

Authors

  • Suresh Kumar R Chennai Institute of Technology, India
  • Mahalakshmi P SRM Institute of Science and Technology, Chennai, India
  • Jothilakshmi R R.M.D. Engineering College, Chennai, India
  • Kavitha M S R.M.K. Engineering College, Chennai, India
  • Balamuralitharan S SRM Institute of Science and Technology, Chennai, India

DOI:

https://doi.org/10.15837/ijccc.2022.2.4299

Keywords:

Histogram of optical flow (HoF), Histogram of oriented gradient (HoG), Human Visual System (HVS), Saliency detection, salient object detection, salient region detection

Abstract

When watching pictures or videos, the Human Visual System has the potential to concentrate on important locations. Saliency detection is a tool for detecting the abnormality and randomness of images or videos by replicating the human visual system. Video saliency detection has received a lot of attention in recent decades, but due to challenging temporal abstraction and fusion for spatial saliency, computational modelling of spatial perception for video sequences is still limited.Unlike methods for detection of salient objects in still images, one of the most difficult aspects of video saliency detection is figuring out how to isolate and integrate spatial and temporal features.Saliency detection, which is basically a tool to recognize areas in images and videos that catch the attention of the human visual system, may benefit multimedia applications such as video or image retrieval, copy detection, and so on. As the two crucial steps in trajectory-based video classification methods are feature point identification and local feature extraction. We suggest a new spatio-temporal saliency detection using an enhanced 3D Conventional neural network with an inclusion of histogram for optical and orient gradient in this paper.

References

[1] A. Borji and L. Itti, "State-of-the-art in visual attention mod-eling,"IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no.1,pp.185-207,2013. https://doi.org/10.1109/TPAMI.2012.89

[2] A. Borji, M.-M. Cheng, H. Jiang, and J. Li, "Salient object detection: A benchmark", IEEE Trans. Image Process., vol. 24, no. 12, pp. 5706-5722, 2015. https://doi.org/10.1109/TIP.2015.2487833

[3] X. Shen and Y.Wu, "A unified approach to salient object detection via low rank matrix recovery," in Proc. IEEE CVPR, Providence, RI, USA, 2012, pp. 853-860.

[4] H. Kim, Y. Kim, J.-Y. Sim, and C.-S. Kim, "Spatiotemporal saliency detection for video sequences based on random walk with restart," IEEE Trans. Image Process., vol. 24, no. 8, pp. 2552-2564, Aug. 2015. https://doi.org/10.1109/TIP.2015.2425544

[5] W.Wang, J. Shen, and L. Shao, "Video salient object detection via fully convolutional networks," IEEE Trans. Image Process., to be published, doi: 10.1109/TIP.2017.2754941. https://doi.org/10.1109/TIP.2017.2754941

[6] J. Peng, J. Shen, and X. Li, "High-order energies for stereo segmentation," IEEE Trans. Cybern., vol. 46, no. 7, pp. 1616-1627, Jul. 2016. https://doi.org/10.1109/TCYB.2015.2453091

[7] F. Perazzi, P. Krí¤henbühl, Y. Pritch, and A. Hornung, "Saliency filters: Contrast based filtering for salient region detection," in Proc. IEEE CVPR, Providence, RI, USA, 2012, pp. 733-740. https://doi.org/10.1109/CVPR.2012.6247743

[8] .M.-M. Cheng, N. J. Mitra, X. Huang, P. H. S. Torr, and S.-M. Hu, "Global contrast based salient region detection," IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 3, pp. 569-582, Mar. 2015. https://doi.org/10.1109/TPAMI.2014.2345401

[9] W. Wang, Q. Lai, H. Fu, J. Shen, H. Ling, Salient object detection in the deep learning era: an in-depth survey, CoRR abs/1904.09146 (2019).

[10] L. Itti, C. Koch, E. Niebur, A model of saliency-based visual attention for rapid scene analysis, IEEE Trans. Pattern Anal. Mach. Intell. 20 (11) (1998) 1254-1259. https://doi.org/10.1109/34.730558

[11] J. Harel, C. Koch, P. Perona, Graph-based visual saliency, in: International Conference on Neural Information Processing Systems, 2006, pp. 545-552.

[12] P. Zhang, T. Zhuo, W. Huang, K. Chen, M. Kankanhalli, Online object tracking based on CNN with spatial-temporal saliency guided sampling, Neurocomputing 257 (2017) 115-127. https://doi.org/10.1016/j.neucom.2016.10.073

[13] J. Zhang, K.A. Ehinger, H.Wei, K. Zhang, J. Yang, A novel graph-based optimization framework for salient object detection, PatternRecognit. 64 (1) (2017) 39-50. https://doi.org/10.1016/j.patcog.2016.10.025

[14] H. Chen, Y. Li, D. Su, Multi-modal fusion network with multi-scale multi- path and cross-modal interactions for RGB-D salient object detection, Pattern Recognit. 1 (1) (2018).1-1.

[15] E. Macaluso, C.D. Frith, J. Driver, Directing attention to locations and to sensory modalities: multiple levels of selective processing revealed with PET, Cerebral Cortex 12 (4) (2002) 357-368. https://doi.org/10.1093/cercor/12.4.357

[16] T.S. Lee, D. Mumford, Hierarchical bayesian inference in the visual cortex, JOSAA 20 (7) (2003) 1434-1448. https://doi.org/10.1364/JOSAA.20.001434

[17] Q. Yan, L. Xu, J. Shi, J. Jia, Hierarchical saliency detection, in: IEEE Conference on Computer Vision and Pattern Recognition, 2013, pp. 1155-1162. https://doi.org/10.1109/CVPR.2013.153

[18] Achanta, R., Hemami, S., Estrada, F., Susstrunk, S.: Frequency-tuned salient region detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 1597-1604. IEEE (2009) https://doi.org/10.1109/CVPR.2009.5206596

[19] Cheng, M.M., Zhang, G.X., Mitra, N.J., Huang, X., Hu, S.M.: Global contrast based salient region detection. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 409-416. IEEE (2011) https://doi.org/10.1109/CVPR.2011.5995344

[20] Cui, X., Liu, Q., Metaxas, D.: Temporal spectral residual: fast motion saliency detection. In: Proceedings of the ACM International Conference on Multimedia (2009). https://doi.org/10.1145/1631272.1631370

[21] B. X. Nie, P. Wei, and S.-C. Zhu, "Monocular 3D human pose estimation by predicting depth on joints." in IEEE International Conference on Computer Vision, 2017 https://doi.org/10.1109/ICCV.2017.373

[22] D. Zhang, J. Han, C. Li, J. Wang, and X. Li, "Detection of co-salient objects by looking deep and wide", International Journal of Computer Vision, vol. 120, no. 2, pp. 215-232, 2016. https://doi.org/10.1007/s11263-016-0907-4

[23] X. Dong et al., "Occlusion-aware real-time object tracking," IEEE Trans. Multimedia, vol. 19, no. 4, pp. 763-771, Apr. 2017. https://doi.org/10.1109/TMM.2016.2631884

[24] X. Dong, J. Shen, L. Shao, and L. Van Gool, "Sub-Markov random walk for image segmentation," IEEE Trans. Image Process., vol. 25, no. 2, pp. 516-527, Feb. 2016. https://doi.org/10.1109/TIP.2015.2505184

[25] J. Shen et al., "Real-time superpixel segmentation by DBSCAN clustering algorithm", IEEE Trans. Image Process., vol. 25, no. 12, pp. 5933-5942, Dec. 2016. https://doi.org/10.1109/TIP.2016.2616302

[26] Y. Yuan, C. Li, J. Kim, W. Cai, D.D. Feng, Dense and sparse labeling with multidimensional features for saliency detection, IEEE Trans. Circuits Syst. Video Technol. 28 (5) (2018) 1130-1143. https://doi.org/10.1109/TCSVT.2016.2646720

[27] W. Wang, J. Shen, F. Guo, M.-M. Cheng, A. Borji, Revisiting video saliency: a large-scale benchmark and a new model, in: IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 4894-4903. https://doi.org/10.1109/CVPR.2018.00514

[28] Li Q., Chen S., Zhang B. (2012) Predictive Video Saliency Detection. In: Liu CL., Zhang C., Wang L. (eds) Pattern Recognition. CCPR 2012. Communications in Computer and Information Science, vol 321. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33506-8_23. https://doi.org/10.1007/978-3-642-33506-8_23

[29] Wang, Wenguan et al. "Deep Learning For Video Saliency Detection." ArXiv abs/1702. 00871 (2017): n. pag.

[30] F. Guo et al., "Video Saliency Detection Using Object Proposals," in IEEE Transactions on Cybernetics, vol. 48, no. 11, pp. 3159-3170, Nov. 2018, doi: 10.1109/TCYB.2017.2761361. https://doi.org/10.1109/TCYB.2017.2761361

[31] Karthik, A., MazherIqbal, J.L. Efficient Speech Enhancement Using Recurrent Convolution Encoder and Decoder. Wireless Pers Commun 119, 1959-1973 (2021). https://doi.org/10.1007/s11277-021-08313-6

[32] Yuming Fang, Xiaoqiang Zhang, Feiniu Yuan, NevrezImamoglu, Haiwen Liu, Video saliency detection by gestalt theory, Pattern Recognition, Volume 96,2019,106987, ISSN 0031-3203. https://doi.org/10.1016/j.patcog.2019.106987

[33] https://docs.microsoft.com/en-us/cpp/build/reference/clr common language runtime compilation? View = msvc-160

[34] https://docs.microsoft.com/en-us/cpp/dotnet/walkthrough-compiling-a-cpp-program-thattargets- the-clr-in-visual-studio?view=msvc-160

[35] https://en.wikipedia.org/wiki/Common_Language_Runtime

[36] https://www.red-gate.com/simple-talk/dotnet/net-development/creating-ccli-wrapper/

[37] Wang, Bofei et al. "Object-based Spatial Similarity for Semi-supervised Video Object Segmentation." (2019).

[38] Li F., Kim T., Humayun A., Tsai D., Rehg J. M.,"Video Segmentation byTracking Many Figure- Ground Segments" In:IEEE International Conference onComputer Vision (ICCV), 2013. https://doi.org/10.1109/ICCV.2013.273

Additional Files

Published

2022-02-18

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.