Optimizing Symbolic Execution Path Exploration with a Transfer Learning-Based Strategy

Te Sun; Dongqing Zhu; Lianying He; Dalin Zhang

doi:10.15837/ijccc.2025.5.6885

Authors

Te Sun School of Cyberspace Science and Technology, Beijing Jiaotong University, Beijing, China
Dongqing Zhu Beijing Jiaotong University, Beijing, China
Lianying He Beijing Jiaotong University, Beijing, China
Dalin Zhang Beijing Jiaotong University, Beijing, China

DOI:

https://doi.org/10.15837/ijccc.2025.5.6885

Keywords:

Transfer Learning, Symbolic Execution, Path Exploration Strategy, Reward Value, Symbolic State

Abstract

aSymbolic execution is an important software analysis technique, but it faces challenges such as path explosion, which leads to a reduction in efficiency. Existing path exploration strategies, such as Random State Search, typically exhibit poor adaptability to real-world programs and lack effective path selection strategies. To address these challenges, this paper proposes a Transfer Learning-based Symbolic Execution Path Exploration Strategy, TLS (Transfer Learning Search). We adopt a transfer learning method based on functional classification to optimize existing symbolic execution strategies. Specifically, real-world programs are classified according to their functional characteristics, and transfer learning is applied by freezing partial layers of existing neural networks with training sets from each program family that better reflect its features. Multiple models are trained based on different training sets to adapt to various program families. Experimental results show that this strategy solves the problem of insufficient training data for real-world programs. Compared to traditional heuristic methods such as random-path (rps) and random-state (rss) strategies, this approach significantly improves instruction coverage and branch coverage on specific program families. For example, in the Grep program test, branch coverage increased by approximately fifteen percentage points, generating more test cases. This approach provides a new and effective solution to the adaptability problem of symbolic execution for complex programs.

References

Baldoni, R.; Coppa, E.; D'Elia, D.C.; Demetrescu, C.; Finocchi, I. (2018). A survey of symbolic execution techniques, ACM Computing Surveys (CSUR), 51(3), 1-39, 2018. https://doi.org/10.1145/3182657

Budd, S.; Robinson, E.C.; Kainz, B. (2021). A survey on active learning and human-in-the-loop deep learning for medical image analysis, Medical Image Analysis, 71, 102062, 2021. https://doi.org/10.1016/j.media.2021.102062

Burnim, J.; Sen, K. (2008). Heuristics for scalable dynamic test generation, 2008 23rd IEEE/ACM International Conference on Automated Software Engineering, IEEE, 2008. https://doi.org/10.1109/ASE.2008.69

Busse, F.; Nowack, M.; Cadar, C. (2020). Running symbolic execution forever, Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2020. https://doi.org/10.1145/3395363.3397360

Cadar, C.; Dunbar, D.; Engler, D.R. (2008). Klee: unassisted and automatic generation of high-coverage tests for complex systems programs, OSDI, 8, 209-224, 2008.

Cha, S.; Hong, S.; Lee, J.; Oh, H. (2018). Automatically generating search heuristics for concolic testing, Proceedings of the 40th International Conference on Software Engineering, 2018. https://doi.org/10.1145/3180155.3180166

Cohn, D.; Atlas, L.; Ladner, R. (1994). Improving generalization with active learning, Machine Learning, 15, 201-221, 1994. https://doi.org/10.1023/A:1022673506211

Eldan, R.; Shamir, O. (2016). The power of depth for feedforward neural networks, Conference on Learning Theory, PMLR, 2016.

Fine, T.L. (2006). Feedforward neural network methodology, Springer Science & Business Media, 2006.

Godefroid, P.; Klarlund, N.; Sen, K. (2005). DART: Directed automated random testing, Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, 2005. https://doi.org/10.1145/1065010.1065036

Guo, S.; Wu, M.; Wang, C. (2018). Adversarial symbolic execution for detecting concurrency-related cache timing leaks, Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2018. https://doi.org/10.1145/3236024.3236028

He, J.; Sivanrupan, G.; Tsankov, P.; Vechev, M. (2021). Learning to explore paths for symbolic execution, Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, 2021. https://doi.org/10.1145/3460120.3484813

Hosna, A.; Merry, E.; Gyalmo, J.; Alom, Z.; Aung, Z.; Azim, M.A. (2022). Transfer learning: a friendly introduction, Journal of Big Data, 9(1), 102, 2022. https://doi.org/10.1186/s40537-022-00652-w

Kaur, T.; Gandhi, T.K. (2020). Deep convolutional neural networks with transfer learning for automated brain image classification, Machine Vision and Applications, 31(3), 20, 2020. https://doi.org/10.1007/s00138-020-01069-2

Kim, H.E.; Cosa-Linan, A.; Santhanam, N.; Jannesari, M.; Maros, M.E.; Ganslandt, T. (2022). Transfer learning for medical image classification: a literature review, BMC Medical Imaging, 22(1), 69, 2022. https://doi.org/10.1186/s12880-022-00793-7

King, J.C. (1976). Symbolic execution and program testing, Communications of the ACM, 19(7), 385-394, 1976. https://doi.org/10.1145/360248.360252

Kurian, E.; Briola, D.; Braione, P.; et al. (2023). Automatically generating test cases for safety-critical software via symbolic execution, Journal of Systems and Software, 199, 2023, 111629. https://doi.org/10.1016/j.jss.2023.111629

Kuznetsov, V.; Kinder, J.; Bucur, S.; Candea, G. (2012). Efficient state merging in symbolic execution, ACM Sigplan Notices, 47(6), 193-204, 2012. https://doi.org/10.1145/2345156.2254088

Li, Y.; Su, Z.; Wang, L.; Li, X. (2013). Steering symbolic execution to less traveled paths, ACM SigPlan Notices, 48(10), 19-32, 2013. https://doi.org/10.1145/2544173.2509553

Liu, W.; Zhang, H.; Ding, Z.; Liu, Q.; Zhu, C. (2021). A comprehensive active learning method for multiclass imbalanced data streams with concept drift, Knowledge-Based Systems, 215, 106778, 2021. https://doi.org/10.1016/j.knosys.2021.106778

Mohamad, S.; Sayed-Mouchaweh, M.; Bouchachia, A. (2018). Active learning for classifying data streams with unknown number of classes, Neural Networks, 98, 1-15, 2018. https://doi.org/10.1016/j.neunet.2017.10.004

Neyshabur, B.; Sedghi, H.; Zhang, C. (2020). What is being transferred in transfer learning?, Advances in Neural Information Processing Systems, 33, 512-523, 2020.

Pan, S.J.; Yang, Q. (2009). A survey on transfer learning, IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345-1359, 2009. https://doi.org/10.1109/TKDE.2009.191

Păsăreanu, C.S.; Rungta, N. (2010). Symbolic PathFinder: symbolic execution of Java bytecode, Proceedings of the 25th IEEE/ACM International Conference on Automated Software Engineering, 2010. https://doi.org/10.1145/1858996.1859035

Radford, A. (2018). Improving language understanding by generative pre-training, 2018.

Settles, B. (2009). Active learning literature survey, University of Wisconsin-Madison Department of Computer Sciences, 2009.

Siddiqui, J.H.; Khurshid, S. (2012). Scaling symbolic execution using ranged analysis, ACM Sigplan Notices, 47(10), 523-536, 2012. https://doi.org/10.1145/2398857.2384654

Sobolu, R.; Stanca, L.; Bodog, S. A. (2023). Automated Recognition Systems: Theoretical and Practical Implementation of Active Learning for Extracting Knowledge in Image-based Transfer Learning of Living Organisms, International Journal of Computers Communications & Control, 18(6), 2023. https://doi.org/10.15837/ijccc.2023.6.5728

Weiss, K.; Khoshgoftaar, T. M.; Wang, D. D. (2016). A survey of transfer learning, Journal of Big Data, 3, 1-40, 2016. https://doi.org/10.1186/s40537-016-0043-6

Wei, G.; Jia, S.; Gao, R.; et al. (2023). Compiling parallel symbolic execution with continuations, 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), IEEE, 2023, pp. 1316-1328. https://doi.org/10.1109/ICSE48619.2023.00116

Wong, K.; Dornberger, R.; Hanne, T. (2024). An analysis of weight initialization methods in connection with different activation functions for feedforward neural networks, Evolutionary Intelligence, 17(3), 2024, pp. 2081-2089. https://doi.org/10.1007/s12065-022-00795-y

Xie, T.; Tillmann, N.; De Halleux, J.; Schulte, W. (2009). Fitness-guided path exploration in dynamic symbolic execution, 2009 IEEE/IFIP International Conference on Dependable Systems & Networks, IEEE, 2009. https://doi.org/10.1109/DSN.2009.5270315

Zhang, R.; Deutschbein, C.; Huang, P.; Sturton, C. (2018). End-to-end automated exploit generation for validating the security of processor designs, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), IEEE, 2018. https://doi.org/10.1109/MICRO.2018.00071

Zhu, Z.; Lin, K.; Jain, A.K.; Zhou, J. (2023). Transfer learning in deep reinforcement learning: A survey, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023. https://doi.org/10.1109/TPAMI.2023.3292075

Zhu, D.; Zhang, J.; He, L.; Wang, R.; Liu, J.; Zhang, D. (2024). Path Exploration Strategy Based on Active Learning for Symbolic Execution, Submitted for publication.

Zhuang, F.; Qi, Z.; Duan, K.; Xi, D.; Zhu, Y.; Zhu, H.; Xiong, H.; He, Q. (2020). A comprehensive survey on transfer learning, Proceedings of the IEEE, 109(1), 43-76, 2020. https://doi.org/10.1109/JPROC.2020.3004555

Optimizing Symbolic Execution Path Exploration with a Transfer Learning-Based Strategy

Authors

DOI:

Keywords:

Abstract

References

Additional Files

Published

Issue

Section

License

Most read articles by the same author(s)