Late Adapter Tuning: A Cost-Effective Approach to Parameter-Efficient Fine-Tuning for Large Language Models

Authors

  • Zhengjie Gao School of Electronic and Information Engineering, Geely University of China
  • Rongcheng Li School of Computer Science, Chengdu University of Information and Technology, Chengdu, Sichuan, China
  • Yuxin Fan School of Electronic and Information Engineering, Geely University of China, Chengdu, Sichuan, China
  • Min Liao School of Electronic and Information Engineering, Geely University of China, Chengdu, Sichuan, China
  • Xinyu Song School of Electronic and Information Engineering, Geely University of China, Chengdu, Sichuan, China

DOI:

https://doi.org/10.15837/ijccc.2025.6.6928

Keywords:

Large Language Models, Parameter-Efficient Tuning, Adapter Tuning, Text Classification, Computational Efficiency

Abstract

Fine-tuning large language models (LLMs) is computationally prohibitive for individual researchers, especially in resource-constrained scenarios. While parameter-efficient fine-tuning (PEFT) methods address this challenge, existing approaches suffer from inefficiencies due to long backpropagation paths and hidden vector distortion. To overcome these limitations, we propose Late Adapter Tuning (LAT), a novel PEFT method that optimizes training costs by fine-tuning only a single hidden layer near the model’s output. LAT integrates a customized adapter architecture with hard prompting to preserve hidden vector dimensions and shorten gradient propagation paths. Experiments on four classification datasets demonstrate that LAT reduces training time by 2.4×, decreases GPU memory usage by 76.5%, and improves accuracy by 4.31% compared to fullparameter fine-tuning. Our work provides a practical solution for deploying LLMs in low-resource environments while advancing the theoretical understanding of gradient-efficient adaptation strategies.

References

Devlin J, Chang M W, Lee K, et al. (2019). Bert: Pre-training of deep bidirectional transformers for language understanding, Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers). pp. 4171-4186.

Liu, Y. (2019). Roberta: A robustly optimized bert pretraining approach, arXiv preprint arXiv:1907.11692, vol. 364.

Gao, T.; Fisch, A.; Chen, D. (2021). Making Pre-trained Language Models Better Few-shot Learners, In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 3816-3830. https://doi.org/10.18653/v1/2021.acl-long.295

Raffel, C.; et al. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer, Journal of Machine Learning Research, vol. 21, no. 140, pp. 1-67.

Chowdhery, A.; et al. (2023). Palm: Scaling language modeling with pathways, Journal of Machine Learning Research, vol. 24, no. 240, pp. 1-113.

Canchila, S.; Meneses-Eraso, C.; Casanoves-Boix, J.; Cortés-Pellicer, P.; Castelló-Sirvent, F. (2024). Natural language processing: An overview of models, transformers and applied practices, Computer Science and Information Systems, no. 00, pp. 31-31.

Menta, A.; Garcia-Serrano, A. (2024). Reaching quality and efficiency with a parameter-efficient controllable sentence simplification approach, Computer Science and Information Systems, no. 00, pp. 17-17.

Liu, H., Ma, Y., Gao, C., Qi, J. & Zhang, D. (2023). Chinese Named Entity Recognition Method for Domain-Specific Text. Tehnički vjesnik, 30 (6), 1799-1808. https://doi.org/10.17559/TV-20230324000477

Cheng, Y., Wan, Y., Sima, Y., Zhang, Y., Hu, S. & Wu, S. (2022). Text Detection of Transformer Based on Deep Learning Algorithm. Tehnički vjesnik, 29 (3), 861-866. https://doi.org/10.17559/TV-20211027110610

Kaplan, J.; et al. (2020). Scaling laws for neural language models, arXiv preprint arXiv:2001.08361.

Muntean, I.; Mois, G. D.; Folea, S. C. (2021). Development and Analysis of Low-Cost IoT Sensors for Urban Environmental Monitoring, International Journal of Computers Communications & Control, vol. 16, no. 5. https://doi.org/10.15837/ijccc.2021.5.4260

Ding, N.; et al. (2023). Parameter-efficient fine-tuning of large-scale pre-trained language models, Nature Machine Intelligence, vol. 5, no. 3, pp. 220-235. https://doi.org/10.1038/s42256-023-00626-4

Houlsby, N.; et al. (2019). Parameter-efficient transfer learning for NLP, In International Conference on Machine Learning, PMLR, pp. 2790-2799.

Peters, M. E.; Ruder, S.; Smith, N. A. (2019). To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks, In Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019), pp. 7-14. https://doi.org/10.18653/v1/W19-4302

Karimi Mahabadi, R.; Henderson, J.; Ruder, S. (2021). Compacter: Efficient low-rank hypercomplex adapter layers, Advances in Neural Information Processing Systems, vol. 34, pp. 1022-1035.

Aghajanyan, A.; Gupta, S.; Zettlemoyer, L. (2021). Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning, In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 7319-7328. https://doi.org/10.18653/v1/2021.acl-long.568

Rumelhart, D. E.; Hinton, G. E.; Williams, R. J. (1986). Learning representations by backpropagating errors, Nature, vol. 323, no. 6088, pp. 533-536. https://doi.org/10.1038/323533a0

Liu, X.; Sun, T.; Huang, X.-J.; Qiu, X. (2022). Late Prompt Tuning: A Late Prompt Could Be Better Than Many Prompts, In Findings of the Association for Computational Linguistics: EMNLP 2022, pp. 1325-1338. https://doi.org/10.18653/v1/2022.findings-emnlp.95

Zhu, W.; Tan, M. (2023). Improving Prompt Tuning with Learned Prompting Layers, arXiv preprint arXiv:2310.20127.

Sun, T.; Shao, Y.; Qian, H.; Huang, X.; Qiu, X. (2022). Black-box tuning for language-model-asa- service, In International Conference on Machine Learning, PMLR, pp. 20841-20855.

Hansen, N.; Ostermeier, A. (2001). Completely derandomized self-adaptation in evolution strategies, Evolutionary Computation, vol. 9, no. 2, pp. 159-195. https://doi.org/10.1162/106365601750190398

Hansen, N.; Müller, S. D.; Koumoutsakos, P. (2003). Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES), Evolutionary Computation, vol. 11, no. 1, pp. 1-18. https://doi.org/10.1162/106365603321828970

Rios, L. M.; Sahinidis, N. V. (2013). Derivative-free optimization: a review of algorithms and comparison of software implementations, Journal of Global Optimization, vol. 56, no. 3, pp. 1247-1293. https://doi.org/10.1007/s10898-012-9951-y

Lei, T.; et al. (2023). Conditional adapters: Parameter-efficient transfer learning with fast inference, Advances in Neural Information Processing Systems, vol. 36, pp. 8152-8172.

Pfeiffer, J.; Kamath, A.; Rückle, A.; Cho, K.; Gurevych, I. (2021). AdapterFusion: Non- Destructive Task Composition for Transfer Learning, In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 487-503. https://doi.org/10.18653/v1/2021.eacl-main.39

Zhao, H.; Fu, J.; He, Z. (2023). Prototype-based HyperAdapter for Sample-Efficient Multi-task Tuning, In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pp. 4603-4615. https://doi.org/10.18653/v1/2023.emnlp-main.280

Chronopoulou, A.; Peters, M. E.; Fraser, A.; Dodge, J. (2023). AdapterSoup: Weight Averaging to Improve Generalization of Pretrained Language Models, In Findings of the Association for Computational Linguistics: EACL 2023, pp. 2054-2063. https://doi.org/10.18653/v1/2023.findings-eacl.153

Vaswani, A. (2017). Attention is all you need, Advances in Neural Information Processing Systems.

Lester, B.; Al-Rfou, R.; Constant, N. (2021). The Power of Scale for Parameter-Efficient Prompt Tuning, In Proceedings of the 2021 Conference on Empirical Methods in Natural. https://doi.org/10.18653/v1/2021.emnlp-main.243

Li, J.; Aitken, W.; Bhambhoria, R.; Zhu, X. (2023). Prefix Propagation: Parameter-Efficient Tuning for Long Sequences, In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 1408-1419. https://doi.org/10.18653/v1/2023.acl-short.120

Zhang, Z.-R.; Tan, C.; Xu, H.; Wang, C.; Huang, J.; Huang, S. (2023). Towards Adaptive Prefix Tuning for Parameter-Efficient Language Model Fine-tuning, In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 1239- 1248. https://doi.org/10.18653/v1/2023.acl-short.107

Liu, X.; et al. (2024). GPT understands, too, AI Open, vol. 5, pp. 208-215. https://doi.org/10.1016/j.aiopen.2023.08.012

Liu, H.; et al. (2022). Few-shot parameter-efficient fine-tuning is better and cheaper than incontext learning, Advances in Neural Information Processing Systems, vol. 35, pp. 1950-1965.

Zaken, E. B.; Goldberg, Y.; Ravfogel, S. (2022). BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models, In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 1-9.

Lee, J.; Tang, R.; Lin, J. (2019). What would elsa do? Freezing layers during transformer finetuning, arXiv preprint arXiv:1911.03090.

Hu, E. J.; et al. (2021). Lora: Low-rank adaptation of large language models, arXiv preprint arXiv:2106.09685.

Dettmers, T.; Pagnoni, A.; Holtzman, A.; Zettlemoyer, L. (2024). Qlora: Efficient finetuning of quantized LLMs, Advances in Neural Information Processing Systems, vol. 36.

Valipour, M.; Rezagholizadeh, M.; Kobyzev, I.; Ghodsi, A. (2023). DyLoRA: Parameter-Efficient Tuning of Pre-trained Models using Dynamic Search-Free Low-Rank Adaptation, In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pp. 3274-3287. https://doi.org/10.18653/v1/2023.eacl-main.239

Luo, D.; Zheng, K.; Wu, C.; Wang, X.; Wang, J. (2025). ERAT-DLoRA: Parameter-efficient tuning with enhanced range adaptation in time and depth aware dynamic LoRA, Neurocomputing, vol. 614, p. 128778. https://doi.org/10.1016/j.neucom.2024.128778

Ren, P.; et al. (2024). Melora: Mini-ensemble low-rank adapters for parameter-efficient finetuning, In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 3052-3064. https://doi.org/10.18653/v1/2024.acl-long.168

Mao, Y.; et al. (2022). UniPELT: A Unified Framework for Parameter-Efficient Language Model Tuning, In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 6253-6264. https://doi.org/10.18653/v1/2022.acl-long.433

Liu, P.; Yuan, W.; Fu, J.; Jiang, Z.; Hayashi, H.; Neubig, G. (2023). Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing, ACM Computing Surveys, vol. 55, no. 9, pp. 1-35. https://doi.org/10.1145/3560815

Wiebe, J.; Wilson, T.; Cardie, C. (2005). Annotating expressions of opinions and emotions in language, Language Resources and Evaluation, vol. 39, pp. 165-210. https://doi.org/10.1007/s10579-005-7880-9

Pang, B.; Lee, L. (2005). Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales, In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), pp. 115-124. https://doi.org/10.3115/1219840.1219855

Voorhees, E. M.; Tice, D. M. (2000). Building a question answering test collection, In Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 200-207. https://doi.org/10.1145/345508.345577

Wolf, T.; et al. (2020). Transformers: State-of-the-art natural language processing, In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38-45.

Rücklé, A.; et al. (2021). AdapterDrop: On the Efficiency of Adapters in Transformers, In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 7930- 7946. https://doi.org/10.18653/v1/2021.emnlp-main.626

Liu, X.; et al. (2022). P-Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks, In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 61-68. https://doi.org/10.18653/v1/2022.acl-short.8

Wu, Z.; et al. (2022). IDPG: An Instance-Dependent Prompt Generation Method, In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 5507-5521. https://doi.org/10.18653/v1/2022.naacl-main.403

Zaib, M.; Zhang, W. E.; Sheng, Q. Z.; Mahmood, A.; Zhang, Y. (2022). Conversational question answering: A survey, Knowledge and Information Systems, vol. 64, no. 12, pp. 3151-3195. https://doi.org/10.1007/s10115-022-01744-y

Alshammari, A.; Alzaidi, S. A.; SK, K. (2024). Enhancing Text Summarization with Linguistic Prompting and Reinforcement Learning: A Human-Centered Approach, Tehnički Vjesnik, vol. 31, no. 5, pp. 1431-1437. https://doi.org/10.17559/TV-20231213001202

Additional Files

Published

2025-11-05

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.