Late Adapter Tuning: A Cost-Effective Approach to Parameter-Efficient Fine-Tuning for Large Language Models

Zhengjie Gao; Rongcheng Li; Yuxin Fan; Min Liao; Xinyu Song

doi:10.15837/ijccc.2025.6.6928

Authors

Zhengjie Gao School of Electronic and Information Engineering, Geely University of China
Rongcheng Li School of Computer Science, Chengdu University of Information and Technology, Chengdu, Sichuan, China
Yuxin Fan School of Electronic and Information Engineering, Geely University of China, Chengdu, Sichuan, China
Min Liao School of Electronic and Information Engineering, Geely University of China, Chengdu, Sichuan, China
Xinyu Song School of Electronic and Information Engineering, Geely University of China, Chengdu, Sichuan, China

DOI:

https://doi.org/10.15837/ijccc.2025.6.6928

Keywords:

Large Language Models, Parameter-Efficient Tuning, Adapter Tuning, Text Classification, Computational Efficiency

Abstract

Fine-tuning large language models (LLMs) is computationally prohibitive for individual researchers, especially in resource-constrained scenarios. While parameter-efficient fine-tuning (PEFT) methods address this challenge, existing approaches suffer from inefficiencies due to long backpropagation paths and hidden vector distortion. To overcome these limitations, we propose Late Adapter Tuning (LAT), a novel PEFT method that optimizes training costs by fine-tuning only a single hidden layer near the model’s output. LAT integrates a customized adapter architecture with hard prompting to preserve hidden vector dimensions and shorten gradient propagation paths. Experiments on four classification datasets demonstrate that LAT reduces training time by 2.4×, decreases GPU memory usage by 76.5%, and improves accuracy by 4.31% compared to fullparameter fine-tuning. Our work provides a practical solution for deploying LLMs in low-resource environments while advancing the theoretical understanding of gradient-efficient adaptation strategies.

References

Devlin J, Chang M W, Lee K, et al. (2019). Bert: Pre-training of deep bidirectional transformers for language understanding, Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers). pp. 4171-4186.

Liu, Y. (2019). Roberta: A robustly optimized bert pretraining approach, arXiv preprint arXiv:1907.11692, vol. 364.

Gao, T.; Fisch, A.; Chen, D. (2021). Making Pre-trained Language Models Better Few-shot Learners, In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 3816-3830. https://doi.org/10.18653/v1/2021.acl-long.295

Raffel, C.; et al. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer, Journal of Machine Learning Research, vol. 21, no. 140, pp. 1-67.

Chowdhery, A.; et al. (2023). Palm: Scaling language modeling with pathways, Journal of Machine Learning Research, vol. 24, no. 240, pp. 1-113.

Canchila, S.; Meneses-Eraso, C.; Casanoves-Boix, J.; Cortés-Pellicer, P.; Castelló-Sirvent, F. (2024). Natural language processing: An overview of models, transformers and applied practices, Computer Science and Information Systems, no. 00, pp. 31-31.

Menta, A.; Garcia-Serrano, A. (2024). Reaching quality and efficiency with a parameter-efficient controllable sentence simplification approach, Computer Science and Information Systems, no. 00, pp. 17-17.

Liu, H., Ma, Y., Gao, C., Qi, J. & Zhang, D. (2023). Chinese Named Entity Recognition Method for Domain-Specific Text. Tehnički vjesnik, 30 (6), 1799-1808. https://doi.org/10.17559/TV-20230324000477

Cheng, Y., Wan, Y., Sima, Y., Zhang, Y., Hu, S. & Wu, S. (2022). Text Detection of Transformer Based on Deep Learning Algorithm. Tehnički vjesnik, 29 (3), 861-866. https://doi.org/10.17559/TV-20211027110610

Kaplan, J.; et al. (2020). Scaling laws for neural language models, arXiv preprint arXiv:2001.08361.

Muntean, I.; Mois, G. D.; Folea, S. C. (2021). Development and Analysis of Low-Cost IoT Sensors for Urban Environmental Monitoring, International Journal of Computers Communications & Control, vol. 16, no. 5. https://doi.org/10.15837/ijccc.2021.5.4260

Ding, N.; et al. (2023). Parameter-efficient fine-tuning of large-scale pre-trained language models, Nature Machine Intelligence, vol. 5, no. 3, pp. 220-235. https://doi.org/10.1038/s42256-023-00626-4

Houlsby, N.; et al. (2019). Parameter-efficient transfer learning for NLP, In International Conference on Machine Learning, PMLR, pp. 2790-2799.

Peters, M. E.; Ruder, S.; Smith, N. A. (2019). To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks, In Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019), pp. 7-14. https://doi.org/10.18653/v1/W19-4302

Karimi Mahabadi, R.; Henderson, J.; Ruder, S. (2021). Compacter: Efficient low-rank hypercomplex adapter layers, Advances in Neural Information Processing Systems, vol. 34, pp. 1022-1035.

Aghajanyan, A.; Gupta, S.; Zettlemoyer, L. (2021). Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning, In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 7319-7328. https://doi.org/10.18653/v1/2021.acl-long.568

Rumelhart, D. E.; Hinton, G. E.; Williams, R. J. (1986). Learning representations by backpropagating errors, Nature, vol. 323, no. 6088, pp. 533-536. https://doi.org/10.1038/323533a0

Liu, X.; Sun, T.; Huang, X.-J.; Qiu, X. (2022). Late Prompt Tuning: A Late Prompt Could Be Better Than Many Prompts, In Findings of the Association for Computational Linguistics: EMNLP 2022, pp. 1325-1338. https://doi.org/10.18653/v1/2022.findings-emnlp.95

Zhu, W.; Tan, M. (2023). Improving Prompt Tuning with Learned Prompting Layers, arXiv preprint arXiv:2310.20127.

Sun, T.; Shao, Y.; Qian, H.; Huang, X.; Qiu, X. (2022). Black-box tuning for language-model-asa- service, In International Conference on Machine Learning, PMLR, pp. 20841-20855.

Hansen, N.; Ostermeier, A. (2001). Completely derandomized self-adaptation in evolution strategies, Evolutionary Computation, vol. 9, no. 2, pp. 159-195. https://doi.org/10.1162/106365601750190398

Hansen, N.; Müller, S. D.; Koumoutsakos, P. (2003). Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES), Evolutionary Computation, vol. 11, no. 1, pp. 1-18. https://doi.org/10.1162/106365603321828970

Rios, L. M.; Sahinidis, N. V. (2013). Derivative-free optimization: a review of algorithms and comparison of software implementations, Journal of Global Optimization, vol. 56, no. 3, pp. 1247-1293. https://doi.org/10.1007/s10898-012-9951-y

Lei, T.; et al. (2023). Conditional adapters: Parameter-efficient transfer learning with fast inference, Advances in Neural Information Processing Systems, vol. 36, pp. 8152-8172.

Pfeiffer, J.; Kamath, A.; Rückle, A.; Cho, K.; Gurevych, I. (2021). AdapterFusion: Non- Destructive Task Composition for Transfer Learning, In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 487-503. https://doi.org/10.18653/v1/2021.eacl-main.39

Zhao, H.; Fu, J.; He, Z. (2023). Prototype-based HyperAdapter for Sample-Efficient Multi-task Tuning, In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pp. 4603-4615. https://doi.org/10.18653/v1/2023.emnlp-main.280

Chronopoulou, A.; Peters, M. E.; Fraser, A.; Dodge, J. (2023). AdapterSoup: Weight Averaging to Improve Generalization of Pretrained Language Models, In Findings of the Association for Computational Linguistics: EACL 2023, pp. 2054-2063. https://doi.org/10.18653/v1/2023.findings-eacl.153

Vaswani, A. (2017). Attention is all you need, Advances in Neural Information Processing Systems.

Lester, B.; Al-Rfou, R.; Constant, N. (2021). The Power of Scale for Parameter-Efficient Prompt Tuning, In Proceedings of the 2021 Conference on Empirical Methods in Natural. https://doi.org/10.18653/v1/2021.emnlp-main.243

Li, J.; Aitken, W.; Bhambhoria, R.; Zhu, X. (2023). Prefix Propagation: Parameter-Efficient Tuning for Long Sequences, In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 1408-1419. https://doi.org/10.18653/v1/2023.acl-short.120

Zhang, Z.-R.; Tan, C.; Xu, H.; Wang, C.; Huang, J.; Huang, S. (2023). Towards Adaptive Prefix Tuning for Parameter-Efficient Language Model Fine-tuning, In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 1239- 1248. https://doi.org/10.18653/v1/2023.acl-short.107

Liu, X.; et al. (2024). GPT understands, too, AI Open, vol. 5, pp. 208-215. https://doi.org/10.1016/j.aiopen.2023.08.012

Liu, H.; et al. (2022). Few-shot parameter-efficient fine-tuning is better and cheaper than incontext learning, Advances in Neural Information Processing Systems, vol. 35, pp. 1950-1965.

Zaken, E. B.; Goldberg, Y.; Ravfogel, S. (2022). BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models, In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 1-9.

Lee, J.; Tang, R.; Lin, J. (2019). What would elsa do? Freezing layers during transformer finetuning, arXiv preprint arXiv:1911.03090.

Hu, E. J.; et al. (2021). Lora: Low-rank adaptation of large language models, arXiv preprint arXiv:2106.09685.

Dettmers, T.; Pagnoni, A.; Holtzman, A.; Zettlemoyer, L. (2024). Qlora: Efficient finetuning of quantized LLMs, Advances in Neural Information Processing Systems, vol. 36.

Valipour, M.; Rezagholizadeh, M.; Kobyzev, I.; Ghodsi, A. (2023). DyLoRA: Parameter-Efficient Tuning of Pre-trained Models using Dynamic Search-Free Low-Rank Adaptation, In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pp. 3274-3287. https://doi.org/10.18653/v1/2023.eacl-main.239

Luo, D.; Zheng, K.; Wu, C.; Wang, X.; Wang, J. (2025). ERAT-DLoRA: Parameter-efficient tuning with enhanced range adaptation in time and depth aware dynamic LoRA, Neurocomputing, vol. 614, p. 128778. https://doi.org/10.1016/j.neucom.2024.128778

Ren, P.; et al. (2024). Melora: Mini-ensemble low-rank adapters for parameter-efficient finetuning, In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 3052-3064. https://doi.org/10.18653/v1/2024.acl-long.168

Mao, Y.; et al. (2022). UniPELT: A Unified Framework for Parameter-Efficient Language Model Tuning, In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 6253-6264. https://doi.org/10.18653/v1/2022.acl-long.433

Liu, P.; Yuan, W.; Fu, J.; Jiang, Z.; Hayashi, H.; Neubig, G. (2023). Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing, ACM Computing Surveys, vol. 55, no. 9, pp. 1-35. https://doi.org/10.1145/3560815

Wiebe, J.; Wilson, T.; Cardie, C. (2005). Annotating expressions of opinions and emotions in language, Language Resources and Evaluation, vol. 39, pp. 165-210. https://doi.org/10.1007/s10579-005-7880-9

Pang, B.; Lee, L. (2005). Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales, In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), pp. 115-124. https://doi.org/10.3115/1219840.1219855

Voorhees, E. M.; Tice, D. M. (2000). Building a question answering test collection, In Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 200-207. https://doi.org/10.1145/345508.345577

Wolf, T.; et al. (2020). Transformers: State-of-the-art natural language processing, In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38-45.

Rücklé, A.; et al. (2021). AdapterDrop: On the Efficiency of Adapters in Transformers, In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 7930- 7946. https://doi.org/10.18653/v1/2021.emnlp-main.626

Liu, X.; et al. (2022). P-Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks, In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 61-68. https://doi.org/10.18653/v1/2022.acl-short.8

Wu, Z.; et al. (2022). IDPG: An Instance-Dependent Prompt Generation Method, In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 5507-5521. https://doi.org/10.18653/v1/2022.naacl-main.403

Zaib, M.; Zhang, W. E.; Sheng, Q. Z.; Mahmood, A.; Zhang, Y. (2022). Conversational question answering: A survey, Knowledge and Information Systems, vol. 64, no. 12, pp. 3151-3195. https://doi.org/10.1007/s10115-022-01744-y

Alshammari, A.; Alzaidi, S. A.; SK, K. (2024). Enhancing Text Summarization with Linguistic Prompting and Reinforcement Learning: A Human-Centered Approach, Tehnički Vjesnik, vol. 31, no. 5, pp. 1431-1437. https://doi.org/10.17559/TV-20231213001202

Late Adapter Tuning: A Cost-Effective Approach to Parameter-Efficient Fine-Tuning for Large Language Models

Authors

DOI:

Keywords:

Abstract

References

Additional Files

Published

Issue

Section

License

Most read articles by the same author(s)