Houlsby, N., Giurgiu, A., Jastrzebski, S., Morrone, B., De Laroussilhe, Q., Gesmundo, A., ... & Gelly, S. (2019, May). Parameter-efficient transfer learning for NLP. In_International Conference on Machine Learning_(pp. 2790-2799). PMLR.
Liu, X., Zheng, Y., Du, Z., Ding, M., Qian, Y., Yang, Z., & Tang, J. (2021). GPT understands, too.arXiv preprint arXiv:2103.10385.
Liu, X., Ji, K., Fu, Y., Du, Z., Yang, Z., & Tang, J. (2021). P-tuning v2: Prompt tuning can be comparable to fine-tuning universally across scales and tasks.arXiv preprint arXiv:2110.07602.
Li, X. L., & Liang, P. (2021, August). Prefix-Tuning: Optimizing Continuous Prompts for Generation. In_Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)_(pp. 4582-4597).
Lester, B., Al-Rfou, R., & Constant, N. (2021, November). The Power of Scale for Parameter-Efficient Prompt Tuning. In_Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing_(pp. 3045-3059).
Gu, Y., Han, X., Liu, Z., & Huang, M. (2022, May). PPT: Pre-trained Prompt Tuning for Few-shot Learning. In_Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)_(pp. 8410-8423).
Sun, T., He, Z., Qian, H., Huang, X., & Qiu, X. (2022). BBTv2: Pure Black-Box Optimization Can Be Comparable to Gradient Descent for Few-Shot Learning.arXiv preprint arXiv:2205.11200.
Jia, M., Tang, L., Chen, B. C., Cardie, C., Belongie, S., Hariharan, B., & Lim, S. N. (2022). Visual prompt tuning.arXiv preprint arXiv:2203.12119.
Zhou, K., Yang, J., Loy, C. C., & Liu, Z. (2022). Learning to prompt for vision-language models.International Journal of Computer Vision, 1-12.
Zhou, K., Yang, J., Loy, C. C., & Liu, Z. (2022). Conditional prompt learning for vision-language models. In_Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition_(pp. 16816-16825).
Rao, Y., Zhao, W., Chen, G., Tang, Y., Zhu, Z., Huang, G., ... & Lu, J. (2022). Denseclip: Language-guided dense prediction with context-aware prompting. In_Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition_(pp. 18082-18091).
Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., ... & Sutskever, I. (2021, July). Learning transferable visual models from natural language supervision. In_International Conference on Machine Learning_(pp. 8748-8763). PMLR.
Chen, Y. C., Li, L., Yu, L., El Kholy, A., Ahmed, F., Gan, Z., ... & Liu, J. (2020, August). Uniter: Universal image-text representation learning. In European conference on computer vision (pp. 104-120). Springer, Cham.
Lu, J., Batra, D., Parikh, D., & Lee, S. (2019). Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. Advances in neural information processing systems, 32.
Wang, Z., Yu, J., Yu, A. W., Dai, Z., Tsvetkov, Y., & Cao, Y. (2021, September). SimVLM: Simple Visual Language Model Pretraining with Weak Supervision. In_International Conference on Learning Representations_.
Li, J., Li, D., Xiong, C., & Hoi, S. (2022). Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation.arXiv preprint arXiv:2201.12086.
Wang, P., Yang, A., Men, R., Lin, J., Bai, S., Li, Z., ... & Yang, H. (2022). Unifying architectures, tasks, and modalities through a simple sequence-to-sequence learning framework.arXiv preprint arXiv:2202.03052.