HSM-LCR:AUnified Framework for Hierarchical Semantic Multi-Scale Memory and Long-Chain Reasoning
Abstract
Keywords
Full Text:
PDFReferences
Dani Aizerman, Soo Kim, and Rui Zhao. H-mem: Hier- archical memory networks for context reasoning. In AAAI Conference on Artifi
cial Intelligence, 2022. 1, 2
Rohan Anil, Yutian Bai, Xinying Chen, and et al. Longnet: Scaling transformers to 1b tokens. In Proceedings of the 40th International
Conference on Machine Learning (ICML), 2023. 1, 2
Yutian Bai, Wenhao Chen, and Xin Zhang. Memorybank: Towards efficient multi-scale context compression . arXiv preprint
arXiv:2401 .04110, 2024. 1
Iz Beltagy, Matthew E Peters, and Arman Cohan. Long- former: The long-document transformer. arXiv preprint arXiv:2004 .05150,
1, 2
Tom Brown, Benjamin Mann, Nick Ryder, and et al. Lan- guage models are few-shot learners. Advances in Neural In-formation
Processing Systems (NeurIPS), 2020. 1
Roman Bulatov, Max Ryabinin, and Artem Gusev. Hier- archical transformers are more efficient long-range models. In International
Conference on Learning Representations (ICLR), 2022. 1
Xinyu Chen, Wenhao Zhang, Yu Gu, and et al. Ruler: Eval- uating long-context language models at scale . arXiv preprint
arXiv:2308 .12288, 2023. 1, 2
Peng Liu, Zhihong Wang,Yizhu Zhao, and Yun Chen. Ring attention with blockwise learning for long-range sequence modeling. In Em
pirical Methods in Natural Language Pro- cessing (EMNLP), 2023. 1
Yifan Liu, Qi Gao, and Chen Liang. Hierarchical transform- ers for structured document understanding. Transactions of the Association
for Computational Linguistics (TACL), 2024. 1
Zhiyuan Liu, Xipeng Qiu, and Maosong Sun. Survey on long-context modeling in large language models.AI Open, 2024. 1
Bo Peng, Xinyu Han,Yuhan Shen, and et al. Memgpt: Towards llms with external memory. arXiv preprint arXiv:2310 .08560, 2023. 1
Jack W Rae, Anna Potapenko, Siddhant M Jayakumar, and Timothy P Lillicrap. Scaling memory-augmented trans- former models. In
Advances in Neural Information Process- ing Systems (NeurIPS), 2021. 1
Hugo Touvron, Thibaut Lavril, Gautier Izacard, and et al. Llama: Open and efficient foundation language models. In International Con
ference on Learning Representations (ICLR), 2023. 1
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszko- reit, Llion Jones,Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin.At
tention is all you need.Advances in Neural Information Processing Systems (NeurIPS), 2017. 1
Lin Xu, Xi Chen, and Yichong Zhang. Hiermem: Hierarchi- cal memory for long-term context reasoning. arXiv preprint
arXiv:2312 .02088, 2023. 1
Zheng Yuan,Yibo Fang, and Xingyu Chen. Dynamic se- mantic memory routing for long-context reasoning. InACL 2024, 2024. 1
Manzil Zaheer, Guru Guruganesh,Avinava Dubey, and et al. Bigbird: Transformers for longer sequences. In Advances in Neural Infor
mation Processing Systems (NeurIPS), 2020. 1, 2
Hao Zhao, Xi Lin, and Tianqi Zhang. Multi-scale context integration for long-document transformers. IEEE Transac- tions on Neural
Networks and Learning Systems, 2024. 1
DOI: http://dx.doi.org/10.70711/aitr.v3i9.9026
Refbacks
- There are currently no refbacks.