pisco_log
banner

HSM-LCR:AUnified Framework for Hierarchical Semantic Multi-Scale Memory and Long-Chain Reasoning

Fan Shi (Blaze Frank Stone)

Abstract


As large language models (LLMs) make breakthroughs in long-text understanding and generation, their ability to ef-fectively utilize long contexts and perform deep reason- ing has become the next bottleneck. We propose a Hier- archical Semantic Multi-Scale Memory and Long-Chain Reasoning Framework (HSM-LCR), which integrates hi- erarchical sparse attention (HSA) with multi-level memory fusion (MMF) to enable structured long-document model- ing, dynamic memory management, and cross-scale rea- soning. Through formal equa tions and pseudo-code, we define hierarchical semantic propagation and cross-level memory recall mechanisms. Simulated experiments show that HSM-LCR significantly improves reasoning depth and long-range information retention compared with methods such as Longformer, BigBird, and LongNet.

Keywords


Long-context Language model;Hierarchical semantics;Hierarchical sparse attention;Multi-level mem- ory fusion; Chain reasoning

Full Text:

PDF

Included Database


References


Dani Aizerman, Soo Kim, and Rui Zhao. H-mem: Hier- archical memory networks for context reasoning. In AAAI Conference on Artifi

cial Intelligence, 2022. 1, 2

Rohan Anil, Yutian Bai, Xinying Chen, and et al. Longnet: Scaling transformers to 1b tokens. In Proceedings of the 40th International

Conference on Machine Learning (ICML), 2023. 1, 2

Yutian Bai, Wenhao Chen, and Xin Zhang. Memorybank: Towards efficient multi-scale context compression . arXiv preprint

arXiv:2401 .04110, 2024. 1

Iz Beltagy, Matthew E Peters, and Arman Cohan. Long- former: The long-document transformer. arXiv preprint arXiv:2004 .05150,

1, 2

Tom Brown, Benjamin Mann, Nick Ryder, and et al. Lan- guage models are few-shot learners. Advances in Neural In-formation

Processing Systems (NeurIPS), 2020. 1

Roman Bulatov, Max Ryabinin, and Artem Gusev. Hier- archical transformers are more efficient long-range models. In International

Conference on Learning Representations (ICLR), 2022. 1

Xinyu Chen, Wenhao Zhang, Yu Gu, and et al. Ruler: Eval- uating long-context language models at scale . arXiv preprint

arXiv:2308 .12288, 2023. 1, 2

Peng Liu, Zhihong Wang,Yizhu Zhao, and Yun Chen. Ring attention with blockwise learning for long-range sequence modeling. In Em

pirical Methods in Natural Language Pro- cessing (EMNLP), 2023. 1

Yifan Liu, Qi Gao, and Chen Liang. Hierarchical transform- ers for structured document understanding. Transactions of the Association

for Computational Linguistics (TACL), 2024. 1

Zhiyuan Liu, Xipeng Qiu, and Maosong Sun. Survey on long-context modeling in large language models.AI Open, 2024. 1

Bo Peng, Xinyu Han,Yuhan Shen, and et al. Memgpt: Towards llms with external memory. arXiv preprint arXiv:2310 .08560, 2023. 1

Jack W Rae, Anna Potapenko, Siddhant M Jayakumar, and Timothy P Lillicrap. Scaling memory-augmented trans- former models. In

Advances in Neural Information Process- ing Systems (NeurIPS), 2021. 1

Hugo Touvron, Thibaut Lavril, Gautier Izacard, and et al. Llama: Open and efficient foundation language models. In International Con

ference on Learning Representations (ICLR), 2023. 1

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszko- reit, Llion Jones,Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin.At

tention is all you need.Advances in Neural Information Processing Systems (NeurIPS), 2017. 1

Lin Xu, Xi Chen, and Yichong Zhang. Hiermem: Hierarchi- cal memory for long-term context reasoning. arXiv preprint

arXiv:2312 .02088, 2023. 1

Zheng Yuan,Yibo Fang, and Xingyu Chen. Dynamic se- mantic memory routing for long-context reasoning. InACL 2024, 2024. 1

Manzil Zaheer, Guru Guruganesh,Avinava Dubey, and et al. Bigbird: Transformers for longer sequences. In Advances in Neural Infor

mation Processing Systems (NeurIPS), 2020. 1, 2

Hao Zhao, Xi Lin, and Tianqi Zhang. Multi-scale context integration for long-document transformers. IEEE Transac- tions on Neural

Networks and Learning Systems, 2024. 1




DOI: http://dx.doi.org/10.70711/aitr.v3i9.9026

Refbacks

  • There are currently no refbacks.