Reasoning Horizon Scaling - Efficient Architecture (I)

Both reasoning and agentic requires changes in our model architecture. From the perspective of reasoning, humans are extraordinary at slow and persistent reasoning. This has not yet been achieved by current LLMs, because of the limited memory capacity and awkward length scaling efficiency. From the perspective of agentic, Transformers’ Attention attend to all previous token each time when we process new tokens, which means the information of the past is stored locally in the past, but not bring forward into the future, making it a model without a mental state, thus making it far from ``agentic’’....

December 4, 2024