5 min read
Atlas: Learning to Optimally Memorize the Context at Test Time
ATLAS, a transformer-like architecture, can process a context window as large as ten million tokens
Reference:
[1] DeepLearning.AI
5 min read
ATLAS, a transformer-like architecture, can process a context window as large as ten million tokens
Reference:
[1] DeepLearning.AI