読み込み中...
読み込み中...
AGI Olympics V3 is an advanced benchmark test for measuring true AGI (Artificial General Intelligence) capabilities. It distinguishes between "long context" and "true memory," evaluating self-awareness and long-range dependencies through 8 tests. ALICE V3 achieved a score of 90.2% on this benchmark.
Composed of 4 tests: self-recognition, identity consistency, self-improvement, and perspective-taking.
Evaluates long-term memory through 4 tests: context integration, learning retention, story coherence, and delayed tasks.
Test 6.2 and Test 7.2 require a 24-hour waiting period. This distinguishes short-term from long-term memory and measures true memory capability. After completing Session 1, return after 24 hours to take Session 2.
Even LLMs with 1 million token contexts cannot achieve true long-term memory. ALICE's SynapticMemory layer realizes a human-like memory system that compresses and stores information, recalling it when needed.
Instead of processing long contexts every time, only necessary information is retrieved from compressed memories. This improves cost efficiency by over 100x.
Test progress is saved only in your browser. After completion, you can choose whether to submit data anonymously. Multi-session test waiting times are also managed in local storage.