Sebastian Raschka surveys recent LLM architectural innovations—KV sharing, multi-head compression (mHC), and compressed attention—all aimed at improving long-context efficiency.
Sebastian Raschka
magazine.sebastianraschka.com · Leading Thinkers · 3 items
Sebastian Raschka documents his personal workflow for studying and visualizing new LLM architectures, including tools and habits that underpin his articles and LLM-Gallery.
Sebastian Raschka breaks down the architecture of modern coding agents, covering tool use, context management, sandboxing, and evaluation loops in practical systems.
Nothing matches.