Transformer

9 articles tagged with this topic

Self-Attention Powers AI Context — But Few Firms Truly Understand It

Self-attention is the core of mainstream AI, enabling simultaneous word relationship analysis. Understanding it is key to evaluating AI costs and ROI.

May 62 min read

TransformerDeep Learning

Transformer Book Read 3 Times: LLM Race Shifts from API Calls to Foundational Logic

A deep learning book read 3 times. While most only call LLM APIs, understanding principles like attention mechanisms now dictates AI app success and c

May 62 min read

minGPTAndrej Karpathy

Million-Param GPT on Journey to the West: Demystifying LLMs Is the New Imperative

Training a million-param mini Chinese GPT on Journey to the West locally reflects the industry's urgent need to demystify the LLM black box and master

May 52 min read

GoogleTransformer

7 Years of Transformer Dominance: LLM Architecture Awaits the Next Reshuffle

Transformer underpins LLMs via self-attention, fixing old algorithms' parallel and long-context flaws. Grasping it reveals LLM capability limits and b

May 42 min read

TransformerAttention Mechanism

Transformer Attention Explained: The 2017 Engine Behind LLMs' Long Memory

Attention is a core LLM principle, solving AI amnesia by weighting key info. Understanding it isn't for coding—it reveals long-text limits and compute

May 32 min read

QuadtrixTransformer

C++ Transformer From Scratch Demystifies LLMs, But Won't Shift Compute Paradigm

A zero-dependency C++17 GPT (0.83M params) demystifies LLMs, but its 75x efficiency lag vs. industrial frameworks proves foundational innovation still

May 22 min read

TransformerAttention is all you need

Transformer: 7 Years, 120K Citations—Key to the LLM Race

Google's 2017 Transformer is the LLM bedrock, replacing RNNs with parallel attention. Grasping it reveals who takes shortcuts in the LLM race.

May 23 min read

GoogleSeq2Seq

Decade of Seq2Seq: The True Technical Starting Point of LLMs

Google's 2014 Seq2Seq architecture is the shared technical foundation of LLMs like GPT and BERT. Understanding its encoder-decoder division and info b

May 12 min read

TransformerMechanistic Interpretability

Compiling a Calculator Into AI Weights: A New Path to Decode Transformers

A dev compiled an RPN interpreter into Transformer weights. The 1.1GB basic-math model's value: offering a new way to bypass training and decode AI in

Apr 302 min read