Francois Chollet: Pattern Recognition vs Intelligence

The creator of Keras on what LLMs actually are — and what they aren’t.

Key Insights

LLMs as vector functions: LLMs aren’t discrete programs with conditional logic. They’re vector functions — continuous input-to-output mappings implemented via curves.

They’re not like discrete programs like you might imagine a Python program. They’re actually vector functions.

Compression forces learning: With infinite memory, an LLM could just be a lookup table. But limited parameters force compression — so it learns predictive functions instead.

Style transfer as efficiency: It’s more compressive to learn style independently from content. That’s why LLMs can do textual style transfer — they learn millions of independent predictive functions and combine them via interpolation.

Compositionality: Because these are vector functions, you can sum them, interpolate between them, and produce new functions. This is fundamentally different from discrete programs.

Listen to the episode

Nihit's Notes

Explorer

Francois Chollet: Pattern Recognition vs Intelligence

Key Insights

Graph View

Backlinks