Post

Conversation

People often say gen AI learns in the same way humans do. This is a very hard claim to justify. A small sample of the many ways they’re different: - AI involves data preprocessing e.g. cleaning, tokenization; human learning doesn’t - AI training involves minimizing a loss function; human learning doesn’t - AI models use model weights; human learning doesn’t - ML systems can be optimized/altered by swapping in different models; human learning can’t - AI training can be parallelized; human learning can’t - AI training can crash / be restarted; human learning can’t - Human learning is usually multimodal; AI training is usually not - The biggest AI models take a few months to train; human expertise more like 20+ years - AI uses much more training data than humans (e.g. Yann LeCun says standard LLM dataset size is 1E13 tokens, which would take a human 170,000 years to read) - Training an AI model produces a trained system that can be easily replicated (model code and weights); human learning doesn’t - Training an AI model produces something that can be fine-tuned by others; human learning doesn’t This is obviously a very incomplete list. There are differences at absolutely every turn. Yes, some of the ideas in ML are inspired by ideas about the brain. But that doesn’t mean these processes are the same. It’s hard to see the claim that AI training and human learning are the same as anything more than a misguided attempt to avoid copyright law. And even if training an AI model and human learning did work the same (which they don’t), there would *still* be good reason for the law to treat them differently; major issues of scale and consent more than warrant this.