Deep learning isn’t hard anymore
At least, building software with deep learning isn’t
In the not-so-distant past, a data science team would need a few things to effectively use deep learning:
- A novel model architecture, likely designed in-house
- Access to a large, and likely proprietary, data set
- The hardware or funds for large-scale model training
This had the effect of bottlenecking deep learning, limiting it to the few projects that met those conditions.
Over the last couple years, however, things have changed.
At Cortex, we are seeing users launch a new generation of products built on deep learning—and unlike before, these products aren’t all being built using one-of-a-kind model architectures.
The driver behind this growth is transfer learning.
What is transfer learning?
Transfer learning, broadly, is the idea that the knowledge accumulated in a model trained for a specific task—say, identifying flowers in a photo—can be transferred to another model to assist in making predictions for a different, related task—like identifying melanomas on someone’s skin.
Note: If you want a more technical dive into transfer learning, Sebastian Ruder has written a fantastic primer.
There are a variety of approaches to transfer learning, but one approach in particular—finetuning—is finding widespread adoption.
In this approach, a team takes a pre-trained model and removes/retrains the last layers of the model to focus on a new, related task. For example, AI Dungeon is an open world text adventure game that went viral for how convincing its AI-generated stories are: