Sitemap

How TabPFN “Sees” Letters: Classifying 26 Characters with the Many Class Extension

3 min readNov 16, 2025
Press enter or click to view image in full size
Photo by Kevin Murray on Unsplash

I recently discovered something fascinating. TabPFN can literally “see” patterns in tabular data using its Many Class Classifier extension.

Of course, it’s not “seeing” in the visual sense like a CNN would. But when you look at how certain datasets encode shapes and textures as numbers, TabPFN starts to feel surprisingly perceptive.

Other tabular models can also do this, but with lower accuracy, which makes them a bit myopic in that sense. Here, TabPFN outperforms every baseline, including strong models like Random Forests.

Let’s go through this idea step-by-step and code it together.

Before explaining how is it doing that, first I need to explain a dataset, namely the UCI Letter Recognition dataset.

UCI Letter Recognition dataset

The UCI Letter Recognition dataset is a classic benchmark in machine learning.

Its goal: classify printed capital letters (A–Z) based on their geometric properties.

Here’s what makes it interesting:

  • It has 20,000 samples of distorted black-and-white letters (A–Z) generated using 20 different fonts.
  • Instead of images, each letter is…

--

--

Kürşat Kaya

Written by Kürşat Kaya

I am a computer science master's student. I have a strong interest in board games.

No responses yet