Member-only story

PCA clearly explained —When, Why, How to use it and feature importance: A guide in Python

In this post I explain what PCA is, when and why to use it and how to implement it in Python using scikit-learn. Also, I explain how to get the feature importance after a PCA analysis.

7 min readMay 31, 2020

**Handmade** sketch made by the **author.**

1. Introduction & Background

Principal Components Analysis (PCA) is a well-known unsupervised dimensionality reduction technique that constructs relevant features/variables through linear (linear PCA) or non-linear (kernel PCA) combinations of the original variables (features). In this post, we will only focus on the famous and widely used linear PCA method.

The construction of relevant features is achieved by linearly transforming correlated variables into a smaller number of uncorrelated variables. This is done by projecting (dot product) the original data into the reduced PCA space using the eigenvectors of the covariance/correlation matrix aka the principal components (PCs).

The resulting projected data are essentially linear combinations of the original data capturing most of the variance in the data (Jolliffe 2002).

thank you Serafeim, very useful explanation!! i use PCA for combining several variables into the Sustainable Territorial Development Index and are a powerful and reliable tool!! your explanation is quite welcome under the additional understanding…

Hey, how did you get the colours to display for each class? Are the classes named by colour? I imported an iris dataset and the classes were called 'Iris-setosa', 'Iris-versicolor' etc. Are your y values colours?

TDS Archive

PCA clearly explained —When, Why, How to use it and feature importance: A guide in Python

In this post I explain what PCA is, when and why to use it and how to implement it in Python using scikit-learn. Also, I explain how to get the feature importance after a PCA analysis.

1. Introduction & Background

Create an account to read the full story.

Published in TDS Archive

Written by Serafeim Loukas, PhD

Responses (7)

More from Serafeim Loukas, PhD and TDS Archive

How Random Forests & Decision Trees Decide: Simply Explained With An Example In Python

What is a decision tree? What is a random forest? How do they decide to split the data? How do they predict class labels in classification…

Agentic AI: Building Autonomous Systems from Scratch

A Step-by-Step Guide to Creating Multi-Agent Frameworks in the Age of Generative AI

How to Build an AI Agent for Data Analytics Without Writing SQL

Create a comprehensive AI agent from the ground up utilizing LangChain and DuckDB

LSTM Time-Series Forecasting: Predicting Stock Prices Using An LSTM Model

In this post I show you how to predict stock prices using a forecasting LSTM model

Recommended from Medium

How to Explain Each Core Machine Learning Model in an Interview

From Regression to Clustering to CNNs: A Brief Guide to 25+ Machine Learning Models

Understanding Precision, Recall, and F1 Score Metrics

In the world of machine learning, performance evaluation metrics play a critical role in determining the effectiveness of a model. Metrics…

Making Sense of Categorical Survey Data: A K-Modes Clustering Case Study

When analysing survey responses, selecting the right clustering methodology is essential. Surveys often consist of categorical data, such…

Understanding Little’s MCAR Test: A Key Tool in Missing Data Analysis

Ensuring Data Integrity: How Little’s MCAR Test Detects Random Missingness.

Data Science All Algorithm Cheatsheet 2025

Stories, strategies, and secrets to choosing the perfect algorithm.

Understanding JS Divergence for Feature Selection: A Hands-On Guide with Evidently

Feature selection is a critical step in building robust machine learning models. One powerful tool to assess feature stability between…