Start to use pandas! — Say GoodBye to Excel in 2023!
Excel in 2023 — Nah! Let’s learn and use Pandas Library in Python!
Pandas is a powerful open-source data manipulation and analysis library for Python. It is widely used for cleaning, aggregating, and transforming data, and is an essential tool for many data scientists and analysts.
We can handle any task when is time taking in excel and more important is we can automate those repititive task. Specially for people who do not like Excel Macros as it freezes, make error and even sometime shutdown all system while rulling if have some bugs. Pandas let you handle all those excel works you can imagine just with few line of codes.
I recommend to start using Spyder Kernal to start with as it provide a interface to see the dataframe uploaded into system .
To start using pandas, you will need to install it. You can install pandas using pip
, the Python package manager, by running the following command in your terminal:
pip install pandas
Once you have installed pandas, you can start using it in your Python scripts by importing it:
The pd
alias is a common convention and will be used throughout this tutorial.
Reading data
One of the most basic tasks in pandas is reading data into a DataFrame. A DataFrame is a two-dimensional size-mutable tabular data structure with rows and columns. You can think of it as a spreadsheet or a SQL table.
There are several ways to read data into a DataFrame. Some of the most common ones are:
pd.read_csv()
: Read a CSV (comma-separated values) filepd.read_excel()
: Read an Excel filepd.read_json()
: Read a JSON (JavaScript Object Notation) filepd.read_sql()
: Read data from a SQL database
Here’s an example of how to read a CSV file into a DataFrame using pd.read_csv()
:
# First, import pandas and load the Excel file
import pandas as pd
# Replace 'path/to/file.xlsx' with the path to your Excel file
df = pd.read_excel('path/to/file.xlsx')
# You can view the first few rows of the DataFrame using the head() method
print(df.head())
#…