As a data analyst, transforming and preparing data for analysis is a large part of the job. One of the most common data preparation tasks is renaming columns in a pandas dataframe to create more readable and understandable column names.
Column names like ‘col1‘, ‘x‘, ‘var1‘ tell you nothing about the data. By giving columns better names like ‘Sales‘, ‘Profit‘, ‘Country‘, you can make the meaning of the data clearer.
In this comprehensive guide, you‘ll learn several methods to rename columns in pandas, complete with code examples and use cases for each one.
By the end, you‘ll know how to:
- Get and set the column names attribute
- Use the rename() method for flexible renaming
- Use str.replace() for simple name changes
- Leverage set_axis() to change all names
- Choose the right method for different renaming needs
Follow along and you‘ll gain expert-level skills for renaming columns in pandas!
Why Rename Columns in Pandas?
Before we dive in, let‘s briefly discuss why you‘d want to rename columns in the first place. Here are some key reasons:
-
Readability – Descriptive names are easier to understand when reviewing or sharing code.
-
Analysis – Column names impact how you access data. Better names can make analysis code simpler.
-
Integration – If joining or merging data, matching column names is easier with standardized names.
-
Maintenance – Code is easier to maintain when column names are meaningful.
As a rule of thumb, I like to rename columns in pandas right after loading the data. This avoids having to debug cryptic variable names later on.
Creating a Pandas DataFrame
Let‘s create a sample dataframe to demonstrate the renaming methods.
Import pandas and load data from a dictionary into a new dataframe df:
import pandas as pd
data = {‘c1‘: [10, 20, 30],
‘c2‘: [40, 50, 60],
‘c3‘: [70, 80, 90]}
df = pd.DataFrame(data)
Preview the dataframe:
df

The column names c1, c2, c3 don‘t tell us much about this data. Let‘s rename them to something better.
Getting and Setting the Columns Attribute
Pandas stores the column names of a dataframe as the columns attribute:
df.columns
# > Index([‘c1‘, ‘c2‘, ‘c3‘], dtype=‘object‘)
To rename columns, you can reassign this attribute to a new list of names:
df.columns = [‘X‘, ‘Y‘, ‘Z‘]
df

The column names changed from c1, c2, c3 to X, Y, Z.
Pros:
- Simple and straightforward to rename all columns
Cons:
- Not flexible – you must set all column names even if you only want to change a few
Let‘s explore more flexible methods next.
Using the rename() Method
The rename() method allows you to selectively rename specific columns.
It takes a columns parameter that maps the old names to the new names:
df = pd.DataFrame(data) # reset
df.rename(columns= {‘c1‘ : ‘Value1‘,
‘c2‘ : ‘Value2‘},
inplace=True)
Now c1 was renamed to Value1, and c2 to Value2:
df

The inplace=True parameter ensures the changes are made on the original dataframe.
Pros:
- Selectively rename certain columns
- Only specify the columns you want to change
Cons:
- Cannot use functions or regex to rename columns
The key benefit of rename() is that you only have to specify the columns you want to change!
Using str.replace() for Simple Renames
The str.replace() method can be used to easily rename columns based on a string replacement.
Resetting our dataframe:
df = pd.DataFrame(data)
Now we‘ll replace ‘c1‘ with ‘First‘:
df.columns = df.columns.str.replace(‘c1‘, ‘First‘)
df

Only c1 was renamed to First.
To rename all columns:
df.columns = df.columns.str.replace(‘c1‘, ‘First‘)
df.columns = df.columns.str.replace(‘c2‘, ‘Second‘)
df.columns = df.columns.str.replace(‘c3‘, ‘Third‘)
df

Pros:
- Simple syntax for quick renames
- Useful for replacing parts of column names
Cons:
- Can only replace full strings, not partial matches
Overall, str.replace() is great for simple string substitution renames.
Using set_axis() to Change All Names
The set_axis() method can rename all columns by passing a new list of names.
Resetting our dataframe one last time:
df = pd.DataFrame(data)
We can use set_axis() like this:
df.set_axis([‘Value1‘, ‘Value2‘, ‘Value3‘], axis=1, inplace=True)
df

The axis=1 tells it to operate on the columns.
Pros:
- Simple way to completely rename all columns
- Takes a list of new names for all columns
Cons:
- Cannot selectively rename only certain columns
set_axis() is best when you want to completely replace all the column names in one shot.
When to Use Each Method
Here‘s a recap of when to use each column renaming method:
- columns attribute – Reassign this directly when renaming all columns
- rename() – Use to flexibly rename a subset of columns
- str.replace() – Use for simple string replacements in names
- set_axis() – Use to wholly replace all column names
In practice, rename() and str.replace() are used most often for selective renaming.
But all these methods are good to have in your pandas toolkit!
Tips for Descriptive Column Names
Here are some tips to create descriptive, readable column names:
-
Be explicit – Name columns so their meaning is clear from just reading the name (e.g. CustomerCount vs cnt)
-
Be consistent – Use similar words and conventions across all column names (e.g. CustomerName, ProductName)
-
Use underscores – Separate words with underscores instead of spaces or camelCase (e.g. customer_count)
-
Avoid special characters – Stick to alphanumeric names and underscores
-
Keep names short – Long column names take up space and reduce readability
Following these naming best practices will ensure your pandas code is easy to understand.
Conclusion
You should now have expert skills for renaming columns in pandas using 4 different methods:
- Get and set the columns attribute
- Use rename() to flexibly rename specific columns
- Use str.replace() to substitute parts of names
- Use set_axis() to replace all column names
Proper data preparation is crucial before analysing or visualizing data with pandas. By renaming columns to descriptive names, you can make your data much easier to work with.
These column renaming skills will serve you well as a data analyst using pandas!