in

How to Rename Pandas Columns: A Comprehensive 4-Method Guide

As a data analyst, transforming and preparing data for analysis is a large part of the job. One of the most common data preparation tasks is renaming columns in a pandas dataframe to create more readable and understandable column names.

Column names like ‘col1‘, ‘x‘, ‘var1‘ tell you nothing about the data. By giving columns better names like ‘Sales‘, ‘Profit‘, ‘Country‘, you can make the meaning of the data clearer.

In this comprehensive guide, you‘ll learn several methods to rename columns in pandas, complete with code examples and use cases for each one.

By the end, you‘ll know how to:

  • Get and set the column names attribute
  • Use the rename() method for flexible renaming
  • Use str.replace() for simple name changes
  • Leverage set_axis() to change all names
  • Choose the right method for different renaming needs

Follow along and you‘ll gain expert-level skills for renaming columns in pandas!

Why Rename Columns in Pandas?

Before we dive in, let‘s briefly discuss why you‘d want to rename columns in the first place. Here are some key reasons:

  • Readability – Descriptive names are easier to understand when reviewing or sharing code.

  • Analysis – Column names impact how you access data. Better names can make analysis code simpler.

  • Integration – If joining or merging data, matching column names is easier with standardized names.

  • Maintenance – Code is easier to maintain when column names are meaningful.

As a rule of thumb, I like to rename columns in pandas right after loading the data. This avoids having to debug cryptic variable names later on.

Creating a Pandas DataFrame

Let‘s create a sample dataframe to demonstrate the renaming methods.

Import pandas and load data from a dictionary into a new dataframe df:

import pandas as pd

data = {‘c1‘: [10, 20, 30], 
        ‘c2‘: [40, 50, 60],
        ‘c3‘: [70, 80, 90]} 

df = pd.DataFrame(data)

Preview the dataframe:

df
sample-df

The column names c1, c2, c3 don‘t tell us much about this data. Let‘s rename them to something better.

Getting and Setting the Columns Attribute

Pandas stores the column names of a dataframe as the columns attribute:

df.columns
# > Index([‘c1‘, ‘c2‘, ‘c3‘], dtype=‘object‘)

To rename columns, you can reassign this attribute to a new list of names:

df.columns = [‘X‘, ‘Y‘, ‘Z‘]
df
set-columns

The column names changed from c1, c2, c3 to X, Y, Z.

Pros:

  • Simple and straightforward to rename all columns

Cons:

  • Not flexible – you must set all column names even if you only want to change a few

Let‘s explore more flexible methods next.

Using the rename() Method

The rename() method allows you to selectively rename specific columns.

It takes a columns parameter that maps the old names to the new names:

df = pd.DataFrame(data) # reset 

df.rename(columns= {‘c1‘ : ‘Value1‘, 
                    ‘c2‘ : ‘Value2‘},
          inplace=True)

Now c1 was renamed to Value1, and c2 to Value2:

df 
rename-method

The inplace=True parameter ensures the changes are made on the original dataframe.

Pros:

  • Selectively rename certain columns
  • Only specify the columns you want to change

Cons:

  • Cannot use functions or regex to rename columns

The key benefit of rename() is that you only have to specify the columns you want to change!

Using str.replace() for Simple Renames

The str.replace() method can be used to easily rename columns based on a string replacement.

Resetting our dataframe:

df = pd.DataFrame(data)

Now we‘ll replace ‘c1‘ with ‘First‘:

df.columns = df.columns.str.replace(‘c1‘, ‘First‘)
df
str-replace

Only c1 was renamed to First.

To rename all columns:

df.columns = df.columns.str.replace(‘c1‘, ‘First‘)
df.columns = df.columns.str.replace(‘c2‘, ‘Second‘)
df.columns = df.columns.str.replace(‘c3‘, ‘Third‘)  
df
rename-method

Pros:

  • Simple syntax for quick renames
  • Useful for replacing parts of column names

Cons:

  • Can only replace full strings, not partial matches

Overall, str.replace() is great for simple string substitution renames.

Using set_axis() to Change All Names

The set_axis() method can rename all columns by passing a new list of names.

Resetting our dataframe one last time:

df = pd.DataFrame(data) 

We can use set_axis() like this:

df.set_axis([‘Value1‘, ‘Value2‘, ‘Value3‘], axis=1, inplace=True)
df
rename-method

The axis=1 tells it to operate on the columns.

Pros:

  • Simple way to completely rename all columns
  • Takes a list of new names for all columns

Cons:

  • Cannot selectively rename only certain columns

set_axis() is best when you want to completely replace all the column names in one shot.

When to Use Each Method

Here‘s a recap of when to use each column renaming method:

  • columns attribute – Reassign this directly when renaming all columns
  • rename() – Use to flexibly rename a subset of columns
  • str.replace() – Use for simple string replacements in names
  • set_axis() – Use to wholly replace all column names

In practice, rename() and str.replace() are used most often for selective renaming.

But all these methods are good to have in your pandas toolkit!

Tips for Descriptive Column Names

Here are some tips to create descriptive, readable column names:

  • Be explicit – Name columns so their meaning is clear from just reading the name (e.g. CustomerCount vs cnt)

  • Be consistent – Use similar words and conventions across all column names (e.g. CustomerName, ProductName)

  • Use underscores – Separate words with underscores instead of spaces or camelCase (e.g. customer_count)

  • Avoid special characters – Stick to alphanumeric names and underscores

  • Keep names short – Long column names take up space and reduce readability

Following these naming best practices will ensure your pandas code is easy to understand.

Conclusion

You should now have expert skills for renaming columns in pandas using 4 different methods:

  • Get and set the columns attribute
  • Use rename() to flexibly rename specific columns
  • Use str.replace() to substitute parts of names
  • Use set_axis() to replace all column names

Proper data preparation is crucial before analysing or visualizing data with pandas. By renaming columns to descriptive names, you can make your data much easier to work with.

These column renaming skills will serve you well as a data analyst using pandas!

AlexisKestler

Written by Alexis Kestler

A female web designer and programmer - Now is a 36-year IT professional with over 15 years of experience living in NorCal. I enjoy keeping my feet wet in the world of technology through reading, working, and researching topics that pique my interest.