Using pandas dataframe’s apply function

Panda’s apply function is very versatile. It is able to run complex instructions on each row of records in the data frame. Take the data of column A and B for example:

A	B	C
Alex	Dick	False
Alexandra	Babylon	True
Alexis	Baby	True
Boris	Charlie	False
Michael	Custom	False

The rules to get the result of C are:

If Column A contains "Alex" and Column B contains "Baby", return True, else False

The apply function can look something like this:

def regexApply(s):
  # the records on a single row is passed in as s, and to access the
  # variables, can use the following methods
  varA = s['A']
  varB = s['B']

  if (re.search("Alex.*", varA)) and (re.search("Baby.*", varB)):
    return True
  else:
    return False

If the data is stored in df, then to call the function:

df['C'] = df.apply(regexApply, axis=1)

The example may look simple, and some may argue that it can be achieved via lambda function, however, the beauty of this is that, since it is a function, it can have as many lines of code as possible, and I have used it to do relatively complex operations.

I hope you can benefit from this as well.

Using pandas dataframe’s apply function

Related

Leave a Comment Cancel reply

Share this:

Related

Leave a Comment Cancel reply