Pipe function in Python Polars. Don't leave the pipe flow!

Are you exploring Polars as an alternative to Pandas? We love it for the pipe flow feeling! Learn in 3 lines how and when to use it

Well, well, hanging out in Python, missing the pipe feeling?? There are some good news! The pipe function in the Polars Python module allows you to chain operations together by passing the result of one operation as the input to the next operation. For sure, this is not one of the greatest advantages that Polars offers over Pandas…. But it does feel so nice!

Here you can find and example of how to keep the pipe flow even in Python!

import polars as pl
import random 

# Create a Polars DataFrame with base columns
df = pl.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie'], 
    'offensive_skill': [5, 30, 85], 
    'defensive_skill': [92, 30, 10]
    })
    
# Define polars custom functions to apply
def add_position_column(df):
    df = df.with_columns( 
        pl.when(pl.col('defensive_skill') > 50).then('CB')
        .when(pl.col('offensive_skill') > 50).then('FW')
        .otherwise('bench').alias("position")
    )
    return df

def add_squad_number_column(df):
    df = df.with_columns( 
        pl.when(pl.col('position') == 'CD').then(pl.lit(random.sample(range(2, 6), 1)[0], dtype=pl.Int8))
        .when(pl.col('position') == 'FW').then(pl.lit(random.sample(range(7, 19), 1)[0], dtype=pl.Int8))
        .otherwise('-').alias("squad_number")
    )
    return df

# Chain operations together using the pipe function

(
    df
    .pipe(add_position_column)
    .pipe(add_squad_number_column)
)
shape: (3, 5)
nameoffensive_skilldefensive_skillpositionsquad_number
stri64i64strstr
"Alice"592"CB""-"
"Bob"3030"bench""-"
"Charlie"8510"FW""15"


Polars pipe and lazy evaluation

An extra trick is to use the lazy evaluation in order to maximize the advantages of query optimization and parallelization. We need a big enough df and complex operations to make it worth.

result = (
    df.lazy()
    .pipe(add_position_column)
    .pipe(add_squad_number_column)
    .collect()
)

result
shape: (3, 5)
nameoffensive_skilldefensive_skillpositionsquad_number
stri64i64strstr
"Alice"592"CB""-"
"Bob"3030"bench""-"
"Charlie"8510"FW""12"


Carlos Vecina
Carlos Vecina
Senior Data Scientist at Jobandtalent

Senior Data Scientist at Jobandtalent | AI & Data Science for Business