Conditionals (if_else)

from siuba.data import penguins
from siuba import _, summarize, group_by, if_else, transmute, case_when

penguins
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
0 Adelie Torgersen 39.1 18.7 181.0 3750.0 male 2007
1 Adelie Torgersen 39.5 17.4 186.0 3800.0 female 2007
... ... ... ... ... ... ... ... ...
342 Chinstrap Dream 50.8 19.0 210.0 4100.0 male 2009
343 Chinstrap Dream 50.2 18.7 198.0 3775.0 female 2009

344 rows × 8 columns

if_else for two cases

Use the if_else() when values depend only on two cases—like whether some condition is True or False. This is similar to a Python if else statement, but applies to each value in a column.

Basics

if_else(penguins.bill_length_mm > 40, "long", "short")
0      short
1      short
       ...  
342     long
343     long
Length: 344, dtype: object

Use in a verb

transmute(
    penguins,
    bill_length = if_else(_.bill_length_mm > 40, "long", "short")
)
bill_length
0 short
1 short
... ...
342 long
343 long

344 rows × 1 columns

case_when for many cases

The case_when() function is a more general version of if_else(). It lets you check as many cases as you want, and map them to resulting values.

Basics

case_when(penguins, {
    _.bill_depth_mm <= 18: "short",
    _.bill_depth_mm <= 19: "medium",
    _.bill_depth_mm > 19: "long"
})
0      medium
1       short
        ...  
342    medium
343    medium
Length: 344, dtype: object

Use in a verb

# also works
penguins >> case_when({ ... })

Set default when no match

Use a True as the final case, in order to set a value when no other cases match.

case_when(penguins, {
    _.bill_depth_mm.between(18, 19): "medium",
    True: "OTHER"
})
0      medium
1       OTHER
        ...  
342    medium
343    medium
Length: 344, dtype: object

Note that this works because—for each value—case_when checks for the first matching condition. The final True condition guarantees that it will always be a match.