Across column apply

Use the across() function to apply the same transformation to multiple columns.

from siuba import _, across, Fx, group_by, mutate, summarize, filter, arrange
from siuba.data import mtcars

Basic use

mtcars >> mutate(across(_["mpg", "hp"], Fx - Fx.mean(), names="demeaned_{col}"))

	mpg	cyl	disp	hp	drat	wt	qsec	vs	am	gear	carb	demeaned_mpg	demeaned_hp
0	21.0	6	160.0	110	3.90	2.620	16.46	0	1	4	4	0.909375	-36.6875
1	21.0	6	160.0	110	3.90	2.875	17.02	0	1	4	4	0.909375	-36.6875
...	...	...	...	...	...	...	...	...	...	...	...	...	...
30	15.0	8	301.0	335	3.54	3.570	14.60	0	1	5	8	-5.090625	188.3125
31	21.4	4	121.0	109	4.11	2.780	18.60	1	1	4	2	1.309375	-37.6875

32 rows × 13 columns

Note three important pieces in the code above:

select: _["mpg", "hp"] chooses the columns to transform.
transform: Fx - Fx.mean() is the transformation, where Fx stands for the column being operated on.
rename: names= is an optional argument, specifying how to name the result. The {col} in "demeaned_{col}" gets replaced with the column name.

Any selection that can be passed to select(), can also be used in across(). Note that you can use _[...] to combine selections.

mtcars >> summarize(across(_[_.startswith("m"), _.endswith("p")], Fx.mean()))

	mpg	disp	hp
0	20.090625	230.721875	146.6875

mtcars >> summarize(across(_["mpg", "hp"], {"avg": Fx.mean(), "std": Fx.std()}))

	mpg_avg	mpg_std	hp_avg	hp_std
0	20.090625	6.026948	146.6875	68.562868

mtcars >> group_by(_.cyl) >> summarize(across(_[_.mpg, _.hp], Fx.mean()))