experimental.pivot.pivot_longer

experimental.pivot.pivot_longer(__data, *cols, *, names_to='name', names_prefix=None, names_sep=None, names_pattern=None, names_ptypes=None, names_repair='check_unique', values_to='value', values_drop_na=False, values_ptypes=None, values_transform=None)

Pivot data from wide to long format.

This function stacks columns of data, turning them into rows.

Parameters

Name	Type	Description	Default
`__data`		The input data.	required
`*cols`		Columns to pivot into longer format. This uses tidyselect (e.g. `_\[_.some_col, _.another_col\]`).	`()`
`names_to`	Union[str, Tuple[str, …]]	A list specifying the new column or columns to create from the information stored in the column names of data specified by cols.	`'name'`
`names_prefix`	Optional[str]	A regular expression to strip off from the start of column selected by `*cols`.	`None`
`names_sep`	Optional[str]	If names_to is a list of name parts, this is a separater the name is split on. This is the same as the sep argument in the separate() function.	`None`
`names_pattern`	Optional[str]	If names_to is a list of name parts, this is a pattern to extract parts This is the same as the regex argument in the extract() function.	`None`
`names_ptypes`	Optional[Tuple]	Not implemented.	`None`
`values_ptypes`	Optional[Tuple]	Not implemented.	`None`
`names_transform`		TODO	required
`names_repair`	str	Strategy for fixing of invalid column names. “minimal” leaves them as is. “check_unique” raises an error if there are duplicate names. “unique” de-duplicates names by appending “___{position}” to them.	`'check_unique'`
`values_to`	str	A string specifying the name of the column created to hold the stacked values of the selected `*cols`. If names_to is a list with the entry “.value”, then this argument is ignored.	`'value'`

Examples

>>> from siuba import _

>>> df = pd.DataFrame({"id": [1, 2], "x": [5, 6], "y": [7, 8]})
>>> pivot_longer(df, ~_.id, names_to="variable", values_to="number")
   id variable  number
0   1        x       5
0   1        y       7
1   2        x       6
1   2        y       8

>>> weeks = pd.DataFrame({"id": [1], "year": [2020], "wk1": [5], "wk2": [6]})
>>> pivot_longer(weeks, _.startswith("wk"), names_to="week", names_prefix="wk")
   id  year week  value
0   1  2020    1      5
0   1  2020    2      6

>>> df2 = pd.DataFrame({"id": [1], "a_x1": [2], "b_x2": [3], "a_y1": [4]})
>>> names = ["condition", "group", "number"]
>>> pat = "(.*)_(.)(.*)"
>>> pivot_longer(df2, _["a_x1":"a_y1"], names_to = names, names_pattern = pat)
   id condition group number  value
0   1         a     x      1      2
0   1         b     x      2      3
0   1         a     y      1      4

>>> names = ["x1", "x2", "y1", "y2"]
>>> wide = pd.DataFrame({
...    "x1": [1, 11], "x2": [2, 22], "y1": [3, 33], "y2": [4, 44]
... })
>>> pivot_longer(wide, _[:], names_to = [".value", "set"], names_pattern = "(.)(.)")
  set   x   y
0   1   1   3
0   2   2   4
1   1  11  33
1   2  22  44