semi_join

semi_join(left, right=None, on=None, *args, *, by=None)

Return the left table with every row that would be kept in an inner join.

Parameters

Name Type Description Default
left The left-hand table. required
right The right-hand table. None
on How to match them. By default it uses matches all columns with the same name across the two tables. None

Examples

>>> import pandas as pd
>>> from siuba import _, semi_join, anti_join
>>> df1 = pd.DataFrame({"id": [1, 2, 3], "x": ["a", "b", "c"]})
>>> df2 = pd.DataFrame({"id": [2, 3, 3], "y": ["l", "m", "n"]})
>>> df1 >> semi_join(_, df2)
   id  x
1   2  b
2   3  c
>>> df1 >> anti_join(_, df2)
   id  x
0   1  a

Generally, it’s a good idea to explicitly specify the on argument.

>>> df1 >> anti_join(_, df2, on="id")
   id  x
0   1  a