Code used by zoomerjoin to perform dplyr-style joins. Users wishing to write
their own joining functions can extend zoomerjoin's functionality by writing
joining functions to use with fuzzy_join_core
.
Arguments
- a, b
The two dataframes to join.
- by
A named vector indicating which columns to join on. Format should be the same as dplyr:
by = c("column_name_in_df_a" = "column_name_in_df_b")
, but two columns must be specified in each dataset (x column and y column). Specification made withdplyr::join_by()
are also accepted.- join_func
the joining function responsible for performing the join.
- mode
the dplyr-style type of join you want to perform
- block_by
A named vector indicating which columns to 'block' (perform exact joining) on. Format should be the same as dplyr:
by = c("column_name_in_df_a" = "column_name_in_df_b")
, but two columns must be specified in each dataset (x column and y column). Specification made withdplyr::join_by()
are also accepted.- similarity_column
An optional character vector. If provided, the data frame will contain a column with this name giving the similarity between the two fields. Extra column will not be present if anti-joining.
- ...
Other parameters to be passed to the joining function