Skip to contents

Spatial Anti Join Using KD-Trees

Usage

kd_anti_join(a, b, by = NULL, threshold = 1)

Arguments

a

the first dataframe you wish to join.

b

the second dataframe you wish to join.

by

a named vector indicating which columns to join on. Format should be the same as dplyr: by = c("column_name_in_df_a" = "column_name_in_df_b"), but two columns must be specified in each dataset (x column and y column).

threshold

the distance threshold below which units should be considered a match

Value

a tibble fuzzily-joined on the basis of the variables in by. Tries to adhere to the same standards as the dplyr-joins, and uses the same logical joining patterns (i.e. inner-join joins and keeps only observations in both datasets).

Examples

n <- 10

X_1 <- matrix(c(seq(0,1,1/(n-1)), seq(0,1,1/(n-1))), nrow=n)
X_2 <- X_1 + .0000001

X_1 <- as.data.frame(X_1)
X_2 <- as.data.frame(X_2)

X_1$id_1 <- 1:n
X_2$id_2 <- 1:n


kd_anti_join(X_1, X_2, by = c("V1", "V2"), threshold =.00005)
#> [1] V1.x V2.x id_1 V1.y V2.y id_2
#> <0 rows> (or 0-length row.names)