disttools provides the functionality needed to rapidly and intuitively retrieve information from large ‘dist’ objects. This functionality is encoded in the get_dists function. The function’s main use cases are outlined below.
After installing the package, it can be loaded by executing:
# Load the package.
library(disttools)
Below, some example data is randomly generated.
# Create some data to play with.
set.seed(123456789)
<- matrix(rnorm(10), ncol = 2) mat
A ‘dist’ object can be generated for these points by executing:
# Generate a 'dist' object.
<- dist(mat) mat_dists
Below, a set of pairs of points are specified for distance retrieval.
# Specify index pairs of interest.
<- matrix(c(1,2,3,2,4,4), ncol = 2, byrow = TRUE) indices
The function get_dists can be used to access the distance between pairs of points. This can be accomplished via two methods. First, a matrix of index pairs can be passed to the function along with the ‘dist’ object itself.
# Retrieve distances using the matrix-based method.
get_dists(mat_dists, indices)
Second, two vectors corresponding to the columns of the index matrix can be passed to the function along with the ‘dist’ object.
# Create vectors i and j from the above data.
<- indices[,1]
i <- indices[,2]
j
# Retrieve distances using the paired vectors method.
get_dists(mat_dists, i, j)
Sometimes, the distances for all combinations of a set of points are desired. This information can be easily extracted by executing the following:
# Create a matrix of unique index pairs.
<- combn(1:3, 2) # Generate the combinations
index_pairs <- t(index_pairs) # Transpose to put the data in tall format.
index_pairs
# Retrieve the distances as above.
get_dists(mat_dists, index_pairs)
It is often desirable to create a matrix or data.frame composed of two columns that indicate the indices being compared and a third column giving the distances between those indices. For convenience, the argument return_indices can be set to TRUE. Doing so results in a three column matrix being returned. It can be converted into a data.frame using the function as.data.frame.
# Retrieve distances using the matrix-based method.
get_dists(mat_dists, indices, return_indices = TRUE)
The views expressed are those of the author(s) and do not reflect the official policy of the Department of the Army, the Department of Defense or the U.S. Government.