% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/distances.R
\name{get_distances}
\alias{get_distances}
\title{Get Distances}
\usage{
get_distances(df, distance_measures)
}
\arguments{
\item{df}{A dataframe of cluster fill rates created with
\code{\link{get_cluster_fill_rates}} and an added column that contains a writer ID.}

\item{distance_measures}{A vector of distance measures. Use 'abs' to
calculate the absolute difference, 'man' for the Manhattan distance, 'euc'
for the Euclidean distance, 'max' for the maximum absolute distance, and
'cos' for the cosine distance. The vector can be a single distance, or any
combination of these five distance measures.}
}
\value{
A dataframe of distances
}
\description{
Calculate distances using between all pairs of cluster fill rates in a data
frame using one or more distance measures. The available distance measures
absolute distance, Manhattan distance, Euclidean distance, maximum distance,
and cosine distance.
}
\details{
The absolute distance between two n-length vectors of cluster fill rates, a
and b, is a vector of the same length as a and b. It can be calculated as
abs(a-b) where subtraction is performed element-wise, then the absolute
value of each element is returned. More specifically, element i of the vector is \eqn{|a_i
- b_i|} for \eqn{i=1,2,...,n}.

The Manhattan distance between two  n-length vectors of cluster fill rates, a and b, is
\eqn{\sum_{i=1}^n |a_i - b_i|}. In other words, it is the sum of the absolute
distance vector.

The Euclidean distance between two  n-length vectors of cluster fill rates, a and b, is
\eqn{\sqrt{\sum_{i=1}^n (a_i - b_i)^2}}. In other words, it is the sum of the elements of the
absolute distance vector.

The maximum distance between two n-length vectors of cluster fill rates, a and b, is
\eqn{\max_{1 \leq i \leq n}{\{|a_i - b_i|\}}}. In other words, it is the sum of the elements of the
absolute distance vector.

The cosine distance between two n-length vectors of cluster fill rates, a and b, is
\eqn{\sum_{i=1}^n (a_i - b_i)^2 / (\sqrt{\sum_{i=1}^n a_i^2}\sqrt{\sum_{i=1}^n b_i^2})}.
}
\examples{

rates <- test[1:3, ]
# calculate maximum and Euclidean distances between the first 3 documents in test.
distances <- get_distances(df = rates, distance_measures = c("max", "euc"))

# calculate maximum and distances between all documents in test.
distances <- get_distances(df = test, distance_measures = c("man"))

}
