fitur package includes several tools for visually inspecting how good of a fit a distribution is. To start, fictional empirical data is generated below. Typically this would come from a real-world dataset such as the time it takes to serve a customer at a bank, the length of stay in an emergency department, or customer arrivals to a queue.
Below is a histogram showing the shape of the distribution and the y-axis has been set to show the probability density.
Three distributions have been chosen below to test against the dataset. Using the
fit_univariate function, each of the distributions are fit to a fitted object. The first item in each of the fits is the probabilty density function. Each fit is overplotted onto the histogram to see which distribution fits best.
The next plot used is the quantile-quantile plot. The
plot_qq function takes a numeric vector x of the empirical data and sorts them. A range of probabilities are computed and then used to compute comparable quantiles using the
q distribution function from the fitted objects. A good fit would closely align with the abline y = 0 + 1*x. Note: the q-q plot tends to be more sensitive around the “tails” of the distributions.
The Percentile-Percentile plot rescales the input data to the interval (0, 1] and then calculates the theoretical percentiles to compare. The
plot_pp function takes the same inputs as the Q-Q Plot but it performs on rescaling of x and then computes the percentiles using the
p distribution of the fitted object. A good fit matches the abline y = 0 + 1*x. Note: The P-P plot tends to be more sensitive in the middle of the distribution.