Title: | Reduced-Rank Regression |
---|---|
Description: | Reduced-rank regression, diagnostics and graphics. |
Authors: | Chris Addy [aut, cre] |
Maintainer: | Chris Addy <[email protected]> |
License: | GPL-3 |
Version: | 1.1.0.9000 |
Built: | 2024-11-04 03:49:13 UTC |
Source: | https://github.com/chrisaddy/rrr |
All Pairs Plots
allpairs_plot(x, y, type = "pca", rank, k = 0)
allpairs_plot(x, y, type = "pca", rank, k = 0)
x |
data frame or matrix of predictor variables |
y |
data frame or matrix of response variables |
type |
type of reduced-rank regression model to fit. |
rank |
rank of coefficient matrix. |
k |
small constant added to diagonal of covariance matrices to make inversion easier. |
scatterplot matrix.
data(pendigits) digits_features <- pendigits[, -35:-36] digits_class <- pendigits[,35] allpairs_plot(digits_features, digits_class, type = "pca", rank = 3)
data(pendigits) digits_features <- pendigits[, -35:-36] digits_class <- pendigits[,35] allpairs_plot(digits_features, digits_class, type = "pca", rank = 3)
COMBO-17 galaxy photometric catalogue, 216, 219, 235
COMBO17
COMBO17
A data frame with 3462 observations on 65 numeric variables.
A. Izenman (2008). Modern Multivariate Statistical Techniques. Springer.
Wolf, C. Meisenheimer, M., Kleinheinrich, M., Borch, A., Dye, S., Gray, M., Wisotski, L., Bell, E.F., Rix, H., W. Cimatti, A., Hasinger, G., and Szokoly, G. (2004). A catalogue of the Chandra Deep Field South with multi-colour classification and photometric redshifts from COMBO-17, Astronomy & Astrophysics. https://arxiv.org/pdf/astro-ph/0403666.pdf
Pairwise Plots
pairwise_plot(x, y, type = "pca", pair_x = 1, pair_y = 2, rank = "full", k = 0, interactive = FALSE, point_size = 2.5)
pairwise_plot(x, y, type = "pca", pair_x = 1, pair_y = 2, rank = "full", k = 0, interactive = FALSE, point_size = 2.5)
x |
data frame or matrix of predictor variables |
y |
data frame or matrix of response variables |
type |
type of reduced-rank regression model to fit. |
pair_x |
variable to be plotted on the |
pair_y |
variable to be plotted on the |
rank |
rank of coefficient matrix. |
k |
small constant added to diagonal of covariance matrices to make inversion easier. |
interactive |
logical. If |
point_size |
size of points in scatter plot. |
ggplot2 object if interactive = FALSE
; plotly object if interactive = TRUE
.
Izenman, A.J. (2008) Modern Multivariate Statistical Techniques. Springer.
data(pendigits) digits_features <- pendigits[,1:34] digits_class <- pendigits[,35] pairwise_plot(digits_features, digits_class, type = "pca", pair_x = 1, pair_y = 3) library(dplyr) data(COMBO17) galaxy <- as_data_frame(COMBO17) galaxy <- select(galaxy, -starts_with("e."), -Nr, -UFS:-IFD) galaxy <- na.omit(galaxy) galaxy_x <- select(galaxy, -Rmag:-chi2red) galaxy_y <- select(galaxy, Rmag:chi2red) pairwise_plot(galaxy_x, galaxy_y, type = "cva") data(iris) iris_x <- iris[,1:4] iris_y <- iris[5] pairwise_plot(iris_x, iris_y, type = "lda")
data(pendigits) digits_features <- pendigits[,1:34] digits_class <- pendigits[,35] pairwise_plot(digits_features, digits_class, type = "pca", pair_x = 1, pair_y = 3) library(dplyr) data(COMBO17) galaxy <- as_data_frame(COMBO17) galaxy <- select(galaxy, -starts_with("e."), -Nr, -UFS:-IFD) galaxy <- na.omit(galaxy) galaxy_x <- select(galaxy, -Rmag:-chi2red) galaxy_y <- select(galaxy, Rmag:chi2red) pairwise_plot(galaxy_x, galaxy_y, type = "cva") data(iris) iris_x <- iris[,1:4] iris_y <- iris[5] pairwise_plot(iris_x, iris_y, type = "lda")
pen-based handwritten digit recognition, 211, 234, 274, 348, 391, 631
pendigits
pendigits
a data frame with 10992 observations on 36 unnamed variables
http://archive.ics.uci.edu/ml/datasets.html
A. Izenman (2008) Modern Multivariate Statistical Techniques. Springer.
rank_trace
is a plot used to determine the effective dimensionality, i.e., ,
of the reduced-rank regression equation.
rank_trace(x, y, type = "identity", k = 0, plot = TRUE, interactive = FALSE)
rank_trace(x, y, type = "identity", k = 0, plot = TRUE, interactive = FALSE)
x |
data frame or matrix of predictor variables |
y |
data frame or matrix of response variables |
type |
type of reduced-rank regression model to fit. |
k |
small constant added to diagonal of covariance matrices to make inversion easier. |
plot |
if FALSE, returns data frame of rank trace coordinates. |
interactive |
if TRUE, creates an interactive plotly graphic. |
plot of rank trace coordinates if plot = TRUE
, the default, or data frame of rank trace coordinates if plot = FALSE
.
Izenman, A.J. (2008) Modern Multivariate Statistical Techniques. Springer.
data(tobacco) tobacco_x <- tobacco[,4:9] tobacco_y <- tobacco[,1:3] gamma <- diag(1, dim(tobacco_y)[2]) rank_trace(tobacco_x, tobacco_y) rank_trace(tobacco_x, tobacco_y, plot = FALSE) rank_trace(tobacco_x, tobacco_y, type = "cva") data(pendigits) digits_features <- pendigits[, -35:-36] rank_trace(digits_features, digits_features, type = "pca") library(dplyr) data(COMBO17) galaxy <- as_data_frame(COMBO17) galaxy <- select(galaxy, -starts_with("e."), -Nr, -UFS:-IFD) galaxy <- na.omit(galaxy) galaxy_x <- select(galaxy, -Rmag:-chi2red) galaxy_y <- select(galaxy, Rmag:chi2red) rank_trace(galaxy_x, galaxy_y, type = "cva")
data(tobacco) tobacco_x <- tobacco[,4:9] tobacco_y <- tobacco[,1:3] gamma <- diag(1, dim(tobacco_y)[2]) rank_trace(tobacco_x, tobacco_y) rank_trace(tobacco_x, tobacco_y, plot = FALSE) rank_trace(tobacco_x, tobacco_y, type = "cva") data(pendigits) digits_features <- pendigits[, -35:-36] rank_trace(digits_features, digits_features, type = "pca") library(dplyr) data(COMBO17) galaxy <- as_data_frame(COMBO17) galaxy <- select(galaxy, -starts_with("e."), -Nr, -UFS:-IFD) galaxy <- na.omit(galaxy) galaxy_x <- select(galaxy, -Rmag:-chi2red) galaxy_y <- select(galaxy, Rmag:chi2red) rank_trace(galaxy_x, galaxy_y, type = "cva")
residuals
calculates the regression residuals for reduced-rank regression and canonical variate analysis.
residuals(x, y, type = "identity", rank = "full", k = 0, plot = TRUE)
residuals(x, y, type = "identity", rank = "full", k = 0, plot = TRUE)
x |
data frame or matrix of predictor variables |
y |
data frame or matrix of response variables |
type |
type of reduced-rank regression model to fit. |
rank |
rank of coefficient matrix. |
k |
small constant added to diagonal of covariance matrices to make inversion easier. |
plot |
if FALSE, returns data frame of rank trace coordinates. |
scatterplot matrix of residuals if plot = TRUE
, the default, or a data frame of residuals if plot = FALSE
.
Izenman, A.J. (2008) Modern Multivariate Statistical Techniques. Springer.
data(tobacco) tobacco_x <- tobacco[,4:9] tobacco_y <- tobacco[,1:3] tobacco_rrr <- rrr(tobacco_x, tobacco_y, rank = 1) residuals(tobacco_x, tobacco_y, rank = 1, plot = FALSE) residuals(tobacco_x, tobacco_y, rank = 1) library(dplyr) data(COMBO17) galaxy <- as_data_frame(COMBO17) galaxy <- select(galaxy, -starts_with("e."), -Nr, -UFS:-IFD) galaxy <- na.omit(galaxy) galaxy_x <- select(galaxy, -Rmag:-chi2red) galaxy_y <- select(galaxy, Rmag:chi2red) residuals(galaxy_x, galaxy_y, type = "cva", rank = 2, k = 0.001)
data(tobacco) tobacco_x <- tobacco[,4:9] tobacco_y <- tobacco[,1:3] tobacco_rrr <- rrr(tobacco_x, tobacco_y, rank = 1) residuals(tobacco_x, tobacco_y, rank = 1, plot = FALSE) residuals(tobacco_x, tobacco_y, rank = 1) library(dplyr) data(COMBO17) galaxy <- as_data_frame(COMBO17) galaxy <- select(galaxy, -starts_with("e."), -Nr, -UFS:-IFD) galaxy <- na.omit(galaxy) galaxy_x <- select(galaxy, -Rmag:-chi2red) galaxy_y <- select(galaxy, Rmag:chi2red) residuals(galaxy_x, galaxy_y, type = "cva", rank = 2, k = 0.001)
rrr
fits a reduced-rank regression model.
rrr(x, y, type = "identity", rank = "full", k = 0)
rrr(x, y, type = "identity", rank = "full", k = 0)
x |
data frame or matrix of predictor variables |
y |
data frame or matrix of response variables |
type |
type of reduced-rank regression model to fit. |
rank |
rank of coefficient matrix. |
k |
small constant added to diagonal of covariance matrices to make inversion easier. |
list containing estimates of coefficients and means, and eigenvalue-based diagnostics.
Izenman, A.J. (2008) Modern Multivariate Statistical Techniques. Springer.
data(tobacco) tobacco_x <- tobacco[,4:9] tobacco_y <- tobacco[,1:3] rrr(tobacco_x, tobacco_y, rank = 1) data(pendigits) digits_features <- pendigits[, -35:-36] rrr(digits_features, digits_features, type = "pca", rank = 3) library(dplyr) data(COMBO17) galaxy <- as_data_frame(COMBO17) galaxy <- select(galaxy, -starts_with("e."), -Nr, -UFS:-IFD) galaxy <- na.omit(galaxy) galaxy_x <- select(galaxy, -Rmag:-chi2red) galaxy_y <- select(galaxy, Rmag:chi2red) rrr(galaxy_x, galaxy_y, type = "cva", rank = 2) data(iris) iris_x <- iris[,1:4] iris_y <- iris[5] rrr(iris_x, iris_y, type = "lda")
data(tobacco) tobacco_x <- tobacco[,4:9] tobacco_y <- tobacco[,1:3] rrr(tobacco_x, tobacco_y, rank = 1) data(pendigits) digits_features <- pendigits[, -35:-36] rrr(digits_features, digits_features, type = "pca", rank = 3) library(dplyr) data(COMBO17) galaxy <- as_data_frame(COMBO17) galaxy <- select(galaxy, -starts_with("e."), -Nr, -UFS:-IFD) galaxy <- na.omit(galaxy) galaxy_x <- select(galaxy, -Rmag:-chi2red) galaxy_y <- select(galaxy, Rmag:chi2red) rrr(galaxy_x, galaxy_y, type = "cva", rank = 2) data(iris) iris_x <- iris[,1:4] iris_y <- iris[5] rrr(iris_x, iris_y, type = "lda")
Compute Latent Variable Scores
scores(x, y, type = "pca", rank = "full", k = 0)
scores(x, y, type = "pca", rank = "full", k = 0)
x |
data frame or matrix of predictor variables |
y |
data frame or matrix of response variables |
type |
type of reduced-rank regression model to fit. |
rank |
rank of coefficient matrix. |
k |
small constant added to diagonal of covariance matrices to make inversion easier. |
Izenman, A.J. (2008) Modern Multivariate Statistical Techniques. Springer.
data(pendigits) digits_features <- pendigits[, -35:-36] scores(digits_features, digits_features, type = "pca", rank = 3) library(dplyr) data(COMBO17) galaxy <- as_data_frame(COMBO17) galaxy <- select(galaxy, -starts_with("e."), -Nr, -UFS:-IFD) galaxy <- na.omit(galaxy) galaxy_x <- select(galaxy, -Rmag:-chi2red) galaxy_y <- select(galaxy, Rmag:chi2red) scores(galaxy_x, galaxy_y, type = "cva", rank = 4) data(iris) iris_x <- iris[,1:4] iris_y <- iris[5] scores(iris_x, iris_y, type = "lda")
data(pendigits) digits_features <- pendigits[, -35:-36] scores(digits_features, digits_features, type = "pca", rank = 3) library(dplyr) data(COMBO17) galaxy <- as_data_frame(COMBO17) galaxy <- select(galaxy, -starts_with("e."), -Nr, -UFS:-IFD) galaxy <- na.omit(galaxy) galaxy_x <- select(galaxy, -Rmag:-chi2red) galaxy_y <- select(galaxy, Rmag:chi2red) scores(galaxy_x, galaxy_y, type = "cva", rank = 4) data(iris) iris_x <- iris[,1:4] iris_y <- iris[5] scores(iris_x, iris_y, type = "lda")
Create three-dimensional, interactive plotly graphics for exploration and diagnostics.
threewise_plot(x, y, type = "pca", pair_x = 1, pair_y = 2, pair_z = 3, rank = "full", k = 0, point_size = 2.5)
threewise_plot(x, y, type = "pca", pair_x = 1, pair_y = 2, pair_z = 3, rank = "full", k = 0, point_size = 2.5)
x |
data frame or matrix of predictor variables |
y |
data frame or matrix of response variables |
type |
type of reduced-rank regression model to fit. |
pair_x |
variable to be plotted on the |
pair_y |
variable to be plotted on the |
pair_z |
variable to be plotted on the |
rank |
rank of coefficient matrix. |
k |
small constant added to diagonal of covariance matrices to make inversion easier. |
point_size |
size of points in scatter plot. |
three-dimensional plot. If type = "pca"
returns three principal components scores - defaulted to the first three - against each other.
If type = "cva"
returns three-dimensional plot of residuals. If type = "lda"
returns three-dimensional plot of three linear discriminant scores plotted against each other.
## Not run: data(pendigits) digits_features <- pendigits[, -35:-36] threewise_plot(digits_features, digits_class, type = "pca", k = 0.0001) library(dplyr) data(COMBO17) galaxy <- as_data_frame(COMBO17) galaxy <- select(galaxy, -starts_with("e."), -Nr, -UFS:-IFD) galaxy <- na.omit(galaxy) galaxy_x <- select(galaxy, -Rmag:-chi2red) galaxy_y <- select(galaxy, Rmag:chi2red) threewise_plot(galaxy_x, galaxy_y, type = "cva") data(iris) iris_x <- iris[,1:4] iris_y <- iris[5] threewise_plot(iris_x, iris_y, type = "lda") ## End(Not run)
## Not run: data(pendigits) digits_features <- pendigits[, -35:-36] threewise_plot(digits_features, digits_class, type = "pca", k = 0.0001) library(dplyr) data(COMBO17) galaxy <- as_data_frame(COMBO17) galaxy <- select(galaxy, -starts_with("e."), -Nr, -UFS:-IFD) galaxy <- na.omit(galaxy) galaxy_x <- select(galaxy, -Rmag:-chi2red) galaxy_y <- select(galaxy, Rmag:chi2red) threewise_plot(galaxy_x, galaxy_y, type = "cva") data(iris) iris_x <- iris[,1:4] iris_y <- iris[5] threewise_plot(iris_x, iris_y, type = "lda") ## End(Not run)
chemical composition of tobacco, 183, 187
tobacco
tobacco
a data frame with 25 observations on the following 9 variables.
‘Y1.BurnRate’ a numeric vector
‘Y2.PercentSugar’ a numeric vector
‘Y3.PercentNicotine’ a numeric vector
‘X1.PercentNitrogen’ a numeric vector
‘X2.PercentChlorine’ a numeric vector
‘X3.PercentPotassium’ a numeric vector
‘X4.PercentPhosphorus’ a numeric vector
‘X5.PercentCalcium’ a numeric vector
‘X6.PercentMagnesium’ a numeric vector
A. Izenman (2008). Modern Multivariate Statistical Techniques. Springer.
Anderson, R.L. and Bancroft, T.A. (1952). Statistical Theory in Research. New York: Mcgraw-Hill.