| Title: | Linear Programming Discriminant Analysis |
|---|---|
| Description: | Classification method obtained through linear programming. It is advantageous with respect to the classical developments when the distribution of the variables involved is unknown or when the number of variables is much greater than the number of individuals. Mathematical details behind the method are published in Nueda, et al. (2022) "LPDA: A new classification method based on linear programming". <doi:10.1371/journal.pone.0270403>. |
| Authors: | María José Nueda [aut, cre] |
| Maintainer: | María José Nueda <[email protected]> |
| License: | GPL (>= 2) |
| Version: | 1.2.3 |
| Built: | 2026-05-12 08:59:35 UTC |
| Source: | https://github.com/cran/lpda |
This function computes a discriminating hyperplane for two groups with original data (calling lpda.fit) or with principal components (calling lpda.pca)
lpda(data, group, scale = FALSE, pca = FALSE, PC = 2, Variability = NULL, f1 = NULL, f2 = NULL) ## S3 method for class 'lpda' print(x, ...)lpda(data, group, scale = FALSE, pca = FALSE, PC = 2, Variability = NULL, f1 = NULL, f2 = NULL) ## S3 method for class 'lpda' print(x, ...)
data |
Matrix containing data. Individuals in rows and variables in columns |
group |
Vector with the variable group |
scale |
Logical indicating if it is required standardize data. When pca=TRUE data is always scaled. |
pca |
Logical indicating if Principal Components Analysis is required |
PC |
Number of Principal Components (PC) for PCA. By default it is 2. When the number of PC is not decided, it can be determined choosing the desired proportion of explained variability (Variability parameter). |
Variability |
Parameter for Principal Components (PC) selection. This is the minimum desired proportion of variability explained for the PC of the variables. The analysis is always done with a minimum of 2 PCs. If it is NULL the PCA will be computed with PC parameter. |
f1 |
Vector with weights for individuals of the first group. If NULL they are equally weighted. |
f2 |
Vector with weights for individuals of the second group. If NULL they are equally weighted. |
x |
An object of class " |
... |
Other arguments passed. |
lpda returns an object of class "lpda".
coef |
Hyperplane coefficients |
data |
Input data matrix |
group |
Input group vector |
scale |
Input scale argument |
pca |
Input pca argument |
loadings |
Principal Components loadings. Showed when pca = TRUE |
scores |
Principal Components scores. Showed when pca = TRUE |
var.exp |
A matrix containing the explained variance for each component and the cumulative variance. Showed when pca = TRUE |
PCs |
Number of Principal Components in the analysis. Showed when pca = TRUE |
The functions predict and plot can be used to obtain the predicted classes and a plot in two dimensions with the distances to the computed hyperplane for the two classes.
Maria Jose Nueda, [email protected]
Nueda MJ, Gandía C, Molina MD (2022) LPDA: A new classification method based on linear programming. PLoS ONE 17(7): e0270403. <https://doi.org/10.1371/journal.pone.0270403>
######### palmdates example in lpda package: data(palmdates) group = as.factor( c(rep("Spanish",11),rep("Foreign",10)) ) # with concentration data: model = lpda(data = palmdates$conc, group = group ) summary(model) predict(model) plot(model, main = "Palmdates example") model.pca = lpda(data = palmdates$conc, group = group, pca=TRUE, PC = 2) plot(model.pca, PCscores = TRUE, main = "Palmdates example") # with spectra data model.pca = lpda(data = palmdates$spectra, group = group, pca=TRUE, Variability = 0.9) model.pca$PCs # 4 PCs to explain 90% of the variability plot(model.pca, PCscores = TRUE, main = "Spectra palmdates")######### palmdates example in lpda package: data(palmdates) group = as.factor( c(rep("Spanish",11),rep("Foreign",10)) ) # with concentration data: model = lpda(data = palmdates$conc, group = group ) summary(model) predict(model) plot(model, main = "Palmdates example") model.pca = lpda(data = palmdates$conc, group = group, pca=TRUE, PC = 2) plot(model.pca, PCscores = TRUE, main = "Palmdates example") # with spectra data model.pca = lpda(data = palmdates$spectra, group = group, pca=TRUE, Variability = 0.9) model.pca$PCs # 4 PCs to explain 90% of the variability plot(model.pca, PCscores = TRUE, main = "Spectra palmdates")
This function applies lpda methodology to classify individuals in two or more groups with original data (by applying lpda through the third dimension) or by applying lpda to the parafac scores.
lpda.3D(data, group, scale = FALSE, pfac = FALSE, nfac = 2, nstart = 10, seed=2, f1 = NULL, f2 = NULL) ## S3 method for class 'lpda.3D' print(x, ...)lpda.3D(data, group, scale = FALSE, pfac = FALSE, nfac = 2, nstart = 10, seed=2, f1 = NULL, f2 = NULL) ## S3 method for class 'lpda.3D' print(x, ...)
data |
Array containing data. Individuals in the first mode, variables in the second mode and third mode with time or similar. |
group |
Vector with the variable group. |
scale |
Logical indicating if it is required standardize data. |
pfac |
Logical indicating if Parafac Analysis is required |
nfac |
Number of factors for Parafac Analysis. |
nstart |
Number of random starts for multiway analysis. |
seed |
A single value to reproduce same results in multiway methods. |
f1 |
Vector with weights for individuals of the first group. If NULL they are equally weighted. |
f2 |
Vector with weights for individuals of the second group. If NULL they are equally weighted. |
x |
An object of class " |
... |
Other arguments passed. |
lpda.3D returns an object of class "lpda.3D".
MOD |
When |
data |
Input array data |
group |
Input group vector |
pfac |
Input pfac argument |
The functions predict and plot can be used to obtain the predicted classes and a plot in two dimensions with the distances to the computed hyperplane for the two classes.
Maria Jose Nueda, [email protected]
Nueda MJ, Gandía C, Molina MD (2022) LPDA: A new classification method based on linear programming. PLoS ONE 17(7): e0270403. <https://doi.org/10.1371/journal.pone.0270403>
### RNAseq is a 3-dimensional array data(RNAseq) group = as.factor(rep(c("G1","G2"), each = 10)) ## Strategy 1 model3D = lpda.3D(RNAseq, group) summary(model3D) predict(model3D) plot(model3D, mfrow=c(2,2)) ## Strategy 2: with parafac model3Ds2 = lpda.3D(RNAseq, group, pfac=TRUE, nfac=2) model3Ds2$MOD$mod.pfac$Rsq predict(model3Ds2) summary(model3Ds2) plot(model3Ds2, pfacscores=FALSE, main="Parafac Model", mfrow=c(1,1)) plot(model3Ds2, pfacscores=TRUE, cex=1.5, main="Parafac components") legend("bottomright", levels(group), col=c(2,3), pch=20)### RNAseq is a 3-dimensional array data(RNAseq) group = as.factor(rep(c("G1","G2"), each = 10)) ## Strategy 1 model3D = lpda.3D(RNAseq, group) summary(model3D) predict(model3D) plot(model3D, mfrow=c(2,2)) ## Strategy 2: with parafac model3Ds2 = lpda.3D(RNAseq, group, pfac=TRUE, nfac=2) model3Ds2$MOD$mod.pfac$Rsq predict(model3Ds2) summary(model3Ds2) plot(model3Ds2, pfacscores=FALSE, main="Parafac Model", mfrow=c(1,1)) plot(model3Ds2, pfacscores=TRUE, cex=1.5, main="Parafac components") legend("bottomright", levels(group), col=c(2,3), pch=20)
lpda.fit computes the discriminating hyperplane for two groups, giving as a result the coefficients of the hyperplane.
lpda.fit(data, group, f1 = NULL, f2 = NULL)lpda.fit(data, group, f1 = NULL, f2 = NULL)
data |
Matrix containing data. Individuals in rows and variables in columns |
group |
Vector with the variable group |
f1 |
Vector with weights for individuals of the first group |
f2 |
Vector with weights for individuals of the second group |
coef |
Hyperplane coefficients |
Maria Jose Nueda, [email protected]
Nueda MJ, Gandía C, Molina MD (2022) LPDA: A new classification method based on linear programming. PLoS ONE 17(7): e0270403. <https://doi.org/10.1371/journal.pone.0270403>
lpda.pca computes the discriminating hyperplane for two groups with Principal Components (PC)
lpda.pca(data, group, PC = 2, Variability = NULL)lpda.pca(data, group, PC = 2, Variability = NULL)
data |
Matrix containing data. Individuals in rows and variables in columns |
group |
Vector with the variable group |
PC |
Number of Principal Components (PC) for PCA. By default it is 2. When the number of PC is not decided, it can be determined choosing the desired proportion of explained variability (Variability parameter). |
Variability |
Parameter for Principal Components (PC) selection. This is the minimum desired proportion of variability explained for the PC of the variables. The analysis is always done with a minimum of 2 PCs. If it is NULL the PCA will be computed with PC parameter. |
loadings |
Principal Components loadings. |
scores |
Principal Components scores. |
var.exp |
A matrix containing the explained variance for each component and the cumulative variance. |
PCs |
Number of Principal Components in the analysis. |
Maria Jose Nueda, [email protected]
Nueda MJ, Gandía C, Molina MD (2022) LPDA: A new classification method based on linear programming. PLoS ONE 17(7): e0270403. <https://doi.org/10.1371/journal.pone.0270403>
lpdaCV evaluates the error rate classification with a crossvalidation procedure
lpdaCV(data, group, scale = FALSE, pca = FALSE, PC = 2, Variability = NULL, CV = "ktest", ntest = 10, R = 10, f1 = NULL, f2 = NULL) ## S3 method for class 'lpdaCV' print(x, ...)lpdaCV(data, group, scale = FALSE, pca = FALSE, PC = 2, Variability = NULL, CV = "ktest", ntest = 10, R = 10, f1 = NULL, f2 = NULL) ## S3 method for class 'lpdaCV' print(x, ...)
data |
Matrix containing data. Individuals in rows and variables in columns |
group |
Vector with the variable group |
scale |
Logical indicating if it is required standardize data. |
pca |
Logical indicating if a reduction of dimension is required |
PC |
Number of Principal Components (PC) for PCA. By default it is 2. When the number of PC is not decided, it can be determined choosing the desired proportion of explained variability (Variability parameter) or choosing the maximum number of errors allowed in the training set (Error.max). |
Variability |
Parameter for Principal Components (PC) selection. This is the desired proportion of variability explained for the PC of the variables. |
CV |
Crossvalidation mode: loo "leave one out" or ktest: that leaves k in the test set. |
ntest |
Number of samples to evaluate in the test-set. |
R |
Number of times that the error is evaluated. |
f1 |
Vector with weights for individuals of the first group. If NULL they are equally weighted. |
f2 |
Vector with weights for individuals of the second group. If NULL they are equally weighted. |
x |
An object of class " |
... |
Other arguments passed. |
lpdaCV returns the prediction error rate classification.
Maria Jose Nueda, [email protected]
### RNAseq is a 3-dimensional array data(RNAseq) data = RNAseq[,,3] group = as.factor(rep(c("G1","G2"), each = 10)) lpdaCV(data, group, pca = TRUE, CV = "ktest", ntest = 2)### RNAseq is a 3-dimensional array data(RNAseq) data = RNAseq[,,3] group = as.factor(rep(c("G1","G2"), each = 10)) lpdaCV(data, group, pca = TRUE, CV = "ktest", ntest = 2)
lpdaCV.3D evaluates the error rate classification with a crossvalidation procedure
lpdaCV.3D(data, group, scale = FALSE, pfac = FALSE, nfac = 2, nstart = 10, seed=2, CV = "ktest", ntest = 10, R = 10, f1 = NULL, f2 = NULL)lpdaCV.3D(data, group, scale = FALSE, pfac = FALSE, nfac = 2, nstart = 10, seed=2, CV = "ktest", ntest = 10, R = 10, f1 = NULL, f2 = NULL)
data |
Array containing data. Individuals in the first mode, variables in the second mode and third mode with time or similar. |
group |
Vector with the variable group. |
scale |
Logical indicating if it is required standardize data. |
pfac |
Logical indicating if Parafac Analysis is required |
nfac |
Number of factors for Parafac Analysis. By default it is 2. |
nstart |
Number of random starts for multiway analysis. |
seed |
A single value to reproduce same results in multiway methods. If NULL the start will be random. |
CV |
Crossvalidation mode: loo "leave one out" or ktest: that leaves k in the test set. |
ntest |
Number of samples to evaluate in the test-set. |
R |
Number of times that the error is evaluated. |
f1 |
Vector with weights for individuals of the first group. If NULL they are equally weighted. |
f2 |
Vector with weights for individuals of the second group. If NULL they are equally weighted. |
lpda.3D returns the prediction error rate classification.
Maria Jose Nueda, [email protected]
### RNAseq is a 3-dimensional array data(RNAseq) group = as.factor(rep(c("G1","G2"), each = 10)) lpdaCV.3D(RNAseq, group , CV = "ktest", R=5, ntest=5, pfac=TRUE, nfac=c(2,10))### RNAseq is a 3-dimensional array data(RNAseq) group = as.factor(rep(c("G1","G2"), each = 10)) lpdaCV.3D(RNAseq, group , CV = "ktest", R=5, ntest=5, pfac=TRUE, nfac=c(2,10))
A data set with scores of 21 dates on spectrometry and concentration measurements of the substances that better define the quality of the dates: fibre, sorbitol, fructose, glucose and myo-inositol. The first 11 dates are Spanish (from Elche, Alicante) and the last 10 are from other countries, mainly Arabian.
palmdatespalmdates
A data frame with 2 elements:
a data frame with 5 columns: fibre, sorbitol, fructose, glucose and myo-inositol.
a data frame with 2050 columns.
Maria Jose Nueda, [email protected]
Abdrabo, S.S., Gras, L., Grindlay, G. and Mora, J. (2021) Evaluation of Fourier Transform-Raman Spectroscopy for palm dates characterization. Journal of food composition and analysis. Submitted.
Computes a Principal Component Analysis when p>n and when p<=n.
PCA(X)PCA(X)
X |
Matrix or data.frame with variables in columns and observations in rows. |
eigen |
A eigen class object with eigenvalues and eigenvectors of the analysis. |
var.exp |
A matrix containing the explained variance for each component and the cumulative variance. |
scores |
Scores of the PCA analysis. |
loadings |
Loadings of the PCA analysis. |
Maria Jose Nueda, [email protected]
## Simulate data matrix with 500 variables and 10 observations datasim = matrix(sample(0:100, 5000, replace = TRUE), nrow = 10) ## PCA myPCA = PCA(datasim) ## Extracting the variance explained by each principal component myPCA$var.exp## Simulate data matrix with 500 variables and 10 observations datasim = matrix(sample(0:100, 5000, replace = TRUE), nrow = 10) ## PCA myPCA = PCA(datasim) ## Extracting the variance explained by each principal component myPCA$var.exp
plot.lpda is applied to an lpda class object. It shows a plot
in two dimensions with the distances to the computed hyperplane of each individual coloring each case with the real class.
## S3 method for class 'lpda' plot(x, PCscores = FALSE, main = NULL, xlab = NULL, ylab = NULL, col= NULL, pch = NULL, lty = NULL, legend.pos = "topright",...)## S3 method for class 'lpda' plot(x, PCscores = FALSE, main = NULL, xlab = NULL, ylab = NULL, col= NULL, pch = NULL, lty = NULL, legend.pos = "topright",...)
x |
Object of class inheriting from "lpda" |
PCscores |
Logical to show the first 2 PCscores. Only possible when PCA is applied. |
main |
An optional title for the plot. |
xlab |
An optional title for x-axis. |
ylab |
An optional title for y-axis. |
col |
An optional vector with colours for the groups. |
pch |
An integer specifying the symbol to be used in plotting points. When NULL, pch=20. |
lty |
The line type. If it is not specified, lty = 2 for the distances to the hiperplane and lty = 1 for PCs plot |
legend.pos |
The position for the legend. By default it is topright. NULL when no legend is required. |
... |
Other arguments passed. |
Two dimensional plot representing the distances to the computed hyperplane of each individual colored with the real class.
Maria Jose Nueda, [email protected]
plot.lpda.3D is applied to an lpda.3D class object. It shows a plot
in two dimensions with the distances to the computed hyperplane of each individual coloring each case with the real class.
## S3 method for class 'lpda.3D' plot(x, pfacscores = FALSE, main = NULL, legend.pos = "topright", ...)## S3 method for class 'lpda.3D' plot(x, pfacscores = FALSE, main = NULL, legend.pos = "topright", ...)
x |
Object of class inheriting from "lpda" |
pfacscores |
Logical to show the first 2 parafac scores. Only possible when parafac is applied. |
main |
An optional title for the plot. |
legend.pos |
The position for the legend. By default it is topright. NULL when no legend is required. |
... |
Other arguments passed. |
Two dimensional plot representing the distances to the computed hyperplane of each individual colored with the real class.
Maria Jose Nueda, [email protected]
Predict method for lpda classification
## S3 method for class 'lpda' predict(object, datatest = object$data,...) ## S3 method for class 'predict.lpda' print(x, ...)## S3 method for class 'lpda' predict(object, datatest = object$data,...) ## S3 method for class 'predict.lpda' print(x, ...)
object |
Object of class inheriting from "lpda" |
datatest |
An optional data to predict their class. If omitted, the original data is used. |
x |
An object of class " |
... |
Other arguments passed. |
fitted |
Predicted class |
eval |
Evaluation of each individual in the fitted model |
Maria Jose Nueda, [email protected]
Predict method for lpda.3D classification
## S3 method for class 'lpda.3D' predict(object, datatest = NULL,...) ## S3 method for class 'predict.lpda.3D' print(x, ...)## S3 method for class 'lpda.3D' predict(object, datatest = NULL,...) ## S3 method for class 'predict.lpda.3D' print(x, ...)
object |
Object of class inheriting from "lpda.3D" |
datatest |
An optional data to predict their class. If omitted, the original data is used. |
x |
An object of class " |
... |
Other arguments passed. |
fitted |
Predicted class |
eval |
Evaluation of each individual in all the fitted models |
Maria Jose Nueda, [email protected]
A simulated RNA-Seq dataset example.
RNAseqRNAseq
An array data frame with 20 samples (1st dimension), 600 genes (2nd dimension) and 4 time-points (3rd dimension).
This dataset is a RNA-Seq simulated example. It has been simulated as Negative Binomial distributed and transformed to rpkm (Reads per kilo base per million mapped reads). It contains gene expression from 600 genes measured to 60 samples through 4 time-points. First 10 samples are from first group and the remaining samples from the second one.
Maria Jose Nueda, [email protected]
stand center and scale a data matrix
stand(X)stand(X)
X |
a data matrix with individuals in rows and variables in columns |
Scaled data matrix
stand2 center and scale a data matrix with the parameters of another one
stand2(X, X2)stand2(X, X2)
X |
the data matrix from which mean and standard deviation is computed |
X2 |
the data matrix to center and scale |
Scaled X2 data matrix
summary method for class "lpda"
## S3 method for class 'lpda' summary(object, datatest = object$data, grouptest=object$group,...) ## S3 method for class 'summary.lpda' print(x, ...)## S3 method for class 'lpda' summary(object, datatest = object$data, grouptest=object$group,...) ## S3 method for class 'summary.lpda' print(x, ...)
object |
Object of class inheriting from "lpda" |
datatest |
An optional data to predict their class and compare with real in the confusion matrix. If omitted, the original data is used. |
grouptest |
When datatest is specified, grouptest must also be specified and viceversa. |
x |
An object of class " |
... |
Other arguments passed. |
Confusion.Matrix |
Table of confusion. Predicted classes in rows and real classes in columns, giving the hit (in the diagonal) and misclassification counts (out of the diagonal) |
Maria Jose Nueda, [email protected]
summary method for class "lpda.3D"
## S3 method for class 'lpda.3D' summary(object, datatest = NULL, grouptest=NULL,...) ## S3 method for class 'summary.lpda.3D' print(x, ...)## S3 method for class 'lpda.3D' summary(object, datatest = NULL, grouptest=NULL,...) ## S3 method for class 'summary.lpda.3D' print(x, ...)
object |
Object of class inheriting from "lpda.3D" |
datatest |
An optional data to predict their class and compare with real in the confusion matrix. If omitted, the original data is used. |
grouptest |
When datatest is specified, grouptest must also be specified and viceversa. |
x |
An object of class " |
... |
Other arguments passed. |
Confusion.Matrix |
Table of confusion. Predicted classes in rows and real classes in columns, giving the hit (in the diagonal) and misclassification counts (out of the diagonal) |
Maria Jose Nueda, [email protected]