Skip to contents

Implements the Parity Regression (PR) methodology for multiple linear regression. Instead of minimizing the aggregate prediction error in the dependent variable, PR distributes the total prediction error evenly across all parameters. This approach ensures stability in the presence of high multicollinearity and is particularly suitable for data affected by substantial noise, such as time series data experiencing structural changes and evolving trends.

Usage

savvyPR(
  x,
  y,
  method = c("budget", "target"),
  val = NULL,
  lambda_val = NULL,
  use_feature_selection = FALSE,
  standardize = FALSE,
  intercept = TRUE,
  exclude = NULL
)

Arguments

x

A matrix of predictors with rows as observations and columns as variables. Must not contain NA values, and should not include an intercept column of ones.

y

A numeric vector of the response variable, should have the same number of observations as x. Must not contain NA values.

method

Character string specifying the parameterization method to use: "budget" (default) or "target".

val

Numeric tuning parameter. If method = "budget", this represents c (a value between 0 and 1/p). If method = "target", this represents t (a target risk parameter > 0).

lambda_val

Optional; a numeric value specifying the regularization strength. If NULL, lambda is determined via cross-validation.

use_feature_selection

Logical; if TRUE, applies Lasso to perform feature selection before model estimation. Defaults to FALSE.

standardize

Logical; if TRUE, scale and center predictor variables before fitting the model. Defaults to TRUE.

intercept

Logical; if TRUE, includes an intercept in the model, otherwise, the model will not estimate an intercept term. Defaults to TRUE.

exclude

Optional; an array of indices specifying columns to be excluded from the analysis.

Value

Returns an S3 object of class "savvyPR" containing the following components:

call

The matched call to the function.

coefficients

A numeric vector of estimated coefficients. If the tuning parameter is zero, coefficients are obtained from Ridge regression or OLS. Otherwise, they are obtained from the parity regression model.

method

The optimization method used ("budget" or "target").

fit

The fitted object returned by glmnet when the tuning parameter is zero.

orp_fit

The fitted object returned by the risk parity optimizer (optimizeRiskParityBudget or optimizeRiskParityTarget) when the tuning parameter is non-zero.

lambda

The regularization parameter lambda used in the model.

intercept

A logical value indicating whether an intercept is included in the model.

model

A data frame containing the response variable y and the covariates x used in the model.

Details

Parity Regression Model Estimation

The PR methodology assigns an equal risk contribution constraint on each predictor variable within a specific search cone, leading to a robust solution. This solution is defined in the context of a penalized regression, wherein the penalty term is a function of both the regularization parameter (lambda_val) and the proportional contribution parameter (val). The function uses the nleqslv package to solve non-linear systems of equations.

The function supports two parameterization methods:

  • Budget: Uses val (acting as c) to set a strict budget constraint on the risk contributions.

  • Target: Uses val (acting as t) to set a risk target for the response variable relative to the predictors.

The function can handle different practical scenarios:

  • While the PR theorem is not specifically designed for variable selection, the package makes this available as an optional preprocessing step. If use_feature_selection is TRUE, Lasso regression is performed to select features by zeroing out non-contributive predictors before applying the PR model.

  • It checks the matrix rank of predictors and applies Ridge regression as a fallback to ordinary least squares if the matrix is not full rank, ensuring computational stability.

For the budget method, the PR methodology optimizes an objective function that includes a variance term and a penalization term: $$1/2 * RRSS(x, x_{p+1}; \lambda) - \widetilde{\mu} ( c \sum_{k=0}^p \log(\delta_k x_k) + (1 - (p+1)c) \log(x_{p+1}) )$$

For the target method, the methodology optimizes a related objective function defined by the target parameter \(t\): $$1/2 * RRSS(x, x_{p+1}; \lambda) - \widetilde{\mu} ( \sum_{k=0}^p \log(\delta_k x_k) + t \log(x_{p+1}) )$$

In both formulas, \(x\) represents the parameters (with \(x_{p+1}\) as an auxiliary parameter set to 1), \(\lambda\) is the regularization parameter, \(p\) is the number of predictors, and \(\widetilde{\mu}\) is a constant with respect to \(\lambda\). The resulting model provides estimates of the regression coefficients that are equitable across all predictors in terms of contribution to the model's predictive power.

References

Asimit, V., Chen, Z., Ichim, B., & Millossovich, P. (2026). Prity Regression Estimation. Retrieved from https://openaccess.city.ac.uk/id/eprint/37017/

The optimization technique employed follows the algorithm described by: F. Spinu (2013). An Algorithm for Computing Risk Parity Weights. SSRN Preprint. doi:10.2139/ssrn.2297383

Author

Ziwei Chen, Vali Asimit and Pietro Millossovich
Maintainer: Ziwei Chen <ziwei.chen.3@citystgeorges.ac.uk>

Examples

library(glmnet)
#> Loading required package: Matrix
#> Loaded glmnet 4.1-8
library(nleqslv)

# Generate synthetic data
set.seed(123)
n <- 100 # Number of observations
p <- 10  # Number of variables
x <- matrix(rnorm(n * p), n, p)
beta <- matrix(rnorm(p), p, 1)
y <- x %*% beta + rnorm(n, sd = 0.5) # Linear combination with noise

# Example 1: Run PR estimation using the "budget" method (acting as c)
result_budget <- savvyPR(x, y, method = "budget", val = 0.05, intercept = TRUE)
print(result_budget$coefficients)
#>  [1]  0.04997937 -1.00500454 -1.10097610  0.12403465 -0.14222097 -2.62920376
#>  [7]  1.07610919  0.29923597  2.36165735  0.68467176 -0.44474216

# Example 2: Run PR estimation using the "target" method (acting as t)
result_target <- savvyPR(x, y, method = "target", val = 1, intercept = TRUE)
print(result_target$coefficients)
#>  [1]  0.04468683 -1.01944158 -1.11908908  0.17409082 -0.18240697 -2.63781293
#>  [7]  1.08697719  0.33274074  2.37947640  0.69904692 -0.46125508

# Example 3: Run PR estimation with feature selection
result_fs <- savvyPR(x, y, method = "budget", val = 0.05, use_feature_selection = TRUE)
print(result_fs$coefficients)
#>  [1]  0.04997937 -1.00500454 -1.10097610  0.12403465 -0.14222097 -2.62920376
#>  [7]  1.07610919  0.29923597  2.36165735  0.68467176 -0.44474216

# Inspect the risk parity portfolio object for more details
if(!is.null(result_fs$orp_fit)) {
  print("Risk parity portfolio details:")
  print(result_fs$orp_fit)
}
#> [1] "Risk parity portfolio details:"
#> $weights
#>  [1] 0.09247496 0.10130573 0.01141298 0.01308639 0.24192478 0.09901761
#>  [7] 0.02753404 0.21730664 0.06299971 0.04092271 0.09201447
#> 
#> $relativeRiskContrib
#>  [1] 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.50
#> 
#> $objFunction
#> [1] 2.608521
#> 
#> $isFeasible
#> [1] TRUE
#> 
#> $message
#> [1] "Optimization successful"
#> 

# Example 4: Run PR estimation excluding some predictors
result_exclude <- savvyPR(x, y, method = "budget", val = 0.05, exclude = c(1, 2))
print("Coefficients with first two predictors excluded:")
#> [1] "Coefficients with first two predictors excluded:"
print(result_exclude$coefficients)
#> [1]  0.004400731  0.409631245 -0.322373524 -2.327305565  1.014530237
#> [6]  0.448410716  2.381883669  0.865958692 -0.489070522