# Case Study: Support Vector Machines Based on Tail Risk Measures

Instructions for optimization with PSG Run-File,  PSG MATLAB Toolbox, PSG MATLAB Subroutines and PSG R.

Problem 1a: Nu-SVM

——————————————————————–
Cvar_risk = Conditional Value-at-Risk specified by a matrix of scenarios
——————————————————————–
Data and solution in Run-File Environment

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 2.66GHz (sec)
Dataset1 Problem Statement Data Solution 25 1,000 0.0 0.03
Data and solution in MATLAB Environment

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 3.50GHz (sec)
Dataset1 Matlab code Data Solution 25 1,000 0.0 0.04
Problem 1a’: Cross Validation for Nu-SVM
2-fold crossvalidation
——————————————————————–
Cvar_risk = Conditional Value-at-Risk specified by a matrix of scenarios
——————————————————————–

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 3.50GHz (sec)
Dataset1 Cycle statement Data Solution 25 150 -0.00249 0.02
Dataset2 25 150 -0.00175 0.04
Cross-Validation shows model performance on two pairs of in-sample and out-of-sample datasets.
For the first pair of datasets (file “solution_problem_1.txt”) :

 In-sample CVaR = cvar_risk(0.5,cutout(1,2,matrix_prior_scenarios)) = -9.9e-003 Out-of-sample CVaR = cvar_risk(0.5,takein(1,2,matrix_prior_scenarios)) = 1.85e-002

For the second pair of datasets (file “solution_problem_2.txt”) :

 In-sample CVaR = cvar_risk(0.5,cutout(2,2,matrix_prior_scenarios)) = -7.02e-003 Out-of-sample cvar_risk(0.5,takein(2,2,matrix_prior_scenarios)) = 9.15e-003

CVaRs in-sample and CVaR out-of-sample are significantly different, i.e., there is a significant over-fitting of the model.

Problem 1b: Nu-SVM with VaR Measure
——————————————————————–
Var_risk = Value-at-Risk specified by a matrix of scenarios
——————————————————————–
Data and solution in Run-File Environment

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 2.66GHz (sec)
Dataset1 Problem Statement Data Solution 25 1,000 -707.78409 0.04
Data and solution in MATLAB Environment

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 3.50GHz (sec)
Dataset1 Matlab code Data Solution 25 1,000 -707.238 0.03
Problem 2a: Extended Nu-SVM
Minimize cvar_risk
Subject to
——————————————————————–
Cvar_risk = Conditional Value-at-Risk specified by a matrix of scenarios
——————————————————————–

Dataset 25 1,000 0.056529 0.40 # of Variables # of Scenarios Objective Value Solving Time, PC 3.14GHz (sec) Environments Run-File Problem Statement Data Solution Matlab Toolbox Data Matlab Subroutines Matlab Code Data R R Code Data
Problem 2b: Extended Nu-SVM with VaR Measure
Minimize var_risk
Subject to
——————————————————————–
Var_risk = Value-at-Risk specified by a matrix of scenarios
——————————————————————–

Dataset 25 1,000 -1053.893608 5.08 # of Variables # of Scenarios Objective Value Solving Time, PC 3.14GHz (sec) Environments Run-File Problem Statement Data Solution Matlab Toolbox Data Matlab Subroutines Matlab Code Data R R Code Data
Problem 3a: Robust Nu-SVM
——————————————————————–
Max_cvar_risk = Maximum of Conditional Value-at-Risk functions specified by a set of matrices of scenarios
——————————————————————–

Dataset 25 134 0.0 0.08 # of Variables # of Scenarios Objective Value Solving Time, PC 3.14GHz (sec) Environments Run-File Problem Statement Data Solution Matlab Toolbox Data Matlab Subroutines Matlab Code Data R R Code Data
Problem 3b: Robust Nu-SVM with VaR Measure
——————————————————————–
Max_var_risk = Maximum of Value-at-Risk functions specified by a set of matrices of scenarios
——————————————————————–

Dataset 25 134 -1,434.071147 0.55 # of Variables # of Scenarios Objective Value Solving Time, PC 3.14GHz (sec) Environments Run-File Problem Statement Data Solution Matlab Toolbox Data Matlab Subroutines Matlab Code Data R R Code Data
Problem 3c: Regularized Weighted Difference of CVaRs
——————————————————————–
cvar_difference = weighted difference of two CVaR functions specified by a set of matrices of scenarios
——————————————————————–
Data and solution in Run-File Environment

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 2.66GHz (sec)
Dataset1 Problem Statement Data Solution 7 230 -0.0010565 0.02
Data and solution in MATLAB Environment

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 3.50GHz (sec)
Dataset1 Matlab code Data Solution 7 230 -0.0010565 0.01
Problem 4a (Primal): Nu-SVM with CVaR in Objective and L_Infinity Norm in Constraint
Minimize cvar_risk
Subject to
L_Infinity_Norm <=1
——————————————————————–
Cvar_risk = Conditional Value-at-Risk specified by a matrix of scenarios
L_infinity_norm = L_infinity norm
——————————————————————–
Data and solution in Run-File Environment

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 2.66GHz (sec)
Dataset1 Problem Statement Data Solution 25 1,000 -0.318631 0.04
Data and solution in MATLAB Environment

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 3.50GHz (sec)
Dataset1 Matlab code Data Solution 25 1,000 -0.318631 0.02
Problem 4b (Dual): Nu-SVM with L1 Norm in Objective and Envelope Constraint
maximize – L1_Norm
Subject to
Linear = 0,
Envelope Constraint
——————————————————————–
L1_norm = L1 norm
Envelope Constraint = CVaR envelop set of constraints
——————————————————————–

Dataset 1024 1,000 -0.318631 0.01 # of Variables # of Scenarios Objective Value Solving Time, PC 3.14GHz (sec) Environments Run-File Problem Statement Data Solution Matlab Toolbox Data Matlab Subroutines Matlab Code Data R R Code Data
Problem 5a (Primal): Nu-SVM with CVaR in Objective and Deltoidal (Mixture of L1 and L_Infinity) Norm in Constraint Minimize cvar_risk
Subject to
Deltoidal_norm <=1
——————————————————————–
Cvar_risk = Conditional Value-at-Risk specified by a matrix of scenarios
Deltoidal_norm = Mixture of L1 and L_infinity norm
——————————————————————–
Data and solution in Run-File Environment

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 2.66GHz (sec)
Dataset1 Problem Statement Data Solution 25 1,000 -0.119276 0.04
>Data and solution in MATLAB Environment

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 3.50GHz (sec)
Dataset1 Matlab code Data Solution 25 1,000 -0.119276 0.04
Problem 5b (Dual): Nu-SVM with Dual Deltoidal Norm in Objective and Envelope Constraint
Maximize -Dual_Deltoidal_Norm
Subject to
Linear = 0,
Envelope Constraint
——————————————————————–
Dual_deltoidal_norm = Norm dual to mixture of L1 and L_infinity norms
Envelope constraint = CVaR envelop set of constraints
——————————————————————–

Dataset 1025 1,000 -0.119278 0.28 # of Variables # of Scenarios Objective Value Solving Time, PC 3.14GHz (sec) Environments Run-File Problem Statement Data Solution Matlab Toolbox Data Matlab Subroutines Matlab Code Data R R Code Data
Problem 6a (Primal): Nu-SVM with CVaR in Objective and CVaR Norm in Constraint Subject to
CVaR_Norm <=1
——————————————————————–
Cvar_risk = Conditional Value-at-Risk specified by a matrix of scenarios
CVaR_norm = Conditional Value-at-Risk specified on a point components
——————————————————————–
Data and solution in Run-File Environment

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 2.66GHz (sec)
Dataset1 Problem Statement Data Solution 25 1,000 -0.074699 0.02
Data and solution in MATLAB Environment

Problem Datasets # of Variables # of Scenarios Objective Value Solving Time, PC 3.50GHz (sec)
Dataset1 Matlab code Data Solution 25 1,000 -0.074699 0.02
Problem 6b (Dual): Nu-SVM with Dual CVaR Norm in Objective and Envelope Constraint
Maximize -Dual_CVaR_Norm
Subject to
Linear = 0,
Envelope Constraint
——————————————————————–
Dual_CVaR_norm = maximum from L1 and L_Infinity Norms
Envelope constraint = CVaR envelop set of constraints
——————————————————————–

Dataset 1025 1,000 -0.073899 0.08 # of Variables # of Scenarios Objective Value Solving Time, PC 3.14GHz (sec) Environments Run-File Problem Statement Data Solution Matlab Toolbox Data Matlab Subroutines Matlab Code Data R R Code Data

CASE STUDY SUMMARY

This case study illustrates the application of the CVaR methodology to the Support Vector Machine (SVM) classification problem.
Given a training data , where are features and are class labels, the basic idea of SVM is to find an optimal separating hyper-plane (in the features space) maximizing a margin between two classes. Cortes et al. (1995) proposed to solve SVM classification problem using quadratic programming. An alternative formulation, known as nu-SVM, was suggested by Scholkopf, et al. (2000). Takeda and Sugiyama (2008) proposed to use the CVaR risk measure in classification and formulated the SVM learning problem as a CVaR minimization problem. Wang (2009) proposed robust nu -Support Vector Machine based on worst-case CVaR Minimization.
Tsyurmasto and Uryasev (2012) proposed Support Vector Machines based on Value-at-Risk (VaR) Measures. They obtained new SVM classifiers based on VaR risk measure for the following CVaR-based SVMs: Nu-SVM, Extended Nu-SVM, Robust Nu-SVM.
Case study contains the following problem formulations: 1) regularized CVaR, 2) regularized VaR, 3) CVaR minimization with unity constraint, 4) VaR minimization with unity constraint, 5) regularized robust CVaR minimization, 6) regularized robust VaR minimization. Problems 1,2,5,6 include additional quadratic regularization term.

References

• Tsyurmasto, P., Uryasev, S. (2012): Support Vector Machine Based on Value-at-Risk Measure. Working Paper.
Cortes, C. and V. Vapnik (1995): Support-vector networks, Machine Learning 20, 273-297.
• Scholkopf, B., Smola, A., Williamson, R., and P. Bartlett (2000): New support vector algorithms, Neural Computation 12, 1207-1245.
• Takeda A. and M. Sugiyama (2008): Nu-support vector machine as conditional value-at-risk minimization, in Proceedings 25th International Conference on Machine Learning, Morgan Kaufmann, Montreal, Quebec, Canada, 1056-1063.
• Tsyurmasto, P., Uryasev, S. (2012): Advanced Risk Measures in Estimation and Classification. Conference Proceedings, Vilnius, Lithuania, July, 2012.
• Wang, Y. (2009): Robust nu-Support Vector Machine Based on Worst-case Conditional Value-at-Risk Minimization. College of Finance, Zhejiang Gongshang University, Hangzhou 3100018, Zhejiang, China. Optimization Methods and