Counseling Center­

University of Maryland

College Park, Maryland

Using Ridge Regression with Non-cognitive

Variables by Race in Admissions

 

Terence J. Tracey and William E. Sedlacek

Research Report # 1-83

A version of this paper was presented at the annual meeting of the American Educational Research Association, Montreal, April, 1983.

 

Terence J. Tracey is now an Assistant Professor in Educational Psychology at the University of Illinois, Champaign/Illinois, Champaign/Urbana. William E. Sedlacek is Assistant Director of the Counseling Center at the University of Maryland.

 


 

Counseling Center

University of Maryland

College Park, Maryland

Using Ridge Regression with Non-cognitive

Variables by Race in Admissions

Terence J. Tracey and William E. Sedlacek

Research Report # 1-83

 

Summary

 

The relative predictive validity of two regression methods, ordinary least squares (OLS) and ridge regression, were compared using SAT scores and the eight non-cognitive measures posited by Tracey and Sedlacek (in press) as predictor variables. It was hypothesized that, given the high degree of multi collinearity of the predictor set, ridge regression would provide less cross-validated shrinkage than would the OLS method. Sub-samples, a different one for each race, were drawn and prediction equations were generated for each regression method. These equations were then applied to the entire racial sample and the shrinkage was examined. The results demonstrated that each method was equally valid.

 


2

Ridge regression has been developed as an alternative prediction method to ordinary least squares regression (OLS) in cases where there is a high degree of multicollinearity among the predictor variables. Predicting academic success is an area where there is a high degree of overlap among the predictor variables and should thus be an excellent area to apply ridge regression instead of the usual OLS regression used. Obtaining valid prediction equations in this area is often difficult because the high degree of multicollinearity tends to create very different prediction equations from year to year. Thus, the process of validating, these equations typically involves collecting very large samples over many years. Ridge regression has been found to be most effective in exactly these cases. Ridge regression should result in more stable equations with high multicollinear data, and thus, should be more valid using smaller samples than typically required by least squares methods. Therefore, it would be quite valuable to colleges and universities to use ridge regression if it resulted in stable, valid prediction equations based on much smaller samples. The purpose of this paper was to describe a study conducted to examine the effectiveness of applying ridge regression procedures over OLS regression procedures to admissions data.

 

When least squares regressions are used on highly interrelated data (high in multicollinearity), the resulting beta weights tend to be very unstable, and when prediction equations are cross-validated, the shrinkage in prediction tends to be great

 


3

(Darlington, 1978; Faden, 1978). Ridge regression was developed to be used in exactly these situations of high multi collinearity (Darlington, 1978; Dempster, Schatzoff, & Wermuth, 1977; Hoerl &

Kennard, 1970; Price, 1977). Overall, the two procedures are identical except that in ridge regression, a small constant is added to the main diagonal of the variance-covariance matrix prior to the determination of the regression equation. This adding of the constant creates a "ridge" on the main diagonal, hence the name. Adding this ridge is an artificial means of decreasing the relative amount of collinearity in the data. The determination of the specific constant ( =delta) that is added to the matrix is determined by using an iterative approach; selecting the delta value that results in the lowest total mean square error for the prediction equation.

 

Tracey, Sedlacek, and Miars (1983) compared ridge regression and least squares. regression in . predicting freshman year cumulative grade point average (GPA) based on SAT scores and high school GPA. They found that ridge regression resulted in cross validated correlations similar to those .found by using ordinary least squares regression. The failure of ridge regression to yield less shrinkage over OLS regression was postulated to have been due to a relatively low ratio of the number of predictors (p) to the number of subjects in the sample (n) used in the study. Faden (1978) found that the key dimension where ridge regression proved superior to OLS regression was where the p/n ratio was high. So it

 


4

was decided to re-examine the efficacy of ridge regression with admissions data by using data with more than three predictors.

 

Sedlacek and Brooks (1976) postulated non-cognitive variables that are predictive of minority student academic success. Tracey and Sedlacek (in press) developed a brief questionnaire, the Non-Cognitive Questionnaire (NCQ), to assess these variables and found eight non-cognitive factors to be highly predictive of grades and enrollment status for both whites and blacks above and beyond using SAT scores alone. But, it was also found that these variables shared a high degree of variance with the SAT scores, so there was a fairly high degree of multicollinearity. It was felt that these ten variables (SATV, SATM, and the eight non-cognitive factors) would be an ideal application of ridge regression.

 

The purpose of this study was to examine the efficacy of ridge regression over ordinary least squares regression as applied to admissions data (SAT & NCQ scores). Because it has been demonstrated that separate regression equations for each race are desirable in selecting students (Farver, Sedlacek, & Brooks, 1975), it was decided to use separate equations for each race. This examination of ridge and OLS regressions was conducted using separate race (black and white) equations in addition to a general equation without regard to race. Also, the criterion of interest in this study was cumulative grade point average (GPA) after three semesters rather than the more typical, short-term lengths of one or two semesters. It was hypothesized that using ridge regression

 


5

would result in lower cross-validated shrinkage of prediction than would using OLS regression in each of the analyses conducted (white, black, and general).

Method

Sample

A random sample of 825 incoming freshmen attending summer orientation at a large eastern university were administered the Non-Cognitive Questionnaire. This sample of 825 consisted of 571 white students, 176 black students, and 78 students of other (mostly Asian-Americans) or unknown racial backgrounds.

 

 

Measures

The Non-Cognitive Questionnaire (NCQ) consisted of 22 Likert type items and three open-ended items, all relating to expectations of college. The individual items were designed to measure the non-cognitive variables postulated to be related to minority student academic success (Sedlacek & Brooks, 1976). The eight factors (as developed by Tracey & Sedlacek, in press) were: leadership, preference for long range goals over short, self-confidence, realistic self-appraisal, an understanding of and ability to deal with racism, demonstrated community service, having strong support for college plans, and academic familiarity. There is evidence for the content and predictive validity of these factors as measured by the NCQ (Tracey & Sedlacek, in press).

 


Analyses

6

As ridge regression has been demonstrated to be most effective over OLS in cases where the ratio of the number of predictor variables (p) to the sample size (n) was large (Faden, 1978), small sub-samples were drawn from the original sample to generate the prediction equations. Sub-samples of 50 were drawn from the whole sample and from the sample of white students. Given that the sample of black students was significantly smaller, a small sub-sample of 30 was drawn. Each random sub-sample was used to generate OLS and ridge regression prediction equations, using SAT Verbal, SAT Quantitative, and the eight non-cognitive factors from the NCQ as predictors, and three semester GPA as the criterion. The resulting predictive equations were then used to generate predicted three semester GPA for the larger sample. These predicted GPA's were then correlated with the actual three semester GPA. This cross-validation of predictive equations would yield information on the differences of shrinkage in prediction between the OLS and ridge methods. It was expected that the ridge regression technique would result in higher correlations between the predicted the actual grades in the cross-validation step than would be found for the ordinary least squares regression technique.

 

 

Results

The summary of the correlations of the predictive equations with three semester GPA are presented in Table 1.

 


 

7

Insert Table 1 about here

 

As evidenced by Table 1, fairly high levels of prediction were attained in each of the sub-samples. But, in each sample, the shrinkage in prediction of each equation when applied to the full sample was substantial. This shrinkage was particularly dramatic for the black sample, which had the most predictive equation based on the sub-sample and a non-significant cross-validated prediction. The hypothesized result that ridge regression would yield less shrinkage in prediction than OLS regression was not demonstrated. In each case, the application of ridge regression yielded shrinkage results similar to the application of OLS.

 

Discussion

 

The results of this study mirrored those found earlier by Tracey, Sedlacek, and Miars (1983). Applying ridge regression to admissions data did not yield predictive equations that improved on those equations derived from applying ordinary least squares regression. This study was designed to represent those aspects where ridge regression has been found superior to OLS regression in Monte Carlo studies. These conditions are: a) when there is a high degree of multi collinearity, which there was in the predictors used, b) when the samples on which the equations are based are small, and c) when the ratio of p/n is relatively large (Darlington, 1979; Dempster, Schatzoff, & Wermuth, 1977; Faden,

 


 

8

1978). The failure of ridge regression to yield less shrinkage than OLS regression is perplexing. Perhaps one possible reason for this lack of expected results is the relative lack of sophistication of the measures used. The NCQ factors may not have the well developed psychometric properties typically associated with measures like SAT scores, although Tracey and Sedlacek (in press) provided very good reliability and validity data. In the Monte Carlo studies examining ridge regression, there-was little concern about the error variance associated with the predictors, given that the population parameters were known. In real applications, the population parameters are unknown and the error variance associated with the predictors takes on more importance. Thus, less well developed measures such as the NCQ may introduce a good deal of error variance unassociated with multicollinearity, and the very precise procedures of ridge regression may not be able to reduce this. Perhaps ridge regression is most useful only in those ideal conditions used in Monte Carlo studies, but not in actual applications where typical social science variables used tend to be less precise. Rozeboom (1979) demonstrated that ridge regression may enhance prediction if the conditions are right,. but if not, decreased accuracy would result. He went on to state that how to diagnose when the conditions are right remains obscure. In application to admissions data, ridge regression does not significantly enhance or detract from the prediction possible from OLS and thus may have limited utility in applied admissions situations.

 


9

References

 

Darlington, R. B. (1978). Reduced variance regression. Psychological Bulletin, 85, 1238-1255.

 

Dempster, A. P., Schatzoff, M., & Wermuth, N. (1977). A simulation study of alternatives to ordinary least

squares. Journal of the American Statistical Association, 72, 77-91,

 

Faden, V. B. (1978). Shrinkage in ridge regression and ordinary least squares multiple regression estimators.

Unpublished doctoral dissertation, University of-Maryland.

 

Farver, A. S., Sedlacek, W. E., Brooks, Jr., G. C. (1975). Longitudinal predictions of university grades for

black and whites. Measurement and Evaluation in Guidance, 7, 243-250.

 

Hoerl, A. E. & Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems.

Technometrics, 12, 69-82,

 

Price, B. (1977). Ridge regression: Applications to non­experimental data. Psychological Bulletin, 84,

759-766.

 

Rozeboom, W. W. (1979).  Ridge regression:   Bonanza or beguilement? Psychological Bulletin, 86, 242-249.

 

Sedlacek, W. E. & Brooks, Jr., G. C. (1976). Racism in American education: A model for change. Chicago:

Nelson-Hall..

 

Tracey, T. J., & Sedlacek, W. E. (in press). Noncognitive variables in predicting academic success by race.

Measurement and Evaluation in Guidance.

 


 

10

Tracey, T. J., Sedlacek, W. E., & Miars, R. D. (1983). Applying ridge regression to admissions data by race

and sex. College and University, 58, 313-318.

 


 

Table 1: Summary of Multiple Coefficients  Using Ordinary Least Squares (OLS) and Ridge Regression Equations to Predict College Cumulative Average After Three Semesters

 

Sub-sample Equations

 

Cross Validation2

 

 

OLS

RIDGE

RIDGE1

 

 

OLS

RIDGE

Sample

n

R

R

 

 

n

R

R

Whites

50

0.63

0.62

0.8

 

571

0.39

0.37

Blacks

30

0.67

0.66

0.35

 

176

0.03

0.06

Whole Sample

50

0.61

0.6

0.8

 

825

0.37

0.36

1-refers to the constant that is added to the main diagonal of the variance-covariance matrix, creating the "ridge."

2-Cross validation performed on full sample.