Weight Estimation Analysis With IBM SPSS

Blog

Weight Estimation Analysis With IBM SPSS

Weight Estimation Analysis With IBM SPSS

Weight estimation

In a very simple language, weight estimation is often applied to regression analysis to investigate many weight transformations in order to identify the one with best fit for the model. With weight estimation the heteroscedasticity in the original data can be corrected and thus produce a much better fitted model.

Weight Estimation model

The regression model has the form:

regression

        Where,

P   = The number of parameters for the model.

yi  = observed dependent variable for ith case.

X    = the observed value of the ith case of the jth independent variable

b0  = regression coefficient of the initial independent variable.

b1  = regression coefficient of the jth independent variable.

 

The error component, ei The ordinary least-squares (OLS) model has a normal distribution with a mean of 0 and a variance of 2. The variance of the dependent variable is correlated with the value of a predictor in the weight least-squares (WLS) method because the error term has a normal distribution with a mean of zero and a variance two times the weight.

Data

To analyze weight estimate data in Spss, we need the dependent and independent variables that are both numerical data. It is necessary to recode categorical data such as religion, primary, and region of residence to become numerical data, binary (dummy) variables or other kinds of contrast variables. The dependent variable's variability should be tied to the weight variable's quantitative nature. We are going to be using this sample data weight_estimation_mall_data.

Make sure that the Measure in the variable view is all in scales.

import data

Figure 1: Importing downloaded mall data as CSV.

Identifying the need for WLS

To begin a Linear Regression analysis, select the following from the menus: Analyse >>> Regression >>> Linear, as shown in figure 2.

linear regression

Figure 2: Linear regression analysis in Spss

Then, select Adjusted Cost of Construction then Square Footage, Years of Experience of Architect and in_or_out door mall as the dependent and independent variables. Secondly, click on the Plot and Select *ZPRED and *ZRESID for x and y simultaneously. Next step is to click on save and Select Standardised in the Residuals area, which will calculate as *ZPRED (The standardised predicted values of the dependent variable) and *ZRESID (The standardised residuals of the independent variable) automatically, As shown in figure 3.

spss plotting

Figure 3: Plotting standardized predicted values and residual

model summary

                                                Table 1: ordinary least-squares (OLS) summary model table

The Ordinary Least Square gives R2 of 0.662 which is not too bad. However, it can be improved and that is what we are doing next. Table 2 shows that the years of experience of the architect does not matter as the p-value is greater that 0.05 while that of other variables significant.

model results

Table 2: The significance values for the estimated variables

scatter plotFigure 4: A standard plot of  Residual and predicted values

 

The plot shows heteroscedasticity meaning that the data points are scattered evenly between the standardised residuals and standardised predicted values, from the plot, residual increases with an increase in the predicted values. The weighted least squares have to be used to investigate the underline parameters responsible for the heteroscedasticity.

Heteroscedasticity Test

Before doing the Weight Estimation technique, we must identify a variable linked to the source of heteroscedasticity. We believe the mall's size is a variable based on the developers' prior experience. To determine the case, from the menus, choose Graphs >>> Chart Builder.

plotting

Figure 5: A plotting of the graph decides variable heteroscedasticity

Navigating and Selecting the Scatter/Dot gallery and dragging the Simple Scatter plot to the graph area, then select Standardized Residual as the y variable and Square Footage as the x variable.

scatter plotFigure 6: Image showing how to plot for heteroscedasticity

scatter plot

Figure 7: A plot to identify the source variable

Figure 7 illustrates the trend in the residuals vs predicted values in figure 4. Square footage should be acceptable as a source variable for creating weights to account for the heteroscedasticity.

Weight Estimation Analysis

Weight Estimation analysis can be analysed in Spss like this, from the menus, choose Analyse >>> Regression >>> Weight Estimation, then the bulleted point should be followed.

weighted plots

Figure 8: Image showing how to plot weighted least-squares regression

  • the dependent variable should be the adjusted cost of construction
  • The Independent variables should be the Square Footage through Years of Experience of the Architect as
  • The weight variable should be selected as Square Footage as.
  • The distribution of residuals widens with square footage, indicating a positive power value. Choose to look for the best power value by searching 0 through 5 by 0.5. Select Save the best weight as a new variable.
  • Finally, click on Option and select "Save the best weight as a new variable" Click Continue and click on OK in the Weight Estimation dialogue box.

log-likelihood

  • model summary  

Each of the specified Power values is fitted with a weighted least squares model, and the algorithm selects the value that produces the highest Log-Likelihood highlighted (-205.072).

Now the weights saved from the weighted regression should be used to conduct a linear regression, ignoring the remainder of the weight estimate output.

Corrected Linear Regression analysis (WLS)

So based on the weight estimated output, select Linear Regression to perform a Linear Regression analysis using the calculated weights.

regression

Figure 9: Regression Analysis using weight estimation

Select Weight for cost as the weight variable in WLS (weight least square). Select Save as shown in figure 10.

  • Select Unstandardized in the Predicted Values and Residuals groups.
  • Deselect Standardised in the Residuals group.
  • Click Continue. Click OK in the Linear Regression dialogue

unstandardized model

Figure 10: Unstandardized model using predicted and residual values

The R2 is now 0.746, which is better than the former value of 0.662 obtained by the OLS model.

model summary

 

model coefficients

Moreover, the parameter coefficients make more sense now, meaning that the coefficient for the architect's experience is now significant. This implies that WLS is a model performance algorithm for OLS.

Calculating Regression with shopping mall Estimated Weight

To create weighted predicted values from the menus, choose: Transform => Compute variable as shown in figure 12.

weight estimation

Figure 11: Computing estimated weight

regressionFigure 12: Creating calculated column for predicted value.

To compute weighted residuals, click the Dialog Recall tool and select Compute Variable: then type wgtresid as the target variable and Type res_1*sqrt(wgt_1) as the Numeric expression. And then Click OK.

residual valuesFigure 13: Creating a calculated column for the residual value.

Corrected Residual Plot

To produce the residuals plot, click the Dialog Recall tool, select Chart Builder, and then click Reset to clear prior selections.

Select the Scatter/Dot gallery and choose Simple Scatter. Select wgtresid as the y variable and wgtpred as the x variable.

There is no pattern in this plot! The heteroscedasticity in the data has been corrected by the weights least squares.

scatter plotFigure 14: A plot of Estimated weight of predicted and residual values.

 

Finally, this is how the data view will look in Spss after the analysis; six(6) columns were generated automatically.

dataFigure 15: Data View for the Shopping mall weight Estimation

Conclusion

In this blog, the relationship between OLS and WLS and how they complement each other has been demonstrated. Using various techniques, including the Weight Estimation procedure, we have diagnosed the need for a weighted least-squares model, identified an appropriate source variable for the weights, generated the appropriate weights, and applied these weights to the model. All we have done was to transform variables that can sometimes be used in place of or in addition to WLS to tackle the heteroscedasticity problem. Although a WLS may be performed with the answer as the weight variable, this is not a predictive tool.

Choosing a weight variable can be difficult when multiple variables show a relationship to the spread of the residuals. Choose the one that has the strongest relationship or the one that makes the most "sense".


← Back


Comments

No comments added


Leave a Reply

Success/Error Message Goes Here
Do you need help with your academic work? Get in touch

AcademicianHelp

Your one-stop website for academic resources, tutoring, writing, editing, study abroad application, cv writing & proofreading needs.

Get Quote
TOP