A smoothed 3D Look at Wages

In this exercise, we take a look at fitting a smooth, non-parametric (LOESS) surface, showing the relation between Wages (ttwgs28c) and both Age and Years of Schooling.

All the commands below are stored in the file R:\psych\spida\courses\eda\pont3dw.sas. You can include them directly into SAS by typing in the Editor window:

%include spidaeda(pont3dw);
or, you can copy/paste from this window into the SAS Editor window.

Preparing the data

We'll work with the revised-Ontario subset, using the square root of ttwgs28c. The step below is not necessary if you have already created the PONTARIO data set in your SAS session.
data pontario;
   set slid.pontario;
   sqrtwage = sqrt(ttwgs28c);
   label sqrtwage = 'sqrt(Total Wages and Salaries)'
      eage26c = 'Age in 1994';
run;

Fitting a smoothed 3D surface

The wages data are irregularly spaced. The following statements create a SAS data set containing a regular grid of points that will be used in the SCORE statement with the LOESS procedure:
data predgrid;
   do yrsch18c = 0 to 20 by 1;
      do eage26c = 18 to 70 by 1;
         if eage26c > (yrsch18c+5) then output;
      end;
   end;
	run;
Fit LOESS model and save the results in output data set, with fitted grid values in ScoreOut. Try running the statements below using different values for the SMOOTH= parameter.
proc loess data=pontario;
   ods Output ScoreResults=ScoreOut
              OutputStatistics=StatOut;
   model sqrtwage = eage26c yrsch18c /
      smooth=0.4
      degree=2          /* fit local quadratics in Age, YrSch */
      scale=sd          /* scale independents using trimmed SD */
      residual          /* output residuals in the ODS */
      ;
   id pupid26c;
   score data=predgrid;
   run;
Plot the surface. You may wish to try changing the TILT= and ROTATE= parameters to get a better view. (Interactive graphics, such as provided by PROC INSIGHT, make 3D visualization much easier, and more intuitive.)
proc g3d data=ScoreOut;
   format p_sqrtwage 4.0 eage26c yrsch18c 4.0;
   plot yrsch18c * eage26c = p_sqrtwage /
   		caxis=black ctop=blue yticknum=5
   		tilt=70 rotate=60 ;
   label p_sqrtwage = 'rtWage'
   	eage26c='Age'
	yrsch18c='YrsSch';
	run;