proc print;This automatically uses the data set from the previous proc or data step. The form
proc print data=a;explicitly uses the data set a rather than the previously or created one. Procedures usually produce printed output (in myfile.lst), but do not create a new or add to existing data sets unless this is made explicit with an output phrase. For instance,
proc means; . . . output out = newname . . . ;This explicitly creates the data set newname. Each proc has its own sub-phrases (the first ". . ." above) and their own set of variables that can be added to the new data set. The general form of the output phrase is:
output out=d1 a=a1 b=b1 c=c1;with out= being the keyword for the data set name d1 and a= b= and c= being any number of optional keywords for new variables. The names after the equals -- a1, b1 and c1, respectively -- are up to you. They are the names of these variable that you can later use. Now to specifics. Here I give some phrases which may be useful. Others can be found in the SAS/STAT book. For the output phrase, I indicate some keywords.
proc sort; by trt; proc means; by trt;will first sort the data by treatment trt and then run the means procedure separately for each treatment group. This is much cleaner than running SAS 3 times, each time retaining only the treatment group under study. However, it does produce a lot more output! The by phrase is on all procedures. BUT you MUST sort before you use it. You can sort by several things at once:
proc sort; by sex trt; proc means; by sex trt;Sorting is cheap. It is a good idea to always run proc sort before using by with other procedures, even if you think you did it earlier in your program.
proc univariate; /* detailed univariate summaries */ var x y; /* for variables x and y */ output out=b mean=mx std=sx; /* create set b with mean and SD for x only */ proc means; /* means, SDs, min, max for each variable var x y; /* for variables x and y */ output out=b mean=mx my std=sx sy;/* output means and SD for x and y */ proc means noprint; /* useful form if you do not want printout */ var x y; output out=b mean=mx my std=sx sy;/* output means and SD for x,y */
proc univariate plot normal; /* histogram type summaries */ var x;The plot option to proc univariate produces a stem-and-leaf plot, a box plot, and a normal probability plot. The normal option tests for normal distribution.
proc plot; /* scatter plot */ plot y*x; /* plot y vertical and x horizontal */ plot y*x='*'; /* use "*" as plotting symbol */ plot y*z=trt; /* use value of trt as plotting symbol */ plot y*x='*' y*z / overlay; /* overlay two plots on same page */Here is a way to construct Interaction Plots. It gives you a plot of the average values of y for each period and trt.
proc sort; by period trt; proc means noprint; by period trt; var y; output out=means mean=my; proc plot; plot my*period=trt;The noprint option used in proc means is available for many procedures. Sometimes it can be very handy in shortening output. You can do plots by another variable.
Here is a fancier way to construct Interaction Plots and some Diagnostic Plots, which allows you to use the ful value of statistical modelling. The basic idea is to fit the desired model, save the least squares means (lsmeans) as a dataset, and print from that. Diagnostic information is saved with the output phrase.
proc glm; class a b; model y = a | b; lsmeans a*b / out=lsm; output out=diag p=py r=ry; proc plot data=lsm; /* Interaction Plot */ plot lsmean*a=b; /* cell mean v. a by b */ plot lsmean*b=a; /* cell mean v. b by a */ proc plot data=diag; /* Diagnostic Plots */ plot y*py py*py='*' / overlay; /* observed v. predicted */ plot ry*py; /* residual v. predicted */Here is a way to keep uniform axes for separate plots by location:
proc plot uniform; by location; plot y*x;You can set several plot features:
plot y*x / vaxis=10 to 100 by 5; /* vertical axis ticks */ plot y*x / haxis=10 to 20 by 2; /* horizontal axis ticks */ plot y*x / vzero hzero; /* include origin on plot */ plot ry*py / href=0; /* horizontal reference line */ plot y*x py*x='*' / overlay; /* overlay two plots */
goptions device=tek4014; data biplot; set results; if _type_ = 'SCORE' then do; text = substr(_name_,4,5); end; if _type_ = 'CORR' then do; text = substr(_name_,1,3); end; x = prin1; y = prin2; z = prin3; xsys = '2'; ysys = '2'; zsys = '2'; size = 1; label z = 'Dimension 3'; label y = 'Dimension 2'; label x = 'Dimension 1'; keep x y z text xsys ysys zsys size; proc gplot; title3 'BiPlot of Stimuli and Chemicals'; symbol1 v=none; plot y*x=1 / annotate=biplot frame href=0 vref=0;The second example creates a 3-dimensional plot. The third example following takes data from file hatps.dat, creates a 3-D plot, and outputs it to a postscript printer. I am not sure if this is saved as a file, or sent directly to one's default print device.
goptions device=tek4014; proc g3d; plot y*x=z / tilt=45 rotate=45;
filename gsasfile 'hatps.dat'; goptions device=ps hsize=8.5 vsize=8.5 gaccess=gsasfile; proc g3d; plot y*x=z / tilt=45 rotate=45;
Last modified: Sun Feb 25 19:17:58 1996 by Brian Yandell Wed Feb 1 10:39:28 1995 by Stat Www (statwww@stat.wisc.edu)