![]() This will remove the format from those variables.To combine datasets vertically in SAS, the easiest way is to “set” the datasets in a SAS data step. You fix this by adding a format statement that lists variables but does not have a format specified. format attached if you set them together you get the right length but the wrong format is attached so when you print the value it is truncated. If the first dataset has COMMENT define as LENGTH=$200 and has no format attached and the second has it defined as LENGTH=100 and has $100. The second problem as those annoying permanently attached formats. Maybe the data step is somewhat easier to use.)Įdit: I forgot to mention that PROC SQL in general has a habit of delivering observations in (what can seem like) random order. (So, in this case the UNION CORRESPONDING operator operates like an " intersection operator" on the sets of variable names. However, the CORRESPONDING keyword has a (maybe again unwanted) side effect: Non-matching columns are dropped. So, you may want to try " union all corresponding" if you encounter unwanted effects with " union" alone. Again, there is a keyword, CORRESPONDING, to change this behavior to the more familiar alignment by variable name. But you can avoid the elimination of duplicates by adding the ALL keyword.Īnother "habit" of the UNION operator which is very unfamiliar if you come from data step programming is to align columns by position, not by name. SQL UNION has a habit of removing duplicates. This will remove the format from those variables. ![]() Or you can just define the variables BEFORE the MERGE or SET statement. ![]() The first problem is how to make sure that when you stack them that each variable is assigned the proper length so that no values are truncated. Now if you use PROC IMPORT to let SAS guess at how to format your data from Excel of text files then you can end up with the variable in one file being of length 10 and in another file of length 20. SAS has an extremely nasty habit when reading from Excel or external databases of permanently attaching formats to character variables. ![]() Is there a quicker way to solve this problem? In a perfect world, I would go back and read-in the data for each worksheet and set the attributes. There are only a handful of character variables that I would need to be worried about here. Not surprisingly, according to PROC CONTENTS, the length of this variable is different in each of the 5 merged datasets (Time period 1, 2, 3, 4, 5). No truncation occurs in these datasets. The truncation only occurs in the final dataset, after these 5 datasets are joined into one.Ĭan I assign length, or other attributes, to certain variables in the DATA step before the SET statement for the final dataset? I am seeing truncated values in at least 1 character variable - an open text box - in the final dataset only. I used PROC IMPORT for each of 25 Excel worksheets (5 files w/ 5 tabs each). I will never do this again! There are 500 variables and 2000 observations. Physical activity, nutrition, personal health, etc.). Each of these 5 datasets was the result of a merge of 5 datasets (e.g. I am stacking 5 datasets into 1 (Time periods 1-5 = Final dataset). ![]()
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |