标题: 请教:SAS数据处理 [打印本页] 作者: shiyiming 时间: 2013-7-10 10:09 标题: 请教:SAS数据处理 比如有一个测试数据集如下:
ID SEQ VALUE
10 1 A
10 2 A
10 3 B
10 4 STOP
10 5 A
10 6 B
10 7 C
10 8 STOP
其中,SEQ是序列号,从小到大如上面那样排列,VALUE是对应的值,这里的A;B;C和STOP是随便说的字符,不代表任何含义,我想得到下面的数据集:
ID SEQ VALUE CUM A B C
10 4 STOP A->B 2 1 0
10 8 STOP A->B->C 1 1 1
多谢大侠指点!作者: shiyiming 时间: 2013-7-10 12:03 标题: Re: 请教:SAS数据处理 [code:10gpdkn8]data raw;
input ID SEQ VALUE $;
datalines;
10 1 A
10 2 A
10 3 B
10 4 STOP
10 5 A
10 6 B
10 7 C
10 8 STOP
;
proc sql noprint;
select distinct value into :varlist separated by ' '
from raw where upcase(value) ne 'STOP' order by value;
quit;
data out(drop=char);
do until(last.id);
set raw;
by id;
length cum $100;
retain cum;
array var &varlist;
if upcase(value) ne 'STOP' then cum=cats(cum,value);
else do;
do over var;
char=vname(var);
var=count(cum,strip(char));
cum=prxchange(cats('s/',char,'+/',char,'->/'),-1,strip(cum));
end;
cum=prxchange(cats('s/->$//'),1,strip(cum));
output;
call missing(cum);
end;
end;
run;[/code:10gpdkn8]作者: shiyiming 时间: 2013-7-15 11:20 标题: Re: 请教:SAS数据处理 非常感谢HOPEWELL!
不过你的代码似乎有点问题,比如我把测试数据集改成如下形式:
data raw;
input ID SEQ VALUE$ OBS $;
datalines;
10 1 AB XX
10 2 BA XX
10 3 BB XX
10 4 BB XX
10 4 STOP YY
101 5 B XX
101 6 A XX
101 8 STOP YY
101 5 AAA XX
101 6 AAB XX
101 7 AAB XX
101 8 STOP YY
;
run;
我本意是:比如对于ID=10,我想得到路径是AB->BA->BB, 但是如果运行你的代码,却得到A->B->A->B, 我猜测可能是函数cum=prxchange(cats('s/',char,'+/',char,'->/'),-1,strip(cum));有点问题,但是不知道怎么改,谢谢再次帮助!作者: shiyiming 时间: 2013-7-15 16:51 标题: Re: 请教:SAS数据处理 [code:2xish7a9]proc sql noprint;
select distinct value into :varlist separated by ' '
from raw where upcase(value) ne 'STOP' order by value;
quit;
data out(drop=char regular x y);
do until(last.id);
set raw;
by id;
length cum $100;
retain cum;
array var &varlist;
if upcase(value) ne 'STOP' then cum=catx(' ',cum,value);
else do;
do over var;
char=vname(var);
regular=prxparse(cats('s/(\b',char,'\b)+/',char,'/'));
call prxchange(regular,-1,strip(cum),cum,x,y,var);
cum=prxchange(cats('s/(\b',char,'\b\s?)+/',char,'->/'),-1,cum);
end;
cum=prxchange(cats('s/->$//'),1,strip(cum));
output;
call missing(cum);
end;
end;
run;[/code:2xish7a9]