标题: 请高手过招 [打印本页] 作者: shiyiming 时间: 2012-2-16 09:23 标题: 请高手过招 假设一个数据集有两个变量:child 和 parent
DATA sample;
INPUT child $ parent $;
DATALINES;
a b
c b
d c
e g
f e
;
run;
例如:
a 的 parent 是 b,
c 的 parent 是 b,
d 的 parent 是 c,
e 的 parent 是 g,
f 的 parent 是 e,
等等。。。
现在需要创建第三个变量highest_parent, 使得最终的数据集成为:
DATA sample;
INPUT child $ parent $ highest_parent $;
DATALINES;
a b b
c b b
d c b
e g g
f e g
;
run;
a 的 parent 是 b,highest_parent 是 b
c 的 parent 是 b,highest_parent 是 b
d 的 parent 是 c,但 c 的 parent 是 b, 所以highest_parent 是 b,
相似的
e 的 parent 是 g,highest_parent 是 g
f 的 parent 是 e,highest_parent 是 g
等等。。。
如何用SAS实现?作者: shiyiming 时间: 2012-2-16 13:59 标题: Re: 请高手过招 [code:3a6kqgig]
proc sql noprint;
create table sample2 as
select s1.child,s1.parent,coalesce(s2.hp,s1.parent) as hp
from sample as s1 left join sample(rename=(parent=hp)) as s2
on s1.parent=s2.child
order by 1;
quit;
[/code:3a6kqgig]作者: shiyiming 时间: 2012-2-16 16:40 标题: Re: 请高手过招 过招似乎是比武大擂台的意思?作者: shiyiming 时间: 2012-2-16 16:43 标题: Re: 请高手过招 1. Use breadth first algorithm until you find the deepest.
2. you don't have to rename at first inorder to avoid name conflict and data overwite in sql. this method is only used in data step.
[quote="zhangzachary":13uv3f5j][code:13uv3f5j]
proc sql noprint;
create table sample2 as
select s1.child,s1.parent,coalesce(s2.hp,s1.parent) as hp
from sample as s1 left join sample(rename=(parent=hp)) as s2
on s1.parent=s2.child
order by 1;
quit;
[/code:13uv3f5j][/quote:13uv3f5j]作者: shiyiming 时间: 2012-2-16 17:21 标题: Re: 请高手过招 [quote:cv0ke04g]e 的 parent 是 c,
...
e 的 parent 是 g, [/quote:cv0ke04g]
e的parent是g还是b?作者: shiyiming 时间: 2012-2-16 22:40 标题: Re: 请高手过招 [quote="jeozu":39plzahb]1. Use breadth first algorithm until you find the deepest.
2. you don't have to rename at first inorder to avoid name conflict and data overwite in sql. this method is only used in data step.
[quote="zhangzachary":39plzahb][code:39plzahb]
proc sql noprint;
create table sample2 as
select s1.child,s1.parent,coalesce(s2.hp,s1.parent) as hp
from sample as s1 left join sample(rename=(parent=hp)) as s2
on s1.parent=s2.child
order by 1;
quit;
[/code:39plzahb][/quote:39plzahb][/quote:39plzahb]
sorry for your confusion. corrections have been made. actually the sample data set contains a hierarchy structure作者: shiyiming 时间: 2012-2-17 03:21 标题: Re: 请高手过招 [quote="zhangzachary":2ic1wsjg][code:2ic1wsjg]
proc sql noprint;
create table sample2 as
select s1.child,s1.parent,coalesce(s2.hp,s1.parent) as hp
from sample as s1 left join sample(rename=(parent=hp)) as s2
on s1.parent=s2.child
order by 1;
quit;
[/code:2ic1wsjg][/quote:2ic1wsjg]
After you add two more data lines, for example,
DATA sample;
INPUT child $ parent $;
DATALINES;
a b
c b
d c
e g
f e
[b:2ic1wsjg]t d
s t[/b:2ic1wsjg];
run;
it doesn't work.
For the last row, s->t , t->d, d->c, c->b, so the highest_parent for s is b.作者: shiyiming 时间: 2012-2-17 10:04 标题: Re: 请高手过招 [code:kmi4n4ym]data regular;
set sample;
rename child=c parent=highest_parent;
run;
data out(drop=rc c);
length c highest_parent $8;
if _n_=1 then do;
declare hash h(dataset:'regular');
rc=h.defineKey('c');
rc=h.defineData('highest_parent');
rc=h.defineDone();
call missing(c,highest_parent);
end;
set sample;
highest_parent=parent;
do while(h.find(key:highest_parent)=0);
end;
run;[/code:kmi4n4ym]作者: shiyiming 时间: 2012-2-17 13:47 标题: Re: 请高手过招 [quote="jerrya00":3dy4xkum][quote="zhangzachary":3dy4xkum][code:3dy4xkum]
proc sql noprint;
create table sample2 as
select s1.child,s1.parent,coalesce(s2.hp,s1.parent) as hp
from sample as s1 left join sample(rename=(parent=hp)) as s2
on s1.parent=s2.child
order by 1;
quit;
[/code:3dy4xkum][/quote:3dy4xkum]
After you add two more data lines, for example,
DATA sample;
INPUT child $ parent $;
DATALINES;
a b
c b
d c
e g
f e
[b:3dy4xkum]t d
s t[/b:3dy4xkum];
run;
it doesn't work.
For the last row, s->t , t->d, d->c, c->b, so the highest_parent for s is b.[/quote:3dy4xkum]
好吧~ 的确没有考虑到位~ 以后写程序的确要了解透问题,考虑全面才行啊~
hopewell大大的 do while 用得很精彩,学习了!~作者: shiyiming 时间: 2012-2-17 22:57 标题: Re: 请高手过招 pf! Using hash searching is absolutely right solution to sovle this sort of problem!
[quote="hopewell":znd11mcn][code:znd11mcn]data regular;
set sample;
rename child=c parent=highest_parent;
run;
data out(drop=rc c);
length c highest_parent $8;
if _n_=1 then do;
declare hash h(dataset:'regular');
rc=h.defineKey('c');
rc=h.defineData('highest_parent');
rc=h.defineDone();
call missing(c,highest_parent);
end;
set sample;
highest_parent=parent;
do while(h.find(key:highest_parent)=0);
end;
run;[/code:znd11mcn][/quote:znd11mcn]作者: shiyiming 时间: 2012-2-18 22:53 标题: Re: 请高手过招 深度优先~~跟Oracle的start with ... connect by相似的思路。
精妙的hash,学习了。
[quote="hopewell":2rw35q2y][code:2rw35q2y]data regular;
set sample;
rename child=c parent=highest_parent;
run;
data out(drop=rc c);
length c highest_parent $8;
if _n_=1 then do;
declare hash h(dataset:'regular');
rc=h.defineKey('c');
rc=h.defineData('highest_parent');
rc=h.defineDone();
call missing(c,highest_parent);
end;
set sample;
highest_parent=parent;
do while(h.find(key:highest_parent)=0);
end;
run;[/code:2rw35q2y][/quote:2rw35q2y]