SAS中文论坛

 找回密码
 立即注册

扫一扫,访问微社区

查看: 1344|回复: 0
打印 上一主题 下一主题

Do loop vs. vectorization in SAS/IML

[复制链接]

49

主题

76

帖子

1462

积分

管理员

Rank: 9Rank: 9Rank: 9

积分
1462
楼主
 楼主| 发表于 2012-1-8 07:01:52 | 只看该作者

Do loop vs. vectorization in SAS/IML

From Dapangmao's blog on sas-analysis

<div class="separator" style="clear: both; text-align: center;"><a href="http://2.bp.blogspot.com/-NDi3PKGCoJI/TwjCTKRfDVI/AAAAAAAAA5M/mZfd87FOJV0/s1600/SGPlot4.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="300" src="http://2.bp.blogspot.com/-NDi3PKGCoJI/TwjCTKRfDVI/AAAAAAAAA5M/mZfd87FOJV0/s400/SGPlot4.png" width="400" /></a></div><br />
Vectorization is an important skill for many matrix languages. From <a href="http://www.amazon.com/Statistical-Programming-SAS-IML-Software/dp/1607646633/ref=sr_1_1?ie=UTF8&amp;qid=1325973905&amp;sr=8-1">Rick Wiklin’s book about SAS/IML</a> and his recent <a href="http://blogs.sas.com/content/iml/2011/10/10/sasiml-tip-sheets/">cheat sheet</a>,  I found a few vector-wise functions since SAS 9.22. To compare the computation efficiency between the traditional do loop style and the vectorization style, I designed a simple test in SAS/IML: square a number sequence(from 1 to 10000) and calculate the time used. <br />
<br />
Two modules were written according to these two coding styles. Each module was ran 100 times, and system time consumed was recorded by SAS/IML’s time() function. <br />
<pre style="background-color: #ebebeb; border: 1px dashed rgb(153, 153, 153); color: #000001; font-size: 14px; line-height: 14px; overflow: auto; padding: 5px; width: 100%;"><code>
proc iml;
   start module1; * Build the first module;
      result1 = j(10000, 1, 1); * Preallocate memory to the testing vector;
      do i = 1 to 10000;  * Use a do-loop to square the sequence;
         result1[i] = i**2;
      end;
      store result1; * Return the resulting object;
   finish;   
   t1 = j(100, 1, 1); * Run the first test;
   do m = 1 to 100;
      t0 = time(); * Set a timer;
         call module1;
      t1[m] =  time() - t0;
   end;
   store t1;
quit;

proc iml;
   start module2; * Build the second module;
      result2 = (1:10000)##2; * Vectorise the sequence;
      store result2; * Return the resulting object;
   finish;   
   t2 = j(100, 1, 1); * Run the second test;
   do m = 1 to 100;
      t0 = time(); * Set a timer;
         call module2;
      t2[m] =  time() - t0;
   end;
   store t2;
quit;

proc iml;
   load result1 result2; * Validate the results;
   print result1 result2;
quit;
</code></pre><br />
Then the results were released to Base SAS and visualized by a&nbsp;box plot&nbsp;with the SG procedures. In this experiment, the winner is the vectorizing method: vectorization seems much faster than do loop in SAS/IML. Therefore, my conclusions are: (1) avoid the do loop if possible;  (2)use those vector-wise functions/operators in SAS/IML; (3) always test the speed of modules/functions by SAS/IML’s time() function.  <br />
<pre style="background-color: #ebebeb; border: 1px dashed rgb(153, 153, 153); color: #000001; font-size: 14px; line-height: 14px; overflow: auto; padding: 5px; width: 100%;"><code>
proc iml;
   load t1 t2;
   t = t1||t2;
   create _1 from t;
      append from t;
   close _1;
   print t;
quit;

data _2;
   set _1;
   length test $25.;
   test = "do_loop"; time = col1; output;
   test = "vectorization"; time = col2; output;
   keep test time;
run;

proc sgplot data = _2;
   vbox time / category = test;
   yaxis grid;
run;
</code></pre><div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3256159328630041416-7436362807472673480?l=www.sasanalysis.com' alt='' /></div><img src="http://feeds.feedburner.com/~r/SasAnalysis/~4/7dfakh4Du0U" height="1" width="1"/>
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

QQ|小黑屋|手机版|Archiver|SAS中文论坛  

GMT+8, 2025-5-6 23:48 , Processed in 0.067785 second(s), 20 queries .

Powered by Discuz! X3.2

© 2001-2013 Comsenz Inc.

快速回复 返回顶部 返回列表