| 
 | 
楼主
 
 
 楼主 |
发表于 2012-3-2 09:05:45
|
只看该作者
 
 
 
Rolling regressions for backtesting
From Dapangmao's blog on sas-analysis 
 
<div class="separator" style="clear: both; text-align: center;"><a href="http://4.bp.blogspot.com/-i51me4f9zlg/T0-0nNYCgLI/AAAAAAAAA8Q/syRBqR_ATkU/s1600/SGPlot6.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="300" src="http://4.bp.blogspot.com/-i51me4f9zlg/T0-0nNYCgLI/AAAAAAAAA8Q/syRBqR_ATkU/s400/SGPlot6.png" width="400" /></a></div><br /> 
<br /> 
Market always generates huge time series data with millions of records. <a href="http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&cts=1330647448658&ved=0CC0QFjAB&url=http%3A%2F%2Fwww.springer.com%2Fcda%2Fcontent%2Fdocument%2Fcda_downloaddocument%2F9780387279657-c1.pdf%3FSGWID%3D0-0-45-169676-p59330694&ei=lBFQT4W2AZGnsAKR06icDg&usg=AFQjCNHuLVAqKa47aNk-uZpjcwfXYqUd2w&sig2=2KDJPlOCwKrFhFmiO1lX-g">Running regressions to obtain the coefficients</a> of any interesting variables in a rolling window is resource-costly. In SAS, <a href="http://www.nesug.org/proceedings/nesug07/sa/sa04.pdf">a macro based on the GLM procedures</a> such as PROC REG is not an efficient option. We can imagine the situations: running PROC REG thousands of times would easily petrify any SAS system.<br /> 
<br /> 
The better way is to go down to the bottom to re-implement the OLS clichés: inverse, transpose and multiply the vectors and matrices. We can do it in either PROC IML, DATA step array or PROC FCMP. <a href="http://www.sasanalysis.com/2011/07/using-sasiml-for-high-performance-var.html">For such attempts PROC IML</a> is really powerful but needs extra license. <a href="http://www.sas-programming.com/2011/08/rolling-analysis-of-time-series.html">DATA step array </a>would require very high data manipulation skills, since it is not designed for matrix operations. PROC FCMP, a part of SAS/BASE, seems like a portable solution for SAS 9.1 or later.  To test this method, I simulated a two-asset portfolio with 100k records, and under a 1000-long rolling window eventually ran 99,001 regressions in just 10 seconds on an old laptop. Overall, the speed is quite satisfying.<br /> 
<br /> 
<pre style="background-color: #ebebeb; border: 1px dashed rgb(153, 153, 153); color: #000001; font-size: 14px; line-height: 14px; overflow: auto; padding: 5px; width: 100%;"><code> 
/* 1 -- Simulate a two-asset portfolio */ 
data simuds; 
   _beta0 = 15; _beta1 = 2; _mse = 5; 
   do minute = 1 to 1e5; 
     asset1 = ranuni(1234)*10 + 20; 
     asset2 = _beta0 + _beta1*asset1 + _mse*rannor(3421); 
     output; 
   end; 
   drop _:; format asset: dollar8.2; 
run; 
 
/* 2 -- Decide length of rolling window */ 
proc sql noprint; 
   select count(*) into: nobs 
   from simuds 
;quit; 
%let wsize = 1000; 
%let nloop = %eval(&nobs - &wsize + 1); 
%put &nloop; 
 
/* 3 -- Manipulate matrices */ 
proc fcmp; 
   /* Allocate spaces for matrices */ 
    array input[&nobs, 2] / nosym; 
   array y[&wsize] / nosym; 
   array xone[2, &wsize] / nosym; 
   array xonet[&wsize, 2] / nosym; 
   array z1[2, 2] / nosym; 
   array z2[2, 2] / nosym; 
   array z3[2] / nosym; 
   array result[&nloop, 3] / nosym; 
 
   /* Input simulation dataset */ 
   rc1 = read_array('simuds', input, 'asset1', 'asset2'); 
 
   /* Calculate OLS regression coefficients */ 
   do j = 1 to &nloop; 
      do i = 1 to &wsize;    
         xone[2, i] = input[i+j-1, 1]; 
         xone[1, i] = 1; 
         y[i] = input[i+j-1, 2]; 
      end;    
      call transpose(xone, xonet); 
      call mult(xone, xonet, z1); 
      call inv(z1, z2); 
      call mult(z2, xone, xone); 
      call mult(xone, y, z3); 
      result[j, 1] = z3[1]; 
      result[j, 2] = z3[2]; 
      result[j, 3] = j; 
   end; 
 
   /* Output resulting matrix as dataset */ 
   rc2 = write_array('result', result, 'beta0', 'beta1', 'start_time'); 
   if rc1 + rc2 > 0 then put 'ERROR: I/O error'; 
   else put 'NOTE: I/O was successful'; 
 quit; 
 
/* 4 -- Visualize result */ 
proc sgplot data = result; 
   needle x = start_time y = beta1; 
   refline 2 / axis = y; 
   yaxis min = 1.8; 
run; 
</code></pre><div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3256159328630041416-7158419009911635595?l=www.sasanalysis.com' alt='' /></div><img src="http://feeds.feedburner.com/~r/SasAnalysis/~4/ldL1HqKq2C4" height="1" width="1"/> |   
 
 
 
 |