|
楼主

楼主 |
发表于 2012-3-2 09:05:45
|
只看该作者
Rolling regressions for backtesting
From Dapangmao's blog on sas-analysis
<div class="separator" style="clear: both; text-align: center;"><a href="http://4.bp.blogspot.com/-i51me4f9zlg/T0-0nNYCgLI/AAAAAAAAA8Q/syRBqR_ATkU/s1600/SGPlot6.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="300" src="http://4.bp.blogspot.com/-i51me4f9zlg/T0-0nNYCgLI/AAAAAAAAA8Q/syRBqR_ATkU/s400/SGPlot6.png" width="400" /></a></div><br />
<br />
Market always generates huge time series data with millions of records. <a href="http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&cts=1330647448658&ved=0CC0QFjAB&url=http%3A%2F%2Fwww.springer.com%2Fcda%2Fcontent%2Fdocument%2Fcda_downloaddocument%2F9780387279657-c1.pdf%3FSGWID%3D0-0-45-169676-p59330694&ei=lBFQT4W2AZGnsAKR06icDg&usg=AFQjCNHuLVAqKa47aNk-uZpjcwfXYqUd2w&sig2=2KDJPlOCwKrFhFmiO1lX-g">Running regressions to obtain the coefficients</a> of any interesting variables in a rolling window is resource-costly. In SAS, <a href="http://www.nesug.org/proceedings/nesug07/sa/sa04.pdf">a macro based on the GLM procedures</a> such as PROC REG is not an efficient option. We can imagine the situations: running PROC REG thousands of times would easily petrify any SAS system.<br />
<br />
The better way is to go down to the bottom to re-implement the OLS clichés: inverse, transpose and multiply the vectors and matrices. We can do it in either PROC IML, DATA step array or PROC FCMP. <a href="http://www.sasanalysis.com/2011/07/using-sasiml-for-high-performance-var.html">For such attempts PROC IML</a> is really powerful but needs extra license. <a href="http://www.sas-programming.com/2011/08/rolling-analysis-of-time-series.html">DATA step array </a>would require very high data manipulation skills, since it is not designed for matrix operations. PROC FCMP, a part of SAS/BASE, seems like a portable solution for SAS 9.1 or later. To test this method, I simulated a two-asset portfolio with 100k records, and under a 1000-long rolling window eventually ran 99,001 regressions in just 10 seconds on an old laptop. Overall, the speed is quite satisfying.<br />
<br />
<pre style="background-color: #ebebeb; border: 1px dashed rgb(153, 153, 153); color: #000001; font-size: 14px; line-height: 14px; overflow: auto; padding: 5px; width: 100%;"><code>
/* 1 -- Simulate a two-asset portfolio */
data simuds;
_beta0 = 15; _beta1 = 2; _mse = 5;
do minute = 1 to 1e5;
asset1 = ranuni(1234)*10 + 20;
asset2 = _beta0 + _beta1*asset1 + _mse*rannor(3421);
output;
end;
drop _:; format asset: dollar8.2;
run;
/* 2 -- Decide length of rolling window */
proc sql noprint;
select count(*) into: nobs
from simuds
;quit;
%let wsize = 1000;
%let nloop = %eval(&nobs - &wsize + 1);
%put &nloop;
/* 3 -- Manipulate matrices */
proc fcmp;
/* Allocate spaces for matrices */
array input[&nobs, 2] / nosym;
array y[&wsize] / nosym;
array xone[2, &wsize] / nosym;
array xonet[&wsize, 2] / nosym;
array z1[2, 2] / nosym;
array z2[2, 2] / nosym;
array z3[2] / nosym;
array result[&nloop, 3] / nosym;
/* Input simulation dataset */
rc1 = read_array('simuds', input, 'asset1', 'asset2');
/* Calculate OLS regression coefficients */
do j = 1 to &nloop;
do i = 1 to &wsize;
xone[2, i] = input[i+j-1, 1];
xone[1, i] = 1;
y[i] = input[i+j-1, 2];
end;
call transpose(xone, xonet);
call mult(xone, xonet, z1);
call inv(z1, z2);
call mult(z2, xone, xone);
call mult(xone, y, z3);
result[j, 1] = z3[1];
result[j, 2] = z3[2];
result[j, 3] = j;
end;
/* Output resulting matrix as dataset */
rc2 = write_array('result', result, 'beta0', 'beta1', 'start_time');
if rc1 + rc2 > 0 then put 'ERROR: I/O error';
else put 'NOTE: I/O was successful';
quit;
/* 4 -- Visualize result */
proc sgplot data = result;
needle x = start_time y = beta1;
refline 2 / axis = y;
yaxis min = 1.8;
run;
</code></pre><div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3256159328630041416-7158419009911635595?l=www.sasanalysis.com' alt='' /></div><img src="http://feeds.feedburner.com/~r/SasAnalysis/~4/ldL1HqKq2C4" height="1" width="1"/> |
|