Replicate Brogaard Stock Volatility Decomposition

news2025/1/13 10:34:23

文章目录

Introduction

Having recognized the potential of the stock volatility decomposition method introduced by Brogaard et al. (2022, RFS) in my previous blog, I will show how to implement this method to empower your own research in this blog.

For readers with time constraints, the codes for implementing this variance decomposition method can be approched via this link.

As replicating Brogaard et al. (2022, RFS) requires some manipulations on the VAR estimation outputs, I took some time to figure out the theory and estimation of the reduced-form VAR coefficients, Impulse response functions (IRFs), structural IRFS, orthogonalized IRFs, and variance decomposition and summarized what I’ve got in a three-blog series about VAR.

In the first blog, I show the basic logics of VAR model with the simplest 2-variable, 1-lag VAR model. In the second blog, I show how to use var and svar commands to conveniently estimate the VAR model in Stata. In the third blog, I dig deeper, show the theoretical definitions and calculation formula of major outputs in VAR model, and manually calculate them in Stata to thoroughly uncover the black box of the VAR estimation.

For this blog, I will only focus on the paper-specific ideas. Readers who need more background information about VAR estimation can find clues in my three-blog series about VAR.

Data and Sample

The sample used by Brogaard et al. (2022, RFS) consists of all common stocks listed on the NYSE, AMEX, and NASDAQ spanning from 1960 to 2015. Estimation of the VAR model requires daily data on stock returns, market returns, and dollar-signed stock trading volumes.

The reduced-form VAR model below is estimated in stock-year level.
r m , t = a 0 ∗ + ∑ l = 1 5 a 1 , l ∗ r m , t − l + ∑ l = 1 5 a 2 , l ∗ x t − l + ∑ l = 1 5 a 3 , l ∗ r t − l + e r m , t x t = b 0 ∗ + ∑ l = 1 5 b 1 , l ∗ r m , t − l + ∑ l = 1 5 b 2 , l ∗ x t − l + ∑ l = 1 5 b 3 , l ∗ r t − l + e x , t r t = c 0 ∗ + ∑ l = 1 5 c 1 , l ∗ r m , t − l + ∑ l = 1 5 c 2 , l ∗ x t − l + ∑ l = 1 5 c 3 , l ∗ r t − l + e r , t (1) \begin{aligned} &r_{m, t}=a_0^*+\sum_{l=1}^5 a_{1, l}^* r_{m, t-l}+\sum_{l=1}^5 a_{2, l}^* x_{t-l}+\sum_{l=1}^5 a_{3, l}^* r_{t-l}+e_{r_m, t} \\ &x_t=b_0^*+\sum_{l=1}^5 b_{1, l}^* r_{m, t-l}+\sum_{l=1}^5 b_{2, l}^* x_{t-l}+\sum_{l=1}^5 b_{3, l}^* r_{t-l}+e_{x, t} \\ &r_t=c_0^*+\sum_{l=1}^5 c_{1, l}^* r_{m, t-l}+\sum_{l=1}^5 c_{2, l}^* x_{t-l}+\sum_{l=1}^5 c_{3, l}^* r_{t-l}+e_{r, t} \end{aligned} \tag{1} rm,t=a0+l=15a1,lrm,tl+l=15a2,lxtl+l=15a3,lrtl+erm,txt=b0+l=15b1,lrm,tl+l=15b2,lxtl+l=15b3,lrtl+ex,trt=c0+l=15c1,lrm,tl+l=15c2,lxtl+l=15c3,lrtl+er,t(1)
where

  • r m , t r_{m,t} rm,t is the market return, the corresponding innovation ε r m , t \varepsilon_{r_{m,t}} εrm,t represents innovations in market-wide information
  • x t x_t xt is the signed dollar volume of trading in the given stock, the corresponding innovation ε x , t \varepsilon_{x,t} εx,t represents innovations in firm-specific private information
  • r t r_t rt is the stock return, the corresponding innovation ε r , t \varepsilon_{r,t} εr,t represents innovations in firm-specific public information
  • the authors assume that ε r m , t , ε x , t , ε r , t \\{\varepsilon_{r_m, t}, \varepsilon_{x, t}, \varepsilon_{r, t}\\} εrm,t,εx,t,εr,t are contemporaneously uncorrelated

Download Data

To better serve my research purpose, in this blog I will implement the stock-year level variance decomposition for all common stocks listed on the NYSE, AMEX, and NASDAQ spanning from 2005 to 2021.

The SAS code is as follows. I first log in the WRDS server in SAS. Then I download the daily stock price, trading volume, return, and market return for all common stocks listed on the NYSE, AMEX, and NASDAQ - that’s exactly what the CRSP got. For ease of importing into Stata, I transfer all the downloaded sas dataset into csv format. As the daily CRSP data is too huge, I implement all the above procedures year by year.

libname home "C:\Users\xu-m\Documents\testVAR\rawdata2005-2021sas";

/* log in WRDS server */
%let wrds = wrds.wharton.upenn.edu 4016;
options comamid=TCP remote=WRDS;
signon username=xxx pwd=xxx;
run;

%signoff;

/* download daily stock price, trading volume, return, and market return from CRSP*/
rsubmit;
%macro downyear;
	 %do year = 2005 %to 2020;
	 	%let firstdate = %sysfunc(mdy(1,1,&year));
		%let lastdate = %sysfunc(mdy(12,31,&year));
		%put &year;

		proc sql;
			create table sampleforvar&year as select distinct 
			cusip, date, ret, prc, vol, numtrd, shrout, hsiccd
			from crsp.dsf
			where date ge &firstdate and date le &lastdate;
		quit;

		proc sql;
			create table sampleforvar&year as select a.*, b.*
			from sampleforvar&year a, crsp.dsi b
			where a.date=b.date;
		quit;
		proc download data=sampleforvar&year out=home.sampleforvar&year; run;
	%end;
%mend;
%downyear;
endrsubmit;

/* transfer the downloaded sas dataset into csv format */
%macro expyear;
	 %do year = 2005 %to 2021;
	 	%let outfile = %sysfunc(cat(C:\Users\xu-m\Documents\testVAR\rawdata2005-2021sas\sampleforvar, &year, .csv));
		%let outfile = "%sysfunc(tranwrd(&outfile,%str( ),%str(%" %")))";
		proc export data=home.sampleforvar&year outfile=&outfile dbms=csv replace; run;
%end;
%mend;
%expyear;

The output of this step is as follows. The raw data for each year is stored in csv file named as sampleforvar + year.


Figure 1: Sample List

Clean Data

We have two tasks in this step.

  1. generate the 3 variables rm, x, and r for VAR estimation
  2. set time series, which is the prerequisite for using the svar command

Following Brogaard et al. (2022, RFS), the 3 variables for VAR estimation is constructed as follows.

  1. I use Equal-Weighted Return (ewretd in CRSP) in basis points as market return rm
  2. I use the daily Holding Period Return (ret in CRSP) in basis points as stock return r
  3. I use the daily signed dollar volume in $ thousands as stock order flow x
    • The daily signed dollar volume is defined as the product of daily stock price (prc in CRSP), trading volume (vol in CRSP), and the sign of the stocks’ daily return

To mitigate the impacts of outliers, I winsorized all the above variables at the 5% and 95% levels.

I set the index of trading days within a given stock-year to tag the time series.

The Stata codes are as follows. Note that I use the cusip and year as identifiers. For the convenience of looping over stocks, I generate a unique number cusipcode for each stock.

The output in this step is the yearly dataset named as sampledata +year that is ready for the implementation of the VAR estimation in stock-year level.

* import data
cd C:\Users\xu-m\Documents\testVAR\rawdata2005-2021sas

cap program drop cleanforbrogaard
cap program define cleanforbrogaard
	import delimited "sampleforvar`1'.csv",clear
	destring ret, force replace
	drop if ret==.|prc<0|vol<0
	g r = ret*10000
	g prcsign = cond(ret>0,1,-1)
	g x=vol*prc*prcsign/1000
	g rm = ewretd*10000
	
	winsor2 rm x r, cuts(5 95) replace
	
	encode cusip, g(cusipcode)
	sort cusipcode date
	by cusipcode: g index=_n
	xtset cusipcode index
	
	g year = floor(date[1]/10000)
	
	keep cusip cusipcode year index rm x r
	save sampledata`1',replace
end

forvalues j = 2005/2021{
			cleanforbrogaard `j'
}

Extract Estimation Unit and Set Global Variables

As the VAR estimation is implemented in stock-year level, we need firstly extract the sample for each stock and year with the identifiers cusip and year. All the subsequent manipulations are functioning in the single stock-year dataset as follows.


      +----------------------------------------------------------------------+
     |    cusip         r           x          rm   cusipc~e   index   year |
     |----------------------------------------------------------------------|
  1. | 00032Q10    266.24    637.6158        54.7          1       1   2020 |
  2. | 00032Q10     -1.56   -121.6986      -30.57          1       2   2020 |
  3. | 00032Q10    123.44    190.1839       45.48          1       3   2020 |
  4. | 00032Q10    233.06    243.7433        -.41          1       4   2020 |
  5. | 00032Q10   -599.08    -1396.12    9.139999          1       5   2020 |
     |----------------------------------------------------------------------|
  6. | 00032Q10    251.22    234.7973       29.41          1       6   2020 |
  7. | 00032Q10   -245.06   -288.1359      -12.47          1       7   2020 |
  8. | 00032Q10   -178.28   -149.2384       46.14          1       8   2020 |
  9. | 00032Q10     -16.5    -76.8955        29.2          1       9   2020 |
 10. | 00032Q10    201.65    135.6328       31.06          1      10   2020 |
     |----------------------------------------------------------------------|
 ...
     |----------------------------------------------------------------------|
246. | 00032Q10       400    4548.106      -18.55          1     246   2020 |
247. | 00032Q10    -96.15   -1940.581       38.66          1     247   2020 |
248. | 00032Q10    291.26    4240.543      108.42          1     248   2020 |
249. | 00032Q10    -94.34   -937.4673       -2.77          1     249   2020 |
250. | 00032Q10   -285.71   -2000.298        13.5          1     250   2020 |
     |----------------------------------------------------------------------|
251. | 00032Q10   -588.24   -2000.485      -84.83          1     251   2020 |
252. | 00032Q10     312.5    1369.296      101.55          1     252   2020 |
253. | 00032Q10   -101.01    -862.252      -10.58          1     253   2020 |
     +----------------------------------------------------------------------+

In this step, I also set two global variables that will be repeatedly used in the subsequent procedures.

  1. the variable names in the VAR system names
  2. the number of observations in the dataset unit rownum

The codes for this step are as follows.

* load dataset unit
use sampledata2020,replace
qui keep if cusipcode == 1
list, nolabel
* set global variables
global names "rm x r"
global rownum = _N

Implement Brogaard Decomposition

For now, we’ve collected all the necessary variables, and get the data ready for Brogaard decomposition. Before we actually start to estimate, I would like to provide a big picture for implementing the Brogaard decomposition in a single stock-year dataset.

The tasks we’re going to resolve are as follows.

  1. Estimate the reduced-form VAR in Equation (1), saving the residuals e e e and variance/covariance matrix of residuals Σ e \Sigma_e Σe

  2. Estimate matrix B B B, which specifies the contemporaneous effects among variables in the VAR system

  3. Estimate the structural shocks ϵ t \epsilon_t ϵt and their variance-covariance matrix Σ ϵ \Sigma_\epsilon Σϵ

  4. Estimate the 15-step cumulative structural IRFs θ r m \theta_{rm} θrm, θ x \theta_x θx, θ r \theta_r θr, which represent the (permanent) cumulative return responses to unit shocks of the corresponding structural-model innovations

  5. Combine the estimated variances of the structural innovations from step 3 with the long-run responses from step 4 to get the variance components and variance shares of each information source using the following formula.
     MktInfo  = θ r m 2 σ ε r m 2  PrivateInfo  = θ x 2 σ ε x 2  PublicInfo  = θ r 2 σ ε r 2 . \begin{aligned} \text { MktInfo } &=\theta_{r_m}^2 \sigma_{\varepsilon_{r_m}}^2 \\ \text { PrivateInfo } &=\theta_x^2 \sigma_{\varepsilon_x}^2 \\ \text { PublicInfo } &=\theta_r^2 \sigma_{\varepsilon_r}^2. \end{aligned}  MktInfo  PrivateInfo  PublicInfo =θrm2σεrm2=θx2σεx2=θr2σεr2.

  6. Estimate the contemporaneous noise term with the following Equation
    Δ s = r t − a 0 − θ r m ϵ r m , t − θ x ϵ x , t − θ r ϵ r , t \Delta_s = r_t-a_0-\theta_{rm}\epsilon_{rm,t}-\theta_x\epsilon_{x,t}-\theta_r\epsilon_{r,t} Δs=rta0θrmϵrm,tθxϵx,tθrϵr,t
    As we’re only interested in the variance of Δ s \Delta_s Δs, which is by construct the variance from noise, we can ignore the constant term a 0 a_0 a0 and use the variance of KaTeX parse error: Undefined control sequence: \* at position 10: \Delta_s^\̲*̲ to represent the noise variance instead, where
    KaTeX parse error: Undefined control sequence: \* at position 48: …)=Var(\Delta_s^\̲*̲)\\ \Delta_s^\*…

  7. Get variance shares by normalizing these variance components
     MktInfoShare  = θ r m 2 σ ε r m 2 / ( σ w 2 + σ r 2 )  PrivateInfoShare  = θ x 2 σ ε x 2 / ( σ w 2 + σ r 2 )  PublicInfoShare  = θ r 2 σ ε r 2 / ( σ w 2 + σ r 2 )  NoiseShare  = σ s 2 / ( σ w 2 + σ r 2 ) . \begin{aligned} \text { MktInfoShare } &=\theta_{r_m}^2 \sigma_{\varepsilon_{r_m}}^2 /(\sigma_w^2+\sigma_r^2 )\\ \text { PrivateInfoShare } &=\theta_{x}^2 \sigma_{\varepsilon_x}^2 /(\sigma_w^2+\sigma_r^2 ) \\ \text { PublicInfoShare } &=\theta_r^2 \sigma_{\varepsilon_r}^2 /(\sigma_w^2+\sigma_r^2 ) \\ \text { NoiseShare } &=\sigma_s^2 /(\sigma_w^2+\sigma_r^2 ) . \end{aligned} \notag  MktInfoShare  PrivateInfoShare  PublicInfoShare  NoiseShare =θrm2σεrm2/(σw2+σr2)=θx2σεx2/(σw2+σr2)=θr2σεr2/(σw2+σr2)=σs2/(σw2+σr2).

Estimate VAR Coefficients, Matrix B B B, ϵ t \epsilon_t ϵt, Σ e \Sigma_e Σe, and Σ ϵ \Sigma_\epsilon Σϵ

I set the lag order as 5 to keep consistent with Broggard et al. (2022, RFS) and then I use svar model to estimate the VAR model, with imposing a Cholesky type restriction to contemporaneous matrix B B B as mentioned in the paper.

The readers can see details about the matrix B B B in Dig into Estimation of VAR Coefficients, IRFs, and Variance Decomposition in Stata and see why svar command can directly estimate it in Estimations of VAR, IRFs, and Variance Decomposition in Stata.

The codes for estimating the reduced-form model are as follows.

* estimate B and coefficients of VAR
matrix A1 = (1,0,0 \ .,1,0 \ .,.,1)
matrix B1 = (.,0,0 \ 0,.,0 \ 0,0,.)
qui svar rm x r, lags(1/5) aeq(A1) beq(B1)

The svar command stores the matrix B B B as e(A), the coefficient matrix as e(b_var) and the variance/covariance matrix of residuals Σ e \Sigma_e Σe as e(Sigma). So I didn’t estimate them once more but just take them directly from the results of svar estimation.

* store parameter matrix
mat B = e(A)
mat coef = e(b_var)
mat sigma_hat = e(Sigma)

I adjust the freedom of variance/covariance matrix of residuals generated from svar command sigma_hat from n n n to n − 1 n-1 n1 and name the adjusted variance/covariance matrix as sigma_e. It shouldn’t make much difference if the readers ignore this step.

By definition,
ϵ t = B e t \epsilon_t=Be_t ϵt=Bet
That also implies
Σ ϵ = B Σ e B \Sigma_\epsilon=B\Sigma_eB Σϵ=BΣeB
With above formulas, we can calculate the structural shocks ϵ t \epsilon_t ϵt and their variance/covariance matrix Σ ϵ \Sigma_\epsilon Σϵ as follows. I stored the structural shocks in a matrix named epsilons and the variance/covariance matrix of structural shocks in a matrix named sigma_epsilon.

** get residuals e_t
foreach var of varlist rm x r{
	qui cap predict e_`var', resi equation(`var')
}

** get epsilons
mkmat e_rm e_x e_r, matrix(resi)
mat epsilons = (B*resi')'

* get variance-covariance matrix of residuals and epsilons
mat sigma_e = sigma_hat*_N/(_N-1)
mat sigma_epsilon = B*sigma_e*B'

Estimate 15-step cumulative structural IRFs θ r m \theta_{rm} θrm, θ x \theta_x θx, and θ r \theta_r θr

While the svar command can easily produce the results for IRFs, Orthogonalized IRFs, and Orthogonalized Structural IRFs automatically, what we need are the un-orthogonalized Structural IRFs.

Procedures for estimating θ \theta θ

These un-orthogonalized Structural IRFs can be quickly and conveniently calculated via the following procedures (please see more details in Dig into Estimation of VAR Coefficients, IRFs, and Variance Decomposition in Stata).

  1. calculate the IRFs Φ i ( i = 1 , 2 , 3... , 15 ) \Phi_i(i=1,2,3...,15) Φi(i=1,2,3...,15) following the following formula, where k k k is the number of variables in the VAR system. A j A_j Aj is the j j j-lag coefficient matrix for the reduced-form VAR

Φ 0 = I k Φ i = Σ j = 1 i Φ i − j A j (2) \Phi_0 = I_k\\ \Phi_i = \Sigma_{j=1}^{i}\Phi_{i-j}A_j \tag{2} Φ0=IkΦi=Σj=1iΦijAj(2)

  1. post-multiply IRF Φ i \Phi_i Φi with B − 1 B^{-1} B1 to get (un-orthogonalized) structural IRFs Λ i \Lambda_i Λi for each forward-looking step i = 1 , 2 , 3 , . . , 15 i=1,2,3,..,15 i=1,2,3,..,15
    Λ i = Φ i B − 1 (3) \Lambda_i =\Phi_iB^{-1} \tag{3} Λi=ΦiB1(3)

  2. Sum all the 15-step (un-orthogonalized) structural IRFs Λ i \Lambda_i Λi to obtain the cumulative structural IRFs.

  3. As we are only interested in the 15-step cumulative structural IRFs functioning on the stock returns, which are specified in the third equation in the VAR system, the θ r m \theta_{rm} θrm, θ x \theta_x θx, and θ r \theta_r θr lie on the third row of the 15-step cumulative structural IRF matrix.

Obtain coefficient matrix A j ( j = 1 , 2 , . . . , 5 ) A_j(j=1,2,...,5) Aj(j=1,2,...,5)

To implement the above procedures, the first thing we need to get are the reduced-form coefficient matrix A j ( j = 1 , 2 , . . . , 5 ) A_j(j=1,2,...,5) Aj(j=1,2,...,5). As we have obtained all the reduce-form coefficients with svar command and stored them in a matrix coef in the last step, we don’t have to compute them again but just need to reshape the matrix coef into the shape we need.

The coefficient matrix we currently have coef is a 1 × 48 1\times 48 1×48 matrix. I first reshape it to a 3 × 16 3\times 16 3×16 matrix named newcoef, where each row contains the coefficients for one equation in the VAR system. Within each row, the coefficients are ordered with fixed rules: the coefficients for the first variable rm with 1 to 5 lags, the coefficients for the second variable x with 1 to 5 lags, the coefficients for the third variable x with 1 to 5 lags, and the constant for the corresponding equation. That implies, the coefficients for the same lag can always be found every 5 columns.

With the above observations, I generated matrix A 1 A_1 A1 to A 5 A_5 A5 with the following codes.

* reshape coeficient matrix
cap mat drop newcoef
forvalues i = 1/3{
	mat temp= coef[1..1, 1+16*(`i'-1)..16*`i']
	mat newcoef = nullmat(newcoef) \ temp
}

* generate a1 to a5
forvalues i = 1/5{
	mat A`i' = (newcoef[1..3,`i'], newcoef[1..3,`i'+5], newcoef[1..3,`i'+10])
	mat rownames A`i' = $names
	mat colnames A`i' = $names
	}

I list the 3-lag coefficient matrix A 3 A_3 A3 as an example to show the desired format of coefficient matrix A 1 A_1 A1 to A 5 A_5 A5. For the i j ij ij-th element of the matrix A j A_j Aj, it represents the impact of one-unit reduced-form shock e j t e_{jt} ejt on the Equation with variable i i i as dependent variable.

. mat list A3

A3[3,3]

            inflation      unrate         ffr
inflation  -.06574644   .00181085  -.00500138
   unrate   1.4581185   .04263687   -1.835178
      ffr  -.01217184  -.00032878  -.06017122

Calculate IRFs and cumulative un-orthogonalized Structural IRFs

To this stage, we’ve made it clear about the formulas of calculating the IRFs and un-orthogonalized Structural IRFs (please see Equation (2) and (3)) and obtained all the necessary ingredients (coefficient matrix A i A_i Ai and matrix B B B) for the calculations.

The codes for calculating IRFs and cumulative un-orthogonalized Structural IRFs are as follows. I summed up all the un-orthogonalized Structural IRF matrix step by step to get the 15-step cumulative un-orthogonalized Structural IRFs and name it as csirf.

* calculate IRFs and cumulative un-orthogonalized Structural IRFs
mat irf0 = I(3)
mat sirf0 = irf0*inv(B)
mat csirf = sirf0
forvalues i=1/15{
	mat irf`i' = J(3,3,0)
    forvalues j = 1/5{
    if `i' >= `j'{
        local temp = `i'-`j'
        mat temp2 = irf`temp'*A`j'
        mat irf`i' = irf`i'+ temp2
    }
	}
    mat sirf`i' = irf`i'*inv(B)
    mat csirf = csirf + sirf`i'
}
mat rownames csirf = $names

Extract θ \theta θ

The 15-step cumulative un-orthogonalized Structural IRF matrix csirf is as follows. The i j ij ij-th element of this matrix represents the permanent (cumulative) impact of one-unit structural shock ϵ j , t \epsilon_{j,t} ϵj,t on the i i i-th Equation in the VAR system.

. mat list csirf

csirf[3,3]
            rm           x           r
rm   1.0590505   .00356754   .00036924
 x   11.662946   .84352094  -1.0125034
 r   .94265444   .01400394   .76133195

By definition, the elements in the 3rd row of the matrix csirf are θ r m \theta_{rm} θrm, θ x \theta_x θx, and θ r \theta_r θr respectively.

Thus, we can extract thetas from the matrix csirf and save the thetas into a new matrix named theta.

* extract thetas
mat theta = csirf[3..3, 1..3]

Calculate noise term

As we’ve discussed in the road map, the noise variance is given by the following formula.
KaTeX parse error: Undefined control sequence: \* at position 48: …)=Var(\Delta_s^\̲*̲)\\ \Delta_s^\*…
Intuitively, we need to first calculate Δ s ∗ \Delta_s^* Δs by substracting the combinations of structural shocks ϵ t \epsilon_t ϵt and the permanent impact of structural shocks on stock returns θ \theta θ from the contemporaneous stock return r t r_t rt.

As we’ve saved the structural shocks ϵ t \epsilon_t ϵt in a matrix named epsilons and the permanent impact of structural shocks on stock returns θ \theta θ in a matrix named theta, the contemporaneous noise term Δ s ∗ \Delta_s^* Δs can be calculated with the following codes, where I save the noise term into a matrix named delta_s.

To more conveniently produce the variance of the noise term, I saved the noise term matrix delta_s into a new column named delta_s in the dataset.

* calculate noise
mkmat r, matrix(r)
mat delta_s = r - (theta*epsilons')'
mat colnames delta_s = "delta_s"
svmat delta_s, names(col)

Calculate the variance from each component

Till now, we’ve collected all the ingredients needed to compute the variance contribution of all the four components defined by the Brogaard et al. (2022, RFS).

Firstly, we calculate the variance contribution from three-types of information with the following formula.
 MktInfo  = θ r m 2 σ ε r m 2  PrivateInfo  = θ x 2 σ ε x 2  PublicInfo  = θ r 2 σ ε r 2 . \begin{aligned} \text { MktInfo } &=\theta_{r_m}^2 \sigma_{\varepsilon_{r_m}}^2 \\ \text { PrivateInfo } &=\theta_x^2 \sigma_{\varepsilon_x}^2 \\ \text { PublicInfo } &=\theta_r^2 \sigma_{\varepsilon_r}^2. \end{aligned}  MktInfo  PrivateInfo  PublicInfo =θrm2σεrm2=θx2σεx2=θr2σεr2.
As we’ve saved the thetas in a matrix theta and the variance/covariance matrix of structural shocks ϵ t \epsilon_t ϵt into a matrix sigma_epsilon, we can calculate the information-based variances as follows.

** calculate information part variance
mat var_epsilon = vecdiag(sigma_epsilon)
mat brogaard = J(1,3,0)
forvalues i = 1/3{
	mat brogaard[1,`i']=theta[1, `i']^2*var_epsilon[1, `i']
}
mat brogaard = (brogaard\theta\var_epsilon)
mat rownames brogaard = varpart theta var_epsilon
mat colnames brogaard = $names

Note that I put all the variance components along with the related parameters θ \theta θ and σ ϵ \sigma_\epsilon σϵ into a new matrix named brogaard. This matrix looks like as follows.

. mat list brogaard

brogaard[3,3]
                    rm          x          r
    varpart    17851.6  7821.7685  56649.107
      theta  .94265444  .01400394  .76133195
var_epsilon  20089.638   39884558   97733.84

Secondly, I calculate the variance contribution from noise, which is proxied by the variance of Δ s ∗ \Delta_s^* Δs we’ve calculated above. Of course, I add the noise variance into the result matrix brogaard.

** calculate noise part variance
mat brogaard = (brogaard, J(3,1,0))' 
qui sum delta_s
mat brogaard[4,1] = r(sd)^2
mat rownames brogaard = $names "s"

After this step, we’ve figured out the variance contribution from each component defined by the Brogaard paper and saved them into the result matrix brogaard.

The final result matrix brogaard is as follows.

. mat list brogaard

brogaard[4,3]
        varpart        theta  var_epsilon
rm      17851.6    .94265444    20089.638
 x    7821.7685    .01400394     39884558
 r    56649.107    .76133195     97733.84
 s    20275.294            0            0

Calculate variance contribution

To more conveniently calculate the variance contribution, I saved the result matrix brogaard into the dataset. I follow the following formula to calculate the variance contribution of each component and save the percentages into a new variable named varpct.
 MktInfoShare  = θ r m 2 σ ε r m 2 / ( σ w 2 + σ r 2 )  PrivateInfoShare  = θ x 2 σ ε x 2 / ( σ w 2 + σ r 2 )  PublicInfoShare  = θ r 2 σ ε r 2 / ( σ w 2 + σ r 2 )  NoiseShare  = σ s 2 / ( σ w 2 + σ r 2 ) . \begin{aligned} \text { MktInfoShare } &=\theta_{r_m}^2 \sigma_{\varepsilon_{r_m}}^2 /(\sigma_w^2+\sigma_r^2 )\\ \text { PrivateInfoShare } &=\theta_{x}^2 \sigma_{\varepsilon_x}^2 /(\sigma_w^2+\sigma_r^2 ) \\ \text { PublicInfoShare } &=\theta_r^2 \sigma_{\varepsilon_r}^2 /(\sigma_w^2+\sigma_r^2 ) \\ \text { NoiseShare } &=\sigma_s^2 /(\sigma_w^2+\sigma_r^2 ) . \end{aligned} \notag  MktInfoShare  PrivateInfoShare  PublicInfoShare  NoiseShare =θrm2σεrm2/(σw2+σr2)=θx2σεx2/(σw2+σr2)=θr2σεr2/(σw2+σr2)=σs2/(σw2+σr2).

keep cusip year
qui keep in 1/4
qui svmat brogaard, names(col)
qui g rownames = ""
local index = 1
foreach name in $names "s"{
	qui replace rownames = "`name'" if _n == `index'
	local index = `index' + 1
}
egen fullvar = sum(varpart)
qui g varpct = varpart/fullvar*100

Pack codes

Remember the Broggard decomposition is implemented in stock-year level. That means we need to loop over the above codes over the daily observations of each stock in each year. That requires an efficient packing of the codes.

There are two issues worth noted in the packing procedures.

  1. I require there are at least 50 observations for the estimation of the VAR model
    • otherwise, the VAR estimation doesn’t converge or lacks vaidility with too few freedoms
  2. I require the estimation of VAR model converges
    • otherwise, it’s not possible to get converged residuals, which are the prerequisite for the subsequent calculations
cap program drop loopb
cap program define loopb
	use sampledata`2',replace
	qui keep if cusipcode == `1'
	
	if _N >= 50{
	* set global variables
	global names "rm x r"
	global rownum = _N

	* estimate B and coefficients of VAR
	matrix A1 = (1,0,0 \ .,1,0 \ .,.,1)
	matrix B1 = (.,0,0 \ 0,.,0 \ 0,0,.)
	qui svar rm x r, lags(1/5) aeq(A1) beq(B1)
	mat B = e(A)
	mat coef = e(b_var)
	mat sigma_hat = e(Sigma)

	* get coefficient matrix of var a1-a5
	* reshape coeficient matrix
	cap mat drop newcoef
	forvalues i = 1/3{
		mat temp= coef[1..1, 1+16*(`i'-1)..16*`i']
		mat newcoef = nullmat(newcoef) \ temp
	}

	* generate a1 to a5
	forvalues i = 1/5{
		mat A`i' = (newcoef[1..3,`i'], newcoef[1..3,`i'+5], newcoef[1..3,`i'+10])
		mat rownames A`i' = $names
		mat colnames A`i' = $names
		}

	* get 15-step coefficients of VMA (irf), structual irf (sirf), and cumulative sirf
	mat irf0 = I(3)
	mat sirf0 = irf0*inv(B)
	mat csirf = sirf0
	forvalues i=1/15{
		mat irf`i' = J(3,3,0)
		forvalues j = 1/5{
		if `i' >= `j'{
		local temp = `i'-`j'
		mat temp2 = irf`temp'*A`j'
		mat irf`i' = irf`i'+ temp2
	}
	}
		mat sirf`i' = irf`i'*inv(B)
		mat csirf = csirf + sirf`i'
	}

	* extract thetas
	mat theta = csirf[3..3, 1..3]

	* get reisduals, epsilons, and their variance-covariance matrix
	** get residuals
	foreach var of varlist rm x r{
		qui cap predict e_`var', resi equation(`var')
	}
	
	capture confirm variable e_rm
	if (_rc == 0){
			** get epsilons
		mkmat e_rm e_x e_r, matrix(resi)
		mat epsilons = (B*resi')'

		** get variance-covariance matrix of residuals and epsilons
		mat sigma_e = sigma_hat*_N/(_N-1)
		mat sigma_epsilon = B*sigma_e*B'

		* calculate noise
		mkmat r, matrix(r)
		mat delta_s = r - (theta*epsilons')'
		mat colnames delta_s = "delta_s"
		svmat delta_s, names(col)

		* calculate variance decomposition of each part
		** calculate information part variance
		mat var_epsilon = vecdiag(sigma_epsilon)
		mat brogaard = J(1,3,0)
		forvalues i = 1/3{
			mat brogaard[1,`i']=theta[1, `i']^2*var_epsilon[1, `i']
		}
		mat brogaard = (brogaard\theta\var_epsilon)
		mat rownames brogaard = varpart theta var_epsilon

		** calculate noise part variance
		mat brogaard = (brogaard, J(3,1,0))' 
		qui sum delta_s
		mat brogaard[4,1] = r(sd)^2
		mat rownames brogaard = $names "s"
		*mat list brogaard

		** save the variance decomposition results 
		keep cusip year
		qui keep in 1/4
		qui svmat brogaard, names(col)
		qui g rownames = ""
		local index = 1
		foreach name in $names "s"{
			qui replace rownames = "`name'" if _n == `index'
			local index = `index' + 1
		}
		egen fullvar = sum(varpart)
		qui g varpct = varpart/fullvar*100

		* save final results
		local savename = "C:\Users\xu-m\Documents\testVAR\resvardecompose\brogaard_'$cusip'_$year.dta"
		qui save "`savename'", replace
	}
	}
end 

Loop over sample

I run the packed code over stocks in each year and collected all the results for different years together. Then I reshaped the dataset into panel data. The codes are as follows.

cd C:\Users\xu-m\Documents\testVAR\resvardecompose

* collect results for each year
cap program drop collectbroggardbyyear
cap program define collectbroggardbyyear
	clear
	set obs 0
	save brogaard`1', replace emptyok

	local ff : dir . files "*_`1'.dta"
	local yearnum : word count "`ff'"
	local index = 1
	foreach f of local ff {
		append using "`f'"
		di "`index' of `yearnum' in year `1'"
		local index = `index' + 1
		}
	save brogaard`1', replace 
end

* collect all years
clear 
set obs 0
save ../brogaard, replace emptyok
forvalues year = 2005/2021{
	collectbroggardbyyear `year'

	use ../brogaard, replace
	append using brogaard`year'
	save ../brogaard, replace 
}

* reshape into panel data
cd ../

use brogaard, replace
rename fullvar_r sdr
replace rownames = "_"+rownames
reshape wide varpart theta var_epsilon fullvar varpct sdr, i(cusip year) j(rownames) string
rename varpct_rm mktinfo
rename varpct_x privteinfo
rename varpct_r publicinfo
rename varpct_s noise
keep cusip year *info noise
save panelbrogaard, replace

The final outcome is as follows.

. list in 1/20

     +-------------------------------------------------------------+
     |    cusip   year   public~o    mktinfo      noise   privte~o |
     |-------------------------------------------------------------|
  1. | 00030710   2014   .4612017   17.60379   58.85497   23.08004 |
  2. | 00030710   2015   43.65164   10.60914   12.57214   33.16709 |
  3. | 00030710   2016   41.93059   13.32436   6.182855    38.5622 |
  4. | 00030710   2017   54.59381   1.191425   14.96389   29.25087 |
  5. | 00030710   2018     43.838   25.09339   8.907529   22.16108 |
     |-------------------------------------------------------------|
  6. | 00030710   2019   30.02275   25.98764   12.31215   31.67746 |
  7. | 00032Q10   2018   18.90079   23.99627   36.99344   20.10949 |
  8. | 00032Q10   2019   59.49644   5.478648   11.91035   23.11457 |
  9. | 00032Q10   2020   55.21476    17.3996   19.76192   7.623722 |
 10. | 00032Q10   2021   51.75403   25.38084   12.44535   10.41978 |
     |-------------------------------------------------------------|
 11. | 00036020   2005   40.70956   7.924543   26.06547   25.30043 |
 12. | 00036020   2006   38.53111   21.68062   6.496971   33.29131 |
 13. | 00036020   2007   32.78075   4.495751   27.55989    35.1636 |
 14. | 00036020   2008   21.36486   47.52046   16.44107    14.6736 |
 15. | 00036020   2009   34.37528   31.25706   17.96114   16.40653 |
     |-------------------------------------------------------------|
 16. | 00036020   2010   39.55574   34.02036   8.709065   17.71484 |
 17. | 00036020   2011   21.50586   17.19429   46.27998   15.01987 |
 18. | 00036020   2012    52.9045   11.12552   11.89774   24.07223 |
 19. | 00036020   2013   12.02968   14.62235   16.48395   56.86403 |
 20. | 00036020   2014   31.94392   42.19098   8.426245   17.43885 |
     +-------------------------------------------------------------

Conclusion

In this blog, I replicated the stock volatility decomposition method introduced by Brogaard et al. (2022, RFS). Given the potential of this information-based decomposition method as I’ve discussed in Theory for the information-based decomposition of stock price, I hope this blog can help the readers make use of this method to empower their own research.

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/156444.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

常用的前端大屏 适配方案

方案实现方式优点缺点vm vh1.按照设计稿的尺寸&#xff0c;将px按比例计算转为vw和vh1.可以动态计算图表的宽高&#xff0c;字体等&#xff0c;灵活性较高 2.当屏幕比例跟 ui 稿不一致时&#xff0c;不会出现两边留白情况1.每个图表都需要单独做字体、间距、位移的适配&#xf…

【寒假每日一题】AcWing 4509. 归一化处理

目录 一、题目 1、原题链接 2、题目描述 二、解题报告 1、思路分析 2、时间复杂度 3、代码详解 三、知识风暴 1、cmath头文件相关函数 2、cout大法 一、题目 1、原题链接 4509. 归一化处理 - AcWing题库 2、题目描述 在机器学习中&#xff0c;对数据进行归一化处理…

【C++】list用法简单模拟实现

文章目录1. list的介绍及使用1.1 list基本概念1.2 list的构造1.3 list的迭代器使用1.4 list 赋值和交换1.5 list 插入和删除1.6 list容量大小操作1.7 list 数据存取2. list的模拟实现这次要模拟实现的类及其成员函数接口总览2.1 结点类的实现2.2 迭代器的模拟实现2.3 反向迭代器…

yolov1 论文精读 - You Only Look Once- Unified, Real-Time Object Detection-统一的实时目标检测

Abstract 我们提出了一种新的目标检测方法- YOLO。以前的目标检测工作重复利用分类器来完成检测任务。相反&#xff0c;我们将目标检测框架看作回归问题&#xff0c;从空间上分割边界框和相关的类别概率。单个神经网络在一次评估中直接从整个图像上预测边界框和类别概率。由于…

PDF体积太大怎么缩小?这两种方法轻松解决

在我们日常处理的文件中&#xff0c;PDF文件的体积已经算是比较小的文件了&#xff0c;但是随着工作时间增加&#xff0c;我们用到的PDF文件也越来越多&#xff0c;而且有些PDF文件的内容非常丰富&#xff0c;文件体积变得更大&#xff0c;这就不利于我们将文件传输给别人&…

人脸检测算法模型MTCNN

MTCNN,Multi-task convolutional neural network(多任务卷积神经网络),将人脸区域检测与人脸关键点检测放在了一起。总体可分为P-Net、R-Net、和O-Net三层网络结构。P-Net是快速生成候选窗口,R-Net进行高精度候选窗口的过滤和选择,O-Net是生成最终边界框和人脸关键点。该…

使用JDK的 keytool 生成JKS,修改查看JKS信息

什么是keytool keytool 是个密钥和证书管理工具。它使用户能够管理自己的公钥/私钥对及相关证书&#xff0c;在JDK 1.4以后的版本中都包含了这一工具&#xff0c;所以不用再上网去找keytool的安装&#xff0c;电脑如果安装有JDK1.4及以上&#xff0c;就可以直接使用。 第一步&…

TOOM舆情分析网络舆情监控平台研究现状

随着网络舆情迅速发展&#xff0c;国内的舆情监测行业也日渐完善&#xff0c;舆情监控平台在企业发展过程中发挥重要作用&#xff0c;但同样也是有问题存在的&#xff0c;接下来TOOM舆情分析网络舆情监控平台研究现状? 一、网络舆情监控平台 网络舆情监控平台是一种能够对网…

maven概述以及简单入门

目录 1、Maven概述 1.1、Maven是什么 1.2 依赖管理 1.3 maven管理资源存放地址 1.4 Maven的作用 2.Maven基础概念 2.1仓库概念 2.坐标概念 1、Maven概述 1.1、Maven是什么 在Javaweb开发中&#xff0c;需要使用大量的jar包&#xff0c;我们手动去导入&#xff1b; 如何…

Mask RCNN网络源码解读(Ⅵ) --- 自定义数据集读取:MS COCOPascal VOC

目录 1.如何在Mask R-CNN中读取有关COCO数据集的内容&#xff08;my_dataset_coco.py&#xff09; 1.1 CocoDetection类 1.1.1 初始化方法__init__ 1.1.2 __getitem__方法 1.1.3 parse_targets 2.如何在Mask R-CNN中读取有关Pascal VOC数据集的内容&#xff08;my_datas…

docker搭建 java web服务

安装 Docker 只需通过以下命令即可安装 Docker 软件&#xff1a; >> rpm -Uvh http://download.fedoraproject.org/pub/epel/6/i386/epel-release-6-8.noarch.rpm >> yum -y install docker-io可使用以下命令&#xff0c;查看 Docker 是否安装成功&#xff1a; …

SpringMvc源码分析(一):启动tomcat服务器,加载DispatcherServlet并将DispatcherServlet纳入tomcat管理

SpringMvc是主流的MVC框架&#xff0c;它是基于Spring提供的web应用框架&#xff0c;该框架遵循servlet规范。该框架的作用是接收Servlet容器&#xff08;如Tomcat&#xff09;传递过来的请求并返回响应。SpringMvc的核心就是servlet实例&#xff0c;而这个servlet在spring中就…

IB地理科SL和HL课程的区别

今期我们会谈到IB地理科这一科目的标准级别&#xff08;StandardLevel&#xff0c;SL&#xff09;课程和高级级别&#xff08;HigherLevel&#xff0c;HL&#xff09;。 两课程的最大区别:试卷数目和题目数量的不同&#xff0c;但两者的教材内容和科目指引&#xff08;SubjectG…

VTK-不同类型的数据集

前言&#xff1a;本博文主要讲解vtk中不同类型的数据集以及它们之间的关系&#xff0c;如何进行转换等。 目录 vtkImageData vtkRectilinearGrid vtkStructuredGrid vtkUnstructuredPoints vtkPolyData vtkUnstructuredGrid vtkPolyData->vtkImageData vtkPolyData…

Go反射学习

文章目录反射介绍&#xff1a;反射应用点变量-空接口-reflect.Value&#xff08;Type)类型值方法结构体&#xff1a;反射修改变量值反射操作结构体MethodCall反射介绍&#xff1a; 反射是在运行时&#xff0c;动态的获取变量的各种信息&#xff0c;如变量的类型&#xff0c;类…

Springboot中如何优雅的写好Service层代码

前言《Springboot中如何优雅的写好controller层代码》一不小心进入了全站综合热榜&#xff0c;收到了大家热情的支持&#xff0c;非常感谢大家&#xff0c;同时说明大家都有同样一个诉求&#xff0c;想好好写代码&#xff0c;不想给别人挖坑&#xff0c;争取可以早点下班。今天…

【Spring源码】CommonAnnotationBeanPostProcessor.postProcessMergedBeanDefinition()详解

CommonAnnotationBeanPostProcessor的postProcessMergedBeanDefinition()中一共包含3个方法&#xff0c;上篇文章我们介绍了的第一个方法&#xff0c;它一个父类调用&#xff08;如下图&#xff09;&#xff0c;其实就是处理PostConstruct和PreDestroy这两个注解的这篇我们继续…

一起聊聊数据治理

统一赵秦车轨&#xff0c;推行秦篆&#xff0c;统一七国文字&#xff0c;兵器统一标准&#xff0c;统一度量衡… 我们优秀的数据治理专家-秦始皇&#xff01; 数据治理这个名字起得好&#xff0c;一般人听不懂&#xff0c;实际上并不是IT人员的专属&#xff0c;广义上来说我们日…

纳米软件分享:为什么要进行电池充放电测试?电池充放电系统测试步骤

在日常生活中电能一直是我们接触过的最为方便的能源&#xff0c;而我们也通过各种方法对电能进行储存从而让我们能随时随地的使用这种能源&#xff0c;比如手机中的锂电池、电动车中的充电电池等。 充电锂电池经过多年的技术革新&#xff0c;综合性能不断提升&#xff0c;已经应…

基于node.js和Vue的食堂窗口美食评价系统/美食网站

摘要本论文主要论述了如何使用Node.js语言开发一个食堂窗口美食评价系统&#xff0c;本系统将严格按照软件开发流程进行各个阶段的工作&#xff0c;采用B/S架构&#xff0c;面向对象编程思想进行项目开发。在引言中&#xff0c;作者将论述食堂窗口美食评价系统的当前背景以及系…