Holy grail or holy fail?

So, continuing fun with principal components, I decided to look at measures of GLD volatility. Based on some work by our friend Biosci, I looked at the standard deviation of day-to-day changes over time. But over how much time? If we limit ourselves to popular Fib numbers, what's the most meaningful time interval over which to look at standard deviation of %change in price?

Well, that's the fun part of PCA. I just throw 'em all in and let PCA sort 'em out (by extracting the most important info).

To allow visualization, let's limit ourselves for now to three dimensions (axes = 233 day, 144 day, and 89 day standard deviation). We can first graph all our data points, with the colors of the points representing the date (using a rainbow gradient, where blue is old and red is most recent)

Now our three axes, representing each date's 233, 144, and 89 day standard deviation, create a 3D space that's mostly empty. Intuitively, we can reduce the above graph to 2 dimensions without losing much info (think of a "plane of best fit" going across the above graph diagonally):

With PCA, it's up to you to interpret what each new axis represents (in this case, some aspect of volatility extracted from the 3 standard deviation measures we began with).  If we imagine pulling every point in the above graph down onto the horizontal axis, we get the most important relative measure of volatility for each date. If we pull every point to the left, onto the vertical axis, we get the second most important relative measure of volatility for each date. Importantly, the second most important measure of volatility is uncorrelated to the first:

 Uncorrelated (by definition), but as we see below when we order the PCs by date, not independent!

PC1 appears to predict what's going to happen to PC2 six months in advance. This general pattern doesn't change if I increase dimensionality from 3 (used here for visualization) to many more by including lots of other Fibonacci standard deviations, or even throwing in other measures of volatility, like the average magnitude of day-over-day %changes (and not the standard deviation thereof).

I'm not sure what PC2 is measuring, but if we consider PC1 a "fundamental" measure of volatility, once it goes negative, it appears to be good for the gold price (and signal an end to a correction) whenever PC2 follows it into negative territory.

As an aside, also note that PC1 is at an all-time low. Has a bubble ever ended with such low volatility?


Anonymous said...

This looks like an explanation of Science Offer Spock's three dimensional chess game.

I'll get my coat.

Anonymous said...

Science Officer Spock. Sorry. How do you edit comments?

Louis Cypher said...

Unfortunately we can't edit comments. Although that could be a lot of fun when people like brother John show up here.

Biosci said...


You might try a singular value decomposition (SVD) instead of a PCA on your matrix. They're mathematically equivalent -- both eigenvalue decompositions of the covariance matrix -- but the SVD returns orthonormal vectors as PCs instead of the scaled versions it looks like you're seeing here. Also the right-vector space can tell you exactly which data components (i.e. which DMAs or their stdevs) contribute to which principal component.

One catch is that some implementations (matlab, I'm looking at you) treat the data differently: PCA will automatically mean-center the rows and columns but SVD will not; you have to do it manually.

Details offline if you're interested.

Anonymous said...


An excellent article on Bull Market Thinking (dot) com. about the unforeseen problems of Bitcoin and unfortunately why Silver will never regain it's status as "real money".


hiptwist said...

I don't get why PC1 is uncorrelated to PC2:

PC1 is a composit of st89 and st144, PC2 a composit of st233 and st 89. As st89 contains information also used in st144 and st233 and st144 contains additional information used in st233 they should be somehow related (correlated?!).

It's like a complicated version of examining 3 MAs with different lookbacks for the same underlying data.

Biosci said...


You have almost answered your own question. The two sets may not be Pearson correlated, but if you phase shift one component they will probably line up. Technically this is a cross-correlation. If you're playing with the data, I'm curious as to whether this gives a signal...

GM Jenkins said...

Ah just saw this question - thanks Biosci.
Independent means uncorrelated, but uncorrelated does not mean independent -- the textbook example is if you have two variables x and y, and y=x^2, then obviously x and y are not independent-- but they *are* uncorrelated because there's no linear relationship (think of trying to draw a line of best fit through points that make a parabola - it would be flat)

Incidentally, screwtape readers may be interested to learn that we are currently in a bidding war for the services of Biosci. Just waiting for Jeanne d'Arc to give the ok-- she may be holding out on account of her suspicion that we're packing our staff, FDR-style, with gold bulls. Luckily he hasn't left much of a paper trail, so I think the prospects are good... Stay tuned.

Btw, hiptwist, you're right that there's some composite action going on - from the PC loadings, it appears that PC1 captures essentially what the 144 day is measuring, while the second PC captures whatever difference there is between the 233 and 89 (independent of what they share with the 144).

Anonymous said...

Nowt to do with me, GM. I sent an invite to Biosci ages ago, at your request. I think it timed out...

That's the trouble with gold bulls. They're just not quick enough off the mark... ;-)

Let me know if you want me to send another invitation to post...

Biosci said...

Am I a gold bull? I've made far more money shorting it. Though still not very much money at all, which is one reason I keep this pesky day job, which interferes mightily with my blogging time.