Wednesday, August 31, 2011
Prayer to action
Based off the triplet found in Reinhold Neibuhr's Serenity Prayer:
Prayer to action
God, grant me the clarity to see,
the boldness to walk,
and the discipline to stay the right path.
Prayer to action
God, grant me the clarity to see,
the boldness to walk,
and the discipline to stay the right path.
Tuesday, August 30, 2011
4 stages of competence
Condensed, from Wikipedia here:
The Four Stages
Unconscious Incompetence
You don't know you suck.
Conscious Incompetence
You know you suck, but also where you suck. (I.e., where to improve).
Conscious Competence
You've got something going, but it takes an aneurism and a half to get things done right.
Unconscious Competence
You're awesomeness is practically in autodrive.
The Four Stages
Unconscious Incompetence
You don't know you suck.
Conscious Incompetence
You know you suck, but also where you suck. (I.e., where to improve).
Conscious Competence
You've got something going, but it takes an aneurism and a half to get things done right.
Unconscious Competence
You're awesomeness is practically in autodrive.
Monday, August 29, 2011
Ggplot2 Panels
I've been trying to use R to print out columns of data each in their own panel form for some time now. They all use the same time frame, of course, so I wanted to arrange by X-axis.
ggplot2 has a "facet" grammar you can use to plot different aspects of your data in a meaningful way -- i.e., different parts of an observation. The trick is to contort your data with the "melt" function, which will create two new variables per observation.
The first variable is the name of one of the columns.
The second variable is the value that corresponds to said variable.
I tried taking a peek at the data length -- observations should increase by a factor equal to the number of columns you are melting, i.e. if I'm melting by 3 variables
The data length increases by 3. But of course.
Here's the code I'm using to format, melt, plot, and save the data to pdf:
ggplot2 has a "facet" grammar you can use to plot different aspects of your data in a meaningful way -- i.e., different parts of an observation. The trick is to contort your data with the "melt" function, which will create two new variables per observation.
The first variable is the name of one of the columns.
The second variable is the value that corresponds to said variable.
I tried taking a peek at the data length -- observations should increase by a factor equal to the number of columns you are melting, i.e. if I'm melting by 3 variables
>>> data = melt(factor.data, measure.vars=c("variable_one", "variable_two", "variable_three"))
>>> length(factor.data$val)
85
>>> length(melted.data$val)
340
The data length increases by 3. But of course.
Here's the code I'm using to format, melt, plot, and save the data to pdf:
process.open <- function(filename) {
return(read.csv(file=filename, head=TRUE, sep=",", dec="."))
}
filename <- "z1_output_quint.csv"
output <- "z1_output_quint_panel.pdf"
factor.data <- process.open(filename)
factor.data$date <- as.Date(as.character(factor.data$date), format="%Y%m%d")
factor.data$val <- factor.data$value
factor.data$value <- NULL
vars = c("val", "momentum", "size")
vars_2 = c("dividend_yield", "profitability", "growth")
vars_3 = c("earnings_variability", "trading_activity", "volatility", "leverage")
pdf(output)
data = melt(factor.data, measure.vars=vars)
ggplot(data, aes(date)) + geom_hline(yintercept=0)+
geom_line(aes(y=value))+
facet_grid(variable ~ .)
data = melt(factor.data, measure.vars=vars_2)
ggplot(data, aes(date)) + geom_hline(yintercept=0)+
geom_line(aes(y=value))+
facet_grid(variable ~ .)
data = melt(factor.data, measure.vars=vars_3)
ggplot(data, aes(date)) + geom_hline(yintercept=0)+
geom_line(aes(y=value))+
facet_grid(variable ~ .)
dev.off()
Sunday, August 28, 2011
Extend vs Append
Could be useful.
>>> x = [1,2,3]
>>> y = [4,5,6]
>>> x.append(y)
>>> x
[1, 2, 3, [4, 5, 6]]
>>> x.remove(y)
>>> x
[1, 2, 3]
>>> x.extend(y)
>>> x
[1, 2, 3, 4, 5, 6]
>>>
Friday, August 26, 2011
Words from a Quant
The head of Director of Quantitative Research at my firm offered wonderful advice on understanding macro trends in finance:
1) A strong (formal) economic background
2) Great market experience
3) Patience and intelligence
One thing I asked him about was physical fitness and rest. He said because of the high level of concentrated study, adequate sleep and physical training (he did weightlifting) was necessary. I will take these words to heart when I go to study for my MSCS next year.
As an added bonus, he said that being the opposite of "scatter-brained" was one of the important qualities of a Ph.D. candidate. I think it is definitely something for a short-attention-span person like me to think about.
1) A strong (formal) economic background
2) Great market experience
3) Patience and intelligence
One thing I asked him about was physical fitness and rest. He said because of the high level of concentrated study, adequate sleep and physical training (he did weightlifting) was necessary. I will take these words to heart when I go to study for my MSCS next year.
As an added bonus, he said that being the opposite of "scatter-brained" was one of the important qualities of a Ph.D. candidate. I think it is definitely something for a short-attention-span person like me to think about.
Thursday, August 25, 2011
A stitch in time
Saves nine.
Was chasing down a bug in my factor model for a week. Went through a lot of stuff.
I decided to look at one day that seemed out of wack. Surely, if this simple calculation was wrong, it must be the data. Otherwise, I would have to keep looking in my own code.
Surprise surprise, the period i was looking at did not match up with my expectations. When I inspected the FactSet code with the resident quant, I instantly saw the problem...
Was using
I was grabbing yearly returns, instead of monthly!!!
Lesson: Be more thorough, use methods to narrow in on the problem.
Was chasing down a bug in my factor model for a week. Went through a lot of stuff.
I decided to look at one day that seemed out of wack. Surely, if this simple calculation was wrong, it must be the data. Otherwise, I would have to keep looking in my own code.
Surprise surprise, the period i was looking at did not match up with my expectations. When I inspected the FactSet code with the resident quant, I instantly saw the problem...
Was using
P_TOTAL_RETURNC(0/0/-2,0/0/-1) rather than P_TOTAL_RETURNC(0/-2/0,0/-1/0)
I was grabbing yearly returns, instead of monthly!!!
Lesson: Be more thorough, use methods to narrow in on the problem.
Wednesday, August 24, 2011
Unit Tests and Python floats
I realized my downfall with using Python's easy dynamic typing: unit tests.
Since I use ints to simplify things, I forget that special "." that makes 1./3 different from 1/3 (hint: the difference is a third)
Which really means, remember to use . anytime you don't specifically mean an integer!
Since I use ints to simplify things, I forget that special "." that makes 1./3 different from 1/3 (hint: the difference is a third)
Which really means, remember to use . anytime you don't specifically mean an integer!
class TestNormalizeFunction(unittest.TestCase):
def setUp(self):
## Setup for weight test
self.dateList2 = [1,1,1,2,2,2]
self.countryList2 = ["us","us","mx","us","us","mx"]
self.list2 = [1.,3.,3.,1.,3.,3.] ## <--- NOT THE SAME AS [1,3,3,1,3,3]!
def test_weights (self):
self.assertEqual(process.calculateWeights(
self.dateList2, self.countryList2, self.list2),
[self.list2[0] / sum(self.list2[:2]),
self.list2[1] / sum(self.list2[:2]),
1.,
self.list2[3] / sum(self.list2[3:5]),
self.list2[4] / sum(self.list2[3:5]),
1.])
if __name__ == "__main__":
unittest.main()
Monday, August 22, 2011
The Basel Multiplier
Chebyshev's Inequality says:
Suppose we wanted the highest VaR at the 99 percent confidence level.
That means, only 1 percent of samples taken should have VaR that exceed this level. Without making any assumptions about the distribution, Chebyshev's inequality says that we can expect K = 10 at the 1% tail end.
Granted, that's pretty big, but if we assume the distribution is symmetric then we can divide the righthand of the equation by half and the new k becomes about 7.
So if a financial institution had assumed a normal distribution, the k value for which VaR is not exceeded with 99% confidence is closer to 2.36. Dividing our earlier k (=7) by this k value, we get ~3. Thus the correction multiplier is about 3.
Example taken from Jorion
Tail events
There are two ways to measure VaR (Value at Risk):
- Finding quantiles using empirical data
- Matching a parametric distribution to data
ETL (Expected Tail Loss) takes averages of the tail value at risk to provide better perspective for tail risk outcomes.
There are still drawbacks to using empirical data.
"The most powerful statistical techniques cannot make short histories reveal once-in-a-lifetime events." (Jorion, Chapter 5)
For this, we move on to Stress Testing.
Subscribe to:
Posts (Atom)