In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import edhec_risk_kit_206 as erk

Retrieve the monthly returns for 49 industries from 1960 to 2018

In [2]:
df = erk.get_ind_returns(n_inds=49)['1960':]

Annualized returns

Annualizing returns allow us to compare returns over different lengths of time. If you are annualizing a return for an investment that took less than a year, you can think of the annualized return as the total return if you had continued seeing that rate of return for a full year.

$r_a = (1 + r_{\text{total}})^{P/n} - 1$

where $P$ is the number of periods per year.

and $n$ is the number of periods.

If we have a set of 25 monthly returns, $P=12$ and $n=25$

If we have a set of 3 quarterly returns, $P=4$ and $n=3$

If we have an investment that returned 9% over 2 years, $P=1$ and $n=2$

For example, if you have an investment that returned 0.5% over one month, the annualized return would be equal to $(1 + 0.005)^{12} - 1 = 6.17$%.

If you have an investment that returned 1% over two months, the annualized return would be equal to $(1 + 0.01)^{12/2} - 1 = 6.15$%.

If you have an investment that returned 15% over 3 years, the annualized return would be equal to $(1 + 0.15)^{1/3} - 1 = 4.77$%.

If you have an investment that retured 15% over 28 months, the annualized return would be equal to $(1 + 0.15)^{12/28} - 1 = 6.17$%.

def annualize_rets(r, periods_per_year):
    """
    Annualizes a set of returns
    """
    compounded_growth = (1+r).prod()
    n_periods = r.shape[0]
    return compounded_growth**(periods_per_year/n_periods)-1

Consider a portfolio holding 100% Beer industry stocks.

Calculate the annualized return.

In [3]:
erk.annualize_rets(df['Beer'], periods_per_year=12)
Out[3]:
0.11920634382896678

Annnualized volatility

$\sigma_a = \sigma * \sqrt{P}$

Where $\sigma$ is the standard deviation of the returns and $P$ is the number of periods per year.

$\sigma = \sqrt{\frac{\sum_{i=0}^{N}(x_i-\mu)^2}{N-1}}$

def annualize_vol(r, periods_per_year):
    """
    Annualizes the vol of a set of returns
    """
    return r.std()*(periods_per_year**0.5)

Calculate the annualized volatility of the returns

In [4]:
erk.annualize_vol(df['Beer'], periods_per_year=12)
Out[4]:
0.17519063591781509

Drawdown

Drawdown is equal to the value of a portfolio, at a given timestep, subtracted from historical maximum value of the portfolio, expressed as a percentage loss.

$\text{Drawdown at time T} = \frac{(\max \limits_{t\in (0,T)} X(t) ) - X(T)}{\max\limits_{t\in (0,T)} X(t)}$

def drawdown(return_series: pd.Series):
    """Takes a time series of asset returns.
       returns a DataFrame with columns for
       the wealth index, 
       the previous peaks, and 
       the percentage drawdown
    """
    wealth_index = 1000*(1+return_series).cumprod()
    previous_peaks = wealth_index.cummax()
    drawdowns = (wealth_index - previous_peaks)/previous_peaks
    return pd.DataFrame({"Wealth": wealth_index, 
                         "Previous Peak": previous_peaks, 
                         "Drawdown": drawdowns})

Maximum drawdown

Maximum drawdown over the course of an investment.

In [5]:
erk.drawdown(df['Beer'])['Drawdown'].min()*-1
Out[5]:
0.5855330578251526

What month did max drawdown occur in?

In [6]:
erk.drawdown(df['Beer'])['Drawdown'].idxmin()
Out[6]:
Period('1974-12', 'M')

Semideviation

Semideviation is the standard deviation among the subset of negative asset returns.

def semideviation(r):
    """
    Returns the semideviation aka negative semideviation of r
    r must be a Series or a DataFrame, else raises a TypeError
    """
    if isinstance(r, pd.Series):
        is_negative = r < 0
        return r[is_negative].std(ddof=0)
    elif isinstance(r, pd.DataFrame):
        return r.aggregate(semideviation)
    else:
        raise TypeError("Expected r to be a Series or DataFrame")

What was the semideviation of the return distribution?

In [7]:
erk.semideviation(df['Beer'])
Out[7]:
0.03417051008448179

Conditional Value at Risk (CVaR)

The conditional value at risk is the expected return given that the return is less than than value at risk.

$\text{CVaR} = -\text{E}(R|R \leq - \text{VaR})$

$= \frac{-\int_{-\infty}^{-\text{VaR}}x * f_R(x)dx}{F_R(-\text{VaR})}$

def cvar_historic(r, level=5):
    """
    Computes the Conditional VaR of Series or DataFrame
    """
    if isinstance(r, pd.Series):
        is_beyond = r <= -var_historic(r, level=level)
        return -r[is_beyond].mean()
    elif isinstance(r, pd.DataFrame):
        return r.aggregate(cvar_historic, level=level)
    else:
        raise TypeError("Expected r to be a Series or DataFrame")
In [8]:
erk.cvar_historic(df['Beer'], level=5)
Out[8]:
0.10638611111111114

Sharpe Ratio

The Sharpe ratio for an investment is its excess return divided by the standard deviation of its returns.

Sharpe ratio = $\frac{r_a - r_f}{\sigma_a}$

def sharpe_ratio(r, riskfree_rate, periods_per_year):
    """
    Computes the annualized sharpe ratio of a set of returns
    """
    # convert the annual riskfree rate to per period
    rf_per_period = (1+riskfree_rate)**(1/periods_per_year)-1
    excess_ret = r - rf_per_period
    ann_ex_ret = annualize_rets(excess_ret, periods_per_year)
    ann_vol = annualize_vol(r, periods_per_year)
    return ann_ex_ret/ann_vol
In [9]:
erk.sharpe_ratio(df['Beer'], riskfree_rate=0.02, periods_per_year=12)
Out[9]:
0.5559722618022646

Skewness

Skewness measures the asymmetry of a probability distribution. It is the third standardized moment of a random variable.

$\text{Skewness}[X] = \text{E}[(\frac{X-\mu}{\sigma})^3] = \frac{\text{E}[(X - \mu)^3]}{(\text{E}[(X - \mu)^2])^{3/2}}$

Normal distributions have a skessness of 0.

Example of a negative and positibe skew:

skew

def skewness(r):
    """
    Alternative to scipy.stats.skew()
    Computes the skewness of the supplied Series or DataFrame
    Returns a float or a Series
    """
    demeaned_r = r - r.mean()
    # use the population standard deviation, so set dof=0
    sigma_r = r.std(ddof=0)
    exp = (demeaned_r**3).mean()
    return exp/sigma_r**3

What is the skewness of this return distribution?

In [10]:
erk.skewness(df['Beer'])
Out[10]:
-0.010471512193777247

Which industry at the most negatively skewed return distribution?

In [11]:
print(erk.skewness(df).idxmin())
Meals

Kurtosis

Kurtosis is a measure of the size of the tail of a probability distribution. It is the fourth standardized moment of a random variable.

$\text{Kurtosis}[X] = \text{E}[(\frac{X-\mu}{\sigma})^4] = \frac{\text{E}[(X - \mu)^4]}{(\text{E}[(X - \mu)^2])^{2}}$

Normal distributions have a kurtosis of 3.

def kurtosis(r):
    """
    Alternative to scipy.stats.kurtosis()
    Computes the kurtosis of the supplied Series or DataFrame
    Returns a float or a Series
    """
    demeaned_r = r - r.mean()
    # use the population standard deviation, so set dof=0
    sigma_r = r.std(ddof=0)
    exp = (demeaned_r**4).mean()
    return exp/sigma_r**4

What is the kurtosis of this return distribution?

In [12]:
erk.kurtosis(df['Beer'])
Out[12]:
5.446251021883774

Which industry had a returns distribution with the largest kurtosis?

In [13]:
print(erk.kurtosis(df).idxmax())
RlEst

Test for normality

Test whether a distribution of data has a skewness and kurtosis that indicate the data matches a normal distribution.

Calculate the test statistics, $JB = \frac{n}{6}(S^2 + \frac{1}{4}(K-3)^2)$ and choose a $p$-value to test at.

def is_normal(r, level=0.01):
    """
    Applies the Jarque-Bera test to determine if a Series is normal or not
    Test is applied at the 1% level by default
    Returns True if the hypothesis of normality is accepted, False otherwise
    """
    if isinstance(r, pd.DataFrame):
        return r.aggregate(is_normal)
    else:
        statistic, p_value = scipy.stats.jarque_bera(r)
        return p_value > level

Do these monthly returns follow the normal distribution?

In [14]:
erk.is_normal(df['Beer'], level=0.01)
Out[14]:
False

Monthly parametric Gaussian VaR

def var_gaussian(r, level=5, modified=False):
    """
    Returns the Parametric Gauusian VaR of a Series or DataFrame
    If "modified" is True, then the modified VaR is returned,
    using the Cornish-Fisher modification
    """
    # compute the Z score assuming it was Gaussian
    z = norm.ppf(level/100)
    if modified:
        # modify the Z score based on observed skewness and kurtosis
        s = skewness(r)
        k = kurtosis(r)
        z = (z +
                (z**2 - 1)*s/6 +
                (z**3 -3*z)*(k-3)/24 -
                (2*z**3 - 5*z)*(s**2)/36
            )
    return -(r.mean() + z*r.std(ddof=0))
In [15]:
erk.var_gaussian(df['Beer'], level=1)
Out[15]:
0.10686670592419344

Monthly parametric Gaussian VaR with Cornish-Fisher adjustment

In [16]:
erk.var_gaussian(df['Beer'], level=1, modified=True)
Out[16]:
0.13615629641279908

Compare returns between industries

Which industries had the highest and lowest annual returns?

In [17]:
ind_highest_ret = erk.annualize_rets(df, periods_per_year=12).idxmax()
ind_lowest_ret = erk.annualize_rets(df, periods_per_year=12).idxmin()
print('Industry with highest annualized return: {0}'.format(ind_highest_ret))
print('Industry with lowest annualized return: {0}'.format(ind_lowest_ret))
Industry with highest annualized return: Smoke
Industry with lowest annualized return: Softw
In [18]:
erk.annualize_rets(df, periods_per_year=12).sort_values(ascending=False).plot.bar()
plt.show()

Display summary stats of all the industries

In [19]:
erk.summary_stats(df)
Out[19]:
Annualized Return Annualized Vol Skewness Kurtosis Cornish-Fisher VaR (5%) Historic CVaR (5%) Sharpe Ratio Max Drawdown
Agric 0.083174 0.227038 0.023787 4.363072 0.096661 0.132867 0.227664 -0.670659
Food 0.119149 0.152597 0.115850 4.983611 0.058799 0.088451 0.568450 -0.431249
Soda 0.113355 0.216873 0.128246 7.327176 0.083630 NaN 0.382608 -0.641812
Beer 0.119206 0.175191 -0.010472 5.446251 0.070081 0.106386 0.495398 -0.585533
Smoke 0.142585 0.210531 -0.103832 5.398053 0.085722 0.125156 0.520244 -0.598755
Toys 0.057655 0.253771 -0.113638 4.188872 0.113619 0.152744 0.105735 -0.736080
Fun 0.131444 0.263804 -0.207285 5.760808 0.112079 0.168928 0.373933 -0.801774
Books 0.089811 0.203934 -0.020507 4.931956 0.085892 0.122044 0.285198 -0.774217
Hshld 0.101864 0.162928 -0.292402 4.694151 0.070307 0.101906 0.429105 -0.574849
Clths 0.107625 0.219375 -0.068181 5.607062 0.091421 0.130592 0.344121 -0.762776
Hlth 0.069990 0.281037 -0.047717 5.680109 0.119981 NaN 0.155862 -0.918911
MedEq 0.127348 0.184678 -0.275416 4.160521 0.079012 0.111506 0.512834 -0.492337
Drugs 0.116519 0.174184 0.127077 5.479912 0.067826 0.100181 0.483245 -0.481990
Chems 0.092195 0.189459 -0.120480 5.265948 0.080380 0.111681 0.319275 -0.583971
Rubbr 0.107496 0.207486 -0.199289 5.539226 0.088381 0.129586 0.363263 -0.648013
Txtls 0.091117 0.244169 0.392389 12.301449 0.084808 0.152728 0.243310 -0.779751
BldMt 0.093464 0.209022 -0.049712 6.996385 0.085866 0.129797 0.295249 -0.665138
Cnstr 0.082722 0.247685 -0.070153 3.855379 0.108504 0.142983 0.206857 -0.711003
Steel 0.045484 0.251254 -0.162875 5.230505 0.112896 0.156942 0.059654 -0.758017
FabPr 0.055124 0.249383 -0.098083 4.233008 0.111157 NaN 0.104979 -0.690458
Mach 0.093557 0.211150 -0.369449 5.430184 0.094078 0.126862 0.292690 -0.633257
ElcEq 0.108011 0.214048 -0.185237 4.583170 0.092302 0.124167 0.354455 -0.593946
Autos 0.071905 0.228836 0.230954 9.082331 0.088097 0.136975 0.177937 -0.779595
Aero 0.118798 0.228049 -0.276653 4.674337 0.099480 0.136942 0.378695 -0.752222
Ships 0.093981 0.244914 -0.008068 4.690134 0.103950 0.143703 0.253947 -0.675484
Guns 0.113285 0.226587 -0.150289 5.003679 0.095932 NaN 0.365874 -0.616820
Gold 0.049120 0.361926 0.746524 7.875271 0.128664 NaN 0.055956 -0.774343
Mines 0.094162 0.253065 -0.273353 4.901837 0.112611 0.150928 0.246433 -0.668831
Coal 0.070662 0.350241 0.172015 5.175953 0.145936 0.212617 0.112539 -0.973579
Oil 0.104085 0.186097 0.032862 4.250474 0.076726 0.110556 0.387243 -0.492375
Util 0.096788 0.137050 -0.119541 4.113629 0.056965 0.079622 0.474155 -0.423764
Telcm 0.095363 0.159542 -0.182987 4.157287 0.068313 0.098486 0.398556 -0.717741
PerSv 0.059159 0.235580 -0.153172 4.435327 0.105610 0.145814 0.120165 -0.934697
BusSv 0.100435 0.193429 -0.365137 5.353391 0.085200 0.119292 0.354170 -0.664484
Hardw 0.092986 0.241226 -0.197878 4.711917 0.106036 0.148506 0.253822 -0.841621
Softw 0.038944 0.391038 0.824403 8.089024 0.136404 NaN 0.028968 -0.994419
Chips 0.085819 0.252321 -0.350978 4.651943 0.114807 0.160097 0.214979 -0.844654
LabEq 0.102067 0.239016 -0.192208 4.204502 0.104928 0.143597 0.293155 -0.688579
Paper 0.099032 0.190371 0.105606 5.107522 0.076935 0.108422 0.352703 -0.586680
Boxes 0.096132 0.192433 -0.367518 4.949687 0.085546 0.122003 0.334239 -0.584344
Trans 0.098112 0.198207 -0.230970 4.219214 0.086864 0.118936 0.334213 -0.570126
Whlsl 0.102002 0.192453 -0.321769 5.195884 0.084144 0.118522 0.363894 -0.626924
Rtail 0.115013 0.184761 -0.186114 5.130936 0.077623 0.108631 0.447615 -0.574016
Meals 0.113795 0.208563 -0.441333 5.503921 0.092388 0.133078 0.390778 -0.722291
Banks 0.098853 0.204399 -0.271116 4.984177 0.089447 0.126464 0.327602 -0.746194
Insur 0.106165 0.195484 -0.029171 5.157407 0.080736 0.117072 0.378981 -0.681065
RlEst 0.043870 0.268589 0.826531 14.366781 0.083934 0.164833 0.049927 -0.879534
Fin 0.111502 0.209095 -0.387515 4.350789 0.093354 0.130994 0.379114 -0.686448
Other 0.050744 0.237110 -0.430036 4.662264 0.111809 0.162894 0.084835 -0.925342