Wednesday, September 12, 2018

Two Tweets and a Comment: Spending in Retirement

The inspirations for this week’s post are two tweets and a reader comment, which could be the title of a movie about retirement planning if anyone were ever desperate enough to film one.

Retirement planner and researcher, Larry Frank[1] tweeted a link from a Wall Street Journal article by Dan Ariely, a professor of psychology and behavioral economics. The article, entitled “How Much Money Will You Really Spend in Retirement? Probably a Lot More than you Think[2] suggests that the conventional wisdom that we will need to replace 70% to 80% of our pre-retirement income may be vastly optimistic and the real number could be as high as 130%. That will require workers to save twice as much as they expect, according to Ariely.

Before you throw up your hands and give up on ever saving enough, let me explain that these two numbers, 70% and 130%, don’t measure the same thing.

The leader in estimating “replacement ratios”, the income needed for the first year of retirement as a percent of the income needed to buy the same standard of living as the year before retirement, is AON Consulting.[3] AON doesn’t calculate a single replacement ratio but notes, for example, that it is higher for lower-income households than higher-income households. Over time, “conventional wisdom” settled on about 70% for a replacement ratio no matter what your circumstances, which is obviously a poor rule of thumb, however widely accepted.

Beware the Ides of March and rules of thumb.

For my two cents, from some unrelated research I'm doing using the Health and Retirement Survey data from 1992 to 2014, I find that about 550 one-person, retired households experienced a median replacement ratio of about 107% and about 850 two-person households experienced a replacement ratio of about 112%. I don't yet know how long those increases continued. As I mentioned, replacement ratios are about the first year of retirement. Furthermore, these are medians — your mileage may vary.

To be perfectly clear, I'm not a fan of replacement ratios as a planning device.

Two Tweets and a Comment: Spending in Retirement.
[Tweet this]

Ariely’s calculations are the results of an experiment in which people were asked what they hope to do after they retire. Of course, many would hope to travel the world, eat all their meals in fancy restaurants, take the grandchildren to Disney World annually or retire to a golf resort. That will cost a bit more than simply staying home from the office, living in the same place and doing the same things as before without the commute, which is closer to the AON calculations.

The important points I learned from the Ariely column were more behavioral than economic. Here's one. I’ll bet if you ask most workers whether retirement will cost more or less than pre-retirement, most would answer, “Less, of course!” Ariely shows that really depends on what you plan to do after retirement and where you plan to do it.

The WSJ column provides a link[4] to Ariely's spreadsheet to calculate replacement costs based on your own retirement dreams. If you calculate that replacement ratio and then compare it to the AON Consulting replacement ratios specific to your financial circumstances, you may find numbers that differ significantly from 70%. Both numbers may help your planning by providing a range of estimated spending and they might also provide a warning flag that your expectations of what you can afford in retirement may be overly optimistic.

I found the behavioral aspects of the column more compelling than the economic perspective. First, replacement ratios compare costs for the first year of retirement to the year before. Hopefully, your retirement will last longer than a year and it is unlikely that if you decide to travel the world at age 65, for example, you will still be flying at 85. (Airlines statistics show that retirees tend to stop traveling internationally in their 70s.)

Even if the retirement you envision requires a 130% replacement ratio, that increase won’t last forever and probably won’t require doubling your pre-retirement savings target, though it will increase it. If an early-retirement spending increase were to actually be sustained for your entire retirement then your savings needs might double but I doubt that it will.

Ariely states that in retirement "Every day becomes just like the weekend. And on the weekend, we have all kinds of time and opportunities to spend money. We shop, travel, buy tickets for events and eat out." As a retiree of 13 years, I don't know any retirees who would agree that retirement is like that, at least not moreso than when we worked, and I will repeat my assertion that we need more researchers with retirement experience (a personal peeve).

My second inspiration was a tweet from a financial planner who didn’t understand why estimating retirement spending is difficult. He suggested basing it on the past four months of current expenses. Calculating current spending is indeed relatively simple and estimating spending for the first few years of retirement isn’t a stretch; the challenge is estimating spending 10, 20 or 30 years into the future.

Will your retirement spending go up or down after you retire? I think the best research on this question comes from David Blanchett[5] and Sudipto Banerjee[6]. Blanchett concludes that a household’s spending trajectory is a function of the ratio of retirement savings to the desired standard of living or said differently, a function of whether the retired household has saved appropriately for the desired standard of living, under-saved, or over-saved.

Blanchett found that households with appropriate savings tend to see a 1.5% to 2% annual reduction in the cost of retirement (spending), though it isn’t a smooth decline. He found that households that “over-save” tend to realize they can spend more after a few years and do. At the other extreme, households that haven’t saved enough tend to notice their savings are declining too fast and reduce spending.

Some have interpreted Blanchett’s findings to suggest that spending declines for the "first half" of retirement and increases for the second half. That’s really only true if you live to 100 or so. Most households won’t and their spending trajectory will look a lot like Banerjee’s chart, which is to say that spending will tend to decline throughout retirement and even large end-of-life costs will likely be smaller on an inflation-adjusted basis than first-year spending.

Which direction your spending will head is unknowable. It’s important to understand that these projections are made for the population of retirees and there is no way of knowing if your household's unique retirement spending will be like any of these averages. Your retirement spending will be determined not only by your wealth and income but also by how much life decides to charge you and for how long.

My final inspiration was a reader asking how much money she will need to spend annually throughout retirement. You can see my response in the comments section at The Critical Factors of Portfolio Ruin Aren't Predictable but there is one inescapable reality — no one can predict how much wealth and income an individual household will have or how much it will need with any accuracy for more than a few years.

To summarize this information about retirement spending, I would say we have some good research on population averages but they can’t predict the future of a single household. Ariely tells us that the retirement we want might be more expensive than the one we can afford and perhaps more expensive than our pre-retirement standard of living. Blanchett and Banerjee tell us that retirees who have saved enough and those who have saved too little tend to experience spending declines throughout retirement. The airlines tell us that we become less adventurous in our 70s.

No one can tell you how much your household will need to spend or be able to spend for more than a few future years. The only realistic solution is to plan for the long term but adjust often.

Retirement finance has no cruise control.


[1] You can follow Larry Frank on Twitter at @LarryFrankSr and you can follow me at @Retirement_Cafe.

[2] How Much Money Will You Really Spend in Retirement? Probably a Lot More than you Think, Wall Street Journal.

(I frequently have problems with the WSJ paywall but you should be able to read this by clicking "sign in" if you don't subscribe. If not, I found that I could read it by Googling "How Much Money Will You Really Spend in Retirement? Probably a Lot More Than You Think" and clicking the link on the Google search page.)

[3] AON Consulting Replacement Ratio study, AON Consulting.

[4] Retirement Spending spreadsheet, Dan Ariely.

[5] The True Cost of Retirement, David Blanchett.

[6] Expenditure Patterns of Older Americans, 2001-2009, Sudipto Banerjee.

Friday, August 31, 2018

Probability of Ruin in Pictures

William Bengen calculated sustainable withdrawal rates (SWR) using historical S&P500 market returns since 1928 leading to the “4% Rule.”[1] More recently, Robert Shiller published stock market  returns data back to 1871 using the S&P Composite Index[2]. In this post, I’ll explore the “probability of ruin” using the more extensive Shiller data.

Probability of ruin is typically used in retirement planning to estimate the probability that a retiree will outlive her portfolio based on some set of assumptions such as a fixed planning horizon (often 30 years), market return expectations and a constant-dollar spending strategy.  Bengen studied rolling 10-, 20- and 30-year retirements using historical S&P500 market returns and a constant-dollar spending strategy[3].

He found that assuming a fixed 30-year retirement and annual withdrawals of 4% of the retiree’s portfolio value at retirement the worst-case historical scenario (someone retiring for 30 years beginning in 1966) would have depleted a portfolio in less than 30 years for about 5% of the rolling periods. Hence, the “4% Rule.”

The following chart shows the terminal portfolio value (TPV) after 30 years for a retiree spending $42,000 (4.2%)  annually from an initial portfolio valued at $1M for 110 overlapping thirty-year periods from 1872 to 1982. (Shiller’s data ends in 2012 so the last 30-year period began in 1982.) The red bars indicate years of retirement that funded less than 30 years.

(Click on the charts to zoom in.)

Six of the 110 periods (5.5%, the historical “probability of ruin”) were depleted in fewer than 30 years. TPV charts typically and reasonably assume a retiree’s portfolio can’t drop below zero but I continued withdrawals for the full 30 years to show the extent to which they failed. Another way to read this is that the deeper the red column, the sooner the portfolio was depleted.

Take a longing glance at those tall columns, the ones with really large terminal portfolio values. Then, compare them to the little stubby blue guys. Both are probability of ruin “successes”.

Probability of ruin assumes that you’ll be happy simply not retiring in one of those red years. You’re either in the 5% of scenarios that start a losing period or the 95% of winners and so as long as your bar turns out blue, you’re good, right?

Not really. Wouldn’t you be at least a little happier with a tall blue bar than a short, stubby blue bar, even though both avoid portfolio depletion? I would. Probability of ruin assumes that you’ll be just as happy successfully funding retirement and leaving a hundred bucks to your heirs as you would be leaving them a million. And, that you’d be as dissatisfied with a portfolio that funds 29 years as with one that only funds 15. 

I wouldn’t. If a planner said, “Hey, great news! Your retirement is funded 95% of the time”, my response would be, “That sounds great but how well does it turn out when it is completely funded and how badly when it isn’t?”

Sequence risk affects all outcomes, sometimes positively and sometimes negatively. Probability of ruin flags only the worst outcomes. Probability of ruin is sort of an upside-down “tip of the iceberg” in that most of the information is hidden from view by condensing all that information into a single data point, the percentage of failures.

(For a better iceberg effect, turn your phone upside down while you view the chart below. If you’re reading this on an iMac or PC, probably better to just use your imagination.)

In Figure 2 below, I increased spending from 4.2% of initial portfolio value to 4.75% which, of course, creates more red bars indicating more depleted portfolios.

Note that the red bars appear in four distinct clusters in both Figures 1 and 2. A “95% probability of ruin” might suggest that ruin appears sporadically about every 20 years (5% of periods). It does not, although that is how sequence risk is most often (incorrectly) modeled. 

When I increase spending to 5.5%, the result is even more red bars, as expected, but they’re still all within those four clusters. Ruin isn’t a uniformly-distributed event. Probability of ruin is quite high in certain periods of economic distress but relatively low any other time. 

Here's an analogy. Kentucky averages about 12 snowfall days per year but we don’t predict snowfall in July. It’s more likely to snow in winter in Kentucky and high sequence risk is more likely to deplete a portfolio when spending starts in an "economic winter". Many models of sequence risk predict snow in July.

Unless you retired just prior to the Panic of 1910, the Great Depression, a bad 1937 bear market (squeezed between two really good market years, by the way) or during the inflationary 1965 to 1975 period, the 4% Rule would not have depleted your portfolio. Unfortunately, these periods are not predictable. The jury is still out on the 2000s.

Probability of ruin in pictures via @Retirement_Cafe.
[Tweet this]

In the next chart, Figure 3, the y-axis scale changes from $M to $K so we can better see the near misses. I arbitrarily set the definition of success in this test to include TPVs greater than $150,0000 and the definition of failures to include TPVs worse than -$150,000. My reasoning is that given the margin of error in a 30-year retirement plan these scenarios might have gone either way IRL (in real life, as Millennials say). This is arbitrary but so is drawing the failure line at precisely zero dollars and this definition factors in more of the uncertainty of the analysis.

Note the number of portfolios that barely avoided depletion (3) and the number that very nearly avoided depletion (2). If we omit these five scenarios from the calculation because they are too close to call, the probability of ruin becomes 3.8% instead of 5.5%. That’s more than a 30% change in the estimate of ruin and represents a big change in sustainable spending.

I'm not advocating ignoring these data but simply viewing them in three categories instead of two: probably succeeded, probably failed and too-close-to-call, based on our degree of confidence in the outcomes.

When you have only a few failures, a few close calls make a large difference in probability of ruin.  Portfolio’s that come up just a little short probably aren’t losers and a small bequest left to heirs is probably too close to call a winner, as well. Thinking we can predict a 30-year retirement much more accurately than plus or minus a few years is overconfidence.

Why do I question “near misses”? Because they probably would have funded most of the 30 years. Only 6% of men and 13% of women aged 65 live another 30 years and all of those who died sooner would have successfully funded their retirements in these scenarios. 

The following chart, Figure 4, brings bear markets (the yellow bars) into the picture. 

Retirees are often told that retiring into a bear market is deadly, but bear markets don’t appear to be particularly highly correlated with failing portfolio periods. Robert Shiller doesn’t even consider the 1960’s and 1970’s to be bear markets because they were so gradual[4]. Paint those bars blue and the correlation of bear markets to portfolio ruin is even less obvious.

If portfolio depletion isn’t necessarily caused by bear markets, what does cause it? The website found that the sustainable withdrawal rate is nearly completely explained by portfolio returns for the first five and first ten years of 30-year periods.[5] This explains SWR but not ruin — portfolio depletion is completely explained by sequence risk. 

Nonetheless, a chart of SWRs is informative. Figure 5 shows the SWRs that would have depleted a portfolio in precisely 30 years from 1872 to 1982.

This is the view of the iceberg below the surface. Sustainable withdrawal rates that deplete portfolios in precisely 30 years are unpredictable and vary widely from 3.8% to 12.6% historically. 

Figure 5 above provides a visual explanation of the “4% Rule” probabilist school of retirement finance. That approach recommends spending the amount that would only fail in no more than 5% of retirement periods. Using the Shiller data, that amount of spending would be about 4.2% of initial portfolio value.

There are two potential risks with this strategy. The obvious one is that you might fall into the unlucky 5% (one in twenty) and outlive your savings but an equally important concern is that you would almost always underspend. All of the blue bars above the red line represent underspending. You would have spent 4.2% if you retired in 1950, planning to live 30 years, for example, when you could have spent 11.8%. Of course, you couldn’t have known that in 1950.

Some planners have suggested that sequence risk goes away after 10 years. Alas, it does not. The following chart shows the value of portfolios at the end of the first 10 years for historical data.

The smallest TPV after 10 years was $340,000 (retirement in 1973) and the largest was $3.8M (1949). Surely the latter has less sequence risk ten years into retirement.

If both scenarios are assumed to complete the remaining 20 years of a 30-year retirement and both continue to spend the $42,000 they calculated as sustainable back in year one, the larger portfolio would have survived all rolling 20-year historical periods with continued annual spending of 1.1% (42,000 / 3,800,000), while the smaller portfolio would have failed nearly all of those periods with 12.4% annual spending (42,000 / 340,000). 

Sequence risk might appear to go away after 10 years from the perspective of the start of a 30-year period but after 10 years much will have changed. Sequence risk will change accordingly and become greater or smaller. We can’t know which.

As I mentioned above, the EarlyRetirementNow blog found that the returns for the first 5 years of a 30-year retirement best explain the sustainable withdrawal rate.  Figure 7a shows 5-year annualized market growth rates with the same time period on the x-axis. The panel below, Figure 7b, shows 30-year TPV with portfolio failures in red in the top chart. Note how well very low growth rates for the next five years align with portfolio depletion.[6]

Portfolio failures are caused by poor market returns early in a series of returns. The low returns can result from a quick, precipitous shock like The Crash of October 1929, from a single terrible year of returns like 1937, or from a long, gradual sideways series of mediocre real returns like 1966 to 1975.

These growth rates are explanatory, not predictive. In these charts we are explaining the past, not predicting the future. We have no idea what the next five years of market returns will bring but we can see that low early returns — sequence risk — are not a good way to start.

To summarize, probability of ruin is an interesting rule of thumb with severe limitations. Sequence risk affects all portfolios from which the retiree periodically spends but probability of ruin only measures the extreme outcomes, those that result in premature portfolio depletion. It treats all failures alike and all success alike, ignoring the extent of the success or failure. The thin line separating success from failure is arbitrary. It hides the extent of success and the extent of failure.

Portfolio ruin isn’t sporadic and doesn’t uniformly occur once every 20 years or so as a 5% failure rate might imply. Most of the time, sequence risk is quite low but during major economic upheavals, it occurs in bouts.

Models of probability of ruin are not robust. They provide a significantly different answer every time they are run even when nothing changes except the Monte Carlo random number draw.

Probability of ruin is based on some strange assumptions about human behavior, like assuming we will continue to spend the same amount when ruin becomes apparent or that we don’t care how much wealth we have as long as it’s more than zero. It’s also based on less than five unique sequences of 30-year historical returns, a truly small sample.

Put all this together and probability of ruin looks like a very poor metric by which to predict, model, or manage retirement finances. 


[2] Annual Data on US Stock Market, Robert. J. Shiller.

[3] This analysis uses the S&P Composite Index, data from 1871 to 2012, and 100% equity allocation.

[5] The Ultimate Guide to Safe Withdrawal Rates – Part 15Early Retirement Now blog.

[6] The market grew about 0% from 1927 to 1931 as shown in the bottom panel, for example, and portfolios with spending beginning in 1927 failed sooner than 30 years with 4.75% spending, as the top chart shows.

Saturday, August 11, 2018

The Critical Factors of Portfolio Ruin Aren't Predictable

Probability of ruin and sequence of returns risk are probably the most widely-discussed topics in all of retirement finance and perhaps the least understood.

Probability of ruin is not sequence of returns (SOR) risk. The sequence of portfolio returns we experience after retiring is one determinant of premature portfolio depletion (ruin) but so are life expectancy, the market returns, themselves, the volatility of those returns, the amount we choose to periodically spend and the value of our portfolio.

For a given sequence of returns, the probability of prematurely depleting our savings increases if we expect to live longer or spend more, start out with a smaller portfolio, receive better average market returns or experience less volatility of those returns. As I will explain below, some of these factors have a significantly larger impact on expected terminal (end-of-retirement) wealth than others.

The fact that some of those key variables, our life expectancy and the size of our portfolio, invariably change as we age tells us that probability of ruin also changes as a result of aging. The amount we need to spend annually might also change over time, as might our expectations of future portfolio returns and these will also alter our updated estimate of probability of ruin.

But, the size of our portfolio and our life expectancy are certain to change as we age. They are critical factors of portfolio survival and I suspect nearly everyone would agree that he or she can't know how much money will be left in the retirement-funding portfolio in 10 or 20 years or whether he or she will live that long.

This should dispel the notion some have that a 95% probability of success at the beginning of retirement remains 95% throughout retirement. It probably changes the next year, perhaps meaningfully. That also means that spending 4% of initial portfolio value could become far riskier or far less risky as we age.

“Sequence risk” is introduced when we periodically spend from or invest in a volatile portfolio of stocks and bonds. If we plan to sell stocks every year for the next 30 years, we have no idea today what the selling price will be when those 30 times arrive. That uncertainty of future selling prices creates sequence risk.

Notice I said, “or invest in a volatile portfolio.” When we are accumulating a retirement portfolio with periodic stock purchases before retiring, we don’t know future purchase prices today, either, and that uncertainty also creates sequence risk.

The best way to see the cause of sequence risk is to look at what happens when it isn’t present. Any given thirty years of market returns, for example, will result in the same terminal portfolio value for a buy-and-hold strategy regardless of the order of those returns.

Imagine three years of portfolio returns of 10%, -7% and 12%. These equate to growth rates of 1.1, 0.93 and 1.12, respectively. Multiply those in any order and you get a three-year growth factor or 1.146.  One dollar invested returns $1.15 after three years. The sequence of the returns doesn’t matter.

When you add (save) or subtract (spend) numbers from each of those years, however, no matter where those numbers come from (constant-dollar spending, constant-percentage spending or whatever) the order of the sequence does matter. This is sequence risk. We see sequence risk when we periodically spend from or invest in a volatile portfolio. We see no sequence risk with a buy-and-hold portfolio, so the sequence risk comes from either periodic savings or periodic withdrawals.

The critical factors of portfolio ruin aren't predictable.
[Tweet this]

This periodic spending, if too large, can result in depleting our portfolio after retirement, so we are exposed to both sequence risk and a “risk of ruin.” Losing 100% of a savings portfolio, however, is extremely unlikely and while we save for retirement we have sequence risk but almost zero probability of ruin.

So, probability of ruin and sequence risk aren’t the same thing. A poor sequence of returns combined with unsustainable spending can lead to ruin after retirement but a good sequence of returns decreases probability of ruin given the same average return.

The cost of sequence risk is lost compounding of returns. When we have a losing year with a buy-and-hold portfolio, we lose money. When we spend from a volatile portfolio we also lose money during that same losing-market year but our portfolio balance further loses the money we spend plus all potential future compounded gains on the amounts we sold.

Losses hurt more when we spend from a volatile investment portfolio than when we buy and hold. This is why it takes longer for a spending portfolio to recover from a bear market than it takes a buy-and-hold or accumulation portfolio.

(It is often noted that the market recovered fairly quickly after the Great Depression when dividends are considered. A buy-and-hold portfolio would have, too. An accumulation portfolio would have recovered even faster as cheap stocks were subsequently purchased. But, a retiree's spending portfolio would have recovered much more slowly, assuming the portfolio had survived, of course.)

Losses early in retirement hurt more than later losses because those earlier losses leave less capital to compound over time. As Michael Kitces has explained, good returns late in retirement aren't helpful if your portfolio doesn't survive long enough to see them.

The best possible sequence of your annual portfolio returns would result if those returns happened to materialize ordered from best annual return in the first year to worst return in the last. The opposite order would be the worst. That’s why we’re warned that significant portfolio losses early in retirement are the most severe.

Of course, we have no control over the sequence of returns we receive nor can we predict the sequence.

Sequence risk never completely goes away. It is present in a 30-year retirement (and greater in the early years) and it is present in a 5-year retirement (and greater in the early years). Note that a 30-year retirement will eventually become a 5-year retirement if we live long enough.

The challenge of savings decumulation is to optimally spread one's portfolio over one's remaining lifetime but a healthy individual's lifetime is unpredictable. Will sequence risk be reduced when a 60-year old reaches 85? That depends on how much longer the 85-year old will live, how much of her wealth remains and how much she will spend. It requires a new calculation of safe spending based on these new variable values.

A reduced range of life expectancy reduces that component of risk compared to 25 years earlier. However, the amount of wealth we will have 25 years into the future is wildly uncertain. If the retiree's portfolio performs well, she may reach age 85 with reduced probability of ruin compared to age 65 because she has greater wealth and fewer years to spread it over. If her portfolio performs poorly, however, she may reach age 85 with fewer years to fund but far less wealth to fund them and, therefore, increased probability of ruin.

Many SWR analyses suggest that risk decreases because the safe withdrawal percentage increases as we age. Those analyses estimate a safe withdrawal rate when a retiree experiences a 30-year retirement beginning with initial savings of say, a million dollars, and an SWR for a 10-year retirement beginning with the same million dollars.

Risk then appears to decrease with age because the analysis assumes the retiree will have the same million dollars with 10 years remaining as he had with 30 years remaining.  But, in real life there is no guarantee that the retiree will still have a million dollars after 20 years.

An SWR model of historical market returns since 1928 with 4% spending produced a maximum TPV after 20 years of $10.8M and a minimum non-zero TPV of $106K. With continued 4% spending, the former scenario would clearly have a far lower probability of ruin than the latter after 20 years. Add the risk of future portfolio value back into the mix and sequence risk doesn't diminish.

Said differently, the percentage of your remaining portfolio that can be safely spent increases as you age because your life expectancy decreases. The problem is knowing "the percentage of what?" Spending 7% of $106K isn't better than spending 7% of $10.8M even though 7% is larger than 4%.

Probability of ruin doesn't always decline with time but it does change as our savings balance and our remaining life expectancy change. We need to recalculate periodically.

We can estimate a terminal portfolio value (TPV), say after 30 years, for a given sequence of returns and we can estimate how often that will deplete the portfolio in less than 30 years (probability of ruin). These are two different measures. TPV says, "you might have this much money left at the end of retirement", while probability of ruin tells us the likelihood that amount will be more than zero.

The EarlyRetirementNow blog[1] estimates the impact of sequence of returns on the sustainable withdrawal rate* and summarizes its findings: "Precisely what I mean by SRR matters more than average returns: 31% of the fit is explained by the average return, an additional 64% is explained by the sequence of returns!" 

However, the sequence of returns explains 100% of portfolio ruin. To illustrate, we can take a series of portfolio returns that result in premature portfolio depletion (ruin) and rearrange those exact same returns in a better way that avoids premature depletion. We simply swap some of the poor early returns with better late returns. As I explained above, doing so doesn't change the average portfolio return we would receive but it does increase the resulting terminal portfolio value. The difference between success and failure is the sequence, not the returns, themselves.

Focussing on portfolio ruin, however, can be misleading. Sequence risk can dramatically decrease consumption (standard of living) in retirement without resulting in portfolio depletion. (This happens when you end retirement with a small portfolio value that is greater than zero.)

As Jason Scott told me years ago, probability of ruin treats a scenario that successfully funds 29 years as a failure and a scenario that successfully funds 50 years of retirement as no better an outcome than one that funds 30 years. I would add that for a retiree who lives less than 30 years, all three scenarios are winners. It's important to also model life expectancy.

Readers often comment that variable-spending strategies eliminate sequence risk. They don't but they can lower the probability of portfolio depletion by not foolishly spending the same fixed amount annually when savings dwindle. Reducing the chances of depleting the portfolio, however, comes at the expense of lower spending.

Think of it this way: a poor sequence of returns reduces our wealth. We can ignore that reduced wealth and keep spending the same constant amount, risking portfolio depletion, or we can spend less (variably) when our savings are stressed. Either way, we have less wealth so variable spending didn't eliminate the consequences of a poor sequence of returns. It simply changed the impact of sequence risk from portfolio depletion to a lower standard of living.

There is a problem with variable spending strategies, though I still consider them vastly superior to mindless constant-dollar strategies. There is no guarantee that the varying amount you can safely spend every year will maintain your standard of living.

If I am stranded on a desert island with a limited water supply, I can choose to drink decreasing amounts as the supply dwindles but at some point, I can't drink less and survive. Likewise, when variable "safe" spending drops below non-discretionary spending for a sustained period I still have to buy food and pay the mortgage even if that entails an "unsafe" level of portfolio spending. Variable spending isn't a flawless strategy but it seems more sound than the alternative.

I mentioned that the sequence of your future portfolio returns can’t be predicted but the risk can be mitigated. We can do this by spending less from the portfolio, for example, or by changing bond-equity allocations. Sequence risk is moderated by safety-first advocates by ensuring an acceptable income from assets not exposed to market risk in the event of portfolio failure.

To summarize some key characteristics of sequence risk:
  • The sequence of future returns is critical for the survivability of a spending portfolio — but unknowable.
  • Sequence risk and the "safe" amount we can spend vary throughout retirement. They can become much safer or much riskier. We need to modify the amount of portfolio withdrawals to compensate — if we can. 
  • Sequence risk can be helpful or harmful and it has different impacts (generally better) during the accumulation phase than after retirement.
  • Sequence risk can result in portfolio depletion (ruin) or lowered standard of living after retirement but probably not before.
  • The sequence of returns matters more than average returns. To avoid premature portfolio depletion you need a fortunate sequence of portfolio returns about twice as badly as you need really good returns.
  • Althought we can't predict or control our sequence of future portfolio returns, the risk it introduces can be mitigated in various ways.
  • Sequence of returns explains most of sustainable withdrawal rate and all of portfolio ruin.
  • The portfolio return of the first five and ten years of a 30-year retirement are much better predictors of a sustainable withdrawal rate than the mean return for 30 years.[1] You can experience good average returns for thirty years and see your portfolio fall to a poor sequence of those returns or experience mediocre average returns and be saved by a good sequence.
  • A terrible bear market isn't required to sink a retirement portfolio. To quote Michael Kitces, "a “merely mediocre” decade of returns can actually be worse than a short-term market crash..."[2] Retiring in the 1960's was a perfect example. Retiring around the beginning of the Great Depression offers a similar example of how a shorter period of dramatic losses can also result in portfolio failure.
  • Sequence risk never goes away but it can become quite small if your wealth is (or becomes) very large relative to your spending needs and remaining life expectancy — in other words, when your portfolio performs well throughout retirement. Sequence risk can become quite high under the opposite circumstances.
The key takeaways are that the sequence of the returns your retirement portfolio experiences is a major determinant of portfolio survival and is about twice as important as your mean portfolio return. The most important factor is how long you will be retired. And, neither of these is predictable for an individual household.

EarlyRetirementNow's analysis calculates the safe withdrawal rate that would deplete the portfolio in exactly 30 years.


[1] The Ultimate Guide to Safe Withdrawal Rates – Part 15, Early Retirement Now blog.

[2] Understanding Sequence Of Return Risk – Safe Withdrawal Rates, Bear Market Crashes, And Bad Decades, Michael Kitces, Nerd's Eye View blog.

Thursday, July 12, 2018

Monte Carlo and Tales of Fat Tails

I recently read a white paper[1] claiming to show that Monte Carlo (MC) simulation "creates fat tails" and suggesting that constant-dollar withdrawals (the "4% Rule") are historically 100% safe.

Before you log onto E*TRADE for that stock-buying binge, let me explain how I come to a totally different conclusion.

The paper asserts that the reason Monte Carlo models produce different results than the historical data model is the absence of mean reversion in the paper's MC model or perhaps a general flaw in the Monte Carlo technique. The paper presents no statistical evidence, however, of either fat tails or mean reversion and I can't find any in the paper or in my own MC models.

Let's start with a definition of "fat tails."  The term has multiple meanings[2] but in this context, it describes a sample that is more likely to include extreme draws than a normal distribution would predict. A few extreme draws from a normal distribution isn't evidence of fat tails; it is simply evidence of tails.

For example, it is possible (though improbable) to draw an annual market return of 80% from a normal distribution with a mean of 5% and a standard deviation of 12% because a normal distribution has tails that are infinite. A single draw, however, tells us nothing about the probability of extreme draws, which is the definition of fat tails. If our model were to produce many extreme draws – more than a normal distribution would predict – then we would have evidence of fats tails. There are also statistical measures that indicate fat tails, though the paper doesn't report any.[2]

The major flaw in the analysis appears to be the use of a naive Monte Carlo model based solely on normally-distributed market returns. (I say "appears" because the paper reveals little about how the model was constructed but the results are telling). Portfolio survivability is too complex to be modeled by such a simple strategy and it is wrong to blame "Monte Carlo" for the results of a poorly constructed model that happens to use Monte Carlo.

David Blanchett and Wade Pfau wrote on this topic in 2014[3]:
"But this argument is like saying all cars are slow. There are no constraints to Monte Carlo simulation, only constraints users create in a model (or constraints that users are forced to deal with when using someone else's model). Non-normal asset-class returns and autocorrelations can be incorporated into Monte Carlo simulations, albeit with proper care. Like any model, you need quality inputs to get quality outputs."
There are no normal distributions in the real world, only samples that seem likely to have been drawn from a normal distribution. Historical annual market returns, as you can see in the following histogram, appear to be such draws.

The historical data model doesn't use this distribution to create sequences of returns, though. It uses rolling 30-year sequences of these returns, changing only the first and last of 30 years for each new sequence, which distorts the distribution significantly, as shown below. That red distribution doesn't look very normal, does it? Rolling sequences also reduce sequence risk, so we won't find as much as we might otherwise. MC-generated sequences of market returns will be independent and that is a primary reason that MC provides different results than the historical data model, not fat tails or mean reversion.

While our only available sample of historical annual returns data seems likely to have been drawn from a normal distribution, not all draws from that normal distribution create a realistic market return sample. A draw from a normal distribution of annual market returns might legitimately represent a theoretical 120% annual market loss or gain but the former would be impossible for a real portfolio and the latter extremely unlikely.

These are not draws that should be used by an MC model of retirement portfolio returns, at least not when the goal is to measure tail risk. As Blanchett and Pfau note above, "There are no constraints to Monte Carlo simulation, only constraints users create in a model. . ." There is no constraint that says an MC model must use unrealistic scenarios simply because they are drawn from a normal distribution. This MC model is meant to model real-life capital markets, not a distribution that exists only in theory.

The sequence of market returns is critical to portfolio survivability. The historical data shows no strings of more than four market losses or more than 15 consecutive annual gains. This isn't predicted by a normal distribution in which the sequence of returns is purely random but it can be modeled with Monte Carlo. There appear to be market forces that constrain normally-distributed market return sequences and a model based solely on a normal distribution of market returns will not account for these market forces.

Blanchett and Pfau note that autoregression can be incorporated into MC models. This is important for interest rates and inflation rates, which tend to be persistent. Mean reversion, or "long-term" memory of market returns, can also be modeled if one has a strong opinion regarding the existence of mean reversion in the stock market and a strong opinion of the lag time. The authors further note that a proper MC retirement model also incorporates random life expectancy rather than assuming fixed 30-year retirements.

In short, the things the paper complains about "Monte Carlo" not doing are all things an MC model can do but the researcher's model simply doesn't.

An MC model that limits market returns and sequences of returns to appropriately reflect empirical market performance will eliminate most of the anomalies cited in the white paper but it raises another concern: the paper's analysis appears to be a comparison of the historical data model results to a single MC simulation.

I refer to the reference to the (single) maximum "$26M" terminal portfolio value generated by the MC model and to a single probability of failure. MC models should provide a distribution of possible maximum TPVs and probabilities of ruin, not a single result, and that requires running the model many times.

Running the MC model once might produce a maximum TPV of $26M but a second run with different random market returns might produce a maximum TPV of $6M. We run the MC model many times to estimate how likely various TPVs and probabilities of ruin are. There is no single answer.

(To explain more simply, I have a basic MC probability of ruin model much like the one in the paper. I set it to run 1,000 thirty-year scenarios. The first time I ran this model it calculated a maximum terminal portfolio value of $6.8M. I ran the same model again with nothing changed except that it calculated a new set of random market returns for another 1,000 scenarios. The maximum TPV was $10.4M. The third time it produced $9.5 M. The maximum TPV changes each time the random market returns are updated.

I automated the process and ran the MC model 1,000 times with 1,000 different random market returns each.  Maximum TPVs ranged from $4.7M to $41M but the most common maximum TPV was around $10M. This is why we don't stop after running the MC model once and estimating a maximum TPV (in this case) of $6.8M, or a single probability of ruin, for that matter.)

This extremely large, improbable terminal portfolio value is not a fault of Monte Carlo analysis but the result of a naive model of market returns and sequences of those returns that poorly approximates capital markets as we currently understand them. It is also a point estimate.

(As an aside, I'm not sure why we should be concerned about overly-optimistic TPVs in this context.  This is an analysis of portfolio survivability, which is a function of poorly-performing scenarios.)

Is a $26M terminal portfolio evidence of fats tails? Many portfolios that large over many MC simulations might be but a single result tells us nothing about whether it is more or less likely than a normal distribution would predict. Then there's the other issue – terminal portfolio values aren't normally distributed.

Following is a histogram of TPVs created by the historical data model and a log-normal distribution of those results in red.

The white paper notes that some MC-generated terminal portfolio values are larger than a normal distribution would predict. However, TPVs, as you can see in the chart above, are log-normally distributed, not normally-distributed, and should be expected to be larger than a normal distribution predicts. A log-normal distribution is the expected result of the product of n (30) annual normal distributions and a fat right tail is the expected probability density of a log-normal function. If TPVs were normally distributed, some would be less than zero.

Is accepting unrealistic scenarios always a bad thing? This depends on the model's purpose. William Sharpe's RISMAT model[5], for instance, doesn't bother excluding them nor does the research I'm currently co-authoring. The same unrealistic scenarios are included in every strategy tested and filtering them out wouldn't change the comparisons. A small number of unrealistic scenarios is easy to deal with.

The paper in question, however, uses Monte Carlo analysis specifically to measure probability of ruin and this purpose is overly sensitive to unrealistic scenarios because they're the ones that generate results counted as portfolio failures (and large TPV). There will probably be only a relative handful of failed scenarios and adding in a few more failures from unrealistic scenarios can have a dramatic impact on the percent of failures (probability of ruin).  If you insist on trying to estimate tail risk this way, then you should use only realistic scenarios.

To my earlier point, the questionable validity of using MC models specifically to estimate tail risk doesn't disqualify all MC models of retirement finance. As Blanchett and Pfau say, not all cars are slow.

Back to the white paper's claims, no statistical evidence of fat tails or mean reversion is provided and I can find neither of these in these results. I certainly see no evidence of 100% success in the results. I mostly see evidence that a naive MC model provides strange results but I would have guessed that.

Joe Tomlinson wrote a follow-on post[4] to that Blanchett-Pfau piece in which he raised several important points. One is that the selection of metrics is critical when analyzing MC results. In fact, I would argue that estimating a probability of ruin metric is a poor use of MC models since low-probability events are unpredictable.

Tomlinson also makes the point that "The measures being applied by researchers may be more useful than those provided in financial-planning software packages, which provides an opportunity for software developers to introduce new measures to improve the usefulness of their products." So, perhaps an important finding of this paper can be gleaned from the phrase "Monte Carlo analysis (as typically implemented in financial planning software). . ."

If most MC models available to planners are indeed as naive as this white paper suggests and we are using those models to calculate probability of ruin (not my preferred use), then we really do have an MC problem. But it isn't fat tails or the lack of mean-reversion modeling.

So, do Monte Carlo models of retirement finance generate fat tails? I don't see evidence of that. Do they create unrealistic scenarios? Maybe, but that depends on the specific software you're using and its purpose, not on the Monte Carlo statistical tool.

Monte Carlo can be a powerful tool for retirement planning but only if used correctly and for the right application. Estimating tail risk is probably not a good application.


[1] Fat Tails In Monte Carlo Analysis vs Safe Withdrawal Rates. Nerd's Eye View blog.

 Fat Tail Distribution: Definition, Examples.

[3] [The Power and Limitations of Monte Carlo Simulations, David Blanchett and Wade Pfau, Advisor Perspectives.

[4] The Key Problem with Monte Carlo Software - The Need for Better Performance Metrics, Joe Tomlinson.

[5] Retirement Income Scenario Matrices (RISMAT), William F. Sharpe.

Friday, May 18, 2018

Some Risks Can't be Modeled

My last few posts, The “Future” of Retirement Planning, The Limits of Simulation and Spending Rules and Simulation, have discussed different aspects of retirement planning, specifically, spending rules and Monte Carlo (MC) simulation.

Spending rules calculate a safe amount to spend in the current year. I highly recommend that you reapply your spending rule every year to take new information into account but if that's all you do then you have a one-year planning horizon.

If retirement were a game of combined chance and skill, like backgammon or poker (and it is, of course), then spending rules would identify our best current move. Simulation would tell us the probabilities that this move will ultimately win the game, like knowing the odds that your backgammon opponent will roll a 3 on his next turn (hint: they aren't good — I'd be willing to leave that stone uncovered).

A good poker player will know the odds of the deck and a good backgammon player will know the odds of the dice. They will become second nature. A good retirement planner will know the odds of possible retirement outcomes.

MC provides probability distributions for possible outcomes given a spending rule that would be repeated periodically over many lifetimes. In other words, it identifies financial risks of a retirement plan.

MC, however, only generates “normal” scenarios or those that would probably be drawn from a normal distribution. By design, MC creates most scenarios near the mean or “expected” outcome. The further from the mean, the less likely that a scenario will be created in an MC simulation.

The shortcoming of MC simulation is not that it will create unrealistic scenarios — quite the opposite — it won’t generate many highly unlikely outcomes. So, even after we test retirement plan risk with simulation we still don’t know much about the effects of low-probability catastrophic events.

Simulating such events, even if we could, wouldn’t be very rewarding. After the Great Recession, Nassim Taleb testified before Congress that improbable events are impossible to predict and called those who claim that they can forecast them “charlatans.” (He was referring to Value-at-Risk advocates.)

If Taleb is not to your taste, you can come to nearly the same conclusion by recognizing the huge confidence intervals inherent in our relatively small sample of historical market returns.  We simply can't be confident in predictions based on them.

Avoiding unforeseeable risks is not an option. It's hard to steer around an obstacle you don't know is there.
[Tweet this]

After simulations, we still need a way to plan for the unknowable. Risk management generally proposes four strategies:
  1. Avoid the risk. You can avoid the risk of riding a motorcycle by not riding one.
  2. Mitigate the risk. Wear a helmet when you ride a bike.
  3. Insure the risk when insurance is available and affordable.
  4. Accept the risk when there is no realistic alternative. The risk of 30 years of consecutive market losses is a good one to accept. So is death by falling satellite.
Avoiding unforeseeable risks is clearly not an option. It's hard to steer around an obstacle you don't know is there. Mitigating these risks presents a similar challenge.

We can insure some retirement risks by buying annuities, umbrella liability and life insurance, for example, but insurers won't offer me long-term care insurance and premiums can be (or can become) unaffordable.

Regardless, at some point we must face the fact that our retirement plan can’t manage every risk by relying on good fortune in the stock market. (I say this knowing full well that many retirees in the “probabilist school” believe precisely that. I just don’t share their optimism, probably because I have spoken with 80-year old’s who lost that bet and must now get by on Social Security benefits alone.)

The best spending rules won’t eliminate these risks. After a long sequence of poor returns, they will simply reduce safe spending to a level that no longer supports the household’s standard of living. Nor will the best simulation software ferret them out and suggest fixes.

After selecting a spending rule and modeling outcomes with MC simulation we need to address low-probability catastrophic outcomes with insurance when we can. This is the theory behind floor-and-upside strategies — hedge your bet.

I consolidated a list of identified retirement risks in Retirement is Risky Business – Here's a List that should provide a starting point for your review. Low-probability catastrophic outcomes defy avoidance and mitigation but they’re worth contemplating and possibly worth insuring.

The key takeaway is that MC simulation can tell you a lot about fairly normal outcomes but very little about improbable, high-impact events, also known as "tail risk." Consequently, simulation is not the end of the retirement planning process. We have to evaluate tail risk by some process other than prediction and that means "seat of the pants."

It won't be a thorough process and according to Taleb, it can't be. That doesn't mean you shouldn't try. Having a floor of safe income, for example, can mitigate a lot of different, even unpredictable risks.

The best retirement plan will fail if Earth is hit by another dinosaur-ending asteroid but that's a risk we probably have to accept. There may be other low-probability, high-impact risks that we can mitigate, though, without being able to predict them with models.

Thanks, Mason Finance Group, for choosing The Retirement Cafe´ for your Best Retirement Blogs of 2018 list. Congrats to Ken Steiner's How Much Can I Afford to Spend in Retirement? blog, as well!

Monday, May 14, 2018

Spending Rules and Simulation

My recent post on Monte Carlo(MC) simulation, The Retirement Café: The “Future” of Retirement Planning, seems to have spawned a strange debate about whether a deterministic "spreadsheet" method of calculating safe current spending from a retirement portfolio is better or worse than using Monte Carlo simulation to estimate the probability of outcomes.

This debate is not unlike arguing whether a screwdriver is superior to a hammer: they do entirely different things, a good toolbox includes both, and even when you use both effectively, you're still probably going to need a saw. (I'll write more about the saw in my next post.)

I'm going to refer to the "method of calculating safe current spending from a retirement portfolio" as a spending rule because that is its purpose. The result is typically a single dollar amount as in, "applying the 4% Rule to a $1M portfolio when expecting a 30-year retirement estimates the retiree can spend $44,000 each year" or "the required minimum distribution(RMD) from your IRA this year is $38,517."

An interesting observation about these spending amounts is that if you recalculate this amount in subsequent years (as you should do with any spending rule but must do by law for IRA and other tax-deferred plans after age 70½), it is possible but extremely unlikely that you will ever calculate the same result again. That's because remaining life expectancy, current portfolio value, and other factors will change over time.

On the other hand, the purpose of Monte Carlo simulation is to estimate the probability of outcomes assuming a retiree were to retire many, many times. The probabilities can be used to identify potential problems and allow a planner to attempt to mitigate them.

The result of simulation is not a single number for a single year as calculated by spending rules but one or more probability distributions showing the odds of various outcomes, as in "there is a 5% chance that you will need to spend less than $20,000 under certain conditions."

While spending rules estimate safe current-year spending, MC simulation can provide additional insight into many questions, such as:
  • The 4% Rule says I can spend $40,000. What are the probabilities that I can safely spend $45,000 or $50,000?
  • How much could I spend if I wanted a 1% or 10% probability of ruin instead of 5%?
  • Which equity allocations most often appear in failed scenarios? (I often find that 40% to 60% equity allocations show up in fewer failures.)
  • What is the value of my safe "floor" over time? Are there gaps? Would annuities help?
  • How often does delaying Social Security benefits improve my outcomes?
The list can go on and on (this is the reason MC is used for academic retirement finance research) but it does not include "how much can I safely spend this year" beyond what the simulation's chosen spending rule can tell you before you even begin simulation.

Monte Carlo simulation and spending rules are different tools for different jobs.
[Tweet this]

Wade Pfau's book, How Much Can I Spend in Retirement?[2], provides an extensive inventory of spending rules, including:
  • 4% Rule (constant-dollar)
  • Constant percentage
  • Steiner ABB
  • RMD
  • Kitces’s Ratcheting, and others.
At present, I tend to favor Steiner's ABB[3] for determining current safe spending, an approach which Pfau generally refers to as "actuarial" or "PMT" methods. He notes that Moshe Milevsky also believes this is a great place to start[4].

MC isn't tied to any particular spending rule, although most free online programs model constant-dollar withdrawals, aka the 4% Rule. That's unfortunate in my opinion because I consider it a very poor spending strategy.

There is no "Monte Carlo spending rule."

MC models can implement any of the rules in Wade's book or any other. MC takes the spending rule of the modeler's choice and predicts what would happen if that rule were applied every year throughout many lifetimes.

It doesn't make sense, then, to ask how MC spending compares to these rules. MC uses these rules to calculate annual spending.

If I build a simulation model using the 4% Rule, spending for the first year of my model will look essentially the same as the 4% Rule. If I build it using ABB or a constant-percent spending rule, instead, the first year spending in my simulation model will look like those. Results only begin to differ in year two of the simulation as other modeled factors, like expected market returns, change.

Sometimes when I write an MC model I make up my own rule. MC is a good way to compare spending rules.

Not all spreadsheet or "deterministic" models are the same. Ken Steiner's ABB model, for example, is quite different than spreadsheet models that simply subtract fixed spending each year and then grow the portfolio by the same expected market return growth factor. Unfortunately, there seem to be a lot more of the latter out there.

Not all MC models are the same, either, but like spreadsheet models, most of the free ones are pretty much the same. They often assume a fixed-length retirement, 4% Rule spending and probability-of-success as the evaluation criteria. More concerning, they usually predict outcomes for a portfolio, which is only a small piece of a retirement plan (see The Retirement Café: Three Degrees of Bad). A simulation for retirement planning should simulate retirement finances, not just a retirement savings portfolio.

The practical implications are somewhat complicated. Using a spending rule is relatively straightforward and accessible; good simulators not so much. Steiner offers a free spreadsheet for ABB at his website. Free RMD calculators are widely available. If you are unfamiliar with MC simulation, though, learning enough to use the tool effectively is a daunting task and probably not worth the effort.

Your best bets, in that case, are to find a knowledgeable planner who has access to good software or to pay for Laurence Kotlikoff's E$PlannerPLUS[1].

Screwdrivers are great for removing a screw but terrible at hammering a nail. Spending rules are intended to estimate a safe amount to spend from a savings portfolio in the current year but tell you nothing about the probabilities of lifetime outcomes from applying that rule repeatedly.

The smooth path of a spreadsheet projection (red in the chart below) is not a rational expectation for a real retirement, nor is the terminal portfolio value it estimates. The future projections are not a median or "expected" outcome. They're simply an assumption for estimating current spending.

I'm not even sure how to describe that TPV. We could say that it is the terminal portfolio value we would expect if the stock market returned the identical expected return every year with no sequence risk for a fixed lifetime but since none of those assumptions is realistic it is not a meaningful value.

MC simulations only estimate current safe spending by incorporating a spending rule but they're great for understanding the probabilities that determine the outcomes of the retirement finance "game."

When I suggested in a previous post that if you are planning retirement with a spreadsheet model you should test it with simulation, I wasn't suggesting that MC provides a better estimate of current safe spending — it doesn't do that, at all — but that you understand the probabilities of future outcomes. If you don't, then you probably don't have a good understanding of the risk in your plan.

I read a post complaining that these simulations are useless because the retiree can't know which of those tens of thousands of potential outcomes will be hers. That's like a poker player arguing that knowing the probabilities of the card deck is useless because a player can't predict which card will be drawn next.

If you recalculate a good spending rule every year (variable spending) you are unlikely to deplete your portfolio, although spending could become awfully tight should your portfolio fall on hard times.

(For example, there's an excellent post at[5] showing how Draconian spending cuts could become using the Guyton-Klinger Guardrails spending strategy. "Sustainable withdrawals" means you aren't likely to deplete your portfolio. It doesn't imply that the variable amount you will be able to safely spend will be enough to sustain you.)

On the other hand, simply recalculating spending annually means you're planning one year at a time. I prefer a plan with a longer horizon that is updated every year. Keep your eyes on the prize and make the necessary annual adjustments to get you there.

Spending rules estimate a safe amount to spend in the current year. Monte Carlo simulations estimate the probabilities of future outcomes one should expect when recalculating a chosen safe spending estimate every year over many lifetimes. But, MC only thoroughly covers reasonably-probable outcomes. A good retirement plan still needs a "saw" to cover the improbable.

More on that next time in Some Risks Can't be Modeled.


[1] ESPlannerPLUS | ESPlanner Inc.

[2] How Much Can I Spend in Retirement?: A Guide to Investment-Based Retirement Income Strategies eBook: Wade Pfau: Kindle Store

[3] Actuarial Approach – Using Basic Actuarial Principles to Accomplish Your Financial Goals.pdf, Ken Steiner.

[4] It’s Time to Retire Ruin (Probabilities), Milevsky, 2016.

[5] The Ultimate Guide to Safe Withdrawal Rates – Part 10: Debunking Guyton-Klinger Some More – Early Retirement Now

Monday, April 23, 2018

The Limits of Simulation

In a previous post, The “Future” of Retirement Planning, I explained that Monte Carlo simulation of retirement finances provides all the information available from a deterministic “spreadsheet” model and more. Among other advantages, it models sequence of returns risk.

Monte Carlo simulation, however, has its own limitations.

A reader commented on my previous post that Monte Carlo simulation “creates thousands of possible and impossible scenarios.

The “impossible” part of that statement is wrong.

In fact, the opposite is true. Monte Carlo simulation’s biggest shortcoming is that most of the scenarios it produces will be the most likely and simplest scenarios while lots of possible but unlikely scenarios that could destroy a retirement will never be simulated.

Most retirement models, deterministic or stochastic, don't model the risks that are most likely to lead to lead to bankruptcy, like spending shocks, divorce or a combination of inter-related risks.[1]

Any planning exercise begins with the basics and is then augmented by a bunch of “what-if’s.”

Planning a picnic? You’ll need a blanket, some food and some lemonade. But, then, what if it rains? What if the park is closed? What if there are too many ants or mosquitoes? What if the sun is too intense? What if someone gets a bee sting? A good plan will consider these possible bad outcomes and prepare for them.

Monte Carlo simulation is a great way to quickly generate a few hundred thousand retirement “what-if” scenarios. Analyzing them with statistics allows us to comprehend the big picture without looking at each scenario individually (an impractical task). They allow much greater in-depth analysis than the spreadsheet approach because they consider more of the key factors of retirement success and provide a lot more what-if’s but here are some of their limitations.

1. They model questionable assumptions.

Most Monte Carlo simulation models assume that market returns are normally distributed, even though we know they aren’t. We see far more — and far more severe — market crises than a normal distribution predicts. We’re either living in a very unlucky universe or we’re using an optimistic distribution for market returns. We use a normal distribution because it's the closest parametric distribution we have and that simplifies the math but we’re pretty sure the market has fatter tails than a normal distribution.

We have a couple of hundred years of market return data but that isn’t enough to create anything near a reasonable confidence interval. That is to say, our (historic) sample size is way too small to make confident guesses of the mean market return and there is no convincing argument that the next thirty years of market returns will look like the last 30 or the last 130.

(These are also problems with deterministic models.)

Monte Carlo is one of the best planning tools we have but it has its limits.
[Tweet this]

In “Where is the Market Going? Uncertain Facts and Novel Theories”[2], John Cochrane notes that over the fifty years from 1947 to 1996 the excess return of stocks over T-bills was 8% but, assuming the annual returns are statistically independent, the standard confidence interval for the mean return ranged from 3% to 13%.

Let me say that in simpler terms. If you asked me the mean excess market return, then based on the sample data from that period I would guess it’s about 8%. But, if you then asked me how confident I am that 8% is the mean, I would say that I’m 95% confident (not totally) that it isn’t less than 3% or more than 13%. In other words, I’m not that confident. (This is also the reason that we can’t identify optimal asset allocations.)

Monte Carlo simulation addresses this uncertainty by generating scenarios with a fairly broad range of market returns. Some scenarios might have a return near 3%, for example, and others near 13%, though most would be closer to 8%. Contrast this with a spreadsheet model that assumes a single market return with no variance.

Some financial writers define market volatility as “risk” and they define “uncertainty” as not even knowing the underlying distribution. We don’t know the underlying distribution of market returns or if the underlying mean return changes over time.

The effect of all this uncertainty is that, while Monte Carlo simulations appear to generate accuracy to several decimal places, our sample size of 200 years or so of historical U.S. stock market returns is too small to inspire confidence. That doesn’t render simulation results irrelevant, however. Hurricane forecasts, as Larry Frank points out, are not very accurate but still very useful. It’s a good analogy for dynamic retirement planning.

When someone says, “My Monte Carlo planner says I have only a 5% probability of outliving my savings, so I’m good, right?”, my answer is, “Well, yes. . . assuming the market behaves much as it has in the past, that you invest your portfolio wisely and earn something near market averages, that you don’t experience a 3-sigma market crash early in retirement, that you experience no spending shocks and that you consider a 1-in-20 chance of outliving your savings “good”, then, yeah, you’re probably good.

2. Simulations are only as good as the strategy they model.

The first thing we need to know is what the simulation models. Most model the probability of outliving a portfolio of stocks and bonds but, as I explained in Three Degrees of Bad[3], portfolio depletion can’t be equated with retirement failure. Portfolio depletion can even be part of the plan.

Some retirement financing strategies are simply flawed. I believe fixed-spending strategies and set-and-forget strategies are hopelessly flawed, for example. Monte Carlo simulation of a flawed strategy for an individual household’s retirement plan is pointless.

I find the concept of “retirement ruin” to be meaningless (retirees don’t stop living and spending when their portfolio is depleted) so I have little confidence in retirement models based on probability of ruin[4,5]. U.S. retirees would declare bankruptcy if their retirement failed, emerge with some protected assets, and live off Social Security benefits. Their retirement wouldn’t simply end, though their standard of living might dramatically decline. Instead, I model the probability of not meeting desired spending.

So, when I respond, “yeah, you’re probably good”, I add, “. . . and assuming your Monte Carlo simulation used a reasonable model.”

3. Spending shocks are difficult to model, so they seldom are.

Spending shocks can decimate a retirement plan but they are difficult to model. Shocks typically have a low probability of occurring but potentially huge risk magnitude. These risks are usually better mitigated by insurance when affordable insurance is available than by relying on a low-probability of their occurrence. Even if we model them, insurance (Social Security benefits, annuities and pensions) will usually be the answer.

4. Simulations probably won’t generate rare but potentially catastrophic scenarios.

Monte Carlo simulation works by generating many of the most probable scenarios and fewer and fewer of less-probable scenarios. They won’t thoroughly analyze very low-probability market returns (tail risk), for example, because they are unlikely to generate more than a few such scenarios.

A normal distribution “tails off” at both ends. The “skinny tails” show that the probability of outcomes far from the mean are highly unlikely, which means they are equally unlikely to be included in a Monte Carlo simulation. We refer to this as “tail risk.” You can see the skinny tails of a normal distribution in the diagram above.

Outcomes in the tails are improbable but, as I recently read somewhere, the left tail should be labeled “There be dragons.”

Unlikely outcomes to the right of the mean (the right tail) aren’t a problem; those outcomes are improbably good. It’s the left tail risk that’s a problem because the distribution tells us that outcomes there are improbable but it doesn’t tell us they’re magnitude.

The reality is that we can’t estimate tail risk for the market because we don’t know the distribution of market returns. We guess that it is “normal-ish” tail risk but we know that market crashes occur far more often than a normal distribution would predict. Monte Carlo simulation isn’t helpful in predicting very unlikely but catastrophic events but then, nothing is.

In Antifragile[7], Nassim Taleb says, "[Antifragility] provides a solution to what I have called the Black Swan problem — the impossibility (emphasis mine) of calculating the risks of consequential rare events and predicting their occurrence."

5. Complex scenarios are difficult to model, so they seldom are.

Complex scenarios are difficult to conceive, let alone model. Elder bankruptcy research by Deborah Thorne[6] showed that most of the worst-case retirement finance outcomes (those that end in bankruptcy) are not caused by a single factor, like spending too much on credit cards, but by a complex self-reinforcing cycle of interdependent risks. These numerous complex combinations of risk are unlikely to be modeled.

Here’s an example that would be difficult to anticipate and therefore difficult to generate with a simulation model.

A retiree borrows a reverse mortgage, feeling secure in the fact that it is non-recourse. He knows that his loan can’t be foreclosed unless he moves out of the home, which he doesn’t plan to do. His wife becomes ill and runs up huge medical bills. They spend home equity to pay bills, then run up credit card debt and eventually file for bankruptcy. They can no longer afford to live in the home and when they leave, repayment of the reverse mortgage will be triggered.

Monte Carlo simulation can generate hundreds of thousands of possible future scenarios but they won’t include complex, interdependent risks like this one. On the other hand, Monte Carlo simulations may surprise you by showing scenarios, for example, in which purchasing an annuity actually results in a greater legacy.

7. Simulation can’t predict your future.

I recently wrote about a blog post that suggested that Monte Carlo simulation has no value because a retiree can’t know which of the thousands of possible future paths her future will track. That is absolutely true — your individual path is unknowable — but the argument is irrelevant. That argument is based on the false premise that we run simulations in order to find that path. We run simulations to collect information on the range of many paths.

A retiree shouldn’t look at simulation results, regardless of the number of scenarios simulated, and assume his or her future is in there somewhere. More often than not it will be but a good retirement plan doesn’t rely on that. A good retirement plan should also consider what happens when really bad, improbable things happen.

8. Many Monte Carlo models underestimate risk.

Many, and probably most, Monte Carlo models calculate risk of ruin simply by counting the percent of scenarios that end in ruin. Some scenarios that are counted as successes, however, may have been exposed to significantly greater risk than others. That 95% probability of success is probably best case.

At this point, you may be asking yourself why I recommend a tool with so many shortcomings. One answer is that it has fewer shortcomings than the alternatives. It considers more factors and generates more information. We look at simulation results to get an overall view of the most probable outcomes and the perspective we gain is, like weather forecasts, imperfect but highly useful.

Understanding what will probably happen and what might happen in most scenarios is a great place to start.

The important takeaways are these. Monte Carlo simulation can be a powerful tool for retirement planning because it provides more information than other approaches. Ultimately, however, we must realize that, as Yogi is credited with saying, predictions are really hard — especially about the future. The results are more of a distribution of a ballpark estimate than a single answer but it's more useful to estimate a 40% chance of rain tomorrow than to maintain that we can't know for sure so it isn't worth considering. Monte Carlo simulation will not predict or protect your retirement from "consequential rare events."

The results are also better used to compare the relative risk of one strategy to another than to measure their absolute risk. If simulation tells you that 3% spending is half as risky as 5% spending, then you can be more confident that one is safer than the other than you can be that there is actually a 3% risk that the former will result in ruin.

If this is all a little too confusing, bear with me. You can learn to use the information provided by simulations without a complete understanding of Monte Carlo models. You probably couldn't build a GPS device, either, but you're probably confident using one. Perhaps, you just need to find a planner that will run one for you. Maybe you have a perfectly fine plan built with a different type of model or no model at all and you just need simulation to improve your confidence.

Next time I’ll discuss the relationship between Spending Rules and Simulation.


[1] The Retirement Café: Why Retirees Go Broke.

[2] Where is the Market Going? Uncertain Facts and Novel Theories, John H Cochrane.

[3] The Retirement Café: Three Degrees of Bad.

[4] The Retirement Café: Time to Retire the Probability of Ruin?.

[5] Financial Analysts Journal: It’s Time to Retire Ruin (Probabilities) | CFA Institute Publications, Moshe Milevsky.

[6] The (Interconnected) Reasons Elder Americans File Consumer Bankruptcy, Deborah Thorne.

[7] Antifragile: Things That Gain from Disorder (Incerto), Nassim Taleb.