You run a Monte Carlo retirement calculator with 500 iterations and get a success rate of 87%. You run it again. This time it says 82%. Again: 89%. The plan did not change. The inputs did not change. So why does the answer keep moving?
Because 500 iterations is not enough.
Monte Carlo is a statistical sampling method. Each iteration is one randomly generated future for your portfolio. The success rate is the percentage of those futures where your money lasted. With too few samples, you are measuring noise, not signal.
The Convergence Problem
Think of it like polling before an election. If you ask 50 people how they will vote, your margin of error is huge. Ask 1,000 and the picture stabilizes. Ask 10,000 and you can be quite confident in the result.
Monte Carlo works the same way. Each iteration is one "respondent" drawn from the universe of possible market return sequences. The success rate is your poll result. And like any poll, the margin of error shrinks as you increase the sample size.
At 100 iterations, the standard deviation of your success rate estimate can be 3-5 percentage points. That means an 85% result could easily be 80% or 90% in reality. The number is almost meaningless.
At 1,000 iterations, the standard deviation drops to about 1-1.5 percentage points. Much better. You can start to trust the ballpark.
At 10,000 iterations, it drops below 0.5 percentage points. The success rate you see is very close to what you would get with infinite iterations.
At 50,000 iterations, the estimate is rock solid. Running it again will give you a result within a fraction of a percent of the previous run.
Why Percentile Bands Need Even More Iterations
The success rate is an average measure. It converges relatively quickly. But the percentile bands showing the range of portfolio outcomes are harder to pin down, especially at the extremes.
The median (50th percentile) outcome converges almost as fast as the success rate. But the 5th percentile (your worst realistic scenario) and the 95th percentile (your best realistic scenario) require many more samples to stabilize. This is because extreme outcomes are rare by definition. You need a larger sample just to see enough of them for a reliable estimate.
If you care about tail risk (and you should, given sequence-of-returns risk and fat-tail distributions), then you need more iterations than someone who only looks at the success rate headline.
A rule of thumb: you need roughly 4x the iterations to get the same precision at the 5th/95th percentile as you get at the median.
The Fat-Tail Factor
The distribution you use for returns also affects how many iterations you need.
Under a normal distribution, extreme returns are rare, so the tails converge reasonably fast. Under a Student's t-distribution or other fat-tailed model, extreme returns are more common but also more variable in magnitude. The tails are "thicker" and take longer to converge.
If you are using fat-tailed distributions (which you should be for honest risk assessment), add more iterations. Where 1,000 might be adequate for Gaussian returns, you probably want 5,000-10,000 for fat-tailed ones to get stable tail estimates.
Speed vs Accuracy Tradeoffs
More iterations take more compute time. On a modern browser, the difference between 1,000 and 50,000 iterations might be the difference between half a second and a few seconds. Not a big deal for a desktop user, but it matters for mobile devices or for simulators that run on older hardware.
Some calculators default to low iteration counts to keep the interface snappy. This is an understandable design choice, but it comes at the cost of reliability. A fast answer that bounces around by 5 percentage points every time you run it is worse than a slower answer you can trust.
The best approach is to let users choose. Offer a fast mode (1,000 iterations) for quick exploration and a high-precision mode (10,000-50,000) for final analysis. That way, you get responsiveness during the "what if" phase and confidence when it matters.
How to Tell If Your Simulator Uses Enough
There is a simple test: run the same scenario twice without changing anything. If the success rate changes by more than 1 percentage point, the iteration count is too low.
Better yet, run it five times and look at the spread. If the success rate bounces between 83% and 89%, you are looking at noise. If it stays between 86.2% and 86.8%, the estimate has converged.
Most serious financial planning tools use at least 10,000 iterations. Academic research on Monte Carlo retirement planning typically uses 10,000-100,000. If your calculator uses fewer than 1,000, treat the results as directional at best.
What Matters More Than Iteration Count
Iteration count is necessary but not sufficient for a good simulation. A calculator running 50,000 iterations with bad assumptions will give you a very precise wrong answer.
The inputs matter more:
Return assumptions. Are the expected returns and volatilities reasonable? Are they based on historical data, forward-looking estimates, or arbitrary defaults? Garbage in, garbage out, no matter how many times you repeat it.
Distribution choice. Running 50,000 iterations with a normal distribution is better than 500, but you are still underestimating tail risk. The distribution shape matters as much as the sample size.
Spending strategy. A fixed withdrawal assumption ignores how real people behave. If your calculator does not let you model flexible spending, the iteration count is solving the wrong problem.
Correlation structure. If asset returns are modeled as independent, the simulation overstates diversification benefits. Cholesky-correlated returns give a more realistic picture, but only if the correlation matrix itself is reasonable.
High iteration count makes a good model precise. It does not make a bad model good.