Doing Statistics Under Fat Tails

Research Project started in 2015 by Nassim Nicholas Taleb and colleagues

(so far Pasquale Cirillo, Raphael Douady and other members of the Real World Risk Institute)

Background: The technical papers below are part of a systematic approach to uncover mismeasurement of statistical metrics under fattailedness and propose corrections and alternative tools. Conventional statistics fail to cover fat tails; physicists who use power laws do not usually produce statistical estimators, leading to a large —and consequential — gap. It is not just changing the color of the dress (see discussion below).
The initial aim was to establish a network of Bourbaki-style collaborators in a synchronized way working on the gap and injecting rigor in policy-making and decision-making under fat tails.

Taleb, N.N., Introductory and Summary Paper: Extremes f. in Extremes: Darwin College Lecture Series, Cambridge University Press

Fontanari, Taleb, and Cirillo, "Gini estimation under infinite variance", f. Physica A

Taleb, N.N., "The law of large numbers under fat tails". This the central idea; it shows where statistical inference is BS and explores more rigorous estimation of the mean of the sum of fat-tailed random variables. A YouTube presentation here at MIT Big Data Luncheon.

Taleb, N.N., "Stochastic Tail Exponent for Asymmetric Power Laws"

Taleb, N.N., Election pricing as arbitrage: a martingale approach. f. Quantitative Finance

Taleb, N.N., "Preasymptotic behavior of subexponential and non-stable powerlaw sums"

Taleb, N.N., "The mathematical foundations of the precautionary principle"(in progress). Actually shows how the entire structure of probability in the social sciences is messed-up.

The inequality papers (apply to all measures of concentration, not just inequality):

The next two papers apply the idea showing the flaw in using "averages" and "sums" as estimators of inequality under fat tails, instead of maximum likelihood methods applied to the tail exponent. Measurements of changes in inequality we have recently witnessed (triggering active discussions) are based on unrigorous methods; seen from these metrics, changes in inequality can be either overly underestimated or severely exaggerated (particularly those concerning wealth which has much fatter tails than income).

Taleb, N. N. and Douady, R., 2015, "On the super- additivity and estimation biases of quantile contributions" Physica A: Statistical Mechanics and Applications proves that measures of concentration (e.g. "top 1% has 50% of wealth"، "top 1% of wars killed 50% of people") are size dependent and aggregate poorly.

Taleb, N.N., "How to (not) estimate Gini indices for fat tailed variables". The paper shows a severe but more tractable problem with the Gini and proposes efficient unbiased estimators, deriving their properties. Some have argued that "it is only a problems for fat tails" except that Gini is a measure for fat tailedness.

Milanovic, B. and Taleb, N.N. Why the super-rich care more about inequality than growth. In Progress. Policymaking errors in not realizing that demand for assets arises from inequality, etc.

The dual distribution papers: techniques that help finding the "true (or shadow) mean" as opposed to the sample mean.

Cirillo, P. and Taleb, N.N., 2016, "On the statistical properties and tail risk of violent conflicts" (Physica A). Yes, the thesis by the science writer S. Pinker on the "drop in violence" has no statistical basis. Under fat tails, sample means are unstable and underestimate true means. The paper proposes a method to use dual distributions, removing compact support to apply Extreme Value Theory, and transfer parameters to the primal. Also a novel robust approach to unreliable estimators.

Cirillo, P. and Taleb, N.N., 2016, "What are the odds of a thirld world war?", (Significance).

Taleb, N.N., Cirillo and P., Taleb, N.N., 2016,"Expected shortfall estimation for apparently infinite-mean models of operational risk", forthcmoing, Quantitative Finance.

P-Value Problem

Taleb, N.N., 2016, The meta-distribution of p-values, P-values (although with compact support) are fat-tailed, with effects on p-hacking.

Option Theory

Taleb, N.N., 2015, Unique Option Pricing Measure with neither Dynamic Hedging nor Complete Markets, European Financial Management . It proves using measure theory how a distribu- tion with finite first moment can produce a risk-neutral option price, and why we can dispense with both the dynamic hedging and pricing kernel arguments –hence price options with fat-tails.

Formalization of the barbell strategy using information theory
We are clueless about downside probability, particularly under fat tails. We look at constructions with severe tail constraints and compatible with gambler's ruin (a generalization of Kelly's criterion).

Geman, D., Geman, H. and Taleb, N.N., 2015. "Tail risk constraints and maximum entropy". Entropy, 17, pp.1-14.

Dimensionality and Model Error

Taleb, N.N., "Model error and dimensionality". In progress

Undecidability: amply covered in Silent Risk (it is its theme), here is the formalization.

Douady, R. and Taleb, N.N. Statistical Undecidability Under what conditions on the metadistribution of the probability measure is a statistical formally decidable.

Power laws and stochastic tail Exponents mixtures of power laws.

TBA

Some BS detecting papers that precede the project

Taleb, N.N., 2014, (On The conflation of long volatility and fat tails), Quantitative Finance. A strange overactive smear-campaigner, Eric Falkenstein, extremely innocent of probability, kept spreading all manner of disinformation about my work. While it may have been ineffective in stopping the spread of my ideas, the strawmanship resulted in people mistaking the tails with the scale of the distribution. This is meant to correct.

It is not changing the color of the dress

Many people know (well, sort of) what fat tails means, but in a vague sense, believing that it is just another class of distributions than the normal and they can think of them as, simply, other distributions doing the same thing. Unfortunately things work differently:
The very definition of inference and confidence interval goes out of the window. More rigor is required. To work with fat tails one has to approach things differently, at a conceptual level. In fact one of us found contradictions in discussions: once it is stated that a distributions is fat-tailed, then many statements taken for granted are no longer valid.

• The mean of the distribution will not correspond to the sample mean. In fact there is no fat-tailed distribution in which the mean can be properly estimated from the sample mean.
• Sharpe ratio, variance, beta and other common finance metrics are uninformative. Variance and standard deviations are not useable.
• Correlations (in the Pearson sense) usually do not exist, and when they do, provide little information (but there are other forms of dependence).
• Robust statistics is not robust at all.
• Maximum likelihood methods work for parameters (good news).
• The gap between disconfirmatory and confirmatory empiricism is wider than common statistics.
• Principal components analysis is likely to produce false factors.
• Methods of moments fail to work.
• There is no such thing as "typical" large deviation: conditional on having a large move, such move isn’t "typical"

The Technical Incerto, Vol 1
The Statistical Consequences of Fat Tails [Full Text]

The Technical Incerto, Vol 2
Convexity, Risk, and Fragility [Full Text, in Progress]

The Fat Tails Project, 2015-2018, Collected Papers

N. N. Taleb's Home Page

Precautionary Principle Page

Real World Risk Institute

The fragility heuristic paper (with IMF, non technical)

A Mathematical Formulation of (Anti)Fragility (technical)

Skin In the Game Page