Extreme value distributions and Renormalization Group

In the classical theorems of extreme value theory the limits of suitably rescaled maxima of sequences of independent, identically distributed random variables are studied. So far, only affine rescalings have been considered. We show, however, that more general rescalings are natural and lead to new limit distributions, apart from the Gumbel, Weibull, and Fr\'echet families. The problem is approached using the language of Renormalization Group transformations in the space of probability densities. The limit distributions are fixed points of the transformation and the study of the differential around them allows a local analysis of the domains of attraction and the computation of finite-size corrections.


Introduction
The basic problem of Extreme Value Theory (EVT) is the following (see Ref. [1] for a primer). Given a equence of n independent identically distributed (i.i.d.) random variables, we ask how the properly rescaled maxima of the sequence are distributed when n → ∞. Not surprisingly, EVT has much importance from the point of view of applications in the natural sciences [2,3,4,5], finance [6], and engineering [7], to name a few. In all these fields one often encounters problems possessing a threshold value for some quantity and wants to know the probability that it be exceeded (catastrophic events are a good illustrative example). This question is similar in spirit to that answered by the central limit theorem, which deals with the limits of rescaled sums of i.i.d. centered random variables. In both cases one tries to find out whether some kind of universality exists, so that the family of limit distributions is small and their domains of attraction are easy to describe.
The problems of EVT and the central limit theorem are naturally addressed in the framework of the Renormalization Group (RG), the deepest formalism used in modern Physics to understand how a system behaves under a change of the scale of observation. For a treatment of the central limit theorem results and stable distributions in this setup see Refs. [8,9]. Only recently has EVT been tackled from the perspective of the Renormalization Group [10,11,12]. In the latter references, the main motivation was to advance the understanding of the convergence to the limit when the size of the sample data increases. Herein, we employ the RG language to try to discuss and solve a different fundamental problem on the acceptable rescalings and limits of maxima of sequences of i.i.d. random variables. We describe it next.
Let ρ be a probability density in R and µ its distribution function, Then, the distribution function for the maximum value of n i.i.d. random variables with probability density ρ is given by and the corresponding probability density reads In the limit of large n, P n concentrates around the maximum of the support of ρ. It is not surprising that in order to obtain a non-trivial limit we have to rescale the random variable. To the knowledge of the authors, the only rescaling thoroughly studied so far is the affine one: starting from Fréchet [13] and Fisher and Tippet [14] there has been an extensive literature considering the possible limits of P n (a n x + b n ) and the domains of attraction. Actually, Fréchet only considered the case b n = 0 while Fisher and Tippet gave the expression for the possible limit distributions with full generality. Finally Gnedenko [15] completed the solution of the problem by describing rigorously the domain of attraction of the different limit distributions. But a natural question is: why to admit only affine rescalings? At this point, nothing better than quoting from [16]: An interesting side issue is why this formulation was adopted at all with its affine normalization of M n . Fisher and Tippet did not explain this, whereas Gnedenko offered only the analogy with stable distribution theory for sums, which seems to be begging the question. Perhaps the real explanation is that no one came with an alternative formulation that lead to interesting results. The same explanation is still valid today.
In this work we try to fill this gap and explain that there are, indeed, interesting results beyond the affine rescaling.
The rest of the paper is organized as follows. In Section 2 we recast the problem of EVT into the RG formalism and show that when the restriction of affine rescalings is relaxed new interesting limit distributions (fixed points in the RG approach) appear, apart from the Gumbel, Weibull, and Fréchet families. In Section 3 the domains of attraction of the fixed points are studied. Section 4 is devoted to finite-size corrections, i.e. the modification of the limit distributions when the sample size, n, is large but finite. In Section 5 some illustrative numerical tests are given. Finally, the conclusions are presented in Section 6.

Renormalization Group transformation
Define the RG transformation where, for reasons that will become evident later, we use s = log n to parameterize it. Note that the definition of T s requires the choice of a rescaling function g s . To the date only the case in which g s is an affine transformation has been studied. In terms of the probability density ρ the transformation (2.1) reads Considering that the support of P n is equal to the support of ρ it is natural to ask that g s be a homeomorphism from the support of ρ onto itself, in this way T s preserves the support of the probability density we start with. We remark that this condition is not imposed in the works focusing on the affine rescaling.
The transformation T s can be extended to continuous s once the appropriate g s is defined. A natural requirement for the transformation T s is that it forms a uniparametric group, i.e.
Given the choice of parameterization, this holds provided that and from now on we assume that this is the case. If we take g s differentiable with respect to s, condition (2.2) is equivalent to saying that g s is solution of the differential equation with initial condition g 0 (x) = x and We are actually interested in the possible extreme limiting distributions, i.e. in M = lim s→∞ T s µ. This, together with the continuity of T s , implies that M must be a fixed point of the renormalization group transformation, Assume that this equation has a solution with probability density P (x) = dM (x)/dx whose support is denoted by Σ. Several important consequences follow.
(i) For s > 0 and x in the interior of Σ, g s (x) > x or equivalently f (x) > 0.
Proof: If x is in the interior of Σ the distribution function verifies M (x) ∈ (0, 1) and is monotonically increasing. We have n = e s > 1 and, there- (ii) If x * is at the boundary of Σ then f (x * ) = 0.
Proof: A simple consequence of the fact that g s is a differentiable, uniparametric group of homeomorphisms of Σ and therefore we must have g s (x * ) = x * .
(iii) f (x * ) = 0 if and only if x * is the maximum or the minimum of Σ.
(iv) f can have at most two ceros and the boundary of Σ at most two points. Therefore we have three possibilities: Σ is the the real line (−∞, ∞), For any of the three cases mentioned in (iv) we shall take a group of maps that preserve Σ and study the associated RG flow.
In this case a natural and simple choice for the group of maps is the group of translations, i.e.
The most general limiting distributions (or fixed points of the renormalization group) for this transformation is The simplest choice for g s is, in this case, And the corresponding limiting distribution is The maps that preserve the semi-infinite line are g s (x) = e s/α x, α > 0, and the limiting distributions A simple choice for the uniparametric group of maps is that leads to the following family of limiting distributions Note that in all cases two free positive constants α and λ appear, whose role is easy to understand: α fixes the scale for the group parameter s and λ can be changed into λe s by the action of the group of maps, that transform a fixed point of the RG into another one. The fixed points of Cases 0, 1 − and 1 + are well known in the literature and comprise the so-called Gumbel, Weibull, and Fréchet distributions. While in Cases 0, 1 + and 1 − the rescaling is affine, it is not so in case 2. The limit distributions of Case 2 are new and the subject of our analysis.

Limit distributions with compact support
Let us consider the RG transformation (2.1) for where α is a positive real number. That is, we concentrate on Case 2 from Section 2. Note in passing that the case α = 1 contains some interesting distributions among the possible fixed points. When α = 1 the general fixed point of the transformation is Therefore and, if λ = 1, we get the uniform distribution. It is easy to determine the domain of attraction of a given fixed point when g s is of the form (3.1). We have the following result: Proposition. A given random variable supported in [0, 1] with cumulative probability distribution µ converges weakly (or in law) after succesive appli- Proof: If (3.2) holds, then we can write and therefore where n = e s . The large s limit in the expression above yields To prove the converse note that the convergence of T s µ, taking logarithms and for x = 0, 1, can be expressed as But given that x = 0 we have lim n→∞ x n −1/α = 1, therefore We should insist that the novelty of these fixed points and attraction domains resides in the non-linear rescaling function g s , which in turn is motivated by the (natural) requirement that the rescaling preserves the support of the initial random variable. If we had considered the standard affine rescaling, the fixed points and their domains of attraction would have corresponded to the Weibull distributions with exponent α.
In the next section we continue the study of the new fixed points with the analysis of the finite-size corrections.

Finite size corrections
To discuss the amplitude of finite-size corrections and their shape, i.e. the behavior of the extremal distributions when the number of i.i.d random variables n is large but finite, we must study the neighborhood of the fixed points and the linear approximation of the RG transformation (2.1). For this we compute its differential at a probability distribution µ acting on η: (DT s ) µ η = nµ(g s (x)) n−1 η(g s (x)). (4.1) The stable and unstable directions of a fixed point µ are given by the eigenvalues and eigenvectors of (4.1) at µ. They determine the amplitude of the finite-size corrections and their shape. We focus on the case g s (x) = x n −1/α , with n = e s and fixed point M (x) = e −λ(− log x) α . In order to solve the eigenvalue equation for (4.1) it is very useful to consider the following ansatz η(x) = M (x)φ(x). In terms of it the eigenvalue equation reads that due to the properties of the fixed point M (x) reduces to nφ(g s (x)) = νφ(x).

This is solved by
with eigenvalue ν β = n 1−β/α . A perturbation of the fixed point is unstable (or relevant, in the RG terminology) if the corresponding eigenvalue is greater than one, i.e. β < α and it is stable (irrelevant) if β > α. The case β = α consists of a perturbation tangent to the line of fixed points and, therefore it corresponds to a purely marginal direction. Note that the above analysis is consistent with the domains of attraction determined in Section 3. The stable directions are precisely those that do not alter the limit in (3.2), the marginal ones induce an infinitesimal change in the limit and therefore also in the fixed point to which the perturbed distribution tends, and finally an unstable perturbation makes the limit diverge, implying that the perturbed distribution does not converge under successive applications of the RG transformation.
To understand how the linear analysis above is useful to determine the finite size corrections, consider the following situation. We start with a random variable with cumulative distribution µ(x) expanded in the eigenvectors obtained above, where the terms in the sum are ordered according to their eigenvalues, so that ν β i > ν β j for i < j.
Assuming that all eigenvalues are smaller than one (β i > α) or, in other words, that µ(x) belongs to the domain of attraction of M (x), one can show that Hence, the largest eigenvalue determines the behavior with n (the size of the system) of the amplitude of the dominant correction while its eigenfunction determines the shape of the correction. One can also study corrections of higher order and go beyond the linear approximation. In the next section we show how to accomplish this and compare our approximations with numerical implementations of the statistical models to test their reliability.

Numerical tests
In order to test our theoretical predictions we make two numerical experiments in which we study the distribution for the maxima of n independent random variables with a probability density ρ. The actual size of the systems is chosen so that the perturbative approach discussed above applies and the experiment is repeated a number of times large enough to make the statistical error much smaller than the finite size corrections. The numerical simulations are performed by generating n independent random variables and selecting their maximum. We divide the interval into 50 bins and after repeating the experiment N times we obtain the frecuency with which the maximum belongs to a given bin. The frequency, properly normalized, will be our numerical approximation to ρ n = T s ρ, with n = e s . Our first example for ρ is the tent distribution, whose probability density is given by Observe that the support of ρ is the interval [0, 1]. It is plotted, together with the density of its limiting distribution, in Fig. 1 The cumulative distribution function determined by ρ is If we perform the expansion in (4.2) we obtain The most relevant (or rather the least irrelevant) eigenvalue in the expansion is ν 3 = n −1/2 and it determines the behaviour with n of the amplitude of the finite-size corrections. In order to quantify the corrections when the number of random variables is n, we use the L 1 norm for the difference of the probability densities. This norm is also called total variation metric in the context of probability theory (see [17] and references therein). We expand where ρ n = T s ρ with n = e s . The first coefficient is given by and, similarly, one can compute the others to obtain In Fig. 2 we have plotted the finite-size corrections to the distribution of the maxima obtained numerically scaled with √ n, for different values of the The second prediction that we test numerically is the shape of the corrections. In this case we take a fixed (and large) value for n and we plot the rescaled difference between the limiting distribution and the one obtained numerically for the maxima of n random variables distributed according to ρ. The finite-size corrections δ(x) := (ρ n (x) − M ′ (x)) can be expanded as with the first coefficients given by In Fig. 3 the dots represent the points obtained with the numerical experiment for √ nδ(x) corresponding to n = 3000. The error bars, a little larger than the size of the dots, represent the statistical uncertainty due to the limited size of the sample. The dashed line is δ 1/2 (x) as defined in (5.3) while the solid line includes the next correction δ 1/2 (x)+δ 1 (x)n −1/2 . We see a wonderful agreement between the theoretical prediction and the numerical experiment, especially when the subleading correction is included.
The second example has a probability density and a cumulative distribution function It converges under the action of the RG with α = 1 to M (x) = x 2 . The limiting probability density is M ′ (x) = 2x. We can expand again, and we find that the most relevant perturbation has an eigenvalue ν 2 = n −1 , which determines the leading behavior with n of the amplitude of the finite-size corrections. If we also keep the first subleading terms we obtain ∆ = In Fig. 4 we check the validity of this expansion. We see that within the statistical errors, due to the limited size of the sample, the finite-size corrections agree with the theoretical predictions. The shape of the corrections in this case is with the different contributions given by In Fig. 5 we show the numerical value for the shape correction and compare it with the analytical prediction in (5.5). We can see again a remarkable agreement between the numerical experiment and the theoretical prediction.
In the previous examples we have tested the accuracy of the finite size analysis carried out in Section 4. The size of the system and the sample have been chosen so that the computational time is reasonable and the errors are sufficiently small not to spoil any predictive power. Within this range we have

Conclusions
By employing Renormalization Group techniques we have studied the limit distribution of the appropriately rescaled maximum value of a sequence of n independent, identically distributed random variables when n → ∞. Obviously, the rescaling is needed for obtaning non-trivial limits. To the present time only affine rescalings had been considered (perhaps in analogy to the treatment of the problem of stable distributions) without further justification. However, when studying limits of sequences of maxima of independent, identically distributed random variables, it seems natural to impose that the rescaling preserves the support of the original random variable. We have recast the problem of finding such limit distributions into the language of the Renormalization Group, explained how the condition of support preservation naturally arises, and what its implications are. The main result of this paper is to show that when non-affine rescalings are allowed new interesting limit distributions are obtained that do not belong to the well-known Gumbel, Weibull, or Fréchet families. In this formulation the limit distributions are fixed points of the Renormalization Group transformation. After the identification and discussion of the fixed points we have studied the differential of the transformation around them (with emphasis on the new ones) in order to understand the domains of attraction and the corrections due to large but finite n, the so-called finite-size corrections.
An interesting technical aspect of the approach herein adopted is the concrete form of the definition of the Renormalization Group transformation. We define it as an uniparametric group of transformations that is fixed once for all, differing from other works in this line where the transformation can be adapted at every step. This fact has some consequences, especially concerning the domain of attraction of the fixed points. Indeed, within our approach the determination of the domain of attraction is simpler.
Finally, one may wonder whether it is possible to modify the classical results on the domains of attraction when we restrict to transformations that preserve the support of the original random variable.