Mathematical Foundations of Randomness

Abhijit Dasgupta , in Philosophy of Statistics, 2011

2.2 Lebesgue measure over the unit interval

The problem of Lebesgue measure over the unit interval [0, 1] = { xR: 0 ≤ x ≤ 1} is essentially a geometric one: Given a subset E ⊆ [0, 1], we want to assign it a number μ(E) which represents its "size" or "length". For very simple subsets such as an interval, its measure is simply its length: If J ⊆ [0, 1] is an interval with endpoints ab (i.e., J is one of the intervals (a, b), [a, b), (a, b], or [a, b]), the measure of J, denoted by μ(J), is defined as the length of the interval, μ(J) = ba.

The next step is to define the length of any open subset of [0, 1]. A subset G of [0, 1] is said to be open if it is a union of open intervals. A standard fact about this linear continuum is that any open set can be expressed uniquely as a disjoint union of (possibly infinitely many) intervals. This allows us to naturally and uniquely define the measure μ(G) of an open set G ⊆ [0, 1] to be the sum (possibly as an infinite series) of all the constituent disjoint intervals.

A key idea here is that a set of "small measure" can be covered by an open set of "small measure": A set E is said to be measure-zero if E can be covered by open sets of arbitrarily small measure, i.e., for any ε > 0 there is an open set G containing A with μ(G) < ε. Slightly more constructively, E has measure-zero if there is an infinite sequence of open sets G 1, G 2, G 3, … with each Gn covering E and with μ(Gn ) < 1/n.

A subset E ⊆ [0, 1] is defined to be (Lebesgue) measurable if for any ε > 0 there is an open set G containing E and an open set H containing the difference GE with μ(H) < ε. Thus a measurable set is one which can be approximated from outside by open sets arbitrarily closely. If E is measurable, it can be shown that the measure of the open set G above approaches a unique limit as ε → 0, and we denote this limit by μ(E). This defines the Lebesgue measure for every measurable subset of [0, 1]. If EF ⊆ [0, 1] are measurable sets, then we have 0 ≤ μ(E) ≤ μ(F) ≤ 1.

The class of measurable sets form a vast collection. If E ⊆ [0, 1] is measurable, so is its complement [0, 1]E with μ([0, 1]E) = 1−μ(E). If 〈En 〉 is a sequence of measurable sets, then their union ∪ n En and intersection ∩ n En are also measurable. If the sequence 〈En 〉 consists of disjoint measurable sets, then the measure of their union is the sum of the measures of the individual sets: μ(∪ n En ) = Σ n μ(En ).

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780444518620500216

Parametric Analysis of Time Signals and Spectra from Perspectives of Quantum Physics and Chemistry

Dževad Belkić , in Advances in Quantum Chemistry, 2011

11.2 The Passage from the Stieltjes to the Cauchy integral via the Dirac-Lebesgue Measure for Discretization of Inner Products

Hereafter, the generalized Lebesgue measure σ 0(z) from Eq. (137) is specified as:

(138) σ 0 ( z ) = k = 1 K d k ϑ ( z u k ) , ϑ ( z u k ) = { 1 z K 0 z K

where dk is the residue (15) and ϑ ( z u k ) is the generalized real-valued Heaviside function of the complex argument zu k , which is a generalization of the real case (138). Using the formula for the derivative d ϑ ( z ) / z = δ ( z ) , , where δ(z) is the complex-valued Dirac function, we have the following expression for the measure in ℒ M :

(139) d σ 0 ( z ) = ρ 0 ( z ) d z , ρ 0 ( z ) = k = 1 K d k δ ( z u k ) .

The generalized Dirac function of a complex variable from Eq. (139) belongs to the class of the so-called ultra distributions [2]. In the present context, δ(zuk ) has the same operational property as the usual Dirac function with a real argument, except that the contour integrals are involved, viz:

(140) C  d z f ( z ) δ ( z u k ) = f ( u k ) ,

where the function f(z) of complex variable z is analytic throughout the same contour C as in Eq. (137). More generally, if f(z) is regular within and on the contour C and g(z) has K simple zeros { v k } k = 1 K within C, we can apply the Cauchy residue theorem to write:

(141) C d z f ( z ) δ ( g ( z ) ) = k = 1 k f ( v k ) g ( v k ) ,

where g′(z)=(d/dz)g(z) and g(vk ) = 0 (1 ≤ kK). The weight function ρ 0(z) from Eq. (139) is reminiscent of the so-called "complex impulse train function" in signal processing [2]. If one formally sets the eigenvalues {ωk } of the operator Ω ^ to be equal to the Fourier grid points { ω ˜ k } , one would equate the residues {dk } with the complex Fourier amplitude {Fk }. Both { ω ˜ k } and {Fk } are given in Eq. (33). In such cases, the special impulse train function k F k δ ( ω ω ˜ k ) would represent the Fourier stick spectrum with jumps or heights |Fk | at the grid points ω = ω ˜ k and would otherwise be zero elsewhere. In Eq. (139), the quantities {|dk |} also have the meaning of heights or jumps in the spectrum (51) constructed from the peak parameters {uk ,dk }. The discrete counterpart of the symmetric inner product (137) for the pair {f(u), g(u)} ∈ ℒ M is given by:

(142) ( f ( u ) | g ( u ) ) = ( g ( u ) | f ( u ) ) = k = 1 K d k f ( u k ) g ( u k ) ,

where the residue dk = (Φ0 k )2 from the definition (15) is a complex-valued weight function.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123860132000048

Set Theory

Marion Scheepers , in Encyclopedia of Physical Science and Technology (Third Edition), 2003

III.B Lebesgue Measurability

A set X of real numbers is said to have (Lebesgue) measure zero if there is for each positive real ϵ a sequence (I n : n  <   ∞) of intervals such that X is covered by their union, and the sum of their lengths is less than ϵ. Measure zero is a notion of smallness but does not mean small cardinality. One has the following Galilean paradox: The Cantor middle-thirds set of real numbers has cardinality equal to that of the real line, but has Lebesgue measure zero.

A set S of real numbers is Lebesgue measurable if there is a Borel set B and a measure zero set N such that S  =   (BN)∪(NB). Thus, a set is Lebesgue measurable if it is only "slightly" different from some Borel set: The set of points where it is different is of Lebesgue measure zero.

There is a function, μ, the Lebesgue measure, defined on the set of all Lebesgue-measurable sets such that:

1.

μ(X)   =   0 if, and only if, X has Lebesgue measure zero.

2.

If (X n : n  <   ∞) is a sequence of pairwise disjoint Lebesgue measurable sets, then μ(∪ n  <   X n )   =   n  <   μ(X n ).

A set is Lebesgue measurable if, and only if, it is in the domain of μ. The second property in the list is called "countable additivity."

In each model of (ZFC) set theory there are Lebesgue nonmeasurable sets. The earliest examples were considered pathological. Thus: Can the Lebesgue measure be extended to a countably additive measure ν such that each set of real numbers is measurable with respect to ν? This is the measure problem. Are projective sets of real numbers Lebesgue measurable?

III.B.1 The Measure Problem

Banach and Kuratowski proved: In any model of set theory in which CH holds there are countably many sets of real numbers such that no countably additive extension of the Lebesgue measure measures all these sets. Thus, ZFC  +   CH implies a negative answer to the measure problem.

What are the properties of a model of set theory in which the Lebesgue measure can be extended to a countably additive measure ν which measures all sets of reals? Restricting such a ν to subsets of [0,1] gives a real-valued measure on [0,1]. Using a one-to-one function this measure can be transferred to any set of cardinality 2 0 . Thus, the measure problem may be considered for cardinals. For the least cardinal κ which permits a real-valued measure we have κ 2 0 and (by the Banach-Kuratowski result) κ   > 1.

Ulam proved that if there is any set at all carrying a real-valued measure then the least κ such that there is a real-valued measure on κ is either no larger than 2 0 , or else carries a two-valued measure. Moreover, if κ is the least cardinal carrying a real-valued measure, then for any family of fewer than κ subsets of κ, each of measure zero, the union of the family has measure zero: This property of the measure is called κ-completeness. Ulam showed that no successor cardinal could be real-valued measurable and that a real-valued measurable cardinal could not be singular. Thus in any model of set theory where the measure problem has a positive solution there would be an uncountable cardinal number κ 2 0 which is not a successor cardinal, and not a singular cardinal. Such a cardinal is said to be weakly inaccessible.

An uncountable cardinal κ is said to be measurable if it carries a two-valued κ-complete measure. Ulam showed that if κ is a measurable cardinal then for each cardinal λ   <   κ one has 2λ  <   κ. Cardinal numbers having the latter property are called strongly inaccessible. The concepts of weakly and strongly inaccessible cardinals made their debut in Hausdorff's works, predating Ulam's paper by more than 20 years.

a Inaccessibility properties of cardinals, and,ZFC + I

If a model of set theory contains a weakly inaccessible cardinal κ, then in L of that model, κ is strongly inaccessible. Moreover, if some model of set theory contains a strongly inaccessible cardinal κ then the set V κ in that model is itself a model of set theory. Thus, if ZFC proved that there is a strongly inaccessible cardinal, then it would prove that there is a set which is a model of set theory. But then ZFC would prove its own consistency, violating Gödel's Second Incompleteness Theorem.

Let I denote the statement "there is a strongly inaccessible cardinal" and let ZFC   +   I be the extension of ZFC which includes I as an axiom. Since ZFC   +   I proves the consistency of ZFC it transcends ZFC in axiomatic strength. Ironically, ZFC   +   I also proves the consistency of ZFC   +   ¬I: If κ is the least strongly inaccessible cardinal in a model of set theory, then V κ is a model of set theory which contains no strongly inaccessible cardinals.

b Measurability properties of cardinals,and,ZFC + M

Let M denote the statement "there is a measurable cardinal," and let ZFC   +   M be the extension of ZFC which includes M as an axiom. By Ulam's results, ZFC   +   M proves the consistency of ZFC   +   I. But ZFC   +   M is much stronger than ZFC   +   I. In fact, if in a model of set theory κ is a measurable cardinal, then in that model there are κ many strongly inaccessible cardinals below κ. Also, ZFC   +   M proves the consistency of ZFC   +   ¬M: If κ is the least measurable cardinal in a model of set theory, then V κ is a model of set theory which contains no measurable cardinals.

Solovay showed: If there is a model of set theory in which ZFC   +   M holds, then there is a model of set theory in which the measure problem has a positive solution. The latter model is obtained by extending the former by means of forcing. He also showed: If there is a model of set theory in which the measure problem has a positive solution, then there is a model of set theory in which ZFC   +   M holds. The latter is obtained from the former by considering an appropriate inner model. In technical language, "a positive solution of the measure problem is equiconsistent with the existence of measurable cardinals."

Let V be a model of set theory and let L be the constructible universe of V. Let κ be an uncountable measurable cardinal in V, and let ν in V be a two-valued κ-complete measure. Scott showed by 1961 that ν cannot be in L. In 1964 Rowbottom gave another dramatic example of how different V and L are: the real numbers which are in L is a countable set of V (most real numbers of V are not in L). Kunen showed that V contains a canonical inner model which satsifies ZFC, contains ν, and in which ν witnesses that κ is a measurable cardinal. It is customary that L ν denote this canonical inner model. (This is analogous to the situation where for a polynomial with rational coefficients the field of complex numbers contains a canonical subfield which contains roots of that polynomial—the corresponding Galois extension of the rationals.)

It is surprising that properties of cardinal numbers much larger than 2 0 should have any influence on the real numbers. Which properties of the real line are affected by "large" cardinals? Could large cardinals in conjunction with ZFC for example decide CH?

III.B.2 Lebesgue Measurability of Projective Sets of Reals

Lusin showed that Σ1 1 sets are Lebesgue measurable, and thus so are Π1 1 sets. But the Σ2 1- and Π2 1- sets posed problems. Gödel showed that in L there is a set of real numbers which is both Σ2 1 and Π2 1, and yet not Lebesgue measurable. Is there in every model of set theory a set of real numbers which is both Σ2 1 and Π2 1, and yet not Lebesgue measurable?

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B0122274105006840

Mathematical preliminaries

Liansheng Tan , in A Generalized Framework of Linear Multivariable Control, 2017

2.7.6 Inverse Laplace transform

Two integrable functions have the same Laplace transform only if they differ on a set of Lebesgue measure zero. This means that, on the range of the transform, there is an inverse transform. In fact, besides integrable functions, the Laplace transform is a one-to-one mapping from one function space into another and in many other function spaces as well, although there is usually no easy characterization of the range. Typical function spaces in which this is true include the spaces of bounded continuous functions, the space L ( 0 , ) , or more generally tempered functions (i.e., functions of at worst polynomial growth) on ( 0 , ) . The Laplace transform is also defined and injective for suitable spaces of tempered distributions.

In these cases, the image of the Laplace transform lives in a space of analytic functions in the region of convergence. The inverse Laplace transform is given by the following complex integral, which is known by various names (the Bromwich integral, the Fourier-Mellin integral, and Mellin's inverse formula):

f ( t ) = L 1 { F } ( t ) = 1 2 π i lim T γ i T γ + i T e s t F ( s ) d s ,

where γ is a real number so that the contour path of integration is in the region of convergence of F(s). An alternative formula for the inverse Laplace transform is given by Post's inversion formula. In practice, it is typically more convenient to decompose a Laplace transform into the known transforms of functions obtained from a table, and construct the inverse by inspection.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780081019467000020

Convex Functions, Partial Orderings, and Statistical Applications

In Mathematics in Science and Engineering, 1992

12.32 Theorem

Suppose that either (i) X = , Θ ⊂ is an interval and μ is Lebesgue measure or (ii) X is the set of all integers, Θ is an interval or an interval of integers, and μ is the counting measure. If g: Θ × X → [0, ∞) is totally positive of order two (TP 2) and satisfies the semigroup property

g ( θ 1 + θ 2 , x ) = X g ( θ i , t ) g ( θ 2 , x t ) d v ( t )

for some measure v on X , and ϕ(x) is Schur-convex for x n . Then the function ψ defined for θ = (θ1, …, θ n ) ∈ Θ × ··· × Θ by

ψ ( θ ) = i = 1 n g ( θ i , x i ) ϕ ( x ) i = 1 n d μ ( x i )

is a Schur-convex function of θ.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/S0076539208628246

Two-Dimensional Euler System and the Vortex Patches Problem

Jean-Yves Chemin , in Handbook of Mathematical Fluid Dynamics, 2005

8.1 The case when the vorticity is bounded

Let us first recall the following classical Leray's theorem about bidimensional incompressible Navier–Stokes system.

Theorem 8.1. Let υ 0 be in the space E m for some real m. A unique solution υ ν of(NS ν ) exists in the space

C R + ; E m L loc 2 R + ; H 1 .

There is a fundamental theorem, proved by Delort in [45] about weak convergence for such families (υ ν ).

Theorem 8.2. Assume that υ 0 belongs to E m and that the singular part (with respect to the Lebesgue measure) of ω 0 is a positive measure. Then some sequence (ν n ) n∈ℕ converging to 0 and a solution υ of the Euler system (E) belonging to the space L loc R + ; E m exist so that

lim n υ ν n = υ weakly in the space L loc R + ; E m .

When the initial vorticity is bounded, the convergence is of course much stronger. It is described by the following theorem.

Theorem 8.3. Let υ 0 in E m . Let us denote by υ ν ν R + the family of solution of (NS ν ) associated to the initial datum υ 0 and by υ the solution of (E) associated to υ 0. Moreover, assume that ω 0L L 2. then

lim ν 0 υ ν = υ in the space L l o c R + ; E m .

More precisely, we have the following estimate. Let T be a positive number; if

ν e 2 - 2 exp C ω 0 L L 2 T 4 T

then we have

υ ν - υ L 0 , T ; L 2 2 4 ν T exp - C ω 0 L 2 L T ω 0 L 2 L e 2 - 2 exp - C ω 0 L 2 L T .

Proof. The simplest idea is the good one to prove Theorem 8.3. Let us state w ν   = υ ν   υ. By difference between (NS ν ) and (E), we get

1 w ν + υ ν ω ν = - p ~ ν + ν Δ υ ν - w ν υ .

So we have to estimate the L 2 norm of w ν . Classical energy estimate in L 2 says that

(90) d d t w ν t L : 2 2 = 2 ν Δ υ ν t , x w ν t , x d x - 2 w ν υ t , x w ν t , x d x .

Let us examine now the estimate about the family (υ ν ). In the context of the two-dimensional Navier–Stokes system, the conservation of vorticity (V) turns out to be

(Vν) 1 ω ν + υ ν ω ν - ν Δ ω ν = 0.

So, by usual energy estimate in L 2, we have

d d t ω ν t L 2 2 + 2 ν ω ν L 2 2 = 0.

In particular, we infer that, for any ν    0,

(91) ω ν t L 2 ω 0 L 2 .

Then, thanks to the maximum principle, we have that

(92) ω ν t L ω 0 L .

To estimate the right term in (90), we write that

(93) R 2 Δ υ ν t , x ω ν t , x d x υ ν t L 2 w ν t L 2 w ν t L 2 w ν t L 2 + w t L 2 2 ω 0 L 2 2 .

In order to prove Theorem 8.3, we have to estimate the term

J t def = R 2 υ t , x w ν t , x 2 d x

exactly as in the proof of Yudovich theorem as done in Section 2.3. Thus we get

(94) d d t w ν L 2 2 C a w 0 L L 2 w ν t L 2 2 + C a w 0 L L 2 1 + 2 / a w ν t L : 2 2 1 - 1 / a + 4 ν ω 0 L 2 2 .

Let us define the function δ ν (t) by

δ ν t = w ν t L 2 2 ω 0 L L 2 2 + δ ,

where δ belongs to]0, 1[. The inequality (94) can be written in the following way

d d t δ ν t C a ω 0 L L 2 δ ν t + δ ν t 1 - 1 / a + 4 ν .

Now, we use the fact that this inequality is valid for any a greater than 2. The following sequence of computations will be valid as long as δ ν (t)     1. If so, we have that

δ ν t g ν t 1 - 1 / a .

Thus the above inequality becomes

d d t δ ν t C a ω 0 L L 2 δ ν t 1 - 1 / a + 4 ν .

Now choosing a  =   2     log δ ν (t), we get that

δ ν t 4 ν + C 2 - log δ ν t ω 0 L L 2 1 + δ ν t - 1 / log δ ν t g d ν t .

It is obvious that δ ν t - 1 / log δ ν t e - 1 . Then it turns out that

(95) δ ν t 4 ν + C ω 0 L L 2 2 - log δ ν t δ ν t .

Let us define the function μ. by μ(r)  = r(2     log r). By integration, we get, as long as δ ν (t) is smaller than 1,

δ ν t 4 ν t + C ω 0 L L 2 0 t μ δ ν t d τ .

Osgood's lemma (Lemma 2.3) allows to conclude the proof.

Let us remark that if the solution of the Euler system (E) belongs to the space L loc 1 R ; L i p (which is of course the case of the initial data of vortex patch type), we have very easily, for any t, the better estimate

(96) υ ν t - υ t L 2 2 4 ν ω 0 L 2 2 0 t exp 2 τ t υ τ L d τ d τ .

To prove this inequality, we simply observe that

J t def = R 2 υ t , x w ν t , x 2 d x 2 R 2 υ t , L w ν t , L 2 2 d t ,

and then use Gronwall lemma.

In fact, as we shall see in the next section, much more can be done in this case considering the description of the convergence of the vorticity ω ν to ω.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/S1874579205800059

Wavelet Zoom

Stéphane Mallat , in A Wavelet Tour of Signal Processing (Third Edition), 2009

Self-Similar Functions

Let f be a continuous function with a compact support S. We say that f is self-similar if there exist disjoint subsets S 1, …, Sk such that the graph of f restricted to each Si is an affine transformation of f. This means that there exist a scale li > 1, a translation ri , a weight pi , and a constant ci such that

(6.73) t S i , f ( t ) = c i + p i f ( l i ( t r i ) ) .

Outside these subsets, we suppose that f is constant. Generalizations of this definition can also be used [128].

If a function is self-similar then its wavelet transform is also self-similar. Let g be an affine transformation of f:

(6.74) g ( t ) = p f ( l ( t r ) ) + c .

Its wavelet transform is

W g ( u , s ) = + g ( t ) 1 s ψ ( t u s ) d t .

With the change of variable t′ = l(t — r), since ψ has a zero average, the affine relation (6.74) implies

W g ( u , s ) = p l W f ( l ( u r ) , s l ) .

Suppose that ψ has a compact support included in [—K, K]. The affine invariance (6.73) of f over Si = [ai, bi ] produces an affine invariance for all wavelets having a support included in Si. For any s < (bi — ai )/K and any u ∈[ai + Ks, bi Ks],

W g ( u , s ) = p i l i W f ( l i ( u r i ) , s l i ) .

The wavelet transform's self-similarity implies that the positions and values of its modulus maxima are also self-similar. This can be used to recover unknown affine-invariance properties with a voting procedure based on wavelet modulus maxima [310].

EXAMPLE 6.9

A Cantor measure is constructed over a Cantor set. Let 0(x) = dx be the uniform Lebesgue measure on [ 0, 1]. As in the Cantor set construction, this measure is subdivided into three uniform measures over [0, 1/3], [1/3, 2/3], and [2/3, 1] with integrals equal to p 1, 0, and p 2 , respectively. We impose p 1 + p 2 = 1 to obtain a total measure 1 on [0, 1] with an integral equal to 1. This operation is iteratively repeated by dividing each uniform measure of integral p over [ a, a + l ] into three equal parts where the integrals are p 1 p, 0, and p 2 p , respectively, over [a, a + l/3], [a + l/3, a + 2l/3], and [a + 2l/3, a + l]. This is illustrated in Figure 6.17. After each subdivision, the resulting measure n has a unit integral. In the limit, we obtain a Cantor measure of unit integral with a support that is the triadic Cantor set.

FIGURE 6.17. Two subdivisions of the uniform measure on [0, 1] with left and right weights p 1 and p 2. The Cantor measure is the limit of an infinite number of these subdivisions.

EXAMPLE 6.10

A devil's staircase is the integral of a Cantor measure:

(6.75) f ( t ) = 0 t d μ ( x ) .

It is a continuous function that increases from 0 to 1 on [0, 1]. The recursive construction of the Cantor measure implies that f is self-similar:

f ( t ) = { p 1 f ( 3 t ) if  t [ 0 , 1 / 3 ] p 1 if  t [ 1 / 3 , 2 / 3 ] p 1 + p 2 f ( 3 t 2 ) if  t [ 2 / 3 , 0 ] .

Figure 6.18 displays the devil's staircase obtained with p 1 = p 2 = 0.5. The wavelet transform in (b) is calculated with a wavelet that is the first derivative of a Gaussian. The self-similarity of f yields a wavelet transform and modulus maxima that are self-similar. The subdivision of each interval in three parts appears through the multiplication by 2 maxima lines when the scale is multiplied by 3. This Cantor construction is generalized with different interval subdivisions and weight allocations beginning from the same Lebesgue measure 0 on [0, 1] [5].

FIGURE 6.18. Devil's staircase calculated from a Cantor measure with equal weights p 1 = p 2 = 0.5. (a) Wavelet transform Wf(u, s) computed with ψ = —θ′ where θ is Gaussian. (b) Wavelet transform modulus maxima.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123743701000100

Measure and Integration

G. de Barra , in Encyclopedia of Physical Science and Technology (Third Edition), 2003

XI The Radon–Nikodým Property for Banach Spaces

In this section we consider measures taking their values in a Banach space, that is a normed space which is complete (the examples of L p , L were considered in Section IV). This is more general than the finite dimensional case considered in Section X. We consider first the question of integrating functions taking values in a Banach space, with respect to a real measure. If we then set m(A)   =   fdμ we see that such a theory of integration gives rise to a Banach space–valued measure. Which measures are formed in this way depends on the Banach space in question, since the existence of such a function f implies that a version of the Radon–Nikodým theorem holds. The most useful version of this integral is the Bochner integral. First, we describe the details of such measures. Let Y be a Banach space with norm ∥.∥. We will suppose that we have a space X with measurable sets I . Then m: I   Y is a vector measure if m(∅)   =   0 and whenever {A i } is a countable family of disjoint sets of I then m(∪ i=1 A i )   =   i=1 m(A i ) where this sum is norm convergent, that is, the sequence of vectors ∑ n i=1 m(A i ) converges in the normed space (Y,∥.∥) as n    ∞. We will suppose that the space X is equipped with a finite measure μ: for some purposes it is more convenient to assume, as we will, that μ(X)   =   1, that is: μ is a probability measure. The average range of m is the set in Y given by AR(m)   =   {m(A)/μ(A): A  I , μ(A)   >   0}. As in the scalar case we say the vector-valued measure m is absolutely continuous with respect to μ, (m    μ) if μ(A)   =   0 implies m(A)   =   0 (zero vector).

Then to set up a theory of integration we consider simple functions as before of form f = i = 1 n x i χ A i with x i   Y, A i   I . Then for any set A  I , we can define for such an f A fdμ   =   n i=1 x i μ(A  A i ). Then generally a function f: X  Y is Bochner integrable if there is a sequence {f n } of simple functions with lim n    f n (x)   = f(x) a.e. (μ) and lim n    ∫∥f(x)   f n (x)∥dμ   =   0. This provides an unambiguous definition if we set ∫ fdμ   =   lim ∫ f n dμ, and we then write f  L 1 Y (X, I , μ), and we say ∫ fdμ is the Bochner integral of f. Regarding measurability, a Y-valued function is said to be strongly measurable if it is the limit a.e. of simple functions. A weaker requirement is that each scalar valued function Ff where F is a continuous linear functional on Y should be measurable in the usual sense. For functions f with ∥f∥ integrable and which satisfy a condition on the range (always true for integrable functions), the definitions are equivalent. Much of the theory extends; for example, we have a dominated convergence theorem: we require ∥f n (x)∥   g(x) a.e. (μ). where g is integrable and lim f n (x)   = f(x) a.e., then f is integrable, ∫ fdμ   =   lim ∫ f n dμ and lim ∫ ∥f n fdμ   =   0 (i.e. f n converges to f in the mean).

The Radon–Nikodým property turns out to be very important for when considering such vector-valued measures and functions; whether it holds depends on the geometry of Y. Let K be a closed bounded convex set in the Banach space Y. Then K has the Radon–Nikodým property (RNP) for {X, I , μ} if for any Y-valued measure m which is absolutely continuous with respect to μ and whose average range AR(m) lies in K there exists a function f  L 1 Y (X, I , μ) such that m(A)   =   A fdμ for each A  I . More generally, if E is a closed convex (possibly unbounded) set of Y (e.g., E  = Y), then E has the RNP for {X, I , μ} if each closed bounded subset K of E has the RNP for {X, I , μ}. Finally, K has the RNP if it has the RNP for each {X, I , μ} for μ a probability measure. An example of a space without the RNP is at hand.

EXAMPLE 1.

[Bourgin]: Let X  =   [0, 1], I   =   Lebesgue measurable sets, μ   = Lebesgue measure. Let Y  = L 1[0, 1] and define m by m(A)   =   χ A for each A  I . Then m    μ, AR(m), lies in the closed unit ball of Y but Y has not the RNP. For if it had, there would exist a function f  L 1 Y (μ) with m(A)   =   A fdμ for each A  I . So for each x, f(x) is a real-valued measurable function taking the value f(x)(s) at s    [0, 1]. Let I be an interval in [0, 1], then for all A  I

A χ I s f t s d s d t = χ I s A f t d t d s = χ I s m A s d s = χ I s χ A s d s = A χ I t d t

So

I f t s d s = χ I s f t s d s = χ I t

outside a set of measure zero, and for such t not in I, ∫ I f(t)(s)ds  =   0. Allowing I to vary over all subintervals I r of [0, 1] with rational end-points we get an exceptional set of measure zero. Any set on which the function f(t) is positive can be approximated by an interval, and choosing a subinterval I r of this interval with t  I r we see that f(t) must vanish a.e. for almost all t. This contradicts the fact that ∫ A f dμ   =   χ A ≠ 0 for A a set of positive measure.

The RNP is closely related to the convergence of martingales, which we now describe. Let { I n } be a sequence of sub σ-algebras of I , with I n   I m whenever n  < m. Suppose f n   L 1 Y (X, I , μ) for each n. Suppose also that the functions f n are strongly I n measurable for each n and that ∫ A f n dμ   =   A f m dμ for each A in I n provided n  < m. Then the sequence {f n , I n } is a Y-valued martingale. A closed bounded convex set K in Y has the martingale convergence property (MCP) for {X, I , μ} if whenever {f n , I n } is a martingale such that ∪ n=1 I n generates I and f n   L 1 K (X, I n , μ) for each n then there exists f  L 1 K (X, I , μ) such that lim n    f n (x)   f(x)∥   =   0 a.e., (μ). The corresponding statements for closed convex sets and the definition of "Y has MCP" follow exactly as for RNP. In fact a closed bounded convex set has the RNP if, and only if, it has the MCP. To see that a closed convex set K has the MCP provided it has the RNP we let m n (A)   =   A f n dμ where μ is a probability measure and {f n , I n } form a martingale and A  I n . Since μ is a probability measure AR(m n ) lies in K. Then {m n (A)} converges for each A in ∪ I n . By a limiting argument we get lim m n (A)   = m(A), which is a measure by the Vitali-Hahn-Saks Theorem. By construction m n     μ and in the limit m    μ and AR(m)   K. So by the RNP there exists f  L 1(X, I , μ) such that ∫ A fdμ   = m(A) for each A in I . By the martingale property ∫ A fdμ   =   A f n dμ for each A  I n and by a theorem of Lévy we deduce lim n    f n (x)   f(x)∥   =   0 a.e. (μ). This theorem of Lévy uses conditional expectations whose existence depends on the classical Radon–Nikodým result given in Section VII.

A third property of Banach spaces which turns out to be related is that of dentability. A bounded set D in Y in s-dentable if for each ε   >   0 there exists x ε in D with x εs-co(D\ U ε(x ε)), where U ε(y) denotes the ball radius ε and center Y and

s c o B = i = x α i x i : x i B , α i 0 , i = 1 α i = 1 .

Replacing s-co by closed convex hull, we get the stronger definition of D dentable. Drawing a diagram we see that D dentable implies that D is in some sense rotund for some part of its boundary. Dentability can be defined equivalently in terms of slices. For a bounded set D in Y, the slice s (D, f, α)   =   {x: x  D, f(x)   >   sup y  D f(y)     α} where α   >   0, and f is a continuous linear functional on Y. Then it is an easy consequence of the Hahn–Banach theorem that D is dentable if, and only if, it has slices of arbitrarily small diameter. Now it follows easily from the definitions that if D is dentable then it is s-dentable. But the converse is not true, as the following example shows.

EXAMPLE 2.

Let Y  = C[0, 1] the Banach spaces of functions continuous on [0, 1] and with the norm ∥f  =   max ∣f(x)∣ and let D be the unit ball U 1[0] of Y. Then with f(x)     1 as x ε, for any ε   >   0, we see that D is s-dentable. However, D is not dentable for if f  D and for any fixed n we choose function f 1 n ,     , f n n with f n i (t)   = f(t) for t ∉ [i    1/n, i/n] and ∣f n i (t)   f(t)∣   >   1/2 for some t    (i −1/n, i/n), for each i. Then ∥f n i   f  >   1/2 but ∥ ∑ n i=1 1/nf n i f    2/n. So f c o ( D \ U 1 / 2 ( f ) ) . So D is not dentable. However, it can be shown by a geometrical argument that if K is a closed convex set in Y with interior K (int K) nonempty then if K is not dentable int K is not s-dentable. See Davis and Phelps for details. So every bounded set of Y is dentable if, and only if, every bounded set is s-dentable. This turns out to be the important property. Indeed, if a closed bounded convex set K has every subset s-dentable then K has the RNP.

To outline the proof: consider the special case when m is in the form

m A = i = 1 n x i μ A B i ,

with {B i } being a partition of X into sets of positive measure and with x i   K. Then for μ(A)   >   0 and A  B i we have

m ( A ) μ ( A ) = x i = m ( B i ) μ ( B i ) .

Set f = i = 1 n x i χ B i ; then for each A in I we have

A f d μ = i = 1 n x i μ A B i = m A .

So f  = dm/dμ and K has the RNP. In general m will not be of this simple form, but we can approximate it by a sequence of such measures, obtaining a convergent sequence of such derivatives f n . So we need to know that we can partition X into sets B i such that for subsets B of B i , with μ(B)   >   0, and for a suitable x in K we have ∥m(B)/μ(B)   x i   <   ε. Then we use a sequence of such partitions with ε's tending to zero to obtain the approximation. So we let E  =   {m(B)/μ(B):   μ(B)   >   0, B  B i }, a subset of K as AR(m)   K. So E is s-dentable and so we can find x i s-co(E\U ε(x i )), with x i   E, x i   = m(B)/μ(B), say. Suppose we can find sets C in B i with ∥m(C)/μ(C)   x i     ε. Maximizing such a family of disjoint sets C we could write B i = i 1 C i and then

x i = m ( B ) μ ( B ) = j = 1 μ ( C j ) μ ( B x j

with x j

= m ( C j ) μ ( C j ) E and i = 1 μ ( C i ) μ ( B ) = 1 ,

contradicting the s-dentability of E.

It can be shown fairly easily that if K has the MCP, then its subsets are s-dentable (Bourgin). Suppose not, so there is a subset D of K which is not s-dentable. Then a nonconvergent martingale can be constructed inductively, using the interval [0, 1], μ   =   Lebesgue measure. So there exists a positive ε such that for each x  D, x  s-co(D\∪ε(x)). At the first stage choose x 0  D, define f 0  = x 0 a constant function and let I 0  =   {[0, 1),∅}, the minimal σ-algebra. By the property of D there exist {t j :   0   < t j   <   1, Σ t j   =   1} and points y j of D with ∥ x 0  y j     ε, Σ t j y j   = x 0. Partition [0, 1] into half-open intervals B j , one for each t j with μ(B j )   = t j . Let I 1 be the σ-algebra generated by {B j }, and f 1 = j = 1 y j χ B j . At the next stage of the induction we partition each B j to get a larger σ-algebra, and a function f 2. By the non-s-dentable property of D, this martingale so constructed will not converge.

This result establishes the equivalence of the properties RNP, MCP, dentability of subsets and s-dentability of subsets for a Banach space Y. Such spaces, to some extent, have the nice properties of finite dimensional spaces. Among the various other properties of such spaces is the fact that they possess the Krein–Milman property referred to in Section IX; that is, for any closed bounded convex set K in Y, K is the closed convex hull of its extreme points. Indeed we saw earlier that the space C[0, 1] has a unit ball U 1[0], which is not dentable: U 1[0] has just two extreme points, the constant functions +1 and −1, so it does not possess the KMP. To prove the KMP it is sufficient to show that every bounded convex set has an extreme point. For dentable sets this can be done using the fact that they have slices of arbitrarily small diameter. A nested decreasing sequence of such slices is found the intersection of which yields the desired extreme point. For further details, related results, and references see Bourgin (1983) and Phelps (1988). We have seen that the spaces C[0, 1], L 1[0, 1] have not the RNP. However, the spaces L p of Section IV for which 1   < p  <   ∞ have the RNP, as have all reflexive spaces. It is not known whether the KMP implies the RNP. Another important property of finite-dimensional spaces is that convex functions are differentiable almost everywhere. To see how this extends, we say that the real-valued function f on a Banach space Y is Frechét differentiable at y if there exists a linear functional ϕ′(y) on Y such that for every ε   >   0 there exists δ   >   0 such that ∥ ϕ(y  + x)     ϕ(y)     ϕ′(y)(x)∥     ε∥x∥ whenever ∥x    δ. Then the space Y is said to be an Asplund space if every continuous convex function f on a nonempty open set D in Y is Frechét differentiable at each point y of some dense set E of D where the set E is a G δ set, that is: E  =   i=1 G i when the sets G i are dense open sets of D. Then the space Y is an Asplund space, if, and only if, the Banach space Y * of continuous linear functionals on Y has the RNP.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B0122274105004130

Convex Functions, Partial Orderings, and Statistical Applications

In Mathematics in Science and Engineering, 1992

6.38 Theorem

Let λ be a regular Borel measure such that0 1 |dλ(x)| < ∞, and let dx denote Lebesgue measure, then0 1 f(x) dλ(x) ≥ ∫0 a f(x) dx holds for all f ∈ M 0 iff

t 1 d λ ( x ) 0 f o r e v e r y t [ 0 , 1 ]

and

a min 0 t 1 { t + t 1 d λ ( x ) } .

Therefore

a = min 0 t 1 { t 1 d λ( x ) }

is the best possible choice.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/S0076539208628180

Convex Functions, Partial Orderings, and Statistical Applications

In Mathematics in Science and Engineering, 1992

13.19 Definition

Let f(x): ℝ n → [0, ∞) be a probability density function such that P is absolutely continuous w.r.t. Lebesgue measure. f is said to be log-concave if

(13.19) f ( α x 1 + ( 1 α ) x 2 ) ( f ( x 1 ) ) α ( f ( x 2 ) ) 1 α

holds for all α ∈ [0, 1] and all x 1, x 2 ∈ ℝ n . When f(x) > 0 for all x ∈ ℝ n , then (13.19) is equivalent to

(13.20) log f ( α x 1 + ( 1 α ) x 2 ) α log f ( x 1 ) + ( 1 α ) log f ( x 2 ) .

The main theorem of Prékopa (1971) states:

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/S0076539208628258