Law of distribution of large numbers. Law of large numbers. Limit theorems. Methods of statistical data analysis

02.11.2023

Law of Large Numbers

The practice of studying random phenomena shows that although the results of individual observations, even those carried out under the same conditions, may differ greatly, at the same time, the average results for a sufficiently large number of observations are stable and weakly depend on the results of individual observations. The theoretical basis for this remarkable property of random phenomena is the law of large numbers. The general meaning of the law of large numbers is that the combined action of a large number of random factors leads to a result that is almost independent of chance.

Central limit theorem

Lyapunov's theorem explains the widespread distribution of the normal distribution law and explains the mechanism of its formation. The theorem allows us to state that whenever a random variable is formed as a result of the addition of a large number of independent random variables, the variances of which are small compared to the dispersion of the sum, the distribution law of this random variable turns out to be an almost normal law. And since random variables are always generated by an infinite number of causes and most often none of them has a dispersion comparable to the dispersion of the random variable itself, most random variables encountered in practice are subject to the normal distribution law.

Let us dwell in more detail on the content of the theorems of each of these groups

In practical research, it is very important to know in what cases it is possible to guarantee that the probability of an event will be either sufficiently small or as close to one as desired.

Under law of large numbers and is understood as a set of propositions which state that, with a probability anywhere close to one (or zero), an event will occur depending on a very large, indefinitely increasing number of random events, each of which has only a small influence on it.

More precisely, the law of large numbers is understood as a set of propositions that state that with a probability as close to unity as possible, the deviation of the arithmetic mean of a sufficiently large number of random variables from a constant value - the arithmetic mean of their mathematical expectations - will not exceed a given arbitrarily small number.

Individual, isolated phenomena that we observe in nature and in social life often appear as random (for example, a registered death, the gender of a child born, air temperature, etc.) due to the fact that such phenomena are influenced by many factors not related to the essence of the emergence or development of a phenomenon. It is impossible to predict their total effect on an observed phenomenon, and they manifest themselves differently in individual phenomena. Based on the results of one phenomenon, nothing can be said about the patterns inherent in many such phenomena.

However, it has long been noted that the arithmetic average of the numerical characteristics of some signs (relative frequencies of occurrence of an event, measurement results, etc.) with a large number of repetitions of the experiment is subject to very slight fluctuations. In the average, a pattern inherent in the essence of phenomena appears to be manifested; in it, the influence of individual factors that made the results of single observations random is cancelled. Theoretically, this behavior of the average can be explained using the law of large numbers. If some very general conditions regarding random variables are met, then the stability of the arithmetic mean will be an almost certain event. These conditions constitute the most important content of the law of large numbers.

The first example of the operation of this principle can be the convergence of the frequency of occurrence of a random event with its probability as the number of trials increases - a fact established in Bernoulli’s theorem (Swiss mathematician Jacob Bernoulli(1654-1705). Bernull's theorem is one of the simplest forms of the law of large numbers and is often used in practice. For example, the frequency of occurrence of any quality of a respondent in a sample is taken as an estimate of the corresponding probability).

Outstanding French mathematician Simeon Denny Poisson(1781-1840) generalized this theorem and extended it to the case when the probability of events in a test changes regardless of the results of previous tests. He was the first to use the term “law of large numbers.”

Great Russian mathematician Pafnutiy Lvovich Chebyshev(1821 - 1894) proved that the law of large numbers operates in phenomena with any variation and also extends to the law of averages.

A further generalization of the theorems of the law of large numbers is associated with the names A.A.Markov, S.N.Bernstein, A.Ya.Khinchin and A.N.Kolmlgorov.

The general modern formulation of the problem, the formulation of the law of large numbers, the development of ideas and methods for proving theorems related to this law belong to Russian scientists P. L. Chebyshev, A. A. Markov and A. M. Lyapunov.

CHEBYSHEV'S INEQUALITY

Let us first consider the auxiliary theorems: Chebyshev's lemma and inequality, with the help of which the law of large numbers in Chebyshev form can be easily proven.

Lemma (Chebyshev).

If among the values ​​of a random variable X there are no negative ones, then the probability that it will take on some value exceeding the positive number A is no more than a fraction, the numerator of which is the mathematical expectation of the random variable, and the denominator is the number A:

Proof.Let the distribution law of the random variable X be known:

(i = 1, 2, ..., ), and we consider the values ​​of the random variable to be in ascending order.

With respect to the number A, the values ​​of the random variable are divided into two groups: some do not exceed A, and others are greater than A. Let us assume that the first group includes the first values ​​of the random variable ().

Since , then all terms of the sum are non-negative. Therefore, discarding the first terms in the expression we obtain the following inequality:

Because the

,

That

Q.E.D.

Random variables can have different distributions with the same mathematical expectations. However, for them Chebyshev’s lemma will give the same estimate of the probability of one or another test result. This drawback of the lemma is related to its generality: it is impossible to achieve a better estimate for all random variables at once.

Chebyshev's inequality .

The probability that the deviation of a random variable from its mathematical expectation will exceed the absolute value of a positive number is no more than a fraction, the numerator of which is the variance of the random variable, and the denominator is the square

Proof.Since it is a random variable that does not take negative values, we apply the inequality from Chebyshev's lemma for a random variable at:


Q.E.D.

Consequence. Because the

,

That

- another form of Chebyshev's inequality

Let us accept without proof the fact that Chebyshev’s lemma and inequality are also true for continuous random variables.

Chebyshev's inequality underlies the qualitative and quantitative statements of the law of large numbers. It determines the upper bound on the probability that the deviation of the value of a random variable from its mathematical expectation is greater than a certain specified number. It is remarkable that Chebyshev’s inequality gives an estimate of the probability of an event for a random variable whose distribution is unknown, only its mathematical expectation and variance are known.

Theorem. (Law of large numbers in Chebyshev form)

If the variances of independent random variables are limited by one constant C, and their number is sufficiently large, then the probability that the deviation of the arithmetic mean of these random variables from the arithmetic mean of their mathematical expectations will not exceed the absolute value of a given positive number, no matter how small it is, is as close to unity as possible. neither was:

.

We accept the theorem without proof.

Corollary 1. If independent random variables have the same, equal, mathematical expectations, their variances are limited by the same constant C, and their number is large enough, then no matter how small the given positive number is, however close to unity the probability is that the deviation of the average the arithmetic of these random variables will not exceed in absolute value.

The fact that the arithmetic mean of the results of a sufficiently large number of its measurements made under the same conditions is taken as an approximate value of an unknown quantity can be justified by this theorem. Indeed, the measurement results are random, since they are influenced by many random factors. The absence of systematic errors means that the mathematical expectations of individual measurement results are the same and equal. Consequently, according to the law of large numbers, the arithmetic mean of a sufficiently large number of measurements will differ practically as little as desired from the true value of the desired quantity.

(Recall that errors are called systematic if they distort the measurement result in the same direction according to a more or less clear law. These include errors that appear as a result of imperfect instruments (instrumental errors), due to the personal characteristics of the observer (personal errors) and etc.)

Corollary 2 . (Bernoulli's theorem.)

If the probability of the occurrence of event A in each of the independent trials is constant, and their number is sufficiently large, then the probability that the frequency of occurrence of the event differs as little as desired from the probability of its occurrence is arbitrarily close to unity:

Bernoulli's theorem states that if the probability of an event is the same in all trials, then as the number of trials increases, the frequency of the event tends to the probability of the event and ceases to be random.

In practice, it is relatively rare to encounter experiments in which the probability of the occurrence of an event in any experiment is constant, more often it varies in different experiments. The Poisson theorem applies to a test scheme of this type:

Corollary 3 . (Poisson's theorem.)

If the probability of the occurrence of an event in the -th trial does not change when the results of previous tests become known, and their number is sufficiently large, then the probability that the frequency of occurrence of the event differs arbitrarily little from the arithmetic average of the probabilities is arbitrarily close to unity:

Poisson's theorem states that the frequency of an event in a series of independent trials tends to the arithmetic mean of its probabilities and ceases to be random.

In conclusion, we note that none of the theorems considered gives either an exact or even an approximate value of the desired probability, but only its lower or upper limit is indicated. Therefore, if it is necessary to establish the exact or at least approximate value of the probabilities of the corresponding events, the possibilities of these theorems are very limited.

Approximate probabilities for large values ​​can only be obtained using limit theorems. In them, additional restrictions are imposed on random variables (as is the case, for example, in Lyapunov’s theorem), or random variables of a certain type are considered (for example, in the Moivre-Laplace integral theorem).

The theoretical significance of Chebyshev's theorem, which is a very general formulation of the law of large numbers, is great. However, if we apply it to the question of whether it is possible to apply the law of large numbers to a sequence of independent random variables, then if the answer is affirmative, the theorem will often require that there be much more random variables than is necessary for the law of large numbers to take effect. This disadvantage of Chebyshev's theorem is explained by its general nature. Therefore, it is desirable to have theorems that would more accurately indicate the lower (or upper) bound of the desired probability. They can be obtained by imposing some additional restrictions on random variables, which are usually satisfied for random variables encountered in practice.

NOTES ON THE CONTENT OF THE LAW OF LARGE NUMBERS

If the number of random variables is large enough and they satisfy some very general conditions, then no matter how they are distributed, it is almost certain that their arithmetic mean deviates as little as desired from a constant value - the arithmetic mean of their mathematical expectations, i.e. is an almost constant value. This is the content of the theorems related to the law of large numbers. Consequently, the law of large numbers is one of the expressions of the dialectical connection between chance and necessity.

One can give many examples of the emergence of new qualitative states as manifestations of the law of large numbers, primarily among physical phenomena. Let's consider one of them.

According to modern concepts, gases consist of individual particles - molecules that are in chaotic motion, and it is impossible to say exactly where at a given moment it will be and at what speed this or that molecule will move. However, observations show that the total effect of molecules, for example gas pressure on

the wall of the vessel, manifests itself with amazing consistency. It is determined by the number of blows and the strength of each of them. Although the first and second are a matter of chance, the devices do not detect fluctuations in gas pressure under normal conditions. This is explained by the fact that due to the huge number of molecules, even in the smallest volumes

a change in pressure by a noticeable amount is practically impossible. Consequently, the physical law stating the constancy of gas pressure is a manifestation of the law of large numbers.

The constancy of pressure and some other characteristics of gas at one time served as a compelling argument against the molecular theory of the structure of matter. Subsequently, they learned to isolate a relatively small number of molecules, ensuring that the influence of individual molecules still remained, and thus the law of large numbers could not manifest itself to a sufficient extent. Then it was possible to observe fluctuations in gas pressure, confirming the hypothesis about the molecular structure of the substance.

The law of large numbers underlies various types of insurance (insurance of human life for all possible periods, property, livestock, crops, etc.).

When planning the range of consumer goods, the population's demand for them is taken into account. This demand reveals the effect of the law of large numbers.

The sampling method, widely used in statistics, finds its scientific basis in the law of large numbers. For example, the quality of wheat brought from a collective farm to a procurement point is judged by the quality of grains accidentally captured in a small measure. There is not much grain in the measure compared to the entire batch, but in any case, the measure is chosen such that there are enough grains in it for

manifestations of the law of large numbers with an accuracy that satisfies the need. We have the right to take the corresponding indicators in the sample as indicators of contamination, humidity and average grain weight of the entire batch of incoming grain.

Further efforts of scientists to deepen the content of the law of large numbers were aimed at obtaining the most general conditions for the applicability of this law to a sequence of random variables. There have been no fundamental successes in this direction for a long time. After P. L. Chebyshev and A. A. Markov, only in 1926 did the Soviet academician A. N. Kolmogorov manage to obtain the conditions necessary and sufficient for the law of large numbers to be applicable to a sequence of independent random variables. In 1928, the Soviet scientist A. Ya. Khinchin showed that a sufficient condition for the applicability of the law of large numbers to a sequence of independent identically distributed random variables is the existence of their mathematical expectation.

For practice, it is extremely important to fully clarify the question of the applicability of the law of large numbers to dependent random variables, since phenomena in nature and society are mutually dependent and mutually determine each other. Much work has been devoted to clarifying the restrictions that need to be imposed

on dependent random variables so that the law of large numbers can be applied to them, and the most important ones belong to the outstanding Russian scientist A. A. Markov and the prominent Soviet scientists S. N. Bernstein and A. Ya. Khinchin.

The main result of these works is that the law of large numbers can be applied to dependent random variables only if a strong dependence exists between random variables with close numbers, and between random variables with distant numbers the dependence is sufficiently weak. Examples of random variables of this type are numerical characteristics of climate. The weather of each day is noticeably influenced by the weather of the previous days, and the influence noticeably weakens as the days move away from each other. Consequently, the long-term average temperature, pressure and other characteristics of the climate of a given area, in accordance with the law of large numbers, should practically be close to their mathematical expectations. The latter are objective characteristics of the climate of the area.

In order to experimentally test the law of large numbers, the following experiments were carried out at different times.

1. Experience Buffon. The coin is tossed 4040 times. The coat of arms appeared 2048 times. The frequency of its occurrence turned out to be equal to 0.50694 =

2. Experience Pearson. The coin is tossed 12,000 and 24,000 times. The frequency of the coat of arms falling out in the first case turned out to be 0.5016, in the second - 0.5005.

H. Experience Vestergaard. From an urn in which there were equal numbers of white and black balls, 5011 white and 4989 black balls were obtained after 10,000 draws (with the next removed ball being returned to the urn). The frequency of white balls was 0.50110 = (), and the frequency of black balls was 0.49890.

4. Experience V.I. Romanovsky. Four coins are tossed 21,160 times. The frequencies and frequencies of various combinations of coat of arms and hash marks were distributed as follows:

Combinations of the number of heads and tails

Frequencies

Frequencies

Empirical

Theoretical

4 and 0

1 181

0,05858

0,0625

3 and 1

4909

0,24350

0,2500

2 and 2

7583

0,37614

0,3750

1 and 3

5085

0,25224

0,2500

1 and 4

0,06954

0,0625

Total

20160

1,0000

1,0000

The results of experimental tests of the law of large numbers convince us that experimental frequencies are very close to probabilities.

CENTRAL LIMIT THEOREM

It is not difficult to prove that the sum of any finite number of independent normally distributed random variables is also normally distributed.

If independent random variables are not normally distributed, then some very loose restrictions can be imposed on them, and their sum will still be normally distributed.

This problem was posed and solved mainly by Russian scientists P. L. Chebyshev and his students A. A. Markov and A. M. Lyapunov.

Theorem (Lyapunov).

If independent random variables have finite mathematical expectations and finite variances , their number is quite large, and with unlimited increase

,

where are the absolute central moments of the third order, then their sum has a distribution with a sufficient degree of accuracy

(In fact, we present not Lyapunov’s theorem, but one of its corollaries, since this corollary is quite sufficient for practical applications. Therefore, the condition, which is called Lyapunov’s condition, is a stronger requirement than is necessary to prove Lyapunov’s theorem itself.)

The meaning of the condition is that the effect of each term (random variable) is small compared to the total effect of all of them. Many random phenomena occurring in nature and in social life proceed precisely according to this pattern. In this regard, Lyapunov's theorem is of exceptionally great importance, and the normal distribution law is one of the basic laws in probability theory.

Let, for example, be produced measurement of some size. Various deviations of observed values ​​from its true value (mathematical expectation) are obtained as a result of the influence of a very large number of factors, each of which generates a small error, and . Then the total measurement error is a random variable, which, according to Lyapunov’s theorem, should be distributed according to the normal law.

At firing a gun under the influence of a very large number of random causes, projectiles are scattered over a certain area. Random impacts on the projectile trajectory can be considered independent. Each cause causes only a slight change in the trajectory compared to the total change under the influence of all causes. Therefore, we should expect that the deviation of the projectile explosion location from the target will be a random variable distributed according to a normal law.

According to Lyapunov’s theorem, we can expect that, for example, adult male height is a random variable distributed according to a normal law. This hypothesis, as well as those considered in the previous two examples, agrees well with observations. To confirm this, we present the distribution by height of 1000 adult male workers, the corresponding theoretical numbers of men, i.e., the number of men who should have the height of these groups, based on the assumption of the distribution the height of men according to the normal law.

Height, cm

number of men

experimental data

theoretical

forecasts

143-146

146-149

149-152

152-155

155-158

158- 161

161- 164

164-167

167-170

170-173

173-176

176-179

179 -182

182-185

185-188

It would be difficult to expect a more accurate agreement between the experimental data and the theoretical data.

One can easily prove as a consequence of Lyapunov’s theorem a proposition that will be necessary in the future to justify the sampling method.

Offer.

The sum of a sufficiently large number of identically distributed random variables having absolute central moments of the third order is distributed according to the normal law.

Limit theorems of probability theory, the Moivre-Laplace theorem explain the nature of the stability of the frequency of occurrence of an event. This nature lies in the fact that the limiting distribution of the number of occurrences of an event with an unlimited increase in the number of trials (if the probability of the event is the same in all trials) is a normal distribution.

System of random variables.

The random variables considered above were one-dimensional, i.e. were determined by one number, however, there are also random variables that are determined by two, three, etc. numbers. Such random variables are called two-dimensional, three-dimensional, etc.

Depending on the type of random variables included in the system, systems can be discrete, continuous or mixed if the system includes different types of random variables.

Let's take a closer look at systems of two random variables.

Definition. Law of distribution system of random variables is a relation that establishes a connection between the areas of possible values ​​of a system of random variables and the probabilities of the system appearing in these areas.

Example. From an urn containing 2 white and three black balls, two balls are taken out. Let be the number of white balls drawn, and the random variable is defined as follows:


Let's create a distribution table for the system of random variables:

Since is the probability that no white balls are drawn (which means two black balls are drawn), while , then

.

Probability

.

Probability

Probability - the probability that no white balls are drawn (and, therefore, two black balls are drawn), while , then

Probability - the probability that one white ball is drawn (and, therefore, one black), while , then

Probability - the probability that two white balls are drawn (and, therefore, no black ones), while , then

.

Thus, the distribution series of a two-dimensional random variable has the form:

Definition. Distribution function a system of two random variables is called a function of two argumentsF( x, y) , equal to the probability of joint fulfillment of two inequalitiesX< x, Y< y.


Let us note the following properties of the distribution function of a system of two random variables:

1) ;

2) The distribution function is a non-decreasing function for each argument:

3) The following is true:

4)


5) Probability of hitting a random point ( X, Y ) into an arbitrary rectangle with sides parallel to the coordinate axes, is calculated by the formula:


Distribution density of a system of two random variables.

Definition. Joint distribution density probabilities of a two-dimensional random variable ( X, Y ) is called the second mixed partial derivative of the distribution function.

If the distribution density is known, then the distribution function can be found using the formula:

The two-dimensional distribution density is non-negative and the double integral with infinite limits of the two-dimensional density is equal to one.

From the known density of the joint distribution, one can find the distribution density of each of the components of a two-dimensional random variable.

; ;

Conditional laws of distribution.

As shown above, knowing the joint distribution law, you can easily find the distribution laws of each random variable included in the system.

However, in practice, the inverse problem is often faced - using the known laws of distribution of random variables, find their joint distribution law.

In the general case, this problem is unsolvable, because the distribution law of a random variable does not say anything about the relationship of this variable with other random variables.

In addition, if random variables are dependent on each other, then the distribution law cannot be expressed through the laws of distribution of components, because must establish connections between components.

All this leads to the need to consider conditional distribution laws.

Definition. The distribution of one random variable included in the system, found under the condition that another random variable has taken a certain value, is called conditional distribution law.

The conditional distribution law can be specified by both the distribution function and the distribution density.

Conditional distribution density is calculated using the formulas:

The conditional distribution density has all the properties of the distribution density of one random variable.

Conditional mathematical expectation.

Definition. Conditional mathematical expectation discrete random variable Y at X = x (x – a certain possible value of X) is the product of all possible values Y on their conditional probabilities.

For continuous random variables:

,

Where f( y/ x) – conditional density of the random variable Y at X = x.

Conditional mathematical expectationM( Y/ x)= f( x) is a function of X and is called regression function X on Y.

Example.Find the conditional mathematical expectation of the component Y at

X = x 1 =1 for a discrete two-dimensional random variable given by the table:

Y

x 1 =1

x 2 =3

x 3 =4

x 4 =8

y 1 =3

0,15

0,06

0,25

0,04

y 2 =6

0,30

0,10

0,03

0,07

The conditional variance and conditional moments of a system of random variables are determined similarly.

Dependent and independent random variables.

Definition. Random variables are called independent, if the distribution law of one of them does not depend on the value of the other random variable.

The concept of dependence of random variables is very important in probability theory.

Conditional distributions of independent random variables are equal to their unconditional distributions.

Let us determine the necessary and sufficient conditions for the independence of random variables.

Theorem. Y were independent, it is necessary and sufficient that the distribution function of the system ( X, Y) was equal to the product of the distribution functions of the components.

A similar theorem can be formulated for the distribution density:

Theorem. In order for random variables X and Y were independent, it is necessary and sufficient that the joint distribution density of the system ( X, Y) was equal to the product of the distribution densities of the components.

The following formulas are practically used:

For discrete random variables:

For continuous random variables:

The correlation moment serves to characterize the relationship between random variables. If random variables are independent, then their correlation moment is equal to zero.

The correlation moment has a dimension equal to the product of the dimensions of random variables X and Y . This fact is a disadvantage of this numerical characteristic, because With different units of measurement, different correlation moments are obtained, which makes it difficult to compare the correlation moments of different random variables.

In order to eliminate this drawback, another characteristic is used - the correlation coefficient.

Definition. Correlation coefficient r xy random variables X and Y is called the ratio of the correlation moment to the product of the standard deviations of these quantities.

The correlation coefficient is a dimensionless quantity. For independent random variables, the correlation coefficient is zero.

Property: The absolute value of the correlation moment of two random variables X and Y does not exceed the geometric mean of their variances.

Property: The absolute value of the correlation coefficient does not exceed one.

Random variables are called correlated, if their correlation moment is different from zero, and uncorrelated, if their correlation moment is zero.

If random variables are independent, then they are uncorrelated, but from uncorrelatedness one cannot conclude that they are independent.

If two quantities are dependent, then they can be either correlated or uncorrelated.

Often, from a given distribution density of a system of random variables, one can determine the dependence or independence of these variables.

Along with the correlation coefficient, the degree of dependence of random variables can be characterized by another quantity, which is called coefficient of covariance. The covariance coefficient is given by the formula:

Example. The distribution density of the system of random variables X is given andindependent. Of course, they will also be uncorrelated.

Linear regression.

Consider a two-dimensional random variable ( X, Y), where X and Y are dependent random variables.

Let us approximately represent one random variable as a function of another. An exact match is not possible. We will assume that this function is linear.

To determine this function, all that remains is to find the constant values a And b.

Definition. Functiong( X) called best approximation random variable Y in the sense of the least squares method, if the mathematical expectation

Takes the smallest possible value. Also functiong( x) called mean square regression Y to X.

Theorem. Linear mean square regression Y on X is calculated by the formula:

in this formula m x= M( X random variable Yrelative to a random variable X. This value characterizes the magnitude of the error generated when replacing a random variableYlinear functiong( X) = aX+b.

It is clear that if r= ± 1, then the residual variance is zero, and therefore the error is zero and the random variableYexactly represented by a linear function of a random variable X.

Mean square regression line X onYis determined similarly by the formula: X and Yhave linear regression functions in relation to each other, then they say that the quantities X AndYconnected linear correlation dependence.

Theorem. If a two-dimensional random variable ( X, Y) is normally distributed, then X and Y are connected by a linear correlation.

E.G. Nikiforova


Law of large numbers in probability theory states that the empirical mean (arithmetic mean) of a sufficiently large finite sample from a fixed distribution is close to the theoretical mean (mathematical expectation) of this distribution. Depending on the type of convergence, a distinction is made between the weak law of large numbers, when convergence occurs in probability, and the strong law of large numbers, when convergence occurs almost everywhere.

There is always a finite number of trials in which, with any given advance probability, there is less 1 the relative frequency of occurrence of some event will differ as little as possible from its probability.

The general meaning of the law of large numbers: the joint action of a large number of identical and independent random factors leads to a result that, in the limit, does not depend on chance.

Methods for estimating probability based on finite sample analysis are based on this property. A clear example is the forecast of election results based on a survey of a sample of voters.

Encyclopedic YouTube

    1 / 5

    ✪ Law of large numbers

    ✪ 07 - Probability theory. Law of Large Numbers

    ✪ 42 Law of Large Numbers

    ✪ 1 - Chebyshev’s law of large numbers

    ✪ Grade 11, lesson 25, Gaussian curve. Law of Large Numbers

    Subtitles

    Let's look at the law of large numbers, which is perhaps the most intuitive law in mathematics and probability theory. .. The first time I do a test, I'll toss a coin 100 times, or I'll take a box with a hundred coins, shake it, and then count how many heads I get, and I'll get, say, the number 55. That would be X1. Just because you get a disproportionately large number of heads does not mean that at some point you will start to get a disproportionately large number of tails. See you in the next video!

Weak law of large numbers

The weak law of large numbers is also called Bernoulli's theorem, after Jacob Bernoulli, who proved it in 1713.

Let there be an infinite sequence (sequential enumeration) of identically distributed and uncorrelated random variables. That is, their covariance c o v (X i , X j) = 0 , ∀ i ≠ j (\displaystyle \mathrm (cov) (X_(i),X_(j))=0,\;\forall i\not =j). Let . Let us denote by the sample average of the first n (\displaystyle n) members:

.

Then X ¯ n → P μ (\displaystyle (\bar (X))_(n)\to ^(\!\!\!\!\!\!\mathbb (P) )\mu ).

That is, for any positive ε (\displaystyle \varepsilon)

lim n → ∞ Pr (| X ¯ n − μ |< ε) = 1. {\displaystyle \lim _{n\to \infty }\Pr \!\left(\,|{\bar {X}}_{n}-\mu |<\varepsilon \,\right)=1.}

Strengthened Law of Large Numbers

Let there be an infinite sequence of independent identically distributed random variables ( X i ) i = 1 ∞ (\displaystyle \(X_(i)\)_(i=1)^(\infty )), defined on one probability space (Ω , F , P) (\displaystyle (\Omega ,(\mathcal (F)),\mathbb (P))). Let E X i = μ , ∀ i ∈ N (\displaystyle \mathbb (E) X_(i)=\mu ,\;\forall i\in \mathbb (N) ). Let us denote by X ¯ n (\displaystyle (\bar (X))_(n)) sample mean of first n (\displaystyle n) members:

X ¯ n = 1 n ∑ i = 1 n X i , n ∈ N (\displaystyle (\bar (X))_(n)=(\frac (1)(n))\sum \limits _(i= 1)^(n)X_(i),\;n\in \mathbb (N) ).

Then X ¯ n → μ (\displaystyle (\bar (X))_(n)\to \mu ) almost always.

Pr (lim n → ∞ X ¯ n = μ) = 1. (\displaystyle \Pr \!\left(\lim _(n\to \infty )(\bar (X))_(n)=\mu \ right)=1.) .

Like any mathematical law, the law of large numbers can only be applied to the real world under certain assumptions that can only be met with some degree of accuracy. For example, successive test conditions often cannot be maintained indefinitely and with absolute accuracy. In addition, the law of large numbers only speaks about improbability significant deviation of the average value from the mathematical expectation.


What is the secret of successful salespeople? If you observe the best salespeople in any company, you will notice that they have one thing in common. Each of them meets with more people and makes more presentations than less successful salespeople. These people understand that sales is a numbers game and the more people they tell about their products or services, the more deals they will close - that's all. They understand that if they communicate not only with those few who will definitely say yes to them, but also with those whose interest in their offer is not so great, then the law of averages will work in their favor.


Your income will depend on the number of sales, but at the same time, it will be directly proportional to the number of presentations you make. Once you understand and practice the law of averages, the anxiety associated with starting a new business or working in a new field will begin to decrease. As a result, a sense of control and confidence in your ability to earn money will begin to grow. If you just make presentations and hone your skills in the process, deals will come.

Instead of thinking about the number of deals, think better about the number of presentations. There is no point in waking up in the morning or coming home in the evening and wondering who will buy your product. Instead, it's best to plan how many calls you need to make each day. And then, no matter what - make all these calls! This approach will make your work easier - because it is a simple and specific goal. If you know that you have a specific and achievable goal, it will be easier for you to make the planned number of calls. If you hear “yes” a couple of times during this process, so much the better!

And if “no,” then in the evening you will feel that you honestly did everything you could, and you will not be tormented by thoughts of how much money you earned or how many companions you acquired in a day.

Let's say in your company or business, the average salesperson closes one deal per four presentations. Now imagine that you are drawing cards from a deck. Each card of the three suits - spades, diamonds and clubs - is a presentation in which you professionally present a product, service or opportunity. You do it as well as you can, but you still don't close the deal. And each heart card is a deal that allows you to get money or acquire a new companion.

In such a situation, wouldn't you want to draw as many cards from the deck as possible? Let's say you are offered to draw as many cards as you want, while paying you or offering you a new companion each time you draw a heart card. You will start drawing cards enthusiastically, barely noticing what suit the card you just pulled out is.

You know that in a deck of fifty-two cards there are thirteen hearts. And in two decks there are twenty-six heart cards, and so on. Will you be disappointed when you draw spades, diamonds or clubs? Of course not! You will only think that each such “miss” brings you closer to what? To the heart card!

But you know what? You have already been given such an offer. You are in a unique position to earn as much as you want and draw as many hearts as you want to draw in your life. And if you simply “draw cards” conscientiously, improve your skills and endure a little spades, diamonds and clubs, you will become an excellent salesman and achieve success.

One of the things that makes sales so fun is that every time you shuffle the deck, the cards are shuffled differently. Sometimes all the hearts end up at the beginning of the deck, and after a lucky streak (when it seems to us that we will never lose!) a long row of cards of a different suit awaits us. And other times, to get to the first heart, you will have to go through an endless number of spades, clubs and diamonds. And sometimes cards of different suits appear strictly in order. But in any case, in every deck of fifty-two cards, in some order, there are always thirteen hearts. Just pull out cards until you find them.



From: Leylya,  

LAW OF LARGE NUMBERS

a general principle, by virtue of which the combination of random factors leads, under certain very general conditions, to a result almost independent of chance. The convergence of the frequency of occurrence of a random event with its probability as the number of trials increases (first noticed, apparently, in gambling) can serve as the first example of the operation of this principle.

At the turn of the 17th and 18th centuries. J. Bernoulli proved a theorem stating that in a sequence of independent trials, in each of which the occurrence of a certain event has the same value, the following relation is true:

for any - the number of occurrences of the event in the first trials, - the frequency of occurrences. This Bernoulli's theorem was extended by S. Poisson to the case of a sequence of independent trials, where the probability of the occurrence of the event A may depend on the number of the trial. Let this probability for the kth trial be equal and let


Then Poisson's theorem States that

for any The first rigorous approach to this theorem was given by P. L. Chebyshev (1846), whose method is completely different from Poisson’s method and is based on certain extreme considerations; S. Poisson derived (2) from an approximate formula for the indicated probability, based on the use of Gauss’s law and at that time not yet strictly substantiated. S. Poisson first encountered the term “law of large numbers,” which he called his generalization of Bernoulli’s theorem.

A natural further generalization of the theorems of Bernoulli and Poisson arises if we notice that random variables can be represented as a sum

independent random variables, where if A appears in the Ath trial, and - otherwise. At the same time, mathematical the expectation (coinciding with the arithmetic mean of mathematical expectations) is equal to p for the Bernoulli case and for the Poisson case. In other words, in both cases the deviation of the arithmetic mean is considered X k from the arithmetic mean of their mathematical expectations.

In the work of P. L. Chebyshev “On average values” (1867), it was established that for independent random variables the relation

(for any ) is true under very general assumptions. P. L. Chebyshev assumed that the mathematician. expectations are all bounded by the same constant, although from his proof it is clear that the requirement of bounded variances is sufficient

or even demands

Thus, P. L. Chebyshev showed the possibility of a broad generalization of Bernoulli’s theorem. A. A. Markov noted the possibility of further generalizations and suggested using the name B. h.z. to the entire set of generalizations of Bernoulli’s theorem [and in particular to (3)]. Chebyshev's method is based on the precise establishment of the general properties of mathematics. expectations and on the use of the so-called. Chebyshev inequalities[for probability (3) it gives an estimate of the form


this boundary can be replaced by a more accurate one, of course, under more significant restrictions, see Bernstein inequality]. Subsequent evidence of various forms of B. h.z. to one degree or another are a development of the Chebyshev method. Applying the proper “cutting” of random variables (replacing them with auxiliary variables namely: , if where are certain constants), A. A. Markov extended the B. part. for cases where variances of the terms do not exist. For example, he showed that (3) takes place if, for certain constants and everyone and

The practice of studying random phenomena shows that although the results of individual observations, even those carried out under the same conditions, may differ greatly, at the same time, the average results for a sufficiently large number of observations are stable and weakly depend on the results of individual observations.

The theoretical basis for this remarkable property of random phenomena is law of large numbers. The name “law of large numbers” combines a group of theorems that establish the stability of the average results of a large number of random phenomena and explain the reason for this stability.

The simplest form of the law of large numbers, and historically the first theorem of this section, is Bernoulli's theorem, which states that if the probability of an event is the same in all trials, then as the number of trials increases, the frequency of the event tends to the probability of the event and ceases to be random.

Poisson's theorem states that the frequency of an event in a series of independent trials tends to the arithmetic mean of its probabilities and ceases to be random.

Limit theorems of probability theory, theorems Moivre-Laplace explain the nature of the stability of the frequency of occurrence of an event. This nature lies in the fact that the limiting distribution of the number of occurrences of an event with an unlimited increase in the number of trials (if the probability of the event is the same in all trials) is normal distribution.

The Central Limit Theorem explains the widespread normal law distributions. The theorem states that whenever a random variable is formed as a result of the addition of a large number of independent random variables with finite variances, the distribution law of this random variable turns out to be practically normal by law.

The theorem given below entitled " Law of Large Numbers" states that under certain, fairly general conditions, with an increase in the number of random variables, their arithmetic mean tends to the arithmetic mean of mathematical expectations and ceases to be random.

Lyapunov's theorem explains the widespread normal law distribution and explains the mechanism of its formation. The theorem allows us to state that whenever a random variable is formed as a result of the addition of a large number of independent random variables, the variances of which are small compared to the variance of the sum, the distribution law of this random variable turns out to be practically normal by law.

And since random variables are always generated by an infinite number of causes and most often none of them has a dispersion comparable to the dispersion of the random variable itself, most random variables encountered in practice are subject to the normal distribution law. The qualitative and quantitative statements of the law of large numbers are based on Chebyshev inequality . It determines the upper bound on the probability that the deviation of the value of a random variable from its mathematical expectation is greater than a certain specified number.

It is remarkable that Chebyshev’s inequality gives an estimate of the probability of an event for a random variable whose distribution is unknown, only its mathematical expectation and variance are known. Chebyshev's inequality. If a random variable x has variance, then for any e > 0 the following inequality holds:, Where M x and

D x - mathematical expectation and variance of the random variable x. .

Bernoulli's theorem. Let m n be the number of successes in n Bernoulli trials and p the probability of success in an individual trial. Then for any e > 0 it is true