What does the three sigma rule describe? Eggheads in the markets: three sigma criterion in trading

21.09.2019

Let us find the probability that a normally distributed random value will take a value from the interval ( A - 3σ, a + 3σ ):

Therefore, the probability that the value of a random variable will be outside of this interval is equal to 0.0027, that is, 0.27% and can be considered negligible. Thus, in practice it can be assumed that All possible values ​​of a normally distributed random variable lie in the interval ( A - 3σ, a + 3σ ).

The result obtained allows us to formulate three sigma rule: if a random variable is normally distributed, then the modulus of its deviation from x = a does not exceed 3σ.

16.7. Exponential distribution.

Definition. Exponential is called the probability distribution of a continuous random variable X, which is described by density

Unlike the normal distribution, the exponential law is determined by only one parameter λ . This is its advantage, since usually the distribution parameters are not known in advance and they have to be estimated approximately. It is clear that it is easier to evaluate one parameter than several.

Let us find the distribution function of the exponential law:

Hence,

Now we can find the probability of an exponentially distributed random variable falling into the interval ( A,b):

Function values e -X can be found from the tables.

16.8. Reliability function.

Let element(that is, some device) starts working at the moment of time t 0 = 0 and should work for a period of time t. Let's denote by T continuous random variable – the failure-free operation time of the element, then the function F(t) = p(T > t) determines the probability of failure over time t. Therefore, the probability of failure-free operation during the same time is equal to

R(t) = p(T > t) = 1 – F(t).

This function is called reliability function.

16.9. The exponential law of reliability.

Often the duration of failure-free operation of an element has an exponential distribution, that is

F(t) = 1 – e - λt .

Therefore, the reliability function in this case has the form:

R(t) = 1 – F(t) = 1 – (1 – e -λt) = e -λt .

Definition. The exponential law of reliability call the reliability function defined by the equality

R(t) = e - λt ,

Where λ – failure rate.

Example. Let the failure-free operation time of an element be distributed according to an exponential law with a distribution density f(t) = 0,1 e - 0,1 t at t≥ 0. Find the probability that the element will operate without failure for 10 hours.

Solution. Because λ = 0,1, R(10) = e-0.1 10 = e -1 = 0,368.

16.10. Expected value.

Definition. Mathematical expectation A discrete random variable is the sum of the products of its possible values ​​and their corresponding probabilities:

M(X) = X 1 R 1 + X 2 R 2 + … + X P R P .

If the number of possible values ​​of a random variable is infinite, then
, if the resulting series converges absolutely.

Note 1. The mathematical expectation is sometimes called weighted average, since it is approximately equal to the arithmetic mean of the observed values ​​of the random variable over a large number of experiments.

Note 2. From the definition of mathematical expectation it follows that its value is no less than the smallest possible value of a random variable and no more than the largest.

Note 3. The mathematical expectation of a discrete random variable is non-random(constant. We will see later that the same is true for continuous random variables.

Example. We'll find expected value random variable X– the number of standard parts among three selected from a batch of 10 parts, including 2 defective ones. Let's create a distribution series for X. From the problem conditions it follows that X can take values ​​1, 2, 3. Then

Example 2. Determine the mathematical expectation of a random variable X– the number of coin tosses before the first appearance of the coat of arms. This quantity can take on an infinite number of values ​​(the set of possible values ​​is the set natural numbers). Its distribution series has the form:

(0,5) P

+ (during the calculation, the formula for the sum of an infinitely decreasing geometric progression was used twice:
, where).

Properties of mathematical expectation.

    The mathematical expectation of a constant is equal to the constant itself:

M(WITH) = WITH.

Proof. If we consider WITH as a discrete random variable taking only one value WITH with probability R= 1, then M(WITH) = WITH·1 = WITH.

    The constant factor can be taken out of the mathematical expectation sign:

M(CX) = CM(X).

Proof. If the random variable X given by distribution series

x i

x n

p i

p n

then the distribution series for CX has the form:

WITHx i

WITHx 1

WITHx 2

WITHx n

p i

p n

Then M(CX) = Cx 1 R 1 + Cx 2 R 2 + … + Cx P R P = WITH(X 1 R 1 + X 2 R 2 + … + X P R P) = CM(X).

Definition. Two random variables are called independent, if the distribution law of one of them does not depend on what values ​​the other has taken. Otherwise the random variables dependent.

Definition. Let's call product of independent random variablesX AndY random variable XY, the possible values ​​of which are equal to the products of all possible values X for all possible values Y, and the corresponding probabilities are equal to the products of the probabilities of the factors.

    The mathematical expectation of the product of two independent random variables is equal to the product of their mathematical expectations:

M(XY) = M(X)M(Y).

Proof. To simplify calculations, we restrict ourselves to the case when X And Y take only two possible values:

x i

p i

at i

g i

Then the distribution series for XY looks like that:

XY

x 1 y 1

x 2 y 1

x 1 y 2

x 2 y 2

p 1 g 1

p 2 g 1

p 1 g 2

p 2 g 2

Hence, M(XY) = x 1 y 1 · p 1 g 1 + x 2 y 1 · p 2 g 1 + x 1 y 2 · p 1 g 2 + x 2 y 2 · p 2 g 2 = y 1 g 1 (x 1 p 1 + x 2 p 2) + + y 2 g 2 (x 1 p 1 + x 2 p 2) = (y 1 g 1 + y 2 g 2) (x 1 p 1 + x 2 p 2) = M(XM(Y).

Note 1. You can similarly prove this property for a larger number of possible values ​​of the factors.

Note 2. Property 3 is true for the product of any number of independent random variables, which is proven by mathematical induction.

Definition. Let's define sum of random variablesX AndY as a random variable X+Y, the possible values ​​of which are equal to the sums of each possible value X with every possible value Y; the probabilities of such sums are equal to the products of the probabilities of the terms (for dependent random variables - the products of the probability of one term by the conditional probability of the second).

4) The mathematical expectation of the sum of two random variables (dependent or independent) is equal to the sum of the mathematical expectations of the terms:

M (X + Y) = M (X) + M (Y).

Proof.

Let us again consider the random variables defined by the distribution series given in the proof of property 3. Then the possible values X + Y are X 1 + at 1 , X 1 + at 2 , X 2 + at 1 , X 2 + at 2. Let us denote their probabilities respectively as R 11 , R 12 , R 21 and R 22. We'll find M(X+Y) = (x 1 + y 1)p 11 + (x 1 + y 2)p 12 + (x 2 + y 1)p 21 + (x 2 + y 2)p 22 =

= x 1 (p 11 + p 12) + x 2 (p 21 + p 22) + y 1 (p 11 + p 21) + y 2 (p 12 + p 22).

Let's prove that R 11 + R 22 = R 1 . Indeed, the event that X + Y will take values X 1 + at 1 or X 1 + at 2 and the probability of which is R 11 + R 22 coincides with the event that X = X 1 (its probability is R 1). It is proved in a similar way that p 21 + p 22 = R 2 , p 11 + p 21 = g 1 , p 12 + p 22 = g 2. Means,

M(X + Y) = x 1 p 1 + x 2 p 2 + y 1 g 1 + y 2 g 2 = M (X) + M (Y).

Comment. From property 4 it follows that the sum of any number of random variables is equal to the sum of the mathematical expectations of the terms.

Example. Find the mathematical expectation of the sum of the number of points obtained when throwing five dice.

Let's find the mathematical expectation of the number of points rolled when throwing one dice:

M(X 1) = (1 + 2 + 3 + 4 + 5 + 6)
The same number is equal to the mathematical expectation of the number of points rolled on any dice. Therefore, by property 4 M(X)=

Random variable. The standard deviation is used when calculating the standard error of the arithmetic mean, when constructing confidence intervals, when statistical testing hypotheses when measuring a linear relationship between random variables.

where is the standard, standard deviation, unbiased estimate of the standard deviation of the random variable X relative to its mathematical expectation; - dispersion; - i-th element samples; - arithmetic mean of the sample; - sample size.

It should be noted that the standard differs (in the denominator n− 1 ) from the root of the variance (standard deviation) (in the denominator n), with a small sample size, the estimate of variance through the last value is somewhat biased; with an infinitely large sample size, the difference between the indicated values ​​disappears. A sample is only a part of the population. The totality is absolutely all possible results. It is absolutely impossible in principle to obtain a result that is not included in the general population. For the case of tossing a coin, the general population is: tails, edges, heads. but the heads-tails pair is just a selection. For the general population, the mathematical expectation coincides with the true value of the estimated parameter. But for the sample this is not a fact. The mathematical expectation of the sample has a bias relative to the true value of the parameter. Due to this, the root mean square error is greater than the dispersion, since the dispersion is the mathematical expectation of the squared deviation from the average value, and the standard deviation is the mathematical expectation of the deviation from the true value. The difference is what we are looking for a deviation from, when it is dispersion, then from the average and it does not matter whether it is a true average or an error, but when it is a standard deviation, then we are looking for a deviation from the true value.

3 sigma rule() - almost all values ​​of a normally distributed random variable lie in the interval. More strictly - with no less than 99.7% confidence, the value of a normally distributed random variable lies in the specified interval. Provided that the value is true and not obtained as a result of sample processing. If the true value is unknown, then you should use not σ, but s. Thus, the rule of 3 sigma is transformed into the rule of three s


Wikimedia Foundation.

2010.

    See what the “Three Sigma Rule” is in other dictionaries:

    The dispersion of a random variable is a measure of the spread of a given random variable, i.e., its deviation from the mathematical expectation. It is designated D[X] in Russian literature and (English variance) in foreign literature. In statistics, the designation or is often used... ... Wikipedia

    - (English six sigma) a production management concept developed at Motorola Corporation in the 1980s and popularized in the mid-1990s after Jack Welch used it as a key strategy at General Electric. The essence... ... Wikipedia

    Standard deviation (sometimes standard deviation) in probability theory and statistics is the most common indicator of the dispersion of the values ​​of a random variable relative to its mathematical expectation. Measured in units... ... Wikipedia

    Probability density Green line ... Wikipedia

    NERVOUS SYSTEM- NERVOUS SYSTEM. Contents: I. Embryogenesis, histogenesis and phylogeny N.s. . 518 II. Anatomy of N. p................. 524 III. Physiology N. p............. 525 IV. Pathology N.s................. 54? I. Embryogenesis, histogenesis and phylogeny N. e.... ... Great Medical Encyclopedia

    The most numerous group of sponges. These are predominantly soft elastic forms. Their skeleton is formed by uniaxial spines. There is always some amount of spongin, with the help of which the needles are glued together into bundles or fibers... Biological encyclopedia

    In medicine, a set of methods for quantitative study and analysis of the state and (or) behavior of objects and systems related to medicine and healthcare. In biology, medicine and healthcare, the range of phenomena studied using M.M. includes... ... Medical encyclopedia

    Contents 1 Management based economic activity 2 Development of a business situation 3 ... Wikipedia

    This article should be Wikified. Please format it according to the article formatting rules. Activity Based Costing (ABC) is a special cost description model that identifies a company’s work... Wikipedia

1. The rule of three sigma is that almost all the results that make up a normally distributed sample are within . This rule can be used to solve the following important problems:

1) Estimates of the normality of the distribution of sample data. If the results are approximately within
and in the area of ​​the arithmetic mean the results occur more often, and to the right and left of it - less often, then we can assume that the results are normally distributed.

2) Identification of erroneously obtained results. If individual results deviate from the arithmetic mean by values ​​significantly exceeding 3, it is necessary to check the correctness of the obtained values. Often, such “popping up” results can appear as a result of a malfunction of the device, errors in measurements and calculations.

3) Estimation of the value of . If the range of variation R=X max - X max is divided by 6, then we get a roughly approximate value of .

2. The Shapiro and Wilk W test is designed to test the hypothesis of a normal population distribution when the sample size is small ( n≤ 50). The verification procedure is as follows: a null hypothesis is put forward about the normal distribution of the population. The observed value of the Shapiro and Wilk criterion W obs is calculated and compared with the critical value W crit, which is found from the table of critical points of the Shapiro and Wilk criterion depending on the sample size and significance level. If W obs ≥ W crit, the null hypothesis of normal distribution of results is accepted; at W obs.< W крит она отвергается.

1. What is it? rule of three sigma?

2. Practical application of the three sigma rule.

3. What criterion is used to check the normality of the population distribution with a small sample size?

4. Describe the procedure for testing the normality of a distribution.

Literature:

1. Basics mathematical statistics. Uch. manual for the Institute of Physical Culture (under the general editorship of V.S. Ivanov). – M.: Physical culture and sport, 1990. – P. 62 – 63, 110 – 112.

2. Rukavitsyna S.L., Volkov Yu.O., Soltanovich L.L. Sports metrology. Testing the effectiveness of training methods using mathematical statistics methods. Workshop for students of BSUPC. – Minsk: BGUFK, 2006. – P. 66 – 67.

3. Ginzburg G.I., Kiselev V.G. Calculation and graphic works on sports metrology. – Minsk: BGOIFK, 1984. – P. 21 – 22, 26 – 29.

LECTURE 7.

Subject: Relationship between measurement results. Methods for calculating correlation coefficients.

Questions to consider:

1. Types of relationships.

2. The main tasks of correlation analysis.

3. Correlation coefficient and its properties.

4. Methods for calculating correlation coefficients.

1. In sports research, a relationship is often found between the studied indicators. Its appearance varies. For example, the determination of acceleration based on known speed data in biomechanics, Fechner’s law in psychology, Hill’s law in physiology and others characterize the so-called functional dependence, or relationship in which each value of one indicator corresponds to a strictly defined value of another.

Another type of relationship includes, for example, the dependence of weight on body length. One body length value can correspond to several weight values ​​and vice versa. In such cases, when one value of one indicator corresponds to several values ​​of another, the relationship is called statistical.

Much attention is paid to the study of the statistical relationship between various indicators in sports research, since this makes it possible to reveal some patterns and subsequently describe them both verbally and mathematically for the purpose of using them in the practical work of a coach and teacher.

Among the statistical relationships, the most important are correlational. Correlation is that the average value of one indicator changes depending on the value of another.

2. The statistical method used to study relationships is called correlation analysis. Its main task is to determine the form, closeness and direction of the relationship between the indicators being studied. Correlation analysis allows you to explore only statistical relationships. It is widely used in test theory to assess their reliability and information content. Different measurement scales require different types of correlation analysis.

Relationship analysis begins with a graphical representation of the measurement results in a rectangular coordinate system. A graph is constructed with X results on the abscissa axis, and Y results on the ordinate axis. Thus, each pair of results in a rectangular coordinate system will be displayed as a point. The resulting set of points is outlined by a closed curve.

This graphical relationship is called scatter diagram or correlation field. Visual analysis of the graph allows you to identify the form of the dependence (at least make an assumption). If the shape of the correlation field is close to an ellipse, this form of relationship is called a linear dependence or linear form of relationship.

However, in practice, another form of relationship can be found. The dependence experimentally obtained for tennis serves is characteristic of nonlinear forms of relationship, or nonlinear dependence.

Thus, visual analysis of the correlation field allows us to identify the form of the statistical dependence - linear or nonlinear. This is essential for next step in the analysisselection and calculation of the appropriate correlation coefficient.

3. If measurements occur on a ratio or interval scale and a linear form of relationship is observed, the Bravais-Pearson correlation coefficient is used to quantify the strength of the relationship. Denoted by the letter r. Calculated by the formula:

,

Where And – arithmetic average values ​​of indicators x and y; σ x and σ y – standard deviations; n – number of measurements (subjects).

Its properties:

1) Values ​​of r can vary from –1 to 1.

2) In the case of r=-1 and r=1, the relationship is functional, negative and positive, respectively.

3) When r=0, a linear relationship is not established, but a relationship of a different form may be observed.

4) At r<0 взаимосвязь отрицательная, при r>0 – positive.

To assess the closeness of the relationship in correlation analysis, the value (absolute value) of the correlation coefficient is used. The absolute value of any correlation coefficient lies in the range from 0 to 1. The value of this coefficient is explained (interpreted) as follows:

correlation coefficient is 1.00 (functional relationship, since the value of one indicator corresponds to only one value of another indicator);

correlation coefficient is 0.990.7 (strong statistical relationship);

correlation coefficient is 0.690.5 (average statistical relationship);

correlation coefficient is 0.490.2 (weak statistical relationship);

correlation coefficient is 0.190.01 (very weak statistical relationship);

correlation coefficient is 0.00 (no correlation).

4. Before starting the mechanical procedure for calculating the correlation coefficient, it is necessary to answer some questions:

1) On what scale is the indicator being studied measured?

2) How many measurements of this indicator have been performed?

The answers to these questions determine which correlation coefficient will be calculated.

In particular, when measurements are carried out on an interval or ratio scale, the Bravais-Pearson correlation coefficient is calculated to assess the strength of the relationship; in the rank scale, the Spearman rank correlation coefficient is calculated; and in the naming scale, when the characteristic of interest varies alternatively, the tetrachoric contingency coefficient is used.

Spearman's rank correlation coefficient is calculated using the formula:

,

Where d= d x - d y– difference in ranks of a given pair of indicators X and Y; n – sample size.

Applies when indicators are measured on a scale of names (i.e. they are assigned numbers, but one cannot be said to be greater than the other), and indicators vary alternatively (gender male/female, completion or failure of a task, etc., otherwise speaking, there are two states: 0 and 1).

It is designated T 4 and is calculated by the formula:

,

where A is the value that corresponds to the number of subjects (attempts) matching both indicators X and Y, i.e. 1 and 1; B – value that corresponds to the number of matches 0 – X and 1 – Y; C – value corresponding to the number of matches 1 – X and 0 – Y; D – value of matches 0 and 0; n – sample size.

Test questions for self-test:

1. Functional relationship. Definition and examples.

2. Statistical relationship. Definition and examples. Correlation relationship.

3. Main tasks of correlation analysis.

4. Correlation field. Construction order, image analysis.

6. Bravais-Pearson correlation coefficient and its properties.

7. Rules for choosing the relationship coefficient.

Literature:

1. Fundamentals of mathematical statistics. Uch. manual for the Institute of Physical Culture (under the general editorship of V.S. Ivanov). – M.: Physical culture and sport, 1990. – P. 124 – 126, 142 – 150, 155 – 162.

2. Rukavitsyna S.L., Volkov Yu.O., Soltanovich L.L. Sports metrology. Testing the effectiveness of training methods using mathematical statistics methods. Workshop for students of BSUPC. – Minsk: BGUFK, 2006. – P. 42 – 48.

3. Ginzburg G.I., Kiselev V.G. Calculation and graphic works on sports metrology. – Minsk: BGOIFK, 1984. – P. 51 – 60.

LECTURE 8.

Subject: Statistical hypotheses and reliability of statistical characteristics. Testing statistical hypotheses.

From this article you will learn:

    What's happened confidence interval?

    What's the point 3 sigma rules?

    How can you apply this knowledge in practice?

Nowadays, due to an overabundance of information associated with a large assortment of products, sales directions, employees, areas of activity, etc., it can be difficult to highlight the main thing, which, first of all, is worth paying attention to and making efforts to manage. Definition confidence interval and analysis of actual values ​​going beyond its boundaries - a technique that will help you highlight situations, influencing changing trends. You will be able to develop positive factors and reduce the influence of negative ones. This technology is used in many well-known global companies.

There are so-called " alerts", which inform managers that the next value is in a certain direction went beyond confidence interval. What does this mean? This is a signal that some unusual event has occurred that may change the existing trend in in this direction. This is a signal to that to figure it out in the situation and understand what influenced it.

For example, consider several situations. We calculated the sales forecast with forecast limits for 100 product items for 2011 by month and actual sales in March:

  1. By " Sunflower oil» broke through the upper limit of the forecast and did not fall into the confidence interval.
  2. For “Dry yeast” we exceeded the lower limit of the forecast.
  3. “Oatmeal Porridge” has broken through the upper limit.

For other products, actual sales were within the given forecast limits. Those. their sales were within expectations. So, we identified 3 products that went beyond the borders and began to figure out what influenced them to go beyond the borders:

  1. For Sunflower Oil, we entered a new distribution network, which gave us additional sales volume, which led to us going beyond the upper limit. For this product, it is worth recalculating the forecast until the end of the year, taking into account the sales forecast for this network.
  2. For “Dry Yeast”, the car got stuck at customs, and there was a shortage within 5 days, which affected the decline in sales and exceeded the lower limit. It may be worthwhile to figure out what caused it and try not to repeat this situation.
  3. A sales promotion event was launched for Oatmeal Porridge, which gave a significant increase in sales and led to the company going beyond the forecast.

We identified 3 factors that influenced the going beyond the forecast limits. There can be much more of them in life. To increase the accuracy of forecasting and planning, factors that lead to the fact that actual sales may go beyond the forecast, it is worth highlighting and building forecasts and plans for them separately. And then consider their impact on the main sales forecast. You can also regularly assess the impact of these factors and change the situation for the better. by reducing the influence of negative and increasing the influence of positive factors.

With a confidence interval we can:

  1. Select directions, which are worth paying attention to, because events have occurred in these directions that may affect change in trend.
  2. Identify factors, which really influence the change in the situation.
  3. Accept informed decision(for example, about purchasing, planning, etc.).

Now let's look at what a confidence interval is and how to calculate it in Excel using an example.

What is a confidence interval?

Confidence interval is the forecast boundaries (upper and lower), within which with a given probability (sigma) actual values ​​will appear.

Those. We calculate the forecast - this is our main guideline, but we understand that the actual values ​​are unlikely to be 100% equal to our forecast. And the question arises, within what boundaries actual values ​​may fall, if the current trend continues? And this question will help us answer confidence interval calculation, i.e. - upper and lower limits of the forecast.

What is a given probability sigma?

When calculating confidence interval we can set probability hits actual values within the given forecast limits. How to do it? To do this, we set the value of sigma and, if sigma is equal to:

    3 sigma- then, the probability of the next actual value falling into the confidence interval will be 99.7%, or 300 to 1, or there is a 0.3% probability of going beyond the boundaries.

    2 sigma- then, the probability of the next value falling within the boundaries is ≈ 95.5%, i.e. the odds are about 20 to 1, or there is a 4.5% chance of going overboard.

    1 sigma- then the probability is ≈ 68.3%, i.e. the odds are approximately 2 to 1, or there is a 31.7% chance that the next value will fall outside the confidence interval.

We formulated 3 sigma rule,which says that hit probability another random value into the confidence interval with a given value three sigma is 99.7%.

The great Russian mathematician Chebyshev proved the theorem that there is a 10% probability of going beyond the forecast limits with a given value of three sigma. Those. the probability of falling within the 3-sigma confidence interval will be at least 90%, while an attempt to calculate the forecast and its boundaries “by eye” is fraught with much more significant errors.

How to calculate a confidence interval yourself in Excel?

Let's look at the calculation of the confidence interval in Excel (i.e., the upper and lower limits of the forecast) using an example. We have a time series - sales by month for 5 years. See attached file.

To calculate the forecast limits, we calculate:

  1. Sales forecast().
  2. Sigma - standard deviation forecast models from actual values.
  3. Three sigma.
  4. Confidence interval.

1. Sales forecast.

=(RC[-14] (time series data)- RC[-1] (model value))^2(squared)


3. For each month, let’s sum up the deviation values ​​from stage 8 Sum((Xi-Ximod)^2), i.e. Let's sum up January, February... for each year.

To do this, use the formula =SUMIF()

SUMIF(array with period numbers inside the cycle (for months from 1 to 12); link to the period number in the cycle; link to an array with squares of the difference between the source data and period values)


4. Calculate the standard deviation for each period in the cycle from 1 to 12 (stage 10 in the attached file).

To do this, we extract the root from the value calculated at stage 9 and divide by the number of periods in this cycle minus 1 = SQRT((Sum(Xi-Ximod)^2/(n-1))

Let's use the formulas in Excel =ROOT(R8 (link to (Sum(Xi-Ximod)^2)/(COUNTIF($O$8:$O$67 (link to array with cycle numbers); O8 (link to a specific cycle number that we count in the array))-1))

Using the Excel formula = COUNTIF we count the number n


Having calculated the standard deviation of the actual data from the forecast model, we obtained the sigma value for each month - stage 10 in the attached file .

3. Let's calculate 3 sigma.

At stage 11 we set the number of sigmas - in our example “3” (stage 11 in the attached file):

Also convenient for practice sigma values:

1.64 sigma - 10% chance of exceeding the limit (1 chance in 10);

1.96 sigma - 5% chance of going beyond limits (1 chance in 20);

2.6 sigma - 1% chance of exceeding limits (1 chance in 100).

5) Calculating three sigma, for this we multiply the “sigma” values ​​for each month by “3”.

3. Determine the confidence interval.

  1. Upper forecast limit- sales forecast taking into account growth and seasonality + (plus) 3 sigma;
  2. Lower forecast limit- sales forecast taking into account growth and seasonality – (minus) 3 sigma;

For the convenience of calculating the confidence interval on a long period(see attached file) let's use the Excel formula =Y8+VLOOKUP(W8,$U$8:$V$19,2,0), Where

Y8- sales forecast;

W8- the number of the month for which we will take the 3-sigma value;

Those. Upper forecast limit= “sales forecast” + “3 sigma” (in the example, VLOOKUP(month number; table with 3 sigma values; column from which we extract the sigma value equal to the month number in the corresponding row; 0)).

Lower forecast limit= “sales forecast” minus “3 sigma”.

So, we calculated the confidence interval in Excel.

Now we have a forecast and a range with boundaries within which the actual values ​​will fall with a given sigma probability.

In this article we looked at what sigma and the three-sigma rule are, how to determine a confidence interval and what you can use this technique on practice.

We wish you accurate forecasts and success!

How Forecast4AC PRO can help youwhen calculating the confidence interval?:

    Forecast4AC PRO will automatically calculate the upper or lower bounds of the forecast for more than 1000 time series simultaneously;

    The ability to analyze the boundaries of the forecast in comparison with the forecast, trend and actual sales on the chart with one keystroke;

In the Forcast4AC PRO program it is possible to set the sigma value from 1 to 3.

Join us!

Download free forecasting and business analysis apps:


  • Novo Forecast Lite- automatic forecast calculation V Excel.
  • 4analytics - ABC-XYZ analysis and emissions analysis Excel.
  • Qlik Sense Desktop and QlikViewPersonal Edition - BI systems for data analysis and visualization.

Test the capabilities of paid solutions:

  • Novo Forecast PRO- forecasting in Excel for large data sets.

Brief theory

Normal is the probability distribution of a continuous random variable whose density has the form:

where is the mathematical expectation and is the standard deviation.

Probability that it will take a value belonging to the interval:

where is the Laplace function:

The probability that the absolute value of the deviation is less than a positive number:

In particular, when the equality holds:

When solving problems that practice poses, one has to deal with various distributions of continuous random variables.

In addition to the normal distribution, the basic laws of distribution of continuous random variables:

Example of problem solution

A part is made on a machine. Its length is a random variable distributed over normal law with parameters , . Find the probability that the length of the part will be between 22 and 24.2 cm. What deviation of the length of the part from can be guaranteed with a probability of 0.92; 0.98? Within what limits, symmetrical with respect to , will almost all dimensions of the parts lie?

Solution:

The probability that a random variable distributed according to a normal law will be in the interval:

We get:

The probability that a random variable distributed according to a normal law will deviate from the average by no more than:

By condition

:

Average solution cost test work 700 - 1200 rubles (but not less than 300 rubles for the entire order). The price is greatly influenced by the urgency of the decision (from a day to several hours). The cost of online help for an exam/test is from 1000 rubles. for solving the ticket.

You can leave a request directly in the chat, having previously sent the conditions of the tasks and informed you of the deadlines for the solution you need. Response time is a few minutes.