Statistical significance. Statistical confidence level

22.09.2019

Statistical reliability is essential in the FCC's calculation practice. It was noted earlier that from the same population multiple samples can be selected:

If they are selected correctly, then their average indicators and the indicators of the general population differ slightly from each other in the magnitude of the representativeness error, taking into account the accepted reliability;

If they are selected from different populations, the difference between them turns out to be significant. Statistics is all about comparing samples;

If they differ insignificantly, unprincipally, insignificantly, i.e., they actually belong to the same general population, the difference between them is called statistically unreliable.

Statistically reliable A sample difference is a sample that differs significantly and fundamentally, that is, it belongs to different general populations.

At the FCC, assessing the statistical significance of sample differences means solving a set practical problems. For example, the introduction of new teaching methods, programs, sets of exercises, tests, control exercises is associated with their experimental testing, which should show that the test group is fundamentally different from the control group. Therefore, special statistical methods, called statistical significance criteria, allowing to detect the presence or absence of a statistically significant difference between samples.

All criteria are divided into two groups: parametric and non-parametric. Parametric criteria require the presence of a normal distribution law, i.e. This means the mandatory determination of the main indicators of the normal law - the arithmetic mean and the standard deviation s. Parametric criteria are the most accurate and correct. Nonparametric tests are based on rank (ordinal) differences between sample elements.

Here are the main criteria for statistical significance used in the FCC practice: Student's test and Fisher's test.

Student's t test named after the English scientist K. Gosset (Student - pseudonym), who discovered this method. Student's t test is parametric and is used for comparison absolute indicators samples. Samples may vary in size.

Student's t test is defined like this.

1. Find the Student t test using the following formula:


where are the arithmetic averages of the compared samples; t 1, t 2 - errors of representativeness identified based on the indicators of the compared samples.

2. Practice at the FCC has shown that for sports work it is enough to accept the reliability of the account P = 0.95.

For counting reliability: P = 0.95 (a = 0.05), with the number of degrees of freedom

k = n 1 + n 2 - 2 using the table in Appendix 4 we find the value of the limit value of the criterion ( t gr).

3. Based on the properties of the normal distribution law, the Student’s criterion compares t and t gr.

We draw conclusions:

if t t gr, then the difference between the compared samples is statistically significant;

if t t gr, then the difference is statistically insignificant.

For researchers in the field of FCS, assessing statistical significance is the first step in solving a specific problem: whether the samples being compared are fundamentally or non-fundamentally different from each other. The next step is to evaluate this difference from a pedagogical point of view, which is determined by the conditions of the task.

Let's consider the application of the Student test using a specific example.

Example 2.14. A group of 18 subjects was assessed for heart rate (bpm) before x i and after y i warm-up.

Assess the effectiveness of the warm-up based on heart rate. Initial data and calculations are presented in table. 2.30 and 2.31.

Table 2.30

Processing heart rate indicators before warming up


The errors for both groups coincided, since the sample sizes are equal (the same group is studied at different conditions), and the standard deviations were s x = s y = 3 beats/min. Let's move on to defining the Student's test:

We set the reliability of the account: P = 0.95.

Number of degrees of freedom k 1 = n 1 + n 2 - 2 = 18 + 18-2 = 34. From the table in Appendix 4 we find t gr= 2,02.

Statistical inference. Since t = 11.62, and the boundary t gr = 2.02, then 11.62 > 2.02, i.e. t > t gr, therefore the difference between the samples is statistically significant.

Pedagogical conclusion. It was found that in terms of heart rate the difference between the state of the group before and after warm-up is statistically significant, i.e. significant, fundamental. So, based on the heart rate indicator, we can conclude that the warm-up is effective.

Fisher criterion is parametric. It is used when comparing sample dispersion rates. This usually means a comparison in terms of stability of sports performance or stability of functional and technical indicators in practice physical culture and sports. Samples can be of different sizes.

The Fisher criterion is defined in the following sequence.

1. Find the Fisher criterion F using the formula


where , are the variances of the compared samples.

The conditions of the Fisher criterion stipulate that in the numerator of the formula F there is a large dispersion, i.e. the number F is always greater than one.

We set the calculation reliability: P = 0.95 - and determine the number of degrees of freedom for both samples: k 1 = n 1 - 1, k 2 = n 2 - 1.

Using the table in Appendix 4, we find the limit value of criterion F gr.

Comparison of F and F criteria gr allows us to formulate conclusions:

if F > F gr, then the difference between the samples is statistically significant;

if F< F гр, то различие между выборками статически недо­стоверно.

Let's give a specific example.

Example 2.15. Let's analyze two groups of handball players: x i (n 1= 16 people) and y i (n 2 = 18 people). These groups of athletes were studied for the take-off time (s) when throwing the ball into the goal.

Are the repulsion indicators of the same type?

Initial data and basic calculations are presented in table. 2.32 and 2.33.

Table 2.32

Processing of repulsion indicators of the first group of handball players


Let us define the Fisher criterion:





According to the data presented in the table of Appendix 6, we find Fgr: Fgr = 2.4

Let us pay attention to the fact that the table in Appendix 6 lists the numbers of degrees of freedom of both greater and lesser dispersion when approaching large numbers gets rougher. Thus, the number of degrees of freedom of the larger dispersion follows in this order: 8, 9, 10, 11, 12, 14, 16, 20, 24, etc., and the smaller one - 28, 29, 30, 40, 50, etc. d.

This is explained by the fact that as the sample size increases, the differences in the F-test decrease and it is possible to use tabular values ​​that are close to the original data. So, in example 2.15 =17 is absent and we can take the value closest to it k = 16, from which we obtain Fgr = 2.4.

Statistical inference. Since Fisher's test F= 2.5 > F= 2.4, the samples are statistically distinguishable.

Pedagogical conclusion. The values ​​of the take-off time (s) when throwing the ball into the goal for handball players of both groups differ significantly. These groups should be considered different.

Further research should reveal the reason for this difference.

Example 2.20.(on the statistical reliability of the sample ). Has the football player's qualifications improved if the time (s) from giving the signal to kicking the ball at the beginning of the training was x i , and at the end y i .

Initial data and basic calculations are given in table. 2.40 and 2.41.

Table 2.40

Processing time indicators from giving a signal to hitting the ball at the beginning of training


Let us determine the difference between groups of indicators using the Student’s criterion:

With reliability P = 0.95 and degrees of freedom k = n 1 + n 2 - 2 = 22 + 22 - 2 = 42, using the table in Appendix 4 we find t gr= 2.02. Since t = 8.3 > t gr= 2.02 - the difference is statistically significant.

Let us determine the difference between groups of indicators using Fisher’s criterion:


According to the table in Appendix 2, with reliability P = 0.95 and degrees of freedom k = 22-1 = 21, the value F gr = 21. Since F = 1.53< F гр = = 2,1, различие в рассеивании исходных данных статистически недостоверно.

Statistical inference. According to the arithmetic average, the difference between groups of indicators is statistically significant. In terms of dispersion (dispersion), the difference between groups of indicators is statistically unreliable.

Pedagogical conclusion. The football player's qualifications have improved significantly, but attention should be paid to the stability of his testimony.

Preparing for work

Before conducting this laboratory work in the discipline “Sports Metrology” all students in the study group must form work teams of 3-4 students each, to jointly complete the work assignment of all laboratory work.

In preparation for work read the relevant sections of the recommended literature (see section 6 of the data methodological instructions) and lecture notes. Study sections 1 and 2 for this laboratory work, as well as the work assignment for it (section 4).

Prepare a report form on standard sheets of A4 size writing paper and fill it with the materials necessary for the work.

The report must contain :

Title page indicating the department (UC and TR), study group, last name, first name, patronymic of the student, number and title of laboratory work, date of its completion, as well as last name, academic degree, academic title and position of the teacher accepting the job;

Goal of the work;

Formulas with numerical values ​​explaining intermediate and final results computing;

Tables of measured and calculated values;

Required by assignment graphic material;

Brief conclusions based on the results of each stage of the work assignment and in general on the work performed.

All graphs and tables are drawn carefully using drawing tools. Conditional graphic and letter designations must comply with GOST standards. It is allowed to prepare a report using computer technology.

Work assignment

Before carrying out all measurements, each member of the team must study the rules of use sports game Darts given in Appendix 7, which are necessary for carrying out the following stages of research.

Stage I of research“Study of the results of hitting the target of the sport game Darts by each member of the team for compliance normal law distributions according to criterion χ 2 Pearson and criterion of three sigma"

1. measure (test) your (personal) speed and coordination of actions, by throwing darts 30-40 times at a circular target in the sports game Darts.

2. Results of measurements (tests) x i(with glasses) arrange in the form variation series and enter into table 4.1 (columns , do all necessary calculations, fill out the necessary tables and draw appropriate conclusions regarding the compliance of the received empirical distribution the normal distribution law, by analogy with similar calculations, tables and conclusions of example 2.12, given in section 2 of these guidelines on pages 7 -10.

Table 4.1

Correspondence of the speed and coordination of the subjects’ actions to the normal distribution law

No. rounded
Total

II – stage of research

“Assessment of the average indicators of the general population of hits on the target of the sport game Darts of all students of the study group based on the results of measurements of members of one team”

Assess the average indicators of speed and coordination of actions of all students in the study group (according to the list of the study group in the class magazine) based on the results of hitting the target of the Darts sport game of all team members, obtained at the first stage of research of this laboratory work.

1. Document the results of measurements of speed and coordination of actions when throwing darts at a circular target in the sports game Darts of all members of your team (2 - 4 people), who represent a sample of measurement results from the general population (measurement results of all students in a study group - for example, 15 people), entering them in the second and third columns Table 4.2.

Table 4.2

Processing indicators of speed and coordination of actions

brigade members

No.
Total

In table 4.2 under should be understood , matched average score (see calculation results in Table 4.1) members of your team ( , obtained at the first stage of research. It should be noted that, usually, Table 4.2 contains the calculated average value of the measurement results obtained by one member of the team at the first stage of research , since the likelihood that the measurement results of different team members will coincide is very small. Then, as a rule, the values in column Table 4.2 for each row - equal to 1, A in the line “Total " columns " ", is written the number of members of your team.

2. Perform all the necessary calculations to fill out table 4.2, as well as other calculations and conclusions similar to the calculations and conclusions of example 2.13 given in the 2nd section of this methodological development on pages 13-14. It should be kept in mind when calculating the representativeness error "m" it is necessary to use formula 2.4 given on page 13 of this methodological development, since the sample is small (n, and the number of elements of the general population N is known, and is equal to the number of students in the study group, according to the list of the journal of the study group.

III – stage of research

Evaluation of the effectiveness of the warm-up according to the indicator “Speed ​​and coordination of actions” by each team member using the Student’s t-test

To evaluate the effectiveness of the warm-up for throwing darts at the target of the sports game "Darts", performed at the first stage of research of this laboratory work, by each member of the team according to the indicator "Speed ​​and coordination of actions", using the Student's criterion - a parametric criterion for the statistical reliability of the empirical distribution law to the normal distribution law .

… Total

2. variances and RMS , results of measurements of the indicator “Speed ​​and coordination of actions” based on the results of warm-up, given in table 4.3, (see similar calculations given immediately after table 2.30 of example 2.14 on page 16 of this methodological development).

3. Each member of the work team measure (test) your (personal) speed and coordination of actions after warming up,

… Total

5. Perform average calculations variances and RMS ,measurement results of the indicator “Speed ​​and coordination of actions” after warm-up, given in table 4.4, write down the overall measurement result based on the warm-up results (see similar calculations given immediately after table 2.31 of example 2.14 on page 17 of this methodological development).

6. Perform all necessary calculations and conclusions similar to the calculations and conclusions of example 2.14 given in the 2nd section of this methodological development on pages 16-17. It should be kept in mind when calculating the representativeness error "m" it is necessary to use formula 2.1 given on page 12 of this methodological development, since the sample is n and the number of elements in the population N ( is unknown.

IV – stage of research

Assessment of the uniformity (stability) of the indicators “Speedness and coordination of actions” of two team members using the Fisher criterion

Assess the uniformity (stability) of the indicators “Speedness and coordination of actions” of two team members using the Fisher criterion, based on the measurement results obtained at the third stage of research in this laboratory work.

To do this you need to do the following.

Using the data from tables 4.3 and 4.4, the results of calculating variances from these tables obtained at the third stage of research, as well as the methodology for calculating and applying the Fisher criterion for assessing the uniformity (stability) of sports indicators, given in example 2.15 on pages 18-19 of this methodological development, draw appropriate statistical and pedagogical conclusions.

V – stage of research

Assessment of groups of indicators “Speedness and coordination of actions” of one team member before and after warm-up

STATISTICAL RELIABILITY

- English credibility/validity, statistical; German Validitat, statistische. Consistency, objectivity and lack of ambiguity in a statistical test or in a q.l. set of measurements. D. s. can be tested by repeating the same test (or questionnaire) on the same subject to see if the same results are obtained; or by comparing different parts of a test that are supposed to measure the same object.

Antinazi. Encyclopedia of Sociology, 2009

See what “STATISTICAL RELIABILITY” is in other dictionaries:

    STATISTICAL RELIABILITY- English credibility/validity, statistical; German Validitat, statistische. Consistency, objectivity and lack of ambiguity in a statistical test or in a q.l. set of measurements. D. s. can be verified by repeating the same test (or... Dictionary in Sociology

    In statistics, a value is called statistically significant if the probability of its occurrence by chance or even more extreme values ​​is low. Here, by extreme we mean the degree of deviation of the test statistics from the null hypothesis. The difference is called... ...Wikipedia

    The physical phenomenon of statistical stability is that as the sample size increases, the frequency of a random event or the average value of a physical quantity tends to some fixed number. The phenomenon of statistical... ... Wikipedia

    RELIABILITY OF DIFFERENCES (Similarities)- analytical statistical procedure for establishing the level of significance of differences or similarities between samples according to the studied indicators (variables) ... Modern educational process: basic concepts and terms

    REPORTING, STATISTICAL Great Accounting Dictionary

    REPORTING, STATISTICAL- a form of state statistical observation, in which the relevant bodies receive from enterprises (organizations and institutions) the information they need in the form of legally established reporting documents (statistical reports) for... Large economic dictionary

    The science that studies techniques for systematic observation of mass phenomena social life humans, compiling their numerical descriptions and scientific processing of these descriptions. Thus, theoretical statistics is a science... ... encyclopedic Dictionary F. Brockhaus and I.A. Efron

    Correlation coefficient- (Correlation coefficient) The correlation coefficient is a statistical indicator of the dependence of two random variables Definition of the correlation coefficient, types of correlation coefficients, properties of the correlation coefficient, calculation and application... ... Investor Encyclopedia

    Statistics- (Statistics) Statistics is a general theoretical science that studies quantitative changes in phenomena and processes. State statistics, statistical services, Rosstat (Goskomstat), statistical data, query statistics, sales statistics,... ... Investor Encyclopedia

    Correlation- (Correlation) Correlation is a statistical relationship between two or more random variables. The concept of correlation, types of correlation, correlation coefficient, correlation analysis, price correlation, correlation of currency pairs on Forex Contents... ... Investor Encyclopedia

Books

  • Research in mathematics and mathematics in research: Methodological collection on student research activities, Borzenko V.I.. The collection presents methodological developments, applicable in the organization research activities students. The first part of the collection is devoted to the application of a research approach in...

The level of significance in statistics is an important indicator that reflects the degree of confidence in the accuracy and truth of the obtained (predicted) data. The concept is widely used in various fields: from conducting sociological research to statistical testing of scientific hypotheses.

Definition

The level of statistical significance (or statistically significant result) shows the probability of the occurrence of the studied indicators by chance. The overall statistical significance of a phenomenon is expressed by the p-value coefficient (p-level). In any experiment or observation, there is a possibility that the data obtained were due to sampling errors. This is especially true for sociology.

That is, a statistically significant value is a value whose probability of random occurrence is extremely small or tends to the extreme. The extreme in this context is the degree to which statistics deviate from the null hypothesis (a hypothesis that is tested for consistency with the obtained sample data). In scientific practice, the significance level is selected before data collection and, as a rule, its coefficient is 0.05 (5%). For systems where it is extremely important exact values, this indicator can be 0.01 (1%) or less.

Background

The concept of significance level was introduced by the British statistician and geneticist Ronald Fisher in 1925, when he was developing a test technique statistical hypotheses. When analyzing any process, there is a certain probability of certain phenomena. Difficulties arise when working with small (or not obvious) percentages of probabilities that fall under the concept of “measurement error.”

When working with statistical data that is not specific enough to test, scientists are faced with the problem of the null hypothesis, which “prevents” operating with small quantities. Fisher proposed for such systems to determine the probability of events at 5% (0.05) as a convenient sampling cut that allows one to reject the null hypothesis in calculations.

Introduction of fixed odds

In 1933 Jerzy scientists Neyman and Egon Pearson in their works recommended setting a certain level of significance in advance (before data collection). Examples of the use of these rules are clearly visible during elections. Let's say there are two candidates, one of whom is very popular and the other is little known. It is obvious that the first candidate will win the election, and the chances of the second tend to zero. They strive - but are not equal: there is always the possibility of force majeure, sensational information, unexpected decisions, which could change the predicted election results.

Neyman and Pearson agreed that Fisher's significance level of 0.05 (denoted by α) was most appropriate. However, Fischer himself in 1956 opposed fixing this value. He believed that the level of α should be set according to specific circumstances. For example, in particle physics it is 0.01.

p-level value

The term p-value was first used by Brownlee in 1960. The P-level (p-value) is an indicator that is inversely related to the truth of the results. The highest p-value coefficient corresponds to the lowest level of confidence in the sampled relationship between variables.

This value reflects the likelihood of errors associated with the interpretation of the results. Let's assume p-level = 0.05 (1/20). It shows a five percent probability that the relationship between variables found in the sample is just a random feature of the sample. That is, if this dependence is absent, then with repeated similar experiments, on average, in every twentieth study, one can expect the same or greater dependence between the variables. The p-level is often seen as a "margin" for the error rate.

By the way, p-value may not reflect the real relationship between variables, but only shows a certain average value within the assumptions. In particular, the final analysis of the data will also depend on the selected values ​​of this coefficient. At p-level = 0.05 there will be some results, and at a coefficient equal to 0.01 there will be different results.

Testing statistical hypotheses

The level of statistical significance is especially important when testing hypotheses. For example, when calculating a two-sided test, the rejection region is divided equally at both ends of the sampling distribution (relative to the zero coordinate) and the truth of the resulting data is calculated.

Suppose, when monitoring a certain process (phenomenon), it turns out that new statistical information indicates small changes relative to previous values. At the same time, the discrepancies in the results are small, not obvious, but important for the study. The specialist is faced with a dilemma: are changes really occurring or are these sampling errors (measurement inaccuracy)?

In this case, they use or reject the null hypothesis (attribute everything to an error, or recognize the change in the system as a fait accompli). The problem solving process is based on the ratio of overall statistical significance (p-value) and significance level (α). If p-level< α, значит, нулевую гипотезу отвергают. Чем меньше р-value, тем более значимой является тестовая статистика.

Values ​​used

The level of significance depends on the material being analyzed. In practice, the following fixed values ​​are used:

  • α = 0.1 (or 10%);
  • α = 0.05 (or 5%);
  • α = 0.01 (or 1%);
  • α = 0.001 (or 0.1%).

The more accurate the calculations are required, the lower the α coefficient is used. Naturally, statistical forecasts in physics, chemistry, pharmaceuticals, and genetics require greater accuracy than in political science and sociology.

Significance thresholds in specific areas

In high-precision fields such as particle physics and manufacturing, statistical significance is often expressed as the ratio of the standard deviation (denoted by the sigma coefficient - σ) relative to a normal probability distribution (Gaussian distribution). σ is a statistical indicator that determines the dispersion of the values ​​of a certain quantity relative to mathematical expectations. Used to plot the probability of events.

Depending on the field of knowledge, the coefficient σ varies greatly. For example, when predicting the existence of the Higgs boson, the parameter σ is equal to five (σ = 5), which corresponds to p-value = 1/3.5 million. In genome studies, the significance level can be 5 × 10 -8, which is not uncommon for this areas.

Efficiency

It must be taken into account that coefficients α and p-value are not exact characteristics. Whatever the level of significance in the statistics of the phenomenon under study, it is not an unconditional basis for accepting the hypothesis. For example, than less valueα, the greater the chance that the hypothesis being established is significant. However, there is a risk of error, which reduces the statistical power (significance) of the study.

Researchers who focus solely on statistically significant results may reach erroneous conclusions. At the same time, it is difficult to double-check their work, since they apply assumptions (which in fact are the α and p-values). Therefore, it is always recommended, along with calculating statistical significance, to determine another indicator - the magnitude of the statistical effect. Effect size is a quantitative measure of the strength of an effect.

PAID FEATURE. The statistical significance feature is only available in some tariff plans. Check if it is in .

You can find out if there are statistically significant differences in the answers received from different groups respondents to survey questions. To use the statistical significance feature in SurveyMonkey, you must:

  • Enable the statistical significance feature when adding a comparison rule to a question in your survey. Select groups of respondents to compare to sort survey results into groups for visual comparison.
  • Examine the data tables for your survey questions to determine if there are statistically significant differences in the responses received from various groups respondents.

View statistical significance

By following the steps below, you can create a survey that displays statistical significance.

1. Add closed-ended questions to your survey

In order to show statistical significance when analyzing results, you will need to apply a comparison rule to any question in your survey.

You can apply the comparison rule and calculate statistical significance in responses if you use one of the following types of questions in your survey design:

It is necessary to make sure that the proposed answer options can be divided into complete groups. The response options you select for comparison when you create a comparison rule will be used to organize the data into crosstabs throughout the survey.

2. Collect answers

Once you've completed your survey, create a collector to send it out. There are several ways.

You must receive at least 30 responses for each response option you plan to use in your comparison rule to activate and view statistical significance.

Survey example

You want to find out whether men are significantly more satisfied with your products than women.

  1. Add two multiple choice questions to your survey:
    What is your gender? (male, female)
    Are you satisfied or dissatisfied with our product? (satisfied, dissatisfied)
  2. Make sure that at least 30 respondents select “male” for the gender question AND at least 30 respondents select “female” as their gender.
  3. Add a comparison rule to the question "What is your gender?" and select both answer options as your groups.
  4. Use the data table below the question chart "Are you satisfied or dissatisfied with our product?" to see if any response options show a statistically significant difference

What is a statistically significant difference?

A statistically significant difference means that using statistical analysis It was established that there were significant differences between the answers of one group of respondents and the answers of another group. Statistical significance means that the numbers obtained are significantly different. Such knowledge will greatly help you in data analysis. However, you determine the importance of the results obtained. It is you who decide how to interpret the survey results and what actions should be taken based on them.

For example, you receive more complaints from female customers than from male customers. How can we determine whether such a difference is real and whether action needs to be taken regarding it? One great way to test your observations is to conduct a survey that will show you whether male customers are significantly more satisfied with your product. Using a statistical formula, our statistical significance function will give you the ability to determine whether your product is actually much more appealing to men than to women. This will allow you to take action based on facts rather than guesswork.

Statistically significant difference

If your results are highlighted in the data table, it means that the two groups of respondents are significantly different from each other. The term “significant” does not mean that the resulting numbers have any particular importance or significance, only that there is a statistical difference between them.

No statistically significant difference

If your results are not highlighted in the corresponding data table, this means that although there may be a difference in the two figures being compared, there is no statistical difference between them.

Responses without statistically significant differences demonstrate that there is no significant difference between the two items being compared given the sample size you use, but this does not necessarily mean that they are not significant. Perhaps by increasing the sample size, you will be able to identify a statistically significant difference.

Sample size

If you have a very small sample size, only very large differences between the two groups will be significant. If you have a very large sample size, both small and large differences will be counted as significant.

However, if two numbers are statistically different, this does not mean that the difference between the results has any practical meaning to you. You will have to decide for yourself which differences are meaningful for your survey.

Calculating Statistical Significance

We calculate statistical significance using a standard 95% confidence level. If an answer option is shown as statistically significant, it means that by chance alone or due to sampling error there is less than a 5% probability of the difference between the two groups occurring (often shown as: p<0,05).

To calculate statistically significant differences between groups, we use the following formulas:

Parameter

Description

a1The percentage of participants from the first group who answered the question in a certain way, multiplied by the sample size of this group.
b1The percentage of participants from the second group who answered the question in a certain way, multiplied by the sample size of this group.
Pooled sample proportion (p)The combination of two shares from both groups.
Standard error (SE)An indicator of how much your share differs from the actual share. A lower value means the fraction is close to the actual fraction, a higher value means the fraction is significantly different from the actual fraction.
Test statistic (t)Test statistic. The number of standard deviations by which a given value differs from the mean.
Statistical significanceIf the absolute value of the test statistic is greater than 1.96* standard deviations from the mean, it is considered a statistically significant difference.

*1.96 is the value used for the 95% confidence level because 95% of the range handled by the Student's t-distribution function lies within 1.96 standard deviations of the mean.

Calculation example

Continuing with the example used above, let's find out whether the percentage of men who say they are satisfied with your product is significantly higher than the percentage of women.

Let's say 1,000 men and 1,000 women took part in your survey, and the result of the survey was that 70% of men and 65% of women say that they are satisfied with your product. Is the 70% level significantly higher than the 65% level?

Substitute the following data from the survey into the given formulas:

  • p1 (% of men satisfied with the product) = 0.7
  • p2 (% of women satisfied with the product) = 0.65
  • n1 (number of men surveyed) = 1000
  • n2 (number of women interviewed) = 1000

Since the absolute value of the test statistic is greater than 1.96, it means that the difference between men and women is significant. Compared to women, men are more likely to be satisfied with your product.

Hiding statistical significance

How to hide statistical significance for all questions

  1. Click the down arrow to the right of the comparison rule in the left sidebar.
  2. Select an item Edit rule.
  3. Disable the feature Show statistical significance using a switch.
  4. Click the button Apply.

To hide statistical significance for one question, you need to:

  1. Click the button Tune above the diagram of this issue.
  2. Open the tab Display options.
  3. Uncheck the box next to Statistical significance.
  4. Click the button Save.

The display option is automatically enabled when statistical significance display is enabled. If you clear this display option, the statistical significance display will also be disabled.

Turn on the statistical significance feature when adding a comparison rule to a question in your survey. Examine the data tables for your survey questions to determine if there are statistically significant differences in the responses received from different groups of respondents.