Basic terms and concepts of medical statistics. Reliability and statistical significance

21.09.2019

Let's consider a typical example of the application of statistical methods in medicine. The creators of the drug suggest that it increases diuresis in proportion to the dose taken. To test this hypothesis, they give five volunteers different doses of the drug.

Based on the observation results, a graph of diuresis versus dose is constructed (Fig. 1.2A). Dependency is visible to the naked eye. Researchers congratulate each other on the discovery, and the world on the new diuretic.

In fact, the data only allow us to reliably state that a dose-dependent diuresis was observed in these five volunteers. The fact that this dependence will manifest itself in all people who take the drug is no more than an assumption.
ZY

With

life It cannot be said that it is groundless - otherwise, why carry out experiments?

But the drug went on sale. All more people take it in hopes of increasing their urine output. So what do we see? We see Figure 1.2B, which indicates the absence of any connection between the dose of the drug and diuresis. Black circles indicate data from the original study. Statistics has methods that allow us to estimate the likelihood of obtaining such an “unrepresentative”, and indeed confusing, sample. It turns out that in the absence of a connection between diuresis and the dose of the drug, the resulting “dependence” would be observed in approximately 5 out of 1000 experiments. So, in this case, the researchers were simply unlucky. Even if they had used the most advanced statistical methods, it still would not have prevented them from making mistakes.

We gave this fictitious, but not at all far from reality example, not to point out the uselessness
ness of statistics. He talks about something else, about the probabilistic nature of her conclusions. As a result of applying the statistical method, we do not obtain the ultimate truth, but only an estimate of the probability of a particular assumption. In addition, everyone statistical method is based on its own mathematical model and its results are correct to the extent that this model corresponds to reality.

More on the topic RELIABILITY AND STATISTICAL SIGNIFICANCE:

  1. Statistically significant differences in quality of life indicators
  2. Statistical population. Accounting characteristics. The concept of continuous and selective research. Requirements for statistical data and the use of accounting and reporting documents
  3. ABSTRACT. STUDY OF THE RELIABILITY OF TONOMETER INDICATIONS FOR MEASURING INTRAOCULAR PRESSURE THROUGH THE EYELID 2018, 2018

In the tables of results of statistical calculations in coursework, diploma and master's theses in psychology, the indicator “p” is always present.

For example, according to research objectives Differences in the level of meaningfulness in life among teenage boys and girls were calculated.

Average value

Mann-Whitney U test

Statistical significance level (p)

Boys (20 people)

Girls

(5 people)

Goals

28,9

35,2

17,5

0,027*

Process

30,1

32,0

38,5

0,435

Result

25,2

29,0

29,5

0,164

Locus of control - "I"

20,3

23,6

0,067

Locus of control - "Life"

30,4

33,8

27,5

0,126

Meaningful life

98,9

111,2

0,103

* - differences are statistically significant (p0,05)

The right column shows the value of “p” and it is by its value that one can determine whether the differences in the meaningfulness of life in the future between boys and girls are significant or not. The rule is simple:

  • If the level of statistical significance “p” is less than or equal to 0.05, then we conclude that the differences are significant. In the table below, the differences between boys and girls are significant in relation to the “Goals” indicator - meaningfulness of life in the future. For girls, this indicator is statistically significantly higher than for boys.
  • If the level of statistical significance “p” is greater than 0.05, then it is concluded that the differences are not significant. In the table below, the differences between boys and girls are not significant for all other indicators, with the exception of the first.

Where does the level of statistical significance “p” come from?

The level of statistical significance is calculated statistical program together with the calculation of the statistical criterion. In these programs, you can also set a critical limit for the level of statistical significance and the corresponding indicators will be highlighted by the program.

For example, in the STATISTICA program, when calculating correlations, you can set the “p” limit, for example, 0.05, and all statistically significant relationships will be highlighted in red.

If the statistical criterion is calculated manually, then the significance level “p” is determined by comparing the value of the resulting criterion with the critical value.

What does the level of statistical significance “p” show?

All statistical calculations are approximate. The level of this approximation determines “p”. The significance level is written as a decimal, for example, 0.023 or 0.965. If we multiply this number by 100, we get the p indicator as a percentage: 2.3% and 96.5%. These percentages reflect the likelihood of our assumptions about the relationship between, for example, aggression and anxiety being wrong.

That is, correlation coefficient 0.58 between aggression and anxiety was obtained at a statistical significance level of 0.05 or an error probability of 5%. What exactly does this mean?

The correlation we identified means that in our sample the following pattern is observed: the higher the aggressiveness, the higher the anxiety. That is, if we take two teenagers, and one has higher anxiety than the other, then, knowing about the positive correlation, we can say that this teenager will also have higher aggressiveness. But since everything in statistics is approximate, then by stating this, we admit that we may be mistaken, and the probability of error is 5%. That is, having made 20 such comparisons in this group of adolescents, we can make one mistake in predicting the level of aggressiveness, knowing anxiety.

Which level of statistical significance is better: 0.01 or 0.05

The level of statistical significance reflects the probability of error. Therefore, the result at p=0.01 is more accurate than at p=0.05.

In psychological research, two acceptable levels of statistical significance of results are accepted:

p=0.01 - high reliability of the result comparative analysis or analysis of relationships;

p=0.05 - sufficient accuracy.

I hope this article will help you write a psychology paper on your own. If you need help, please contact us (all types of work in psychology; statistical calculations).

STATISTICAL RELIABILITY

- English credibility/validity, statistical; German Validitat, statistische. Consistency, objectivity and lack of ambiguity in a statistical test or in a q.l. set of measurements. D. s. can be tested by repeating the same test (or questionnaire) on the same subject to see if the same results are obtained; or by comparing different parts of a test that are supposed to measure the same object.

Antinazi. Encyclopedia of Sociology, 2009

See what “STATISTICAL RELIABILITY” is in other dictionaries:

    STATISTICAL RELIABILITY- English credibility/validity, statistical; German Validitat, statistische. Consistency, objectivity and lack of ambiguity in a statistical test or in a q.l. set of measurements. D. s. can be verified by repeating the same test (or... Dictionary in Sociology

    In statistics, a value is called statistically significant if the probability of its occurrence by chance or even more extreme values ​​is low. Here, by extreme we mean the degree of deviation of the test statistics from the null hypothesis. The difference is called... ...Wikipedia

    The physical phenomenon of statistical stability is that as the sample size increases, the frequency of a random event or the average value physical quantity tends to some fixed number. The phenomenon of statistical... ... Wikipedia

    RELIABILITY OF DIFFERENCES (Similarities)- analytical statistical procedure for establishing the level of significance of differences or similarities between samples according to the studied indicators (variables) ... Modern educational process: basic concepts and terms

    REPORTING, STATISTICAL Great Accounting Dictionary

    REPORTING, STATISTICAL- a form of state statistical observation, in which the relevant bodies receive from enterprises (organizations and institutions) the information they need in the form of legally established reporting documents (statistical reports) for... Large economic dictionary

    The science that studies techniques for systematic observation of mass phenomena social life humans, compiling their numerical descriptions and scientific processing of these descriptions. Thus, theoretical statistics is a science... ... encyclopedic Dictionary F. Brockhaus and I.A. Efron

    Correlation coefficient- (Correlation coefficient) The correlation coefficient is a statistical indicator of the dependence of two random variables Definition of the correlation coefficient, types of correlation coefficients, properties of the correlation coefficient, calculation and application... ... Investor Encyclopedia

    Statistics- (Statistics) Statistics is a general theoretical science that studies quantitative changes in phenomena and processes. State statistics, statistical services, Rosstat (Goskomstat), statistical data, query statistics, sales statistics,... ... Investor Encyclopedia

    Correlation- (Correlation) Correlation is a statistical relationship between two or more random variables. The concept of correlation, types of correlation, correlation coefficient, correlation analysis, price correlation, correlation of currency pairs on Forex Contents... ... Investor Encyclopedia

Books

  • Research in mathematics and mathematics in research: Methodological collection on student research activities, Borzenko V.I.. The collection presents methodological developments, applicable in the organization research activities students. The first part of the collection is devoted to the application of a research approach in...

In any scientific and practical experimental (survey) situation, researchers may not study all people ( general population, population), but only a certain sample. For example, even if we are studying a relatively small group of people, such as those suffering from a particular disease, it is still very unlikely that we have the appropriate resources or the need to test every patient. Instead, it is common to test a sample from the population because it is more convenient and less time consuming. If so, how do we know that the results obtained from the sample are representative of the entire group? Or, to use professional terminology, can we be sure that our research correctly describes the entire population, the sample we used?

To answer this question, it is necessary to determine the statistical significance of the test results. Statistical significance (Significant level, abbreviated Sig.), or /7-significance level (p-level) - is the probability that a given result correctly represents the population from which the study was sampled. Note that this is only probability- it is impossible to say with absolute certainty that a given study correctly describes the entire population. IN best case scenario Based on the level of significance, one can only conclude that this is very likely. Thus, the next question inevitably arises: what level of significance must be before a given result can be considered a correct characterization of the population?

For example, at what probability value are you willing to say that such chances are enough to take a risk? What if the odds are 10 out of 100 or 50 out of 100? What if this probability is higher? What about odds like 90 out of 100, 95 out of 100, or 98 out of 100? For a situation involving risk, this choice is quite problematic, because it depends on the personal characteristics of the person.

In psychology, it is traditionally believed that a 95 or more chance out of 100 means that the probability of the results being correct is high enough for them to be generalizable to the entire population. This figure was established in the process of scientific and practical activity - there is no law according to which it should be chosen as a guideline (and indeed, in other sciences sometimes other values ​​of the significance level are chosen).

In psychology, this probability is operated in a somewhat unusual way. Instead of the probability that the sample represents the population, the probability that the sample doesn't represent population. In other words, it is the probability that the observed relationship or differences are random and not a property of the population. So, instead of saying there is a 95 in 100 chance that the results of a study are correct, psychologists say that there is a 5 in 100 chance that the results are wrong (just as a 40 in 100 chance that the results are correct means a 60 in 100 chance in favor of their incorrectness). The probability value is sometimes expressed as a percentage, but more often it is written as a decimal fraction. For example, 10 chances out of 100 are expressed as a decimal fraction of 0.1; 5 out of 100 is written as 0.05; 1 out of 100 - 0.01. With this form of recording, the limit value is 0.05. For a result to be considered correct, its significance level must be below this number (remember, this is the probability that the result wrong describes the population). To get the terminology out of the way, let's add that the “probability of the result being incorrect” (which is more correctly called significance level) usually denoted Latin letter R. Descriptions of experimental results usually include a summary statement such as “the results were significant at the confidence level (R(p) less than 0.05 (i.e. less than 5%).

Thus, the significance level ( R) indicates the likelihood that the results Not represent the population. Traditionally in psychology, it is believed that the results reliably reflect big picture, if value R less than 0.05 (i.e. 5%). However, this is only a probabilistic statement, and not at all an unconditional guarantee. In some cases this conclusion may not be correct. In fact, we can calculate how often this might happen if we look at the magnitude of the significance level. At a significance level of 0.05, 5 out of 100 times the results are likely to be incorrect. 11a at first glance it seems that this is not very common, but if you think about it, then 5 chances out of 100 is the same as 1 out of 20. In other words, in one out of every 20 cases the result will be incorrect. Such odds do not seem particularly favorable, and researchers should beware of committing errors of the first kind. This is the name for the error that occurs when researchers think they have discovered real results, but in fact there are none. The opposite error, which consists in researchers believing that they have not found a result, but in fact there is one, is called errors of the second type.

These errors arise because it is impossible to exclude the possibility that the statistical analysis. The probability of error depends on the level of statistical significance of the results. We have already noted that for a result to be considered correct, the significance level must be below 0.05. Of course, some results are more low level, and it is not uncommon to find results as low as 0.001 (a value of 0.001 indicates that the results have a 1 in 1000 chance of being wrong). How less value p, the stronger our confidence in the correctness of the results.

In table 7.2 shows the traditional interpretation of significance levels about the possibility of statistical inference and the rationale for the decision about the presence of a relationship (differences).

Table 7.2

Traditional interpretation of significance levels used in psychology

Based on the experience of practical research, it is recommended: in order to avoid errors of the first and second types as much as possible, when drawing important conclusions, decisions should be made about the presence of differences (connections), focusing on the level R n sign.

Statistical test(Statistical Test - it is a tool for determining the level of statistical significance. This is a decisive rule that ensures that a true hypothesis is accepted and a false hypothesis is rejected with high probability.

Statistical criteria also denote the method of calculating a certain number and the number itself. All criteria are used with one main goal: define significance level the data they analyze (i.e., the likelihood that the data reflects a true effect that correctly represents the population from which the sample is drawn).

Some tests can only be used for normally distributed data (and if the trait is measured on an interval scale) - these tests are usually called parametric. Using other criteria, you can analyze data with almost any distribution law - they are called nonparametric.

Parametric criteria are criteria that include distribution parameters in the calculation formula, i.e. means and variances (Student's t-test, Fisher's F-test, etc.).

Nonparametric criteria are criteria that do not include distribution parameters in the formula for calculating distribution parameters and are based on operating with frequencies or ranks (criterion Q Rosenbaum criterion U Manna - Whitney

For example, when we say that the significance of the differences was determined by the Student's t-test, we mean that the Student's t-test method was used to calculate the empirical value, which is then compared with the tabulated (critical) value.

By the ratio of the empirical (calculated by us) and critical values ​​of the criterion (tabular) we can judge whether our hypothesis is confirmed or refuted. In most cases, in order for us to recognize the differences as significant, it is necessary that the empirical value of the criterion exceeds the critical value, although there are criteria (for example, the Mann-Whitney test or the sign test) in which we must adhere to the opposite rule.

In some cases, the calculation formula for the criterion includes the number of observations in the sample under study, denoted as P. Using a special table, we determine what level of statistical significance of differences a given empirical value corresponds to. In most cases, the same empirical value of the criterion may be significant or insignificant depending on the number of observations in the sample under study ( P ) or from the so-called number of degrees of freedom , which is denoted as v (g>) or how df (Sometimes d).

Knowing P or the number of degrees of freedom, using special tables (the main ones are given in Appendix 5) we can determine the critical values ​​of the criterion and compare the obtained empirical value with them. This is usually written like this: “when n = 22 critical values ​​of the criterion are t St = 2.07" or "at v (d) = 2 critical values ​​of the Student’s test are = 4.30”, etc.

Typically, preference is still given to parametric criteria, and we adhere to this position. They are considered to be more reliable and can provide more information and deeper analysis. As for the complexity of mathematical calculations, when using computer programs this difficulty disappears (but some others appear, however, quite surmountable).

  • In this textbook we do not consider in detail the problem of statistical
  • hypotheses (null - R0 and alternative - Hj) and statistical decisions made, since psychology students study this separately in the discipline “Mathematical methods in psychology”. In addition, it should be noted that when preparing a research report (coursework or thesis, publications) statistical hypotheses and statistical solutions, as a rule, are not given. Usually, when describing the results, they indicate the criterion, provide the necessary descriptive statistics (means, sigma, correlation coefficients, etc.), empirical values ​​of the criteria, degrees of freedom, and necessarily the p-level of significance. Then a meaningful conclusion is formulated regarding the hypothesis being tested, indicating (usually in the form of an inequality) the level of significance achieved or not achieved.

PAID FEATURE. The statistical significance feature is only available in some tariff plans. Check if it is in .

You can find out if there are statistically significant differences in the answers received from different groups respondents to survey questions. To use the statistical significance feature in SurveyMonkey, you must:

  • Enable the statistical significance feature when adding a comparison rule to a question in your survey. Select groups of respondents to compare to sort survey results into groups for visual comparison.
  • Examine the data tables for your survey questions to determine if there are statistically significant differences in the responses received from various groups respondents.

View statistical significance

By following the steps below, you can create a survey that displays statistical significance.

1. Add closed-ended questions to your survey

In order to show statistical significance when analyzing results, you will need to apply a comparison rule to any question in your survey.

You can apply the comparison rule and calculate statistical significance in responses if you use one of the following types of questions in your survey design:

It is necessary to make sure that the proposed answer options can be divided into complete groups. The response options you select for comparison when you create a comparison rule will be used to organize the data into crosstabs throughout the survey.

2. Collect answers

Once you've completed your survey, create a collector to send it out. There are several ways.

You must receive at least 30 responses for each response option you plan to use in your comparison rule to activate and view statistical significance.

Survey example

You want to find out whether men are significantly more satisfied with your products than women.

  1. Add two multiple choice questions to your survey:
    What is your gender? (male, female)
    Are you satisfied or dissatisfied with our product? (satisfied, dissatisfied)
  2. Make sure that at least 30 respondents select “male” for the gender question AND at least 30 respondents select “female” as their gender.
  3. Add a comparison rule to the question "What is your gender?" and select both answer options as your groups.
  4. Use the data table below the question chart "Are you satisfied or dissatisfied with our product?" to see if any response options show a statistically significant difference

What is a statistically significant difference?

A statistically significant difference means that statistical analysis has determined that there are significant differences between the responses of one group of respondents and the responses of another group. Statistical significance means that the numbers obtained are significantly different. Such knowledge will greatly help you in data analysis. However, you determine the importance of the results obtained. It is you who decide how to interpret the survey results and what actions should be taken based on them.

For example, you receive more complaints from female customers than from male customers. How can we determine whether such a difference is real and whether action needs to be taken regarding it? One great way to test your observations is to conduct a survey that will show you whether male customers are significantly more satisfied with your product. Using a statistical formula, our statistical significance function will give you the ability to determine whether your product is actually much more appealing to men than to women. This will allow you to take action based on facts rather than guesswork.

Statistically significant difference

If your results are highlighted in the data table, it means that the two groups of respondents are significantly different from each other. The term “significant” does not mean that the resulting numbers have any particular importance or significance, only that there is a statistical difference between them.

No statistically significant difference

If your results are not highlighted in the corresponding data table, this means that although there may be a difference in the two figures being compared, there is no statistical difference between them.

Responses without statistically significant differences demonstrate that there is no significant difference between the two items being compared given the sample size you use, but this does not necessarily mean that they are not significant. Perhaps by increasing the sample size, you will be able to identify a statistically significant difference.

Sample size

If you have a very small sample size, only very large differences between the two groups will be significant. If you have a very large sample size, both small and large differences will be counted as significant.

However, if two numbers are statistically different, this does not mean that the difference between the results has any practical meaning to you. You will have to decide for yourself which differences are meaningful for your survey.

Calculating Statistical Significance

We calculate statistical significance using a standard 95% confidence level. If an answer option is shown as statistically significant, it means that by chance alone or due to sampling error there is less than a 5% probability of the difference between the two groups occurring (often shown as: p<0,05).

To calculate statistically significant differences between groups, we use the following formulas:

Parameter

Description

a1The percentage of participants from the first group who answered the question in a certain way, multiplied by the sample size of this group.
b1The percentage of participants from the second group who answered the question in a certain way, multiplied by the sample size of this group.
Pooled sample proportion (p)The combination of two shares from both groups.
Standard error (SE)An indicator of how much your share differs from the actual share. A lower value means the fraction is close to the actual fraction, a higher value means the fraction is significantly different from the actual fraction.
Test statistic (t)Test statistic. The number of standard deviations by which a given value differs from the mean.
Statistical significanceIf the absolute value of the test statistic is greater than 1.96* standard deviations from the mean, it is considered a statistically significant difference.

*1.96 is the value used for the 95% confidence level because 95% of the range handled by the Student's t-distribution function lies within 1.96 standard deviations of the mean.

Calculation example

Continuing with the example used above, let's find out whether the percentage of men who say they are satisfied with your product is significantly higher than the percentage of women.

Let's say 1,000 men and 1,000 women took part in your survey, and the result of the survey was that 70% of men and 65% of women say that they are satisfied with your product. Is the 70% level significantly higher than the 65% level?

Substitute the following data from the survey into the given formulas:

  • p1 (% of men satisfied with the product) = 0.7
  • p2 (% of women satisfied with the product) = 0.65
  • n1 (number of men surveyed) = 1000
  • n2 (number of women interviewed) = 1000

Since the absolute value of the test statistic is greater than 1.96, it means that the difference between men and women is significant. Compared to women, men are more likely to be satisfied with your product.

Hiding statistical significance

How to hide statistical significance for all questions

  1. Click the down arrow to the right of the comparison rule in the left sidebar.
  2. Select an item Edit rule.
  3. Disable the feature Show statistical significance using a switch.
  4. Click the button Apply.

To hide statistical significance for one question, you need to:

  1. Click the button Tune above the diagram of this issue.
  2. Open the tab Display options.
  3. Uncheck the box next to Statistical significance.
  4. Click the button Save.

The display option is automatically enabled when statistical significance display is enabled. If you clear this display option, the statistical significance display will also be disabled.

Turn on the statistical significance feature when adding a comparison rule to a question in your survey. Examine the data tables for your survey questions to determine if there are statistically significant differences in the responses received from different groups of respondents.