t-Test, Chi-Square, ANOVA, Regression, Correlation... (2024)

This tutorial is about z-standardization (z-transformation). We will discuss what the z-score is, how z-standardization works, and what the standard normal distribution is. In addition, the z-score table is discussed and what it's used for.

What is z-standardization?

Z-standardization is a statistical procedure used to make data points from different datasets comparable. In this procedure, each data point is converted into a z-score. A z-score indicates how many standard deviations a data point is from the mean of the dataset.

Example of z-standardization

Suppose you are a doctor and want to examine the blood pressure of your patients. For this purpose, you measured the blood pressure of a sample of 40 patients. From the measured data, you can now naturally calculate the average, i.e., the value that the 40 patients have on average.

t-Test, Chi-Square, ANOVA, Regression, Correlation... (1)

Now one of the patients asks you how high his blood pressure is compared to the others. You tell him that his blood pressure is 10mmHg above average. Now the question arises, whether 10mmHg is a lot or a little.

t-Test, Chi-Square, ANOVA, Regression, Correlation... (2)

If the other patients cluster very closely around the mean, then 10mmHg is a lot in relation to the spread, but if the other patients spread very widely around the mean, then 10mmHg might not be that much.

t-Test, Chi-Square, ANOVA, Regression, Correlation... (3)

The standard deviation tells us how much the data is spread out. If the data are close to the mean, we have a small standard deviation; if they are widely spread, we have a large standard deviation.

Let's say we get a standard deviation of 20 mmHg for our data. This means that on average, the patients deviate by 20 from the mean.

t-Test, Chi-Square, ANOVA, Regression, Correlation... (4)

The z-score now tells us how far a person is from the mean in units of standard deviation. So, a person who deviates one standard deviation from the mean has a z-score of 1. A person who is twice as far from the mean has a z-score of 2. And a person who is three standard deviations from the mean has a z-score of 3.

t-Test, Chi-Square, ANOVA, Regression, Correlation... (5)

Accordingly, a person who deviates by minus one standard deviation has a z-score of -1, a person who deviates by minus two standard deviations has a z-score of -2, and a person who deviates by minus three standard deviations has a z-score of -3.

And if a person has exactly the value of the mean, then they deviate by zero standard deviations from the mean and receive a score of zero.

t-Test, Chi-Square, ANOVA, Regression, Correlation... (6)

Thus, the z-score indicates how many standard deviations a measurement is from the mean. As mentioned, the standard deviation is just a measure of the dispersion of the patients' blood pressure around the mean.

In short, the z-score helps us understand how exceptional or normal a particular measurement is compared to the overall average.

Calculating the z-score

How do we calculate the z-score? We want to convert the original data, in our case the blood pressure, into z-scores, i.e., perform a z-standradiszation.

t-Test, Chi-Square, ANOVA, Regression, Correlation... (7)

Here we see the formula for z-standardization. Here, z is of course the z-value we want to calculate, x is the observed value, in our case the blood pressure of the person in question, μ is the mean value of the sample, in our case the mean value of all 40 patients, and σ is the standard deviation of the sample, i.e. the standard deviation of our 40 patients.

t-Test, Chi-Square, ANOVA, Regression, Correlation... (8)

Caution: μ and σ are actually the mean and standard deviation of the population, but in our case we only have a sample. However, under certain conditions, which we will discuss later, we can estimate the mean and standard deviation using the sample.

Let's assume that the 40 patients in our example have a mean value of 130 and a standard deviation of 20. If we use both values, we get for z: x-130 divided by 20

t-Test, Chi-Square, ANOVA, Regression, Correlation... (9)

Now we can use the blood pressure of each individual patient for x and calculate the z value. Let's just do this for the first patient. Let's say this patient has a blood pressure of 97, then we simply enter 97 for x and get a z-value of -1.65.

t-Test, Chi-Square, ANOVA, Regression, Correlation... (10)

This person therefore deviates from the mean by -1.65 standard deviations. We can now do this for all patients.

Regardless of the unit of the initial data, we now have an overview in which we can see how far a person deviates from the mean in units of the standard deviation.

Now, of course, we only have a sample that comes from a specific population. But if the data is normally distributed and the sample size is greater than 30, then we can use the z-value to say what percentage of patients have a blood pressure lower than 110, for example, and what percentage have a blood pressure higher than 110.

t-Test, Chi-Square, ANOVA, Regression, Correlation... (11)

But how does this work? If the initial data is normally distributed, we obtain a so-called standard normal distribution through z-standardization.

t-Test, Chi-Square, ANOVA, Regression, Correlation... (12)

The standard normal distribution is a specific type of normal distribution with a mean value of 0 and a standard deviation of 1.

t-Test, Chi-Square, ANOVA, Regression, Correlation... (13)

The special feature is that any normal distribution, regardless of its mean or standard deviation, can be converted into a standard normal distribution.

t-Test, Chi-Square, ANOVA, Regression, Correlation... (14)

Since we now have a standardized distribution, all we really need is a table that tells us what percentage of the values are below this value for as many z-values as possible .

t-Test, Chi-Square, ANOVA, Regression, Correlation... (15)

And you can find such a table in almost every statistics book or here: Table of the z-distribution. Now, of course, the question is how to read this table?

If, for example, we have a z-value of -2, then we can read a value of 0.0228 from this table.

t-Test, Chi-Square, ANOVA, Regression, Correlation... (16)

This means that 2.28% of the values are smaller than a z-value of -2. As the sum is always 100% or 1, 97.72% of the values are greater.

And with a z-value of zero, we are exactly in the middle and get a value of 0.5. Therefore 50% of the values are smaller than a z-value of 0 and 50% of the values are greater than 0. As the normal distribution is symmetrical, we can read off the probabilities for positive z-values exactly.

t-Test, Chi-Square, ANOVA, Regression, Correlation... (17)

If we have a z-value of 1, we only need to search for -1. However, we must note that in this case we get a value that tells us what percentage of the values are greater than the z-value. So with a z-value of 1, 15.81% of the values are larger and 84.14% of the values are smaller.

t-Test, Chi-Square, ANOVA, Regression, Correlation... (18)

But what if, for example, we want to read a z-value of -1.81 in the table? We need the other columns for this. We can read a z-value of -1.81 at -1.8 and at 0.01.

t-Test, Chi-Square, ANOVA, Regression, Correlation... (19)

Now let's look at the example about blood pressure again. For example, if we want to know what percentage of patients have a blood pressure below 123, we can use z-standardization to convert a blood pressure of 123 into a z-value, in this case we get a z-value of -0.35.

t-Test, Chi-Square, ANOVA, Regression, Correlation... (20)

Now we can take the table with the z-distributions and search for a z-value of -0.35. Here we have a value of 0.3632. This means that 36.32 percent of the values are smaller than a z-value of -0.35 and 63.68 percent are larger.

Compare different data sets with the z-score

However, there is another important application for z-standardization. The z-standardization can help to make values measured in different ways comparable. Here is an example.

Suppose we have two classes, class A and class B, who have written a different test in mathematics.

t-Test, Chi-Square, ANOVA, Regression, Correlation... (21)

The tests are designed differently, have a different level of difficulty and a different maximum score.

In order to be able to compare the performance of the pupils in the two classes fairly, we can apply the z-standardization.

The average score or mean score for class A was 70 points with a standard deviation of 10 points. The average score for the test in class B was 140 points with a standard deviation of 20 points.

We now want to compare the performance of Max from class A, who scored 80 points, with the performance of Emma from class B, who scored 160 points.

t-Test, Chi-Square, ANOVA, Regression, Correlation... (22)

To do this, we simply calculate the z-value of Max and Emma. We enter 80 once for x and get a z-value of 1. Then we enter 160 for x and also get a z-value of 1.

The z-values of Max and Emma are therefore the same. This means that both students performed equally well in terms of average performance and dispersion in their respective classes. Both are exactly one standard deviation above the mean of their class.

Assumptions

But what about the assumptions? Can we simply calculate a z-standardization and use the table of the standard normal distribution?

The z-standardization itself, i.e. the conversion of the data points into z-values using this formula, is essentially not subject to any strict conditions. It can be carried out independently of the data distribution.

t-Test, Chi-Square, ANOVA, Regression, Correlation... (23)

However, if we use the resulting z-values in the context of the standard normal distribution for statistical analyses (e.g. for hypothesis tests or confidence intervals), certain assumptions must be met.

The z-distribution assumes that the underlying population is normally distributed and that the mean (μ) and standard deviation (σ) of the population are known.

t-Test, Chi-Square, ANOVA, Regression, Correlation... (24)

However, as you never have the entire population in practice and the mean value and standard deviation are usually not known, this requirement is of course often not met. Fortunately, however, there is an alternative assumption.

Although the z-distribution is defined for normally distributed populations, the central limit theorem can be applied to large samples. This theorem states that the distribution of the sample approaches a normal distribution if the sample size is greater than 30. Therefore, if the sample is larger than 30, the standard normal distribution can be used as an approximation and the mean and standard deviation can be estimated using the sample.

When the standard deviation is estimated from the sample, s is usually written instead of σ and x dash instead of mu for the mean.

t-Test, Chi-Square, ANOVA, Regression, Correlation... (25)

The z-standardization should not be confused with the z-test or the t-test. If you want to know what the t-test is, please watch the following video.

t-Test, Chi-Square, ANOVA, Regression, Correlation... (2024)

FAQs

When to use chi-square test vs t-test vs ANOVA? ›

While t-tests and ANOVA primarily deal with continuous dependent variables, Chi-Square tests come into play when there is a categorical dependent variable, often in the context of logistic regression.

Is ANOVA a correlation or regression? ›

Thus, ANOVA can be considered as a case of a linear regression in which all predictors are categorical. The difference that distinguishes linear regression from ANOVA is the way in which results are reported in all common Statistical Softwares.

What is the use of t-test, ANOVA, correlation, and regression in research? ›

The Student's t test is used to compare the means between two groups, whereas ANOVA is used to compare the means among three or more groups. In ANOVA, first gets a common P value. A significant P value of the ANOVA test indicates for at least one pair, between which the mean difference was statistically significant.

Is chi-square test used for correlation? ›

Both correlations and chi-square tests can test for relationships between two variables. However, a correlation is used when you have two quantitative variables and a chi-square test of independence is used when you have two categorical variables.

What is the difference between t-test and regression? ›

The main difference is that t-tests and ANOVAs involve the use of categorical predictors, while linear regression involves the use of continuous predictors. When we start to recognise whether our data is categorical or continuous, selecting the correct statistical analysis becomes a lot more intuitive.

What is the difference between an ANOVA and a t-test? ›

ANOVA compares means among three or more groups, whereas t-tests solely compare means between two groups. ANOVA encompasses an analysis of between-group and within-group variation, whereas t-tests focus solely on within-group variation.

When should we use regression instead of ANOVA? ›

If you're interested in predicting an outcome or understanding the relationship between variables, regression is your go-to method. But if your focus is on comparing means and determining whether differences are significant, ANOVA is the tool of choice.

When to use t test vs correlation? ›

The correlation statistic can be used for continuous variables or binary variables or a combination of continuous and binary variables. In contrast, t-tests examine whether there are significant differences between two group means.

What is the difference between linear regression and t test ANOVA? ›

If the categorical predictor has only 2 levels such as sex (male, female), then the simple regression analysis is equivalent to an independent t test. If the single categorical variable has more than 2 levels, then the simple linear regression is equivalent to 1-way analysis of variance (ANOVA).

What is the difference between chi-square test and t-test? ›

The t-test and the chi-square test are two different statistical tests used for different types of data. The t-test is used to compare the means of two groups and is suitable for continuous numerical data. On the other hand, the chi-square test is used to examine the association between two categorical variables.

When to use a chi-square test? ›

A chi-square test is used to help determine if observed results are in line with expected results and to rule out that observations are due to chance. A chi-square test is appropriate for this when the data being analyzed is from a random sample, and when the variable in question is a categorical variable.

Is chi-square a regression analysis? ›

It turns out that the 2 X 2 contingency analysis with chi-square is really just a special case of logistic regression, and this is analogous to the relationship between ANOVA and regression. With chi-square contingency analysis, the independent variable is dichotomous and the dependent variable is dichotomous.

What is the difference between ANOVA and chi-square? ›

In summary, ANOVA is used to compare means across multiple groups with continuous dependent variables and categorical independent variables. On the other hand, Chi-Square tests assess the association or independence between categorical variables.

What is the difference between ANOVA and correlation? ›

Analysis of variance (ANOVA) is a collection of statistical models used to analyze the differences among group means and their associated procedures (such as "variation" among and between groups). A correlation is a single number that describes the degree of relationship between two variables.

What statistical test to use for correlation? ›

The correlation coefficient is a statistic measuring the strength of the linear correlation. Usually, there are two ways: the Pearson correlation coefficient and the Spearman correlation coefficient.

What is the difference between chi-square test and t-test and F-test? ›

Both the t-test and the z-test are usually used for continuous populations, and the chi-square test is used for categorical data. The F- test is used for comparing more than two means.

When should you use an ANOVA test? ›

Use a one-way ANOVA when you have collected data about one categorical independent variable and one quantitative dependent variable. The independent variable should have at least three levels (i.e. at least three different groups or categories).

When should you use a chi-square test? ›

A chi-square test is used to help determine if observed results are in line with expected results and to rule out that observations are due to chance. A chi-square test is appropriate for this when the data being analyzed is from a random sample, and when the variable in question is a categorical variable.

When you should use a dependent t-test vs independent t-test vs ANOVA? ›

If your data are independent, for example, an independent samples t-test or an ANOVA without repeated measures is calculated. If your data are dependent, a t-test for dependent samples or an ANOVA with repeated measures is calculated.

Top Articles
Alton Evening Telegraph from Alton, Illinois
BATTLEYE Error Code Plum in Destiny 2 [Fix]
Mybranch Becu
Poe T4 Aisling
Main Moon Ilion Menu
Ffxiv Palm Chippings
Health Benefits of Guava
Dee Dee Blanchard Crime Scene Photos
Craigslist Furniture Bedroom Set
Flat Twist Near Me
Sunday World Northern Ireland
Florida (FL) Powerball - Winning Numbers & Results
Hello Alice Business Credit Card Limit Hard Pull
Savage X Fenty Wiki
Craigslist Pets Southern Md
Fredericksburg Free Lance Star Obituaries
Finger Lakes Ny Craigslist
Bitlife Tyrone's
Highland Park, Los Angeles, Neighborhood Guide
Michigan cannot fire coach Sherrone Moore for cause for known NCAA violations in sign-stealing case
Icommerce Agent
Silive Obituary
Teen Vogue Video Series
Optum Urgent Care - Nutley Photos
Purdue 247 Football
Myql Loan Login
Gma' Deals & Steals Today
Select The Best Reagents For The Reaction Below.
Paradise Point Animal Hospital With Veterinarians On-The-Go
Florence Y'alls Standings
Nurofen 400mg Tabletten (24 stuks) | De Online Drogist
All Things Algebra Unit 3 Homework 2 Answer Key
2016 Honda Accord Belt Diagram
Despacito Justin Bieber Lyrics
Polk County Released Inmates
Space Marine 2 Error Code 4: Connection Lost [Solved]
Radical Red Doc
Manatee County Recorder Of Deeds
Tiny Pains When Giving Blood Nyt Crossword
Craigslist Free Manhattan
Carroll White Remc Outage Map
Birmingham City Schools Clever Login
Coroner Photos Timothy Treadwell
Divinity: Original Sin II - How to Use the Conjurer Class
Ehome America Coupon Code
Dragon Ball Super Card Game Announces Next Set: Realm Of The Gods
Adams-Buggs Funeral Services Obituaries
Kushfly Promo Code
Rheumatoid Arthritis Statpearls
Mawal Gameroom Download
Blippi Park Carlsbad
Lux Nails & Spa
Latest Posts
Article information

Author: Kelle Weber

Last Updated:

Views: 6131

Rating: 4.2 / 5 (53 voted)

Reviews: 84% of readers found this page helpful

Author information

Name: Kelle Weber

Birthday: 2000-08-05

Address: 6796 Juan Square, Markfort, MN 58988

Phone: +8215934114615

Job: Hospitality Director

Hobby: tabletop games, Foreign language learning, Leather crafting, Horseback riding, Swimming, Knapping, Handball

Introduction: My name is Kelle Weber, I am a magnificent, enchanting, fair, joyous, light, determined, joyous person who loves writing and wants to share my knowledge and understanding with you.