Newsletter archives
Newsletter Signup: Text HTML 
 

P Values and Statistical Significance

 
 

A recent question regarding the statistical significance of p values was recently posted to the iSixSigma discussion forum as follows:

"I'm having a problem understanding the concept of the P-Value as it relates to correlation. If my understanding is correct, when evaluating a correlation with a low P-Value, then we say it's significant. But what does this really mean? What does the value represent?"




It's a question that I'm sure we've all faced in the use of our statistical software programs. How do we determine when two samples are similar or different? Is the alternate hypothesis valid or not? What follows below are the answers from the many statistical experts that frequent our community. If you have a question or would like to make an additional comment, just press the 'Post A Reply' button.


"Hypothesis tests are typically used in the [Six Sigma] Analyze Phase to identify the critical x's (inputs) of a process. Generally, these critical x's are assumed to exist when we reject the null hypothesis. The significance level (alpha) used in these hypothesis tests is often set at 95% (i.e. p-value = 0.05 threshold). When a hypothesis test is performed, the p-value represents the probability of getting a sample as extreme (or worse) assuming the null hypothesis is true. We therefore reject the null hypothesis when the p-value is less than the significance level we established"

Posted by: Tab 


"The basic notion of p value is this: What is the probability (p) that the association seen in the data (in this case the correlation) would have been seen by chance (i.e. if in fact there is no relationship between the variables)? Or more accurately what is the probability that you will again get this value for extent of correlation if in fact there is no correlation?

"If the p-value is 0.05, the common understanding is that the observed relationship would be expected in 5% of measurements if there was no correlation. You would expect 1 out of 20 samples to show a correlation when in fact the variables were not correlated."

Posted by: Don Nanneman 


"I always struggled with p- values until I heard this helper. First, the null hyothesis is always no difference (i.e. for normality -- the sample is no different than the normal curve, for 2 sample t -- the two samples are no different, etc.). The alternate is always there is a difference.

"Then remember this saying: If the p is high (>.05) the null will fly -- if the p is low (<.05) the null must GO!"

Posted by: Cindy 

Another reader added to Cindy's thoughts:

"Just for completeness, there are other null hypotheses that may be tested for. For example, we may investigate the null hypothesis that data may have come from a Normal distribution (via Anderson-Darling, Ryan-Joiner test, etc.) or that the process mean is equal to some specified value, e.g. H0: mu = 265g, etc.

"Your rhyme is still valid, though one should always consider the sample size selected - would we be able to detect a difference, say, of a certain size if it were truly present? That's why we have power and sample size computational functionality as well, of course."

Posted by: Kmb 


"In statistics, you can virtually never get data from an entire population so you have to take samples. A P-value is just an indication that there is a high chance that the factor or data is significant although there are never any black or whites. How sensitively you want to analyze the data (or how thoroughly) is where you set your P-values at for significance. I'm not sure of your exact situation, but it sounds like, the higher you set your P-value, the more thorough you are trying to be, particularly during a DOE when you are trying to measure interaction effects. A P-value of .05 may show no significance. However a P-value of .20 may show significance. Basically, it's an arbitrary, but well thought out, level to detect significance of your data. The higher, the more thorough.

"P-values in normality tests are something a little different. If they are above .05, that means chances are good your data is normal. If it is less than .05, chances are good it's not and you should look for other options. Again, this is just an inference made from the sample of the population you took. But you better believe, it's usually dead on."

Posted by: Mark Grubert 


"The p-value is simply the actual level of confidence provided by the model in question (regression, hypothesis test, F test, difference of means, etc.). For example, if you decide to set you confidence level at 0.05, which means that you are willing to allow for a 5% change of erring in your final analysis (i.e. finding a significant association when one does not really exists), and the p-value is 0.0035, then your model passes the test."

Posted by: J. Angulo 


"I'm guessing that you're discussing the p-value related to the correlation coefficient. Of course, the p-value represents the probability of incorrectly rejecting the null hypothesis. If the p-value is less than some significance level, alpha, (typically practitioners use an alpha of 0.05) then we say that the result is statistically significant (at the 5% level) - i.e. the probability of incorrectly rejecting the null hypothesis is less than 5%.

"For the test I think you're alluding to, it would indicate that we would reject the null hypothesis that rho (the true correlation coeff) is equal to zero, hence there may be some evidence to suggest that a linear relation is present. Don't ignore a scatter diagram though, of course!"

Posted by: Kmb 

Another reader continued the explanation:

"KMB is exactly right, but maybe a further discussion will help. Imagine that there is a universe of points from the process you are studying. You take a sample of those to see if you can prove or disprove correlation. The truth that you are assuming is that there is no or null correlation.

"Now you develop some test or way to mathematically relate the sample to some statistic, in this case rho. You then compare it to some reference distribution. The probability (p) that you selected the sample in such a way that you got a sample that shows there is some correlation, i.e. that rho is not zero, when in fact it is zero, the truth you assumed; is the p value. In other words it is the probability that your sample indicates that the state of nature in the universe is different than the truth you assumed when your assumption was the correct one. In this case it is the probability that the rho or correlation is zero when your sample indicates that it is not zero.

"Usually if we have a one in twenty (0.05) chance of making the wrong decision, we are satisfied that there is a difference , i.e. statistical significence, and we reject the null hypothesis that there is no difference. You can set this level based on your need to be right. In drug testing work for instance, a p= 0.01 is often used since the consequences of being wrong are much more severe than being wrong about a knob for a radio."

Posted by: Dave Strouse 

And yet another reader continued the explanation:

"What KMB and Dave are saying is...If the p-value for a correlation coefficient test is less than 0.05, it indicates that the correlation coefficient IS significantly different from zero (either positive or negative) at the alpha = 0.05 level. This means that there is some significant amount of linear relationship between your two variables of interest. This test uses a test statistic t0=[r*SQRT(n-2)]/SQRT(1-r^2). It has been proven that IF the true correlation coefficient is equal to zero, then t0 will follow the t-distribution with n-2 degrees of freedom.

"Extreme values of this t0 statistic are indicative that t0 does not actually follow a t-distribution with n-2 df, thus it indicates that the true correlation coefficient is NOT equal to zero. Extreme values of t0 are characterized by small p-values. Thus small p-values indicate that the true correlation coefficient is NOT equal to zero."

Posted by: Dave Strouse 

 

Best Selling Products

  1. Six Sigma Black Belt (DMAIC) Training Slides - 2009 Version!
    The 2009 Six Sigma Black Belt course includes over 40 more slides than the 2008 version. Contents include: 1,220 PowerPo...
  2. Certified Lean Six Sigma Green Belt Assessment Exam
    This assessment exam is useful for students interested in assessing their knowledge of Lean Six Sigma on the Green Belt ...
  3. Certified Lean Six Sigma Black Belt Assessment Exam
    Interested in assessing your knowledge of Lean Six Sigma? Preparing for certifications? Testing your students and traine...
  4. Six Sigma Green Belt Training Slides - 2009 Version
    The 2009 Six Sigma Green Belt course is comprised of: 1122 slides (over 70 more slides than the 2008 version)Instructor...
  5. Certified Lean Six Sigma Black Belt E-book
    In 670 pages learn everything within the Lean Six Sigma DMAIC body of knowledge to successfully achieve Black Belt certi...
  6. The Quality Group's Lean Six Sigma Black Belt E-learning
    This D-M-A-I-C structured blended learning course combines interactive e-learning instruction with virtual online cla...
  7. Design For Six Sigma (DFSS) E-Book or Print
    Need an "encyclopedia" consisting of many of the tools you’ll study? Need a helpful refresher to apply the DFSS process?...

Premium Sponsor:

Sponsors:















Sponsor iSixSigma Military



 

About iSixSigma Military

The purpose of this iSixSigma Military channel is to document the transformation of the United States Armed Services through the use of Lean Six Sigma and related process improvement methodologies.

Ronald E. Rezek, special assistant to the acting secretary of the Army, has said the goal of the Army's Lean Six Sigma deployment is to "make the business side of the Army as efficient as the war-fighting side is effective." Leaders of the other armed services echo that sentiment and transformation objective.

This portal will serve as a central community for everyone associated with the business transformation of the U.S. military. It will provide communication updates on deployments, the opportunity for military leaders at all levels to learn new skills, advance their careers and contribute to the success of their organizations.