Newsletter Signup: Text HTML 
 

Determine # & Size of Bins for Histogram

 
 
Message: 34726
Posted by: Mike
Posted on: Tuesday, 21st October 2003

Hi, I have several thousand rows of data which I need to group into appropriate sized bins for a histogram. The problem is in determining the correct number and size for each bin, i.e. if I had a bunch of test scores and didn't have pre-determined bins (A,B,C,D,F) that were already sized (> 90, 80-89, 70-79, 60-69, <60), how would I figure out the number and size of the bins?? Thanks!


Message: 34732
Posted by: Gabriel
Posted on: Tuesday, 21st October 2003

I have known about the following steps and used them since then. It has no theoretical base as far as I know, but it seems to work fine:

1) Number of bins (first trial): n1=sqrt(N-1)-1, where N is the number of individuals.

2) Bin size (first trial): s1=(max-min)/n1, where max and min are the higher and lower individuals

3) Bin size (definitive): s=Round UP s1 to the precision of the data. For example, if s1=0.32 mm and the data is in 0.1 mm format, then s=0.4mm.

4) The lower limit of the first bin will be min-(1/2 of the precision), the upper limit of the first bin will be the lower limit + s, and this also be the lower limit of the second bin, add another s to get the upper limit of this second bin which will also be the lower limit of the third bin and so on.

As said, this is a gideline. If you don't like the result then you can increase or reduce the bin size, but allways keep the size a multiple of the precision as said in point 3) (if not some bins will contain more possible results than others, and the bars of those bins will be fakely higher) and allways keep the limits of the bins "between" possible readings as said in point 4), if not the bins will be "unbalanced". For example a bin "larger than 10, up to 12" has its center at 11, but if the resolution is 1 the possible results are 11 and 12, which has a center in 11.5. A bin (10.5; 12.5) has a center at 11.5, which matches the centyer of the possible results and, by the way, you don't have to bother thinking if it is "larger" or "larger or equal" than 10.5 and "lower" or "lower or equal" than 12.5, because you will never have a data point "equal" to 10.5 or 12.5 anyway.


Message: 34733
Posted by: Heebeegeebee BB
Posted on: Tuesday, 21st October 2003

Check this link out:

http://www.sytsma.com/tqmtools/hist.html


Message: 34909
Posted by: Mike
Posted on: Thursday, 23rd October 2003

Thanks for the replies Gabriel & Heebeegeebee!

The following two are from published studies:

1) bin width = 3.49*ó*N-1/3
2) bin width = 2*(IQR)*N-1/3

where IQR = 75th pctl - 25th pctl; N = number of samples; and the number of bins would be based on dividing the dataset range by the bin width.

This one is a rule of thumb I found on the Internet:

3) number of bins = 1+3.3*ln(N) where the bin width would be the dataset range by the number of bins

4) I've also tried Excel's built-in data analysis tools.

5) Gabriels's method

Here is what I get with the test data I'm reviewing (I've left out some small % of some bins so it won't total 100%):
1) bin width = 888; number of bins = 338; 97% of items in one bin, 1% in next bin, then 1%
2) bin width = 17; number of bins = 17564; 20% of items in one bin, 13% in next bin, then 11%,6%,5%,5%,3%,3%,3%,2%,2%
3) bin width = 9606; number of bins = 31; 99% of items in one bin, 1% in next bin
4) bin width = 3093; number of bins = 97; 99% of items in one bin, 1% in next bin
5) bin width = 3158, that's as far as I took it

All of these give way to many bins because most of the data is clustered below a certain number and the range below the lowest and highest numbers is quite large.


Message: 101928
Posted by: hide
Posted on: Sunday, 1st October 2006

This page provides the method to select histogram bin size (or number of bins) of your data.

http://www.ton.scphys.kyoto-u.ac.jp/~hideaki/res/histogram.html

Best,


 

Best Selling Products

  1. Certified Lean Six Sigma Black Belt Assessment Exam
    Interested in assessing your knowledge of Lean Six Sigma? Preparing for certifications? Testing your students and traine...
  2. Six Sigma DMAIC Training Slides
    The complete 2008 Lean Six Sigma DMAIC course prepares participants to perform the role of a LSS Black Belt; covering wh...
  3. Process Management Training Slides
    The 2008 Process Management course is designed in two phases comprised of:352 Powerpoint slidesInstructor notesSlide exp...
  4. Certified Lean Six Sigma Green Belt Assessment Exam
    This assessment exam is useful for students interested in assessing their knowledge of Lean Six Sigma on the Green Belt ...
  5. Certified Lean Six Sigma Black Belt E-book
    In 670 pages learn everything within the Lean Six Sigma DMAIC body of knowledge to successfully achieve Black Belt certi...
  6. Gage R&R Excel Template
    Gage Repeatability and Reproducibility (R&R) studies measure the amount of measurement variation that is attributabl...
  7. Six Sigma Black Belt (DMAIC) Training Slides
    The 2008 Six Sigma Black Belt course is comprised of: 1,176 PowerPoint slides, Instructor notes, Slide explanations, 37 ...

Premium Sponsor:

Accenture, Process & Improvement Performance, formerly George Group

Sponsors:

Achieve Operational Excellence: Oriel

Advance your Military Career with Six Sigma Programs from Villanova University

MoreSteam: The Engine Room of Continuous Improvement

Novaces: Six Sigma for the Military

Sponsor iSixSigma Military



 

About iSixSigma Military

The purpose of this iSixSigma Military channel is to document the transformation of the United States Armed Services through the use of Lean Six Sigma and related process improvement methodologies.

Ronald E. Rezek, special assistant to the acting secretary of the Army, has said the goal of the Army's Lean Six Sigma deployment is to "make the business side of the Army as efficient as the war-fighting side is effective." Leaders of the other armed services echo that sentiment and transformation objective.

This portal will serve as a central community for everyone associated with the business transformation of the U.S. military. It will provide communication updates on deployments, the opportunity for military leaders at all levels to learn new skills, advance their careers and contribute to the success of their organizations.