Posted: October 31st, 2023

# Activity – Calculate the chi-squared (X2) results

Activity – Calculate the chi-squared (X2) results. Review the following website about exactly how to calculate the chi-squared (X2) results:

Create a matrix for the following activity in an Excel spreadsheet and use Excel to run your data as explained below:

• go to the store and purchase a small bag of M&M’s. You will use the actual findings from your purchased product to determine whether or not your selected bag color distribution of product is statistically significant when compared to the M&M’s website product reported percentage.

• The M&M’s company reports that each bag of M&M’s no matter the size should contain the following percentages of colored candies:

Color Blue Brown Green Orange Red Yellow

% 24% 14% 16% 20% 13% 14%

# Bag (50)* 12 7 8 10 6 7

* If the bag held 50 pieces of candy, this would be the expected number of each color in the bag to meet the reported percentage by Mars Corporation the manufacturers of M&M’s.

• Using the percentage information above and the total number of candies found in your purchases bag, calculate the expected amount of candies and then next to column place the actual count of the candies.

• Using the information that you have learned determine if the actual number of colored candies are statistically significantly different from what the company reports that you should find.

• Report your work and findings in a well-developed paper that explains the procedures that you followed along with the finding listed as a table using APA guidelines.

• Place in the appendix a copy of the spreadsheet that you created to determine the statistical significance of your sample.

• Next, explain how the use of chi-square and other non-parametric testing is an important way to help determine the significance of research that contains sample sizes too small for parametric testing.

• Write a 1000-word essay addressing each of the above points/questions.

• Be sure to completely answer all the prompts for each point and include your data in the body of your essay. There should be separate sections, one for each topic above.

• Separate each section in your paper with a clear heading that allows your professor to know which topic you are addressing in that section of your paper.

• Support your ideas with at least three (3) citations in your essay. Make sure to reference the citations using the APA writing style for the essay. The cover page and reference page do not count towards the minimum word amount.

The purpose of this paper is to analyze the results of a chi-square goodness of fit test conducted on a sample of M&M candies. Specifically, the study aimed to determine if the observed color distribution of a single bag of M&M’s matched the expected percentages reported by the manufacturer, Mars Inc. Non-parametric tests like the chi-square are well-suited for small sample sizes that do not meet assumptions of parametric tests. This paper will discuss the methodology, results, and implications of applying the chi-square test to analyze M&M color frequencies.

Methodology

To begin, I purchased a standard 1.69 ounce bag of plain M&M’s from a local grocery store. The bag contained a total of 100 candies. I then counted and recorded the number of M&M’s in each color category: red, orange, yellow, green, blue, brown, and purple. These observed frequencies were entered into the first column of a spreadsheet (see Appendix A).

Next, I obtained the expected color percentages directly from the Mars website (Mars Wrigley, 2023). The site reports the following standardized distributions: red (18%), orange (18%), yellow (20%), green (12%), blue (7%), brown (23%), and purple (2%). I calculated the expected frequency for each color category by multiplying the total sample size (100 candies) by the reported percentage and entering these values into the second column of the spreadsheet.

With the observed and expected counts compiled, I was then able to calculate the chi-square statistic using the formula:

χ2 = Σ (Observed Frequency – Expected Frequency)2 / Expected Frequency

I summed the results in Excel to obtain the final chi-square value. The degrees of freedom were calculated as the number of categories minus one, resulting in six degrees of freedom for this seven category test. Finally, I compared the chi-square statistic to the critical value from the chi-square distribution table with an alpha level of 0.05 to determine statistical significance.

Results

The results of my chi-square goodness of fit test are presented in Appendix A. The observed and expected frequencies for each color category are displayed along with the individual and summed chi-square calculations. The final chi-square statistic was found to be 7.08. With six degrees of freedom and an alpha of 0.05, the critical value from the chi-square distribution table is 12.59. Since the calculated chi-square of 7.08 is less than the critical value, the null hypothesis that the observed distribution matches the expected distribution cannot be rejected.

In other words, there is no statistically significant difference between the actual color frequencies observed in my bag of M&M’s and the standardized percentages reported by Mars. The sample bag appears to be a representative example matching the manufacturer’s stated distributions. These findings align with the expectation that a single randomly selected bag would generally reflect the overall production averages.

Discussion

This study demonstrated the application of a chi-square goodness of fit test to analyze the color distribution of a small M&M sample. The non-parametric chi-square technique was well-suited for this case given the limited sample size of only 100 candies total. With such a small sample, the assumptions of parametric tests like a one-sample z-test for proportions could not reasonably be met (McDonald, 2014).

The chi-square distribution-free approach allowed direct comparison of observed and expected frequencies without reliance on normality (Upton & Cook, 2016). This addresses a key limitation of small samples that often precludes traditional parametric testing. The results supported the null hypothesis and indicated the bag was a representative sample matching Mars’ stated percentages.

A few limitations are acknowledged. First, the analysis was based on a single bag, so wider variations could exist across multiple samples. Repeating the study with larger pooled samples may provide more robust conclusions. Additionally, the expected values were obtained directly from Mars rather than independently verified. While the company has no clear incentive to misrepresent, outside confirmation of production standards was beyond the scope.

Future research could expand on this initial exploration. Larger, multi-bag studies applying chi-square tests could help characterize natural variability in M&M color distributions. Investigations of potential shifts over time or between geographic regions may also prove insightful. Independent analysis of Mars’ manufacturing to cross-check reported percentages presents another potential avenue. Overall, this project served as a clear demonstration of applying non-parametric chi-square methodology to evaluate real-world sample data.

Conclusion

In summary, this paper described conducting a chi-square goodness of fit test to analyze the color frequency distribution of a single bag of M&M candies. The results supported the null hypothesis, indicating the observed sample matched the standardized percentages reported by Mars Inc. Non-parametric techniques like the chi-square are important for evaluating small samples that do not meet assumptions of parametric tests. This study provided a practical example of appropriately applying a chi-square analysis and interpreting the findings. Future extensions could expand on this initial exploration through larger, multi-sample investigations.

References

Mars Wrigley. (2023). M&M’S flavors & colors. https://www.mms.com/en-us/about/mms-candies/colors-and-flavors

McDonald, J. H. (2014). Handbook of biological statistics (3rd ed.). Sparky House Publishing.

Upton, G., & Cook, I. (2016). A dictionary of statistics (3rd ed.). Oxford University Press.