It offers a large data set, data spreads across many orders of magnitudes. The data set consists of the reported cause of death on the death certificates. The co-morbidity is the secondary or the contributing cause of death as reported on the death certificates.
All of the Center for Disease Control's tables offer the option to export the data set to CSV format. In this video,I will show how to export the data to CSV format, save the data to a file and then remove any non-data-specific fields that distort the Benford's analysis, while maintaining the integrity of the data.
After removing the extraneous columns, those with integer values for dates and CDC codes, we can analyze the data set for Benford's distribution on the first three digits and understand the findings.