On further analysis, the Blue Water Quality Board is making a change to the data it

collects on fracking. The measures collected will now be “normalized” to be integer values in

the inclusive range [-1000, +1000].

They will still be collecting data from five (5) different sites just as in the previous

programming assignment.

On each line of the data file, there will always be exactly 5 numerical measures, with the

first coming from a water sample at site 1, the second coming from a water sample at site 2, the

third coming from site 3, the fourth coming from site 4 and the fifth coming from site 5.

You are being asked to develop a C program that can read in the data from the input file

the State Water Quality Board supplies. Your program is to determine and print out the following

data regarding the contamination measures for each site:

1. The minimum contamination measure

2. The maximum contamination measure

3. The range of the contamination measures (i.e., the difference between the

maximum and the minimum measure)

4. The mean (arithmetic average) of the contamination measures

5. The standard deviation of the contaminations measures

6. The median

7. The mode.

There will be at most 500 values collected for each site.

The median is defined as the middle-most value in the set of data for a given site once

that data is sorted from low to high. For example, if there were 13 data points, the 7th data point

in the sorted list would be the median. If a data set has an even number of values (for example

12), then the median would be the sum of the 6th and 7th elements divided by 2.

The mode is defined as that data value that occurs most frequently in the data set for a

given site. Again, this is easier to determine if the data is sorted. For example, suppose we had a

data set with the following values:

-100 -100 -88 -75 -20 0 0 0 0 0 10 20 20 20 199 199

The mode for that data set would be 0 since 0 occurs 5 times. The question then is what to do

when a data set has two or more distinct values occurring the same number of times. For

example suppose the data set was:

-100 -100 -88 -75 -20 0 0 0 0 0 10 20 20 20 20 20 199 199

Here both 0 and 20 occur five times. In that case the mode should report the largest value (20) as

the mode. In the “worst case” in a data set where every value is unique, report “None” for the

mode since there is no mode. Also report the number of occurrences of the mode value in the

data set (report 0 for the “worst case.”

1

Here is a small sample data set for the assignment and an indication of what is to be printed.

A Small Sample Input Set

-50 28 11 999 9

-100 27 4 -22 -87

-50 27 3 -17 -87

25 27 6 -22 -87

-50 27 5 998 -87

25 0 9 2 -87

-50 -99 8 -10 -87

25 -99 7 998 -87

0 -99 10 18 9

-100 -99 2 2 -87

-50 -99 1 998 -87

Sample Output

Number Measures Processed = 11

Site 1 2 3 4 5

Minimum -100 -99 1 -22 -87

Maximum 25 28 11 999 9

Range 125 127 10 1021 96

Mean -34.0909 -32.6364 6.0000 358.5455 -69.5455

Std. Dev, -46.4660 -64.0223 -3.3166 -507.3060 -38.8339

Median -50.0000 0.0000 6.0000 2.0000 -87.0000

Mode -50 -99 None 998 -87

Occur 5 5 0 3 9

Note that the Mean, Standard Deviation and Median are still to be reported to 4 decimal places.

Everything else is now reported as integer values.

collects on fracking. The measures collected will now be “normalized” to be integer values in

the inclusive range [-1000, +1000].

They will still be collecting data from five (5) different sites just as in the previous

programming assignment.

On each line of the data file, there will always be exactly 5 numerical measures, with the

first coming from a water sample at site 1, the second coming from a water sample at site 2, the

third coming from site 3, the fourth coming from site 4 and the fifth coming from site 5.

You are being asked to develop a C program that can read in the data from the input file

the State Water Quality Board supplies. Your program is to determine and print out the following

data regarding the contamination measures for each site:

1. The minimum contamination measure

2. The maximum contamination measure

3. The range of the contamination measures (i.e., the difference between the

maximum and the minimum measure)

4. The mean (arithmetic average) of the contamination measures

5. The standard deviation of the contaminations measures

6. The median

7. The mode.

There will be at most 500 values collected for each site.

The median is defined as the middle-most value in the set of data for a given site once

that data is sorted from low to high. For example, if there were 13 data points, the 7th data point

in the sorted list would be the median. If a data set has an even number of values (for example

12), then the median would be the sum of the 6th and 7th elements divided by 2.

The mode is defined as that data value that occurs most frequently in the data set for a

given site. Again, this is easier to determine if the data is sorted. For example, suppose we had a

data set with the following values:

-100 -100 -88 -75 -20 0 0 0 0 0 10 20 20 20 199 199

The mode for that data set would be 0 since 0 occurs 5 times. The question then is what to do

when a data set has two or more distinct values occurring the same number of times. For

example suppose the data set was:

-100 -100 -88 -75 -20 0 0 0 0 0 10 20 20 20 20 20 199 199

Here both 0 and 20 occur five times. In that case the mode should report the largest value (20) as

the mode. In the “worst case” in a data set where every value is unique, report “None” for the

mode since there is no mode. Also report the number of occurrences of the mode value in the

data set (report 0 for the “worst case.”

1

Here is a small sample data set for the assignment and an indication of what is to be printed.

A Small Sample Input Set

-50 28 11 999 9

-100 27 4 -22 -87

-50 27 3 -17 -87

25 27 6 -22 -87

-50 27 5 998 -87

25 0 9 2 -87

-50 -99 8 -10 -87

25 -99 7 998 -87

0 -99 10 18 9

-100 -99 2 2 -87

-50 -99 1 998 -87

Sample Output

Number Measures Processed = 11

Site 1 2 3 4 5

Minimum -100 -99 1 -22 -87

Maximum 25 28 11 999 9

Range 125 127 10 1021 96

Mean -34.0909 -32.6364 6.0000 358.5455 -69.5455

Std. Dev, -46.4660 -64.0223 -3.3166 -507.3060 -38.8339

Median -50.0000 0.0000 6.0000 2.0000 -87.0000

Mode -50 -99 None 998 -87

Occur 5 5 0 3 9

Note that the Mean, Standard Deviation and Median are still to be reported to 4 decimal places.

Everything else is now reported as integer values.

Topic archived. No new replies allowed.