Code for finding the freqency of Chars in a text file

Hi I'm working on a bit of code that's supposed to populate an array with the frequency of each letter in a text file. It's segfaulting in the while loop for some reason. I'm not exactly sure why. Can anyone help?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
  fstream bucky;
  char c[26] = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'};
  int f[26] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
  string file_input = argv[1];
  bucky.open( file_input );
  if( !bucky.is_open() )
  {
    cout << "Cannot open " << file_input << " for input" << endl;
    exit( EXIT_FAILURE );
  }
  cout << "debug1" << endl;

  char ch;
  string cur;
  while( bucky.good() ){
    bucky >> cur;
    for(int i=0; i<cur.length(); i++){
      ch = cur[i];
      f[ ch - 97] = f[ ch - 97] + 1;
    }
  }
any chance your file has something other than a lower case letter?
its a lot easier, for ascii/1-byte effort, to just make f[256] and then increment f[ch] for each letter. That subtraction / trying to fit it down to 26 letters is probably the issue; as soon at you hit anything not in a-z it could crash.

you can bulk assign:
int f[256] = {0}; //all zero, zero is a special case though.
now this works perfectly well:
f['a']++;
or
f['A']++;
or even
f['?']++;
Last edited on
Line 19 is of course dangerous. If a character doesn't fit in that pattern it will crash.

You can sort this letters out:
1
2
3
4
5
6
7
    for(int i=0; i<cur.length(); i++){
      int idx = cur[i] - 97;
if((idx < 0) || (idx >= 26))
  cout << "Invalid character: " << cur[i];
else
      ++f[idx];
    }

Another thing: Currently you will get the last character twice (because the stream becomes bad on line 16 not 15). So merge lines 15 and 16 to a single line:

while( bucky >> cur ) {
Topic archived. No new replies allowed.