Ascii Code

I'm having a bit of difficulty understanding how and why this example works. It changes all ascii letters from small caps to big caps using this while loop...

1
2
3
4
5
6
7
8
9
void capitalize(char text[]) {
   int i=0;
   while (text[i] != '\0') {
      if (text[i] >= 'a' &&  text[i] <= 'z') {
         text[i] += 'A'-'a';
      }
      i++;
   }
}


How is it that by subtracting the two, it actually works?
C and C++ treat characters as a type of numerical integers. That is, the literal value of 'a' is actually equal to the numerical value of 0x61 (or 97 if you don't like hex).

This can be easily demonstrated:

1
2
3
4
if('a' == 97)
{
    cout << "This will print, because 'a'==97";
}


It just so happens that in ASCII, all lowercase characters are arranged in order numerically. So a==97, b==98, etc, etc. Same for uppercase letters: A==65, B==66, etc.


With that in mind:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
char somechar = 'f';  // or any value between 'a' and 'z'

somechar -= 'a';  // <- this makes it a zero base.
   // IE, where we once had 97 for a, we now have 0 for a
   // and if we had 98 for b, we now have 1 for b
   // etc

somechar += 'A';  // <- this takes the zero base and moves it to uppercase-base
   // so where we had 0 for A, we now have 65 for A
   // and where we had 1 for B, we now have 66 for B
   // etc.


// Of course, doing this:
somechar -= 'a';
somechar += 'A';

// is mathematically the same as doing this:
somechar += 'A'-'a';
It might be easier to understand if you start by writing it:
text[i] = (text[i] - 'a') + 'A';
Since you already know that text[i] is a lower case letter, text[i] - 'a' is a value between 0 and 25 representing the letters a to Z. This takes advantage of the fact that letters are encoded contiguously: 'a'+1 == 'b', 'b'+1 == 'c' etc.

Now to get an upper case letter just take that 0-25 value, and add it to 'A'.

Once you understand the algorithm, the rest is just algebra:
text[i] = (text[i] - 'a') + 'A'; is the same as
text[i] = text[i] + ('A' - 'a'); which can be rewritten as:
text[i] += 'A' - 'a';
Topic archived. No new replies allowed.