Stuck on a problem

Pages: 12
> ' ' isn't a magic number.
> If you argue that ' ' won't always equal the numerical value
> between lower and upper case letters in the ASCII table
it is a magic number
and my argument is not that ' ' may not be 32 or that 'a'-'A' may not be 32,
but that that is not ingrained, that's a piece of trivia that you'll forget in a week
it is not obvious

suppose that you want to convert ['a'..'j'] to ['0'..'9'], you may do c - '1' but that's horrendous

> how would assuming that the lower case value of a letter will always be
> higher than the upper case one in ('a' - 'A') be any different?
because it still works
c - ('a'-'A') is equivalent to c - 'a' + 'A'
now read the second snip, c-'a' is the distance to 'a', the index of the letter
then you add that distance to 'A' to find the one with an equivalent index
it does not matter if 'a'<'A'


> Or that each letter will have the same value apart from it's counterpart as every other letter does?
let's say that that's the case and so c - 'a' + 'A' is wrong
you can tell at a glance that it assume that supposition.
The character set is as much a property of the target platform

I'm not really understanding this. C++ mandates the use of ASCII, how can character sets be a property of the platform?

Because the problem statement explicitly allows making that assumption.

Yes, but that's dodging the point which is that C++ is running on the ASCII system. If character sets suddenly became dependent on the machine, or "implementation" which I'm not sure what you mean by, then most code depending on it would break and wouldn't even resemble portability.

This character set is sufficient to implement OP's problem

Well, yes, but who exactly is going to implement this? Would that be a hard-coded character set? Is there another way to make C++ just recognize this new character set without hard coding it?

The assumptions you're making are too strong.

Are they though? The problem says "Assuming the ASCII value for ‘a’ is greater than ‘A’", which clearly states using the ASCII system. If you have different values or "character set", it's no longer ASCII but something made up or a new system.

Learning how to read a requirements spec is a basic engineering skill that you better acquire.

I wouldn't consider this relatable. Professor assignments are always vaguely worded because they fear they'll give away how to code certain things otherwise. When programming my assignments, most of the time spent on it isn't even coding, but trying to decode broken vague English.

If a requirement spec sheet like this showed up in my email, I'd probably ask for more information about it, since it seems like it's trying to allude to something but not actually saying what - very similar to all coding assignments I've ever gotten.

If a spec says something like "the program may assume that integers are always 32 bits long"

There's no ambiguity there. It wouldn't make sense for me to screw up programming under this assumption, even if my comprehension skills were questionable - I'd just be highly incompetent.

The difference between this and the assignment details is that this pretend spec sheet gives a clear restriction. If you were to restrict yourself only to the assumptions given in this assignment, assuming there's an actual solution to the problem would also be to "high" an assumption. After all, it doesn't say that the difference between 'a' and 'A' will be the same difference between all character's lower and upper case versions.

I don't see how my very minor assumption that the language standard for character sets is going to be used, especially when referenced in the problem, is going on a limb.
it is a magic number

Sure, let's call it a magic space.

and my argument is not that ' ' may not be 32 or that 'a'-'A' may not be 32,
but that that is not ingrained

I must be failing to understand something, because ASCII is the language standard. ASCII hasn't changed in this century. If the standard for the language changed, to something like Unicode, none of the code here would work at all.

suppose that you want to convert ['a'..'j'] to ['0'..'9'], you may do c - '1' but that's horrendous

Is that the only reason for not doing 'a' - ' ' ?

c - ('a'-'A') is equivalent to c - 'a' + 'A'

In which universe is this?


I must be missing something major here. Other than the leftover EBCDIC system and the small scale adoption of some Unicode in strings, how can you use standard C++ with something other than ASCII? 'a' is equal to 97, there's no way around this if C++ is going by the ASCII system. Otherwise, it wouldn't be ASCII anymore. I can't see how this would be not "ingrained".
Sure, let's call it a magic space.

You do realize ALL characters are really numbers, right? A char variable contains a number.

1
2
3
4
5
6
7
8
9
10
#include <cstdio>

int main()
{
   char ch = 'A';

   printf("Character: %c\n\n", ch);

   printf("Number: %i\n", ch);
}

Character: A

Number: 65

Same variable, same value, interpreted differently for output. (It was easier to illustrate this using printf() than std::cout)
You do realize ALL characters are really numbers, right? A char variable contains a number.

Yes, which makes using 32 or ' ' no more magical than ('a' - 'A'). The only argument I'm finding is that ('a' - 'A') won't "always" equal 32 somehow while also following the ASCII system.
C++ mandates the use of ASCII
Cite the part of the standard that makes that specification.

Yes, but that's dodging the point which is that C++ is running on the ASCII system.
This is incorrect.

Well, yes, but who exactly is going to implement this?
For example, the program might be running as embedded software in a billboard, where the display matrix directly reads memory and translates numeric values to characters. In that case, the character set would be implemented in the hardware.

The problem says "Assuming the ASCII value for ‘a’ is greater than ‘A’"
I interpret the text as those being OP's words, not part of the problem.

If a requirement spec sheet like this showed up in my email, I'd probably ask for more information about it, since it seems like it's trying to allude to something but not actually saying what
If someone wrote a spec like that, it's quite possible it was done deliberately, because they want to support the most common case but also leave the door open to variation in the future.

The difference between this and the assignment details is that this pretend spec sheet gives a clear restriction. If you were to restrict yourself only to the assumptions given in this assignment, assuming there's an actual solution to the problem would also be to "high" an assumption. After all, it doesn't say that the difference between 'a' and 'A' will be the same difference between all character's lower and upper case versions.
Fine. If we interpret the problem as being underspecified, then there's no solution. Otherwise, c - ('a' - 'A') is a solution. c - ' ' is a solution to a different problem.

> c - ('a'-'A') is equivalent to c - 'a' + 'A'

In which universe is this?
It's just basic algebra.
x - (y - z) = x - y + z
Last edited on
Is that the only reason for not doing 'a' - ' ' ?


No. 'a' - ' ' is not intuitively obvious to anybody except those who happen to have seen this obscure fact and remembered it. Let's say you wrote code for my company:

1
2
3
4
5
6
7
char myToUpper(char lowerCaseLetter)
{
    // Convert to upper case
    char upperCaseLetter - lowerCaseLetter - ' ';

    return upperCaseLetter;
}


Even assuming that the code is written for an ASCII environment, if you brought this to me for a code inspection (after explaining why toupper could not be used), I would reject the code because it is not clear. There is nothing about ' ' that suggests the distance between lower and upper case letters other than the knowledge that its ASCII value is arbitrarily equal to that distance. Anybody in my department would require that something like one the following be used:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
char myToUpper(char lowerCaseLetter)
{    char upperCaseLetter;

    // Subtract the upper case to lower case offset to get the upper case letter.
    upperCaseLetter = lowerCaseLetter - ('a' - 'A');
 
    // or ...

    // Offset constant for all letters.
    int lowerToUpperOffset = 'a' - 'A';
    upperCaseLetter = lowerCaseLetter - lowerToUpperOffset;


    // or ...

    // The offset from upper case to lower case letters in ASCII
    int asciiLowerCaseOffset = 32;
    upperCaseLetter = lowerCaseLetter - asciiLowerCaseOffset;

    return upperCaseLetter;
}


In all cases the intent of the offset value is clear, and the comments reinforce that clarity. In no case is the magic character ' ' found.
Last edited on
Cite the part of the standard that makes that specification.

I suppose there isn't, huh? I do recall Duthomhas stating something along the lines of C++ practically mandating the use of ASCII. But really, there isn't any reason to assume ASCII isn't what's being used for this assignment and won't be what's used in C++ today and tomorrow. The only other system that would make any sense to go to would be Unicode, which would leave all code here non-functional.

For example, the program might be running as embedded software in a billboard, where the display matrix directly reads memory and translates numeric values to characters. In that case, the character set would be implemented in the hardware.

That seems like a notable detail that wouldn't be vaguely worded in a real-world situation. But this is a basic C++ assignment. Either way, this argument won't go anywhere. Using ('a' - 'A') is a better solution since it fits with what the professor was talking about. My arguments here weren't about that, but rather whether or not c - ' ' would even work if character sets changed, as if ('a' - 'A') would survive a character set change.

https://stackoverflow.com/questions/29381067/does-c-and-c-guarantee-the-ascii-of-a-f-and-a-f-characters

It's just basic algebra.

My bad, I was assuming parentheses around 'a' + 'A'

No. 'a' - ' ' is not intuitively obvious to anybody

That's a fine position to take and I won't argue against that. This is a valid view, my argument was more about that validity of the logic.
The only other system that would make any sense to go to would be Unicode, which would leave all code here non-functional.
ASCII is a subset of Unicode, so...

That seems like a notable detail that wouldn't be vaguely worded in a real-world situation.
If you write your programs carefully, they can run anywhere, including such arcane hardware.
ASCII is a subset of Unicode, so...

The end is near.

If you write your programs carefully, they can run anywhere, including such arcane hardware.

How in that situation? If the character system is too different, it can't possibly be portable code.
For example, a program that simply sent the string literal "Hello World!" to stdout would print that string in a normal x86 environment, and would display that string on the billboard in the hypothetical platform I mentioned earlier. The source code merely specifies that the program needs to display a string of characters. Figuring out the string of bytes that translate to those characters is the compiler's job.
Now, if instead of writing std::cout << "Hello World!"; you write std::cout << (char)0x48 << (char)0x65 << (char)0x6C << (char)0x6C /*...*/;, yes that won't work if the character set is too different. But whose fault is that?
On the other hand, if you read the contents of a file and send them verbatim to stdout, whether that works will depend entirely of the file's encoding, unless you perform a conversion step.
Topic archived. No new replies allowed.
Pages: 12