Help cleaning up a verbose loop (loop to remove additional white space from an array)

Hello, I am trying to create a program that allows a phrase to be stored, but the program will clean it up and remove any additional or unwanted spaces. I have a logic-loop that seems to work, however its kinda unwieldy and I am hoping someone can give me some advice on where maybe its redundant, or where I could save time/space. Here's what I have.

The main part that I would like some assistance with is the whitespace while loop.

I am not yet into concepts like strings, pointers, etc. (I've seen several advice posts where people recommend using points, or strings, and functions like remove/erase, or something else.).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
#include <iostream>
#include <cctype>
#include <cstring>

using namespace std;

const int SIZE = 130;

int main ()
{
    char initPhrase[SIZE]; //Character array for initial phrase.
    char newPhrase[SIZE]; //Character array for edited phrase.
    int temp = 0; //Temporary counter in white-space loop.
    int c = 0; //Initial phrase position counter.
    int d = 0; //New phrase position counter.

    cout << "Enter a phrase to be stored (130 characters max): " << endl;
    cin.get(initPhrase, SIZE, '\n');
    cin.ignore(100, '\n');

    //Loop to remove excess whitespace.
    while (initPhrase[c] != '\0') //Beginning on position c, if not \0.
        { //Open while loop.
        if (initPhrase[c] == ' ') //Check condition if position c is blank.
            { //Open if loop.
            temp = c+1; //Set temporary counter to 1 place after pos c.
            if (initPhrase[temp] != '\0') //If location at temp is not \0
                { //Open if loop.
                //Run the while loop while current position is empty
                //and the following space is not the terminus
                while (initPhrase[temp]== ' ' && initPhrase[temp] != '\0')
                { //Open while loop
                //If the temp position (c+1) is also a space (initiated when
                //c was also a space, meaning 2 spaces),
                if (initPhrase[temp] == ' ')
                    {c++;} //Increment c position
                temp++; //Increment temp position.
                } //Close while loop 2.
            } //Close if loop 2.
        }//Close if loop 1.
    newPhrase[d] = initPhrase[c]; //Copy character into newPhrase.
    ++c; //Increment initPhrase position.
    ++d; //Increment newPhrase position.
    } //Close while loop 1.
    newPhrase[d] = '\0'; //Ensure ending at '\0' to remove trash.

    if (islower(initPhrase[0])) //Capitalization if-check.
    {newPhrase[0] = toupper(initPhrase[0]);}

    cout << "Cleaning up phrase........" << endl;
    cout << "Auto-capitalizing start of phrase." << endl;
    cout << "Removing excessive spaces between words." << endl;
    cout << "New phrase:" << endl;
    cout << newPhrase << endl;

    return 0;

}
Last edited on
your code isnt that bad given the tools you are willing to use.
I would not mess with it: it will be much cleaner with alternate techniques. Pointers would really help here as all those indices that you keep track of become innate and are no longer cluttering up the code.

the only ways I see to make this code a lot smaller without using pointers is going to make it much harder to read and follow, and that for no significant gains.

changes you might consider: just to-upper the first letter, don't check it to see if it needs it, that is wasting time (this isnt obvious, but the check costs more cpu time than doing the toupper when not needed here).



Here is my trim, using the stuff you didn't want to use. And its not perfect, it was written in haste for not c++ coders and was moved over from ancient code, avoiding pointers and iterators and such to make it simple. You don't want the top condition for keeping multiple internal spaces, so you can see, the whole thing is just a few lines ignoring that first block. You can also see that its more or less like yours, all things considered. I use 2 loops to get the ends cleaned up, and you did it with special conditions in your code. Loops have a condition built into them, so the difference is minor, but mine only does the bare minimum checks and work needed to trim the ends.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
void trim(string &s, char c, bool keepinternal = true)
{  //database-like trim. keepinternal true means keep one copy of multiples inside 
//while false means to remove all copies.  ALL leading and trailing (even multiples)
//are removed regardless. 
  size_t dx; size_t i; size_t j; 
  if(keepinternal == false)
  { //completely eliminates the character c from the string. 
    //c++ remove on array-like containers cannot resize it. Erase must also be called to 
	//repair the container: web search c++ remove erase idiom
  	s.erase(std::remove(s.begin(), s.end(), c), s.end());
	return;
  }  
  //trim leading and trailing and internal
  dx = 0;   i = s.length()-1;   
  while(s[dx] == c) {dx++;} //find the first letter that is not c
  while(s[i] == c) {i--;} //find the last letter that is not c  
  string temp = s;  s = ""; //copy s into temp and set s to empty string
  for(j = dx; j<i+1; j++)
  { //for all the letters between the first not c up to the last not c, copy back into s
    //unless there are 2+ in a row, then only get the first one. 
	if(temp[j]!=c || (temp[j]==c && temp[j+1]!=c))//keeps 1 copy per group of 1 or more cs internally
	s += temp[j];                           //that is 3 spaces in a row become 1 space, etc. 
  }
}

Last edited on
Thank you very much for the advice! I didn't think about the upper case thing.
I will incorporate it and see what happens.
again, its not obvious. Often if you time a few billion runs of your code, you can see that an if statement that should prevent work costs more than the work it prevented, and then you take it back out and just do the work regardless.
examples are things like
if(pointer != nullptr) pointer = nullptr; //this costs more than just saying pointer=nullptr in many instances, if the CPU branch prediction is already busy on other branches or it can't reliably predict

its another topic for another day and the difference is really, really small unless you work on a LOT of items at a time (which I do).

there is probably a way to do this efficiently in-place (my line 17) which would be a much bigger gain. If you wanted to tune yours or play with it, avoiding any unnecessary copying may be something to look at, but there again, the computer is REALLY good at copying stuff. Shuffling the data inside the same string multiple times is worse than a one time copy, of course.
Last edited on
Line 35 is redundant. The while loop guarantees that initPhrase[temp] == ' '

You can do this easily by keeping track of whether the previous character was a space.
I like to use "src" and "dst" for the indexes into the thing I'm reading and the thing I'm writing
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
#include <iostream>
#include <cctype>
#include <cstring>

using namespace std;

const int SIZE = 130;

int main ()
{
    char initPhrase[SIZE]; //Character array for initial phrase.
    char newPhrase[SIZE]; //Character array for edited phrase.

    int src=0;			// read index into initPhrase
    int dst=0;			// write index into newPhrase
    bool wasSpace = false;	// true means the last character was a space
    
    cout << "Enter a phrase to be stored (130 characters max): " << endl;
    cin.get(initPhrase, SIZE, '\n');
    cin.ignore(100, '\n');

    //Loop to remove excess whitespace.
    for (src = 0; initPhrase[src]; ++src) { // for each input char
	if (wasSpace && initPhrase[src] == ' ') {
	    // skip it
	} else {
	    newPhrase[dst++] = initPhrase[src];
	}
	wasSpace = (initPhrase[src] == ' ');
    }
    newPhrase[dst] = '\0';

    if (islower(newPhrase[0])) //Capitalization if-check.
    {newPhrase[0] = toupper(newPhrase[0]);}

    cout << "Cleaning up phrase........" << endl;
    cout << "Auto-capitalizing start of phrase." << endl;
    cout << "Removing excessive spaces between words." << endl;
    cout << "New phrase:" << endl;
    cout << newPhrase << endl;

    return 0;
}

Awesome, I will plug that in and mess around with it, definitely looks more condensed.
It worked great. Thanks again.
Topic archived. No new replies allowed.