Removing extra spaces

Pages: 12
I was supposed to make a program that receives a string and then generates a new one, with no extra spaces in the middle, begin and end of the string.
I tried two types of logical thinking, both didn't work.
At my first try, I almost got it, but I couldn't remove the first space. If there was, in example, 3 spaces at the beginning of the string, I succesfully could remove 2, but there was one that didn't disapear.
After some days trying to figure out what was wrong what my code, I tried to see my problem from a different way and wrote a new code using a different logical thinking. I found my second logical thiking more rational, but I don't know, it didn't work anyway and my results are worst than it was when I was using my first code.
I'll show both codes I wrote and I hope someone can help me. Thank you!
PS: In my second code I didn't wrote anything about dots and commas as I did in the first one, but later, if my second code works, I'll do it.

My first code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
#include <stdio.h>
#include <stdlib.h>

int main ()
{
    int i=0, j=0;
    char string[225], stringWithNoSpaces[225];
    puts("Type a string");
    gets(string);
    while(string[i]!='\0')
    {
        if(string[j]==32)
        {
            if(j==0)
            {
                j++;
                if(string[j]!=32)
                {
                    if((string[j]==46)||(string[j]==44))
                    {
                        j++;
                    }
                }
                else
                {
                    while(string[j]==32)
                    {
                        j++;
                    }
                }
            }
            else
            {
                if(string[j+1]!=32)
                {
                    if((string[j+1]==46)||(string[j+1]==44))
                    {
                        j++;
                    }
                }
                else
                {
                    while(string[j]==32)
                    {
                        j++;
                    }
                }
            }
        }
        stringWithNoSpaces[i]=string[j];
        j++;
        i++;
    }
    puts("The string with no extra spaces is:");
    puts(stringWithNoSpaces);
    system("pause");
    return 0;
}


My second code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#include <stdio.h>
#include <stdlib.h>

int main ()
{
    int i=0, j=0;
    char string[225], StringWithNoSpaces[225];
    puts("Type a string");
    gets(string);
    while(string[i]!='\0')
    {
        if(string[i]==32)
        {
            if((i!=0)&&((string[i-1]!=32)&&(string[i+1]!=32)))
            {
                string[i]=StringWithNoSpaces[j];
                j++;
            }
        }
        else
        {
            StringWithNoSpaces[j]=string[i];
            j++;
        }
        i++;
    }
    StringWithNoSpaces[j]='\0';
    puts("New string:");
    puts(StringWithNoSpaces);
    system("pause");
    return 0;
}
Are you trying to make all spaces into single spaces?

 
if(string[i]==32 && string[i + 1] == 32)


So we check if there is a space, with an extra space after it only and then and only then do we delete the space. If you do this, you will have to write a separate statement to chop all from the beginning and end of the code.

Another thing. Are you supposed to be catching tabs?
"extra spaces" and "NoSpaces" are not the same thing. An example of how a related function is described: http://qt-project.org/doc/qt-5.1/qtcore/qstring.html#simplified

Are you forced to use plain C?
Last edited on
Let me see if I got it. I need to verify the position of the string. If it is at the begging, I won't copy any space to the second string. Same thing will happen if it is at the end. And if it is somewhere in the middle of the string, I use this code you showed, right?
The question I received just says about spaces, so I think I am not supposed to catch tabs.
Last edited on
Yea, I know "extra spaces" and "no spaces" aren't the same thing, but I found "stringWithNoExtraSpaces" a name very big for a variable, that's why I named it just "NoSpaces" instead of "NoExtraSpaces".
I saw the link you sent me, but I can't use this function because I have to write my whole code using only the functions my teacher taught me (it is a question from my college)
I need to use plain C
You cannot use the Qt function, but you can look at its description:
Returns a string that has whitespace removed from the start and the end, and that has each sequence of internal whitespace replaced with a single space.


For that, I would be inclined to jump to the end first, and write null over trailing spaces. Then start from beginning and advance the "src" iterator until it is no longer pointing to spaces at start. We do now have a pointer to first non-space and we know that there is no trailing space either.

That leaves the reduction of the internal spaces. Overall, we would copy each character from "src" to "dst". However, we could have a flag. Every non-space sets the flag on. Space sets it off. If it is already off and the character is space, then no copy. Only the first space gets copied.


I don't comment the gets() and puts(). C has multiple functions and some are safer than others. See the reference documentation.


What I find odd is that you use '\0' but 32. I would probably do the opposite: 0 and ' '.
The way you said, what I understood is you are modifying the first variable, right? What I need to do is generate a new string (that is stringWithNoSpaces, in this case), without modifying the variable string. So, what I am thinking is that I should copy string to stringWithNoSpaces and then modify it. Did it get it?
I think a big issue is in your variable name.

"no extra spaces" doesn't mean remove all spaces.

If I understand you correctly, you wish to remove all leading and trailing spaces, and replace all internal contiguous sequences of spaces with a single space?

 
#include <boost/algorithm/string.hpp> 
1
2
3
4
5
6
7
string trim_and_reduce_spaces( const string& s )
  {
  string result = boost::trim_copy( s );
  while (result.find( "  " ) != result.npos)
    boost::replace_all( result, "  ", " " );
  return result;
  }

Hope this helps.
Yea, that's it. But unfortunately I can't use any library besides stdio.h, stdlib.h, conio.h and cstring :/ Anyway, I appreciate your help. Thank you a lot!
Whether you operate inplace or not is up to you.

http://www.cplusplus.com/reference/cstring/strlen/
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
char string[225], simplifiedString[225];
char * src = string;
char * dst = simplifiedString;
char * end = string + strlen(string);
// where does the end point now?
while ( (end != src) && (' ' == *(end-1)) )
{
  --end;
}
// where does the end point now?

// more code

while ( (end != src) && ... )
{
  // more code, for example a copy of a character
  *(dst++) = *(src++);
}

Oops! I missed that your original post was C and not C++.

I'm off to eat dinner right now, but there is a fairly simple solution in C I'll get to you when I'm done.
I have done some modifications to my first code (I gave up to use my second logical thinking)
Here it is:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#include <stdio.h>
#include <stdlib.h>

int main ()
{
    int i=0, j=0;
    char string[225], stringWithNoExtraSpaces[225];
    puts("Type a string");
    gets(string);
    while(string[i]!='\0')
    {
        if(string[j]==32)
        {

            if(string[j+1]!=32)
            {
                if((string[j+1]==46)||(string[j+1]==44))
                {
                    j++;
                }
            }
            else
            {
                while(string[j+1]==32)
                {
                    j++;
                }
            }
        }
        stringWithNoExtraSpaces[i]=string[j];
        j++;
        i++;
    }
    while(stringWithNoExtraSpaces[0]==32)
    {
        for(i=0;i<100;i++)
        {
            stringWithNoExtraSpaces[i]=stringWithNoExtraSpaces[i+1];
        }
    }
    puts("New string: ");
    puts(stringWithNoExtraSpaces);
    system("pause");
    return 0;
}


I solved my problem about spaces at the beginning of the string, but I don't know where do I put '\0' to mark the end of the string
Here you go.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#include <stdio.h>
#include <string.h>

char* trim_and_reduce_spaces( char* s )
  {
  if (s)
    {
    char* from = s + strspn( s, " " );
    char* to   = s;

    do if ((*to = *from++) == ' ') 
      from += strspn( from, " " );
    while (*to++);
    
    while (*--to == ' ') 
      *to = '\0';
    }
  return s;
  }

int main()
  {
  char  s[ 1000 ];
  char* p;
  
  for (;;)
    {
    printf( "s? " );
    fflush( stdout );
    fgets( s, 1000, stdin );

    p = strchr( s, '\n' );
    if (p) *p = '\0';
  
    if (!strlen( s )) break;
  
    printf( "\"%s\"\n", trim_and_reduce_spaces( s ) );
    }
    
  return 0;
  }
Out of interest, how does the handling of . and , fit into your scheme?

Is it the case that (e.g.) " hello , world . " needs to end up as "hello, world." ??

i.e. all leading and trailing spaces are removed; all spaces before periods and commas are removed; all periods and commas have a single space following them; and all remaining runs of whitespace are replaced by a single space?

Andy
Last edited on
Oooh, I didn't tought about it. '.' and ',' are different from ' ', so the space will be eliminated anyway, right?
No, if you reduce multiple consecutive spaces into one. That will not remove single space, like between "hello" and ",".

If you remove all spaces, then you get "hello,world.", which is different from reducing space.


The code by Duoas is inplace, but it is quite trivial to change it to write to a separate output array.
Last edited on
As it stands, Duoas' code doesn't remove the final space; for example, where the input string has two leading spaces and two trailing spaces, I get

s? hello
"hello "

i.e. one trailing space.

Decrementing to just before line 15 fixes the problem, i.e.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
char* trim_and_reduce_spaces( char* s )
  {
  if (s)
    {
    char* from = s + strspn( s, " " );
    char* to   = s;

    do
      {
      if ((*to = *from++) == ' ') 
        from += strspn( from, " " );
      }
    while (*to++);
    
    --to; // extra decrement
    while (*--to == ' ') 
      *to = '\0';
    }
  return s;
  }


(Where I also separated the do-while and if to make it easier for me to follow while debugging.)

s?   hello
"hello"

I think the culprit is the ++ in the do-while's while, which will increment one past the teminating null which causes the loop to exit. As a result of this, the loop which is supposed to remove the trailing spaces will never run as the first thing it finds will always be a null.

(output from original alg with code added to dump chars in string.)

s?   hello
orig   = {' ',' ','h','e','l','l','o',' ',' ','\0'}
step 1 = {'h','e','l','l','o',' ','\0',' ',' ','\0'}
step 2 = {'h','e','l','l','o',' ','\0',' ',' ','\0'}
"hello "


Andy
Last edited on
Duoas, thank you for your attention, but I couldn't use pointers. Besides I already studied it, it wasn't taught at the college yet, so I was supposed to do only using what was already explained. I found some algorithms while I was searching for it in google, but all of them are using pointers and my problem is that I couldn't find any solution without using pointers. keskiverto, I was really mistaken, I erased that part I wrote about '.' and ',' and now what should be "hello, world." is "hello , world .", but I think there is no problem. By now, I will let it be, because my code is working now, besides of the '.' and ',' problem. I tought about what you said first, and instead of follow "spaces at the begin" "middle spaces" and "spaces at the end", I did middle and end first, and after that I did the begin.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
#include <stdio.h>
#include <stdlib.h>
#include <cstring>

int main ()
{
    int i=0, j=0 ;
    char string[225], stringWithNoExtraSpaces[225];
    puts("Type a string");
    gets(string);
    while(string[i]!='\0')
    {
        if((string[j]==32)&&(string[j+1]==32))
        {
            while(string[j+1]==32)
            {
                j++;
            }
        }
        stringWithNoExtraSpaces[i]=string[j];
        j++;
        i++;
    }
    stringWithNoExtraSpaces[i]='\0';
    while(stringWithNoExtraSpaces[0]==32)
    {
        for(i=0;i<strlen(stringWithNoExtraSpaces);i++)
        {
            stringWithNoExtraSpaces[i]=stringWithNoExtraSpaces[i+1];
        }
    }
    puts("New string:");
    puts(stringWithNoExtraSpaces);
    system("pause");
    return 0;
}


It seems like it's working perfectly now, so, thank you a lot for everyone who helped me. Have a great week.
I couldn't use pointers.

It's easy enough to map pointer-based code to index based code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
char* trim_and_reduce_spaces( char* s )
  {
  if (s)
    {
    int from = strspn( s, " " );
    int to   = 0;

    do
      {
      if ((s[to] = s[from++]) == ' ')
        {
        from += strspn( &s[from], " " );
        }
      }
    while (s[to++]);

    --to; // extra decrement
    while (s[--to] == ' ')
      s[to] = '\0';
    }
  return s;
  }


And, for what it's worth, the solution I'd use is prob something like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h> // Visual C++ does not provide this for you
#include <string.h>

int main ()
{
    int i=0, j=0;
    bool oweSpace=false;
    char string[225], stringWithNoExtraSpaces[225];
    puts("Type a string");
    gets(string);
    while(string[i] != '\0')
    {
        if(string[i]==' ')
        {
            oweSpace = true;
        }
        else
        {
            if(oweSpace)
            {
                if(0 < j)
                    stringWithNoExtraSpaces[j++]=' ';
                oweSpace = false;
            }
            stringWithNoExtraSpaces[j++] = string[i];
        }
        i++;
    }
    stringWithNoExtraSpaces[j]='\0';
    puts("New string:");
    puts(stringWithNoExtraSpaces);
    system("pause");
    return 0;
}


Andy

PS If your code is supposed to be C, you should use

#include <string.h>

rather than the C++ wrapper header

#include <cstring>
Last edited on
Hey Andy,
I found your way to solve it very interesting, thank you. But there is just a thing I didn't understood. I read that C language doesn't support boolean data type, is that right? How could you use the boolean type?
Pages: 12