Help removing EXTRA whitespaces and EXTRA newlines from file

Hi, I have to submit this assignment to Vocareum which it has 3 parts. I'm stuck in removing extra whitespaces and extra newlines from a file. This is my code for that:

case 's':

myfile.open(argv[2]);
str="";
char ch;

while(myfile.get(ch)){

str+=ch;

for(j=0; j<str.length(); j++){

if(str[j]=='\n' && str[j+1]=='\n'){
str.erase(j, 1);

}
if(str[j]==' ' && str[j+1]==' '){
str.erase(j, 1);

}


}
}
cout << str;


myfile.close();




break;

For some reason Vocareum doesn't give me the points for this, and it shows me something like I'm missing something:

RUNNING prog0 -s second.txt
this line has nothing
but this line needs a squish
this also with extra newlines
and this has evil spaces at end
and this has spaces AND a skip
this has spaces and extra skips
done
5,6c5,6
< and this has spaces AND a skip
< this has spaces and extra skips
---
> and this has spaces AND a skip
> this has spaces and extra skips
RESULT: sq [ 0 / 1 ]
is \n the only newline combination scored?
does a mix of space and \n need to be cleared up, like space newline (extra pointless space)
Does it score your run time or performance in any way? It is woefully inefficient.

I would simply NOT ADD the character if it doubles down, rather than ADD it and then ERASE it which is worse because I think erase looks at the whole string(?). Is there a copy-if that might do it?

something like...
get ch
if string[last] == ' ' and ch == ' ' don't add
if string[last] == newline and ch == newline don't add
else add

Last edited on
Hi I'm not sure what you mean by "add", the text file has some text with extra spaces and also extra new lines that need to be removed. Only the extra ones since the new lines that already exist have to remain the same. Now I know my code is not right so I changed it a little bit:
myfile.open(argv[2]);
str="";
char ch;

while(myfile.get(ch)){

str+=ch;




if(isspace(ch) && isspace(str[str.back()])){
str.pop_back();





}

}

cout << str;



myfile.close();
while(myfile.get(ch)){

str+=ch; <-----ADD (or, concat, if you prefer)


I was saying wrap that in the conditions.

while(myfile.get(ch))
{
if (!(str[str.back()] == ' ' && ch== ' ') && !((str[str.back()] == '\n' && ch== '\n'))
str+=ch;


the pop back solution is much, much better, but why add it at all?

Last edited on
Checking only for ' ' (space) and '\n' (new line) is naive.
In a standard text file, the white space characters could be any one of ' ' (space), '\t'(tab) or '\n' (new line);
ideally use std::isspace(c) to check (at runtime) if c is a white space character.

Since formatted input from streams skips leading white space,
we can let an input stream do the work of removing extra white space for us.

For instance:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#include <string>
#include <sstream>
#include <fstream>

// remove extra white space from a line
std::string without_extra_ws( const std::string& line )
{
    std::string result ;

    // create an input string stream to read from the line
    std::istringstream stm(line) ;

    // read ws separated tokens one by one and add them to result
    // with a single space character between the tokens
    std::string token ;
    while( stm >> token ) result += ' ' ;
    if( !result.empty() ) result.pop_back() ; // remove the last ws
    // note: the ws at the end would be 'extra ws' because a new line is also ws

    return result ;
}

// remove extra white space and new lines
// assumes that the file is not gigantic (contents fit into available memory)
void remove_extra_ws_nl( const std::string& file_name )
{
    std::string cleaned_contents ;

    if( std::ifstream file{file_name} )
    {
        // read the contents of the file without extra white space and new lines into cleaned_contents
        std::string line ;
        while( std::getline( file, line ) ) // for each line in the file
            if( !line.empty() ) // if it is not empty (not the result of an extra new line)
                cleaned_contents += without_extra_ws(line) + '\n' ;
    }

    // overwrite the file with the characters in cleaned_contents
    std::ofstream(file_name) << cleaned_contents ;
}
Topic archived. No new replies allowed.