find '\r' character in a string


Here is my pice of code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
void readToArray(array<string,ARR_SIZE> &arr){

  string word;
  int crPos=0;

    ifstream file("input.txt");
    if(file.is_open())
    {
      for(int i = 0; i < ARR_SIZE; ++i)
      {
        getline(file,word,'\t');
        arr[i]=word;
        if(i==ARR_SIZE-1){
          for(size_t i=0;i<word.length();i++)
            crPos=word.find('\n');
            cout<<crPos;
        }
      }
    }
}

This reads one line from a tab delimited file to an array word by word. The problem is that the very last element contains an '\r' and a '\n' (no tab at the end of the line) and the next tab is placed after the first element of the next line.

That's why the last array element contains:

1. the last word of the first line
2. an '\r' character
3. a '\n' character
4. the first element of the next line

If I put the crPos=word.find('\n') line in my code it returns the position of the new line character correctly but crPos=word.find('\r') doesn't return the position of the carriage return.

What is the reason of this phenomenon?

Last edited on
If you know precisely how many elements are on each line then you can use the normal stream extractor >>, since tab will be treated as another form of white space.

If you don't, then you can use the two-parameter form of getline to read a whole line in one go, then stringstream the contents of that line into your array.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#include <iostream>
#include <sstream>
#include <string>
#include <vector>
using namespace std;

int main ()
{
   string line, word;
   vector<string> arr;

   stringstream ss( "aa\tbb\tcc\ndd\tee\tff" );  // to simulate the file

   while( getline( ss, line ) )                  // read a whole line
   {
      cout << "Line is " << line << endl;

      stringstream ssline( line );               // put line in a stringstream
      while( ssline >> word ) arr.push_back( word );   // and just extract parts

      cout << "Individual words are: ";
      for ( string s : arr ) cout << s << " ";
      cout << endl;
      arr.clear();
   }

}
Last edited on
Thanks for your detailed answer. It was also my initial idea to upload my array in two steps you suggested. I thought there was a simpler way to solve this problem.

Anyway I still interested in why the crPos=word.find('\n') line works differently than crPos=word.find('\r')
Last edited on
@lastchance forgot that tab is not necessarily a space. An update:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
#include <iostream>
#include <sstream>
#include <string>
#include <vector>

typedef std::string          field;
typedef std::vector <field>  record;
typedef std::vector <record> table;

std::string trim( std::string& s )
{
  return s.erase( s.find_last_not_of( ' ' ) + 1 )
          .erase( 0, s.find_first_not_of( ' ' ) );
}

table read_delimited_values( std::istream&& f, char delimiter )
{
  table result;
  std::string line;
  while (getline( f, line ))
  {
    record record;
    if (line.empty()) continue;
    
    std::istringstream ss( line );
    std::string field;
    while (getline( ss, field, delimiter ))
      record.push_back( trim( field ) );
      
    result.push_back( record );
  }
  return result;
}

int main()
{
  std::string TSV = 
    "Schmidt\tJohn Jacob Jingleheimer\t22\n"
    "Piper\t Peter \t48\n"
    " Simon\tSimple\t23\n"
    "Muffet \tLittle Miss\t17\n";
    
  table data = read_delimited_values( std::istringstream{ TSV }, '\t' );
  
  for (auto rec : data)
  {
    const char* prefix[2] = { "", ", " };
    std::size_t n = 0;
    for (auto field : rec) std::cout << prefix[!!n++] << '"' << field << '"';
    std::cout << "\n";
  }
}

If I put the crPos=word.find('\n') line in my code it returns the position of the new line character correctly but crPos=word.find('\r') doesn't return the position of the carriage return.

What is the reason of this phenomenon?

If your program is running on a Windows system, a file opened in text mode like this: ifstream file("input.txt"); will automatically have the combination "\r\n" translated to just "\n" during the file input. That means as far as the program is concerned, the '\r' will not be seen, as it is translated out of existence.

You could test that by opening the file in binary mode, ifstream file("input.txt", ios::binary); so that no translation of line-endings takes place. Usually though we want the translation to take place, so that from inside the program the line endings are simply '\n', but outside the program, other applications such as notepad or other text editors will see the expected "\r\n", hence binary mode is usually not used for ordinary text files.
Thx for all replies. It's very usful to know the backround info described by Chervil.

In fact the original file is arranged in a 15x<idontknow howmany> matrix so the lastchance hint:
If you know precisely how many elements are on each line then you can use the normal stream extractor >>


would also work.
Topic archived. No new replies allowed.