Pulling only numeric values from mixed file?

I am working on this code to extract numeric values from a file that contains both numeric values and strings, which are separated by commas, and then writing those values into a new file.

I think I have most of the code right, except for the part where I have to figure out whether what is being read from the file is an integer or a string. I've tried to use isdigit, but I'm not entirely sure how to use that when it comes to reading files...

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
//Rewrite data in lane9.dat into clane9.dat
//Include only numeric values less than or equal to 10
//Use commas as delimiters

#include <iostream>
#include <cstdlib>
#include <fstream>
#include <sstream>
#include <string>
#include <cctype>
using namespace std;

int main(){
 ifstream fin;
 ofstream fout;
 string input;

//check to make sure file opens
 if (fin.fail()){
  cout << "Error opening the file" << endl;
  exit(1);
 }

 fin.open("lane9.dat");
 fout.open("clane9.dat");

//loop through file text
 while (getline(fin, input)){
   fout << input << endl << endl;

//create stringstream to parse text
   stringstream extract(input);	
   string token;

//introduce comma as delimiter
	getline(extract, token, ',');
	int i;
	extract >> i;
	fout << i << endl;
	}

fin.close();
fout.close();

system("PAUSE");
return 0;
}


This IS an assignment, so I don't care if the answer isn't given to me straight away. I would, however, really appreciate a pointer in the right direction.
Last edited on
Anyone?

Should I perhaps try making an array of the file?
Well it depends on the content of the file. The only thing I know so far is that there are some items separated by commas. Well, the description above says "numeric values and strings", but a numeric item can be considered a special case of a string. Likewise, a string may contain some numeric content.
Example:
123,Fred Bloggs,1240 Centenary Ave,Apple,456
The question is, what to do with the cases where the string may begin with a number. If there are none, then it's simply a case of reading everything up to the next comma as a string, then use a stringstream to attempt to extract a number from the string. If it succeeds, it's a number, if it fails, it's a string.
The text from the file is as follows:
8,split,1,9,1,spare,7,2,10,strike,8,2,spare,6,split,2,7,3,spare,10,strike,10,strike,10,turkey,8,1


Now with my code above, I'm able to print
2883586


Which isn't right at all. Obviously those numbers are coming from the file, but not all of them are showing up and I'm not sure where the order is coming from either.
Thanks for the file data. it helps me to think more clearly.

Just a couple of comments on the code in the first post of this thread. The check at line 19 should be moved so it is done after the file is opened at line 24, 25.

At line 28, I would change this:
 
    while (getline(fin, input)){

to this instead:
 
    while (getline(fin, input, ',')) {


That way, the splitting by comma delimiter is the very first thing that is done with the file. Inside the body of that while loop, use the string stream as at present, extract >> i; should work ok I think. But you can do better, by putting if (extract >> i) you will be able to isolate just the numbers.
Last edited on
Thanks for those catches! Makes sense.

Now for the output, I get the same text from the file, but each string has their own line.
I figured it out -

having the second getline was messing things up, especially after changing the initial getline with the comma as the delimiter.

Thanks for your help!
Topic archived. No new replies allowed.