Reading in words from a file one at a time and ignoring unnecessary characters in it?

Hey everyone,

I have to read in words from a file one at a time, make them lower-case, then pass them into a function. The problem I am having is that sometimes when I read in the word it also reads in some characters that I don't want, like the comma, semi colon, or something that may be attached to the beginning/end of a word.

For example:

1
2
3
4
5
--such

that;

[george


I need just the word to be passed into the function, not the other stuff attached to it. How could I go about fixing this?
Last edited on
I pretty much want to remove everything except for alphabet letters though, so is making a list of characters that I don't want and checking for them the right way to do this?
Last edited on
isalpha(char c); will return true if c is a letter. You can then iterate over the whole string and add all letters to a new string, like so:

1
2
3
4
5
6
7
8
  int i=0;
  char str[]="C++";
  char new_str[] = "";
  while (str[i])
  {
    if (isalpha(str[i])) new_str += str[i];
    i++;
  }


If you want to keep more than latters you can check for if (isalpha(c) || c == "." || ...) or create your own array with all chars you want to keep and again, iterate over the string and check for the char at i whether it is in your array.
1
2
3
char new_str[] = "";
// ...
new_str += str[i];


Something's wrong!
Oh, obviously... I usually prefer using std::string, so

1
2
3
4
5
6
7
8
9
10
#include <string>

  int i=0;
  char str[]="C++";
  string new_str = "";
  while (str[i])
  {
    if (isalpha(str[i])) new_str += str[i];
    i++;
  }


does it.
Last edited on
Just for fun, a solution that edges into the very useful world of C++ STL algorithms:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#include <string>
#include <iostream>
#include <algorithm>
#include <cctype>

int main()
{
  std::string str = "Text;with-bad(characters";
  str.erase(std::remove_if(str.begin(), 
			   str.end(),
			   [](char x){return !(isalpha(x));}),
	    str.end());
  std::cout << str << '\n';
}
Last edited on
Topic archived. No new replies allowed.