The bug of getting the word form the article(read txtfile)

Hello.I am trying to get the word from the article and count the frequency of each word.But there is a problem when I fscanf the txt file,the symbol ' – ',my program would not ignore that , and read it to be 'V'

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#include<stdio.h>
#include<string.h>
typedef struct{
	char storeword[1024][30];
	int wordfreq[1024];
	int wordnum;
}list;
void readfile(char* filename,list* wordlist){
	FILE* fp;
	char data[256];
	char buffer[256];
	int i;
	fp=fopen(filename,"r");
	if(fp==NULL)
	printf("The file opened failed\n");
	else{
		while(fscanf(fp,"%[A-Z|a-z/’]%[^A-Za-z]",data,buffer)==2){	
		printf("%s\n",data);
		
		}
		fclose(fp);
	}
}
  int main(){
	list wordlist;
	readfile("article.txt",&wordlist);
}

Here is the example.
article.txt
An Environment Bureau proposal submitted to the Legislative Council on Monday said the payment to independent producers – known as the feed-in tariff – would be set at HK$3 (US$0.38) to HK$5 per kilowatt-hour (kWh) of electricity to spur investment in clean energy production.
The result:
An
Environment
Bureau
proposal
submitted
to
the
Legislative
Council
on
Monday
said
the
payment
to
independent
producers
V---------------------------------->' - 'that should be ignored by fscanf
known
as
the
feed
in
tariff
V---------------------------------->' - 'that should be ignored by fscanf
would
be
set
at
HK
US
to
HK
per
kilowatt
hour
kWh
of
electricity
to
spur
investment
in
clean
energy
production
Last edited on
Hello toby1a05,

The program ran fine for me. I could not duplicate the problem.

When I looked at the hex code for that particular symbol it came up as 0x96 or a decimal value of 150. In my program that prints an ASCII table it shows a character of a lower case 'a' with a '^' over it. There is a good chance that in your code page this is something different and is being picked up by the program differently.

There are a few other things I want to try to see how the program is working,

Hope that helps,

Andy
Topic archived. No new replies allowed.