string parsing issue

Pages: 12
I have following null terminated char array which I want to parse as follows:
look for the first '\n' and then look for the '^' after this pack the ascii chars into integers separated by ',' and save them in array recvd[]
so the following array would be read as :

recvd[0]=1023 , recvd[1]=1023, recvd[2]=1023, recvd[3]=735, recvd[4]=100,
recvd[5]=97, recvd[6]=255, recvd[7]=255

here is my code but its giving me the last byte only and that too for recvd[0] and not recvd[7]. and the rest of elements are getting 0.


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
char * p = recvd_chars;
uint16_t recvd[100];
char recvd_chars[100];
recvd_chars[100]='\0';

48 44 10 63 94 49 48 50 51 44
49 48 50 51 44 49 48 50 51 44
55 51 53 44 49 48 48 44 57 55
44 50 53 53 44 50 53 53 33 10
*/
	counter=0, i=0;
	while(*p != '\n' )
	{
	  p++;  //do nothing ..wait for the 1st ^ to start processing the string
    counter=counter+1;  
	}
	p++;
	while(1) {
		recvd[i] = (uint16_t)atoi(p);
		while ((*p != ',') && (*p != '\n')) {
			p++;
		}
		if (*p == '\n') break;
		p++;
	}
TFT_String(120,10,itoa(recvd[0],buffer,10),RED,txt_bg_color,3);
TFT_String(140,10,itoa(recvd[1],buffer,10),RED,txt_bg_color,3);
TFT_String(160,10,itoa(recvd[2],buffer,10),RED,txt_bg_color,3);
TFT_String(180,10,itoa(recvd[3],buffer,10),RED,txt_bg_color,3);

255
0
0
0
Last edited on
ok I used strtok() function and it extracted all the tokens for me but how can I save them in individual array elements and then also convert them into integers?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
 
void split_string()
{
 int a=10;

  char * pch;
  pch = strtok (recvd_chars," ,!^?");
  while (pch != NULL)
  {
   sprintf (cmdbuff,"%s\n",pch);
   TFT_String(a,20,cmdbuff,RED,txt_bg_color,2);
   pch = strtok (NULL, " ,.-");
   a=a+20;
  }

}

I tried a C++ solution using stringstream, but that probably isn't what you want. An idea using sscanf()
The function expects a string and an integer array as parameters. It returns the number of integers extracted.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
int parse_string(char * input, int * output)
{
    // find newline
    char * p = strchr(input, '\n');  
    if (!p)
    {
        printf("newline not found\n");
        return 0;
    }
    
    // find '^'
    p = strchr(p, '^');
    if (!p)
    {
        printf("'^' not found\n");
        return 0;
    }    
    
    p++;  // skip past character just found
    
    // variables used by scanf
    int n;
    char ch;
    int count = 0;
    
    // output extracted to array here
    int index = 0;
    
    while ((sscanf(p, " %d%c%n", &n, &ch, &count) > 1))
    {
        p += count;
        output[index++] = n;
    }    
    
    return index;
}
Last edited on
how can I apply sscanf() to my function ?
how can I apply sscanf() to my function ?

sscanf() is applied to a character string. As illustrated in the example I gave.

I'm sorry I didn't try to understand your code, my first task was to understand the input, which looks like this:
char data[] = "0,\n?^1023,1023,1023,735,100,97,255,255!";
By the time I'd figured that out, I just went ahead and wrote the rest of the code without looking at what you'd tried.

The rest of my code looks like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main()
{
    char data[] = "0,\n?^1023,1023,1023,735,100,97,255,255!";  

    int recvd[100];
    int count = parse_string(data, recvd);
    
    printf("recvd: ");
    for (int i=0; i<count; i++)
    {
        printf("%5d", recvd[i]);
    }

}


Put that together with my previous post and you have a complete working example.
awesome !
its actually doing something extra that I was planning to do , meaning I was planning to extract the string from first '\n' char to the next '\n' char and process this substring only but your code actually is doing that .
how is it working the sscanf() specially can you please explain?
also I am not seeing how sscanf() is separating the tokens based on a "," ?

also I noticed that the sender is sending the string in this format
"?^%u,%u,%u,%u,%u,%u,%u,%u!\n". there is a "!" before the '\n'
and even though you are not dealing it in your code yet the tokens are successfully extracted .. how ?

thanks
Last edited on
A quick summary of my use of sscanf. It may not answer all of your questions, but it's a start.
My code:
1
2
3
    int n;
    char ch;
    int count = 0;

sscanf(p, " %d%c%n", &n, &ch, &count)

p is a pointer to the input character array (string).
Format string: " %d%c%n"

• space - ignore leading whitespace (not really needed here).
• %d - read an integer
• %c - read a character, in this case the comma separator (or the ending '!')
• %n - don't read anything, but return the number of characters read from the string.

These items are stored in the variables passed by a pointer to each. In the same order as the format string,
&n, &ch, &count

That will just read a single integer, and a single comma. The useful part is to increment the pointer by the value of count. That makes it ready to read the next part of the string.


By the way, you should also refer to the reference page and consider that alongside my own comments:
sscanf - just a special case of scanf, which works with strings, hence the extra 's' in its name.
http://www.cplusplus.com/reference/cstdio/sscanf/
scanf - get formatted input:
http://www.cplusplus.com/reference/cstdio/scanf/

its actually doing something extra that I was planning to do , meaning I was planning to extract the string from first '\n' char to the next '\n' char and process this substring only but your code actually is doing that .

Maybe my code isn't quite doing that. My input string ends with a trailing exclamation mark '!', there is nothing else after that.

The code could be modified to stop when it hits the '!', if that would be reliable. If not, you might want to start by copying a just the substring which is enclosed between the first and second newlines, and use that substring as the input. I can think of other ideas, such as modifying the original string (that's what strtok does).

Edit: one tricky part of the code is the return value from the function sscanf(). Usually we hope it will read two values, an integer and a character. The return value should be 2 in that case. When the end of the string is reached it will return EOF which is a special value reserved by the compiler, it may be -1.
http://www.cplusplus.com/reference/cstdio/EOF/
Last edited on
Maybe my code isn't quite doing that. My input string ends with a trailing exclamation mark '!', there is nothing else after that.


please see the input string , iam just copying a part of 100 chars string, it shows two occurrence of '\n' which is 10 and a "!" before the second '\n'

1
2
3
4
5
6
7
/* assigns the following to recvd_chars[] 
48 44 10 63 94 49 48 50 51 44
49 48 50 51 44 49 48 50 51 44
55 51 53 44 49 48 48 44 57 55
44 50 53 53 44 50 53 53 33 10
*/
Thanks yes, you are right. I had that trailing newline in one of my earlier codes (I wrote in C++ at first) but it fell off somewhere when I copied my code to a C version. :)

If my code was run on the full 100 character string, its behaviour would depend on what came next, after the newline - if there's any chance there could be an integer there, the code would need modification. But if it is always some characters which don't look like a decimal integer, it will stop because the sscanf() will be unable to read an integer.

One suggestion would be to add some extra code inside the while loop, just before the closing brace '}'

1
2
        if (ch == '!')    // optional extra validation, 
            break;
Last edited on
iam not understand can you give me an example of when this code might not work?
thanks
can you give me an example of when this code might not work?

No problem.

1. Initial testing:
 
    char data[] = "0,\n?^1023,1023,1023,735,100,97,255,255!\n"; 
Output (ok):
recvd:     1023    1023    1023     735     100      97     255     255


2. Now a test on a 100-byte string:
 
    char data[] = "10,11,12,13,14,1516,17,18,19,20,0,\n?^1023,1023,1023,735,100,97,255,255!\n21,22,23,24,25,26,27,28,29,";
Output (wrong):
recvd:     1023    1023    1023     735     100      97     255     255      21      22      23      24      25      26      27      28      29


3. Another 100-byte string:
 
    char data[] = "10,11,12,13,14,1516,17,18,19,20,0,\n?^1023,1023,1023,735,100,97,255,255!\na1,22,23,24,25,26,27,28,29,"; 
Output (ok):
recvd:     1023    1023    1023     735     100      97     255     255


What is the difference between examples 2 and 3? 3 has a non-numeric character (the letter 'a') following the second newline.

That means the success or failure would be dependent on something outside the part of the string we have been looking at. In order to close that loophole, either check for the final '!' - if that is guaranteed to always be there. If not, then check whether or not the next character pointed to by p is a newline '\n' ;
hi
I would rather check for the next '\n' , but I am not sure how can I check that in your code can you please point out ?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
int parse_string(char * input, int * output)
{
	// find newline
	char * p = strchr(input, '\n');
	if (!p)
	{
		printf("newline not found\n");
		return 0;
	}
	
	// find '^'
	p = strchr(p, '^');
	if (!p)
	{
		printf("'^' not found\n");
		return 0;
	}
	
	p++;  // skip past character just found
	
	// variables used by scanf
	int n;
	char ch;
	int count = 0;
	
	// output extracted to array here
	int index = 0;
	
	while ((sscanf(p, " %d%c%n", &n, &ch, &count) > 1))
	{
		p += count;
		output[index++] = n;
	}
	
	return index;
}

If not, then check whether or not the next character pointed to by p is a newline '\n' ;


I would rather check for the next '\n' , but I am not sure how can I check that in your code can you please point out ?


Use * to dereference the pointer.
I think I would do your earlier suggestion that is to extract the substring between first two '\n'
can i use stncpy in following way but then how will i specify the position of first and second '\n' ?

strncpy(targetstr,sourcestr, <first \n>, <second \n> )

how will i specify the position of first and second '\n'

You could use strchr()
There are a couple of examples of its use in my code above, where I find the first '\n' and the '^'.

Lines 4 and 12:
www.cplusplus.com/forum/general/183227/#msg897003

http://www.cplusplus.com/reference/cstring/strchr/

also your parameters passed to strncpy() don't match the specification


strncpy(targetstr,sourcestr, <first \n>, <second \n> )

should be
strncpy(targetstr,sourcestr, length )

http://www.cplusplus.com/reference/cstring/strncpy/





Last edited on
first char i can find using strchr() but how do I get the second occurance using strchr() ?
iam reading strchr() and its says the following:
strchr

const char * strchr ( const char * str, int character );
char * strchr ( char * str, int character );
Locate first occurrence of character in string
Last edited on
See how I found the '^' character. I use the value of p, which is the position of the first newline.

The only difference in your case is that you don't want to disturb the existing value of the pointer, because you want to keep two pointers.

Oh, there is one other difference, you don't want to keep re-finding the same newline, so start the second strchr() from one position after the first newline.
ok i have now one pointer pointing to the first occurence of '\n' and other pointer pointing to the second occurence '\n' , now how can i extract the string between them?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
int parse_string(char * input, int * output)
{
	// find newline
	char * p = strchr(input, '\n');
	char * p1 = strchr(input, '\n');   \\ pointer p1 will point to the first '\n'
	char * p2;
	
	if (!p)
	{
		printf("newline not found\n");
		return 0;
	}
	
	// find '^'
	p = strchr(p, '^');
	p2= strchr(p, '\n');     //this will point p2 to the second '\n'
	if (!p)
	{
		printf("'^' not found\n");
		return 0;
	}
	
	p++;  // skip past character just found
	
	// variables used by scanf
	int n;
	char ch;
	int count = 0;
	
	// output extracted to array here
	int index = 0;
	
	while ((sscanf(p, " %d%c%n", &n, &ch, &count) > 1))
	{
		p += count;
		output[index++] = n;
	}
	
	return index;
}
Well my feeling is this may be a little overcomplicated. But I think it is still useful for you to go through some of the coding for yourself, that's how to learn.

So, you need to calculate the length of the part contained between the two newlines. That would be length = p2 - p1. (an integer result). I recommend you display (printf) that length, to confirm that it is what you expect, before proceeding. (once you have the code working, you can remove or comment out the extra printf).
but that will only give me the length of the string , how do I get the string itself ?
is there no function in C that takes the begining and end of the substing and extracts it for you?
Pages: 12