Are my casts corrects?

Hello,

I would like to incorporate the snowball stemming library (snowball.tartarus.org) to my code. This library is written in C, and I need to use some cast to transform the string representation used in this library to and from std::string.

My code works, but I don’t know if if used the casts correctly. Can someone say me my code is correct and efficient?

Thanks.

The library represents words as “sb_symbol”, which is a typedef for unsigned char

 
typedef unsigned char sb_symbol;


To stem a word, the function “sb_stemmer_stem” is used.

1
2
const sb_symbol *   sb_stemmer_stem(struct sb_stemmer * stemmer,
                    const sb_symbol * word, int size);



Here is the code I use to wrap this in a C++ class (see my function std::string snowball_stemmer::stem(std::string to_stem) ).

To cast a string to the sb_symbol structure, I use

 
(sb_symbol *) std::string.c_str()  


And to cast a sb_symbol structure to a string

 
std::string( (const char *) sb_symbol )



stemmer.h


1
2
3
4
5
6
7
8
9
10
11
12
13
14
class snowball_stemmer{
 
private :
 
   sb_stemmer * stemmer;
 
public :
   snowball_stemmer(char* , char* );
   ~snowball_stemmer();
  
   std::string stem(std::string);

 
};



stemmer.cpp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include "stemmer.h"
 
snowball_stemmer::snowball_stemmer(char * language , char * charenc){
 
stemmer = sb_stemmer_new(language, charenc);
 
};
 
 
snowball_stemmer::~snowball_stemmer(){
   sb_stemmer_delete(stemmer);
};
 
std::string snowball_stemmer::stem(std::string to_stem){
 
   auto stemmed = sb_stemmer_stem(stemmer, (sb_symbol *) to_stem.c_str() , to_stem.size());
 
   return std::string( (const char *) stemmed );
 
}
Better to use C++ casts instead of C ones.

Casts looks fine, but it is better and afer to cast to (const sb_symbol *) as you have const pointer as source.
Thanks MiiniPaa,

I was told that casts are tricky, so I prefered to ask a double check for my first try !

By "Better to use C++ cast" you mean static_cast ? If yes, this code should be ok ? Thanks.

1
2
3
4
5
6
7
8
std::string snowball_stemmer::stem(std::string to_stem){
 
   auto stemmed = sb_stemmer_stem(stemmer, static_cast<const sb_symbol *> to_stem.c_str() , to_stem.size());
 
   return std::string( static_cast<const char *> stemmed );
 
}
You shouldn't need a cast there, you should be able to construct a string without it trying to change the args passed to it.
string does not have constructor accepting unsigned char*
error: no matching function for call to 'std::basic_string<char>::basic_string(const sb_symbol*&)'|
Should be static_cast then.
Thanks for your advices. It seems that because unsigned char and char are uncorrelated, I need to use reinterpret_cast.

1
2
erreur: invalid static_cast from type ‘std::basic_string<char>::size_type {aka long unsigned int}’ to type ‘const sb_symbol* {aka const unsigned char*}’
    auto stemmed = sb_stemmer_stem(stemmer, static_cast<const sb_symbol *> to_stem.c_str() , to_stem.size());


with reinterpret_cast this is ok

1
2
3
4
5
6
7
8
std::string snowball_stemmer::stem(const std::string &to_stem){

   auto stemmed = sb_stemmer_stem(stemmer, reinterpret_cast<const sb_symbol *>(to_stem.c_str()) , to_stem.size());

   return std::string( reinterpret_cast<const char*>(stemmed) );

}
That's better :)
Topic archived. No new replies allowed.