Adding character sequence as a single consonant element.

I'm working on an assignment and I have to hyphenate the words in a c style string if they have a vowel-consonant-consonant-vowel or vowel-consonant-vowel pattern. For example: “Lorem ipsum dolor sit amet” will be returned as “Lo-rem ip-sum do-lor sit a-met”.

But a requirement is that the following character sequences should never be hyphenated, they should act as single consonants: “qu”, “tr”, “br”, “str”, “st”, “sl”, “bl”, “cr”, “ph”, “ch”. I am not sure how to incorporate that into my code. Or how to incorporate that into my VCCV and VCV methods. Here is what I have:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
  #include <iostream>
#include <sstream>

using namespace std;

//Check if char is a vowel.
bool isAVowel(char c){
    bool vowel = false;
    switch(tolower(c)){
        case 'a':
        case 'e':
        case 'i':
        case 'o':
        case 'u':
        case 'y':
            vowel = true;
    }
    return vowel;
}

//Check if char is a consonant.
//True if it is a letter and is not a vowel.
bool isAConsonant(char c){
    return isalpha(c) && !isAVowel(c);
}

//Returns true if letter is followed by the vowel-consonant-consonant-vowel pattern.
bool VCCV(string word, int index){
    bool correctMatch = false;
    
    if (index + 3 < word.length()){
        if (isAVowel(word.at(index))
            && isAConsonant(word.at(index+1))
            && isAConsonant(word.at(index+2))
            && isAVowel(word.at(index+3))){
            correctMatch = true;
        }
    }
    return correctMatch;
}

//Returns true if letter is followed by the vowel-consonant-vowel pattern.
bool VCV(string word, int index){
    bool correctMatch = false;
    if (index+2 < word.length()){
        if (isAVowel(word.at(index))
            && isAConsonant(word.at(index+1))
            && isAVowel(word.at(index+2))){
            correctMatch = true;
        }
    }
    return correctMatch;
}

char* process(const char* input) {
    
    string word;
    string stringResult;
    char* result = new char();
    istringstream iss(input);
    //Read in each word from the input
    while (iss >> word) {
        for(int index = 0; index < word.length(); index++){
            //If VCCV pattern add -
            if(VCCV(word, index)){
                word.insert(index + 2, "-");
            }
            //If VCV pattern add -
            if(VCV(word, index)){
                word.insert(index + 1, "-");
            }
        }
        //Add edited word to stringResult
        stringResult += word + " ";
    }
    //Erase trailing whitespace
    stringResult.erase(stringResult.size()-1);
    //Convert string to char
    result = &stringResult[0u];
    //cout << result << endl;;
    return result;
}
you need another function to test if the thing you are about to hyphenate is in your list. use string's substr method to get the sub-string, then compare, eg
if(input.substr(start,end) == "ph") return false

I didn't see much with c style strings here. ? That is not important, but you don't seem to have anything major. do you need to know how to do this in C, or is pulling the c-string out as you are already doing sufficient? Strings have a built in method to return their c-string, don't do this manually. doing it manually is dangerous, you gave a pointer to someone, and they think its a c-string and may edit it, and editing it will cause 'bad things' to happen. you may actually want to just allocate a real c-string and copy your answer into it, just to be extra safe, if this were anything bigger than a homework problem.
Last edited on
So I created the following function, I'm not sure how I would incorporate that.....

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
//Checks if the word contains one of the sequences to be ignored by hyphens
bool sequenceConstant(string word){
    bool result = true;
    std::transform(word.begin(), word.end(), word.begin(), ::tolower);
    if(word.find("qu") != string::npos||
       word.find("tr") != string::npos||
       word.find("br") != string::npos||
       word.find("str") != string::npos||
       word.find("st") != string::npos||
       word.find("sl") != string::npos||
       word.find("bl") != string::npos||
       word.find("cr") != string::npos||
       word.find("ph") != string::npos||
       word.find("ch") != string::npos){
        result = false;
    }
    return result;
}
Yes, that would be the general idea.
To put it all together, something along these lines, perhaps:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
#include <iostream>
#include <string>
#include <cctype>
#include <utility>

bool is_vowel( char c ) // return true if c is a vowel (non-phonetic)
{
    static const std::string vowels = "AEIOUaeiou" ;
    return vowels.find(c) != std::string::npos ;
}

std::string to_lower( std::string str )
{
    for( char& c : str ) c = std::tolower(c) ;
    return str ;
}

// return true if the next three character from position index
// form a three character consonant (currently, only "str")
bool is_consonant3( const std::string& str, std::size_t index )
{ return to_lower( str.substr( index, 3 ) ) == "str" ; }

// return true if the next two character from position index
// form a two character consonant
bool is_consonant2( const std::string& str, std::size_t index )
{
    static const std::string consonants[] = { "qu", "tr", "br", "st", "sl", "bl", "cr", "ph", "ch" } ;

    const std::string candidate = to_lower( str.substr( index, 2 ) ) ;
    for( const auto& cons2 : consonants ) if( candidate == cons2 ) return true ;
    return false ;
}

enum type { VOWEL, CONSONANT, NEITHER };

// classify the next component of str, starting at position index
// return a pair with first == type and second == number of characters
std::pair< type, std::size_t > classify( const std::string& str, std::size_t index )
{
    // no more characters
    if( index == str.size() ) return { NEITHER, 0 } ;

    // consonant, 3 characters
    if( is_consonant3( str, index ) ) return { CONSONANT, 3 } ;

    // consonant, 2 characters
    if( is_consonant2( str, index ) ) return { CONSONANT, 2 } ;

    // vowel, 1 character
    if( is_vowel( str[index] ) ) return { VOWEL, 1 } ;

    // alpha, consonant, 1 character
    if( std::isalpha( str[index] ) ) return { CONSONANT, 1 } ;

    // non-alpha, 1 character
    else return { NEITHER, 1 } ;
}

int main()
{
    const std::string test_str = "abra(aqusto) trabrbst astrid!" ;
    std::size_t index = 0 ;

    while( index < test_str.size() )
    {
        const auto [ ty, n ] = classify( test_str, index ) ;

        std::cout << "\n\n" << test_str << '\n' << std::string( index, ' ' ) << std::string( n, '^' )
                  << "\n'" << test_str.substr(index,n) << "'  " ;

        if( ty == VOWEL ) std::cout << "VOWEL\n" ;
        else if( ty == CONSONANT ) std::cout << "CONSONANT-" << n << '\n' ;
        else std::cout << "NEITHER\n" ;

        index += n ;
    }
}

http://coliru.stacked-crooked.com/a/2b5773abb4084bfe
I am not sure you are 100% on the right track? (close, but some details..)
question one is whether you need to treat these new sequences as a single letter (the instructions indicate yes?) and THEN break according to the rules as if they were one letter (??).

question 2 is what happens if you had
Loremph

is that lo-remph? Find would locate the PH here.

you have to check the location from find if it returns one to see if it inside the region of interest (you are currently looking at cvcvccc or cVCVccc and the ph is outside the VCV area, see?

if you condense the ph to a single c, its cVCVcc of course.
Last edited on
Topic archived. No new replies allowed.