New to working with bits. Not exactly going my way. Help!

I'm working on this problem here. Its just for my own enjoyment/benefit. I'm not even sure if you'll be able to view the problem without being signed in.
https://www.hackerrank.com/challenges/acm-icpc-team/problem

If you have google its like a one click sign in and im not here to promote the site even tho i like it alot. Anyways.

I've always had a vague understanding of binary. This problem seems perfect for a binary type solution bc it should add some level of efficiency as opposed to doing the addition one digit at a time. Least i'd assume. So to the problem i'm having.

Taking a string and converting it to binary seems to work fine via


void strtobiarr(string& str){
unsigned long long tmp=0;
int yy=sizeof(unsigned long long)-1;
    for(int x=0;x<str.size();x++){

        if (str[x]=='1'){
            tmp^= 1<<yy;
        }
        
       // cout<<bitset<sizeof(8)>(tmp)<<'\n';
        yy--;
        if(yy==-1){
            yy=sizeof(unsigned long long)-1;
            biarr.push_back(tmp);
            tmp=0;
        }
    }
}



The issue i'm having with this is that i can't seem to get more than 8 bits to attach to tmp? I realize from what i posted that would make sense but if i change the range from 63-0 some really weird stuff starts happening. From what i've seen an ULL is supposed to be 64 bits? 8 bytes * 8 bits. However nothing i seem to do allows me to actually store 64 bits. I'm attempting to create a struct that breaks the string up into a vector of vector<unsigned long long> 's and then use operator overloads on the elements to do various operations which everything seems to work pretty smoothly with 8 bits but if all i'm accomplishing is taking one iteration through and dividing it by 8 its hardly worth the effort. Any help would be appreciated. Thanks guys.
Last edited on
no such thing as biarr
is this what you want?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19


unsigned long long strtobiarr(string& str)
{
  unsigned long long tmp = 0;
  int pow2 = 1;
  for(int i = str.length()-1; i>=0; i--, pow2*=2)	  
  {
    tmp += (str[i] == '1')* pow2;	
  }
return tmp;
}

int main()
{
  string s = "1000000010";  //512+2 = 514
  cout << strtobiarr(s) << endl;
}


I am not sure ULL has to be 64 bits. I think it 'usually is'. Its 64 on my compiler.
if you want to be sure, use the named ones. uint64_t type stuff that tells you how many bits you want. Endian matters.
Last edited on
no such thing as biarr


thats a struct member variable. I wasnt going to post it all. I will if you want me to. I didn't think that variable had much relevance as to what the code is trying to accomplish which is compare the char, set the bit or not and print it out.
Last edited on
Looks like I can read the problem without an account. I can't submit it though; try this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
vector<int> acmTeam(vector<string> strs) {
  std::vector<bitset<500>> topics {
    std::make_move_iterator(strs.begin()), 
    std::make_move_iterator(strs.end()) };

  int max_tk = 0, n_teams_max_tk = 0;

  for (int i = 0; i < topics.size(); ++i)
    for (int j = i + 1; j < topics.size(); ++j)
    {
      int tk = (topics[i] | topics[j]).count(); 
      if (tk >= max_tk)
      {
        if (tk == max_tk) n_teams_max_tk++;
        else n_teams_max_tk = 1;

        max_tk = tk;
      }
    }

  return std::vector<int>{ max_tk, n_teams_max_tk };
}
Last edited on
oh, and pow2 needs to be ull too, oops ;)
Looks like I can read the problem without an account. I can't submit it though; try this:


I wasn't seeking an easy answer but what you have posted is definitely inline with the path i was traveling. I had the problem solved with the test cases that they provide the thing that was getting me was the string being up to [500]. Was not aware that i could use bitset as a datatype... that might have been useful 3 hrs ago ha!.

@Jonin the reason why i was trying to view output as binary was so that i could see if it matched up with the input after i was done shuffling it around etc.

i'm going to see what i can do with that bitset data type, that will definitely simplify a few things. Thanks guys i'll report back here in a few.
ok sweet this actually gives me the output that i'm expecting even though for some reason i have to write to the bitset vector in reverse order but progress. Super sweet.
1
2
3
4
5
6
7
8
vector<bitset<500>> sets(topic.size(),0);

for(int x=0;x<topic.size();x++){
    for(int y=0;y<topic[x].size();y++){
        if(topic[x][y]=='1'){sets[x][topic[x].size()-(y+1)]=1;}
    }
    cout<<sets[x]<<'\n';
}


great i got everyone to shutup in my house and now i think theres a cricket in here. fantastic

Edit: yep that basically solved all my problems. worked like a champ passed all the test cases. Figured i'd post the results.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31

vector<int> acmTeam(vector<string> topic) {
vector<int> result(2,0);
vector<bitset<500>> sets(topic.size(),0);

for(int x=0;x<topic.size();x++){
    for(int y=0;y<topic[x].size();y++){
        if(topic[x][y]=='1'){sets[x][topic[x].size()-(y+1)]=1;}
    }
}

int teams=0;
int subs=0;
for(int x=0;x<sets.size();x++){
    
    for(int y=x+1;y<sets.size();y++){
        bitset<500> tmp=sets[x]|sets[y];
        int tempsubs=0;
        for(int z=0;z<tmp.size();z++){
            if(tmp[z]){tempsubs++;}
        }
        if(subs<tempsubs){subs=tempsubs;teams=1;}
        else if(subs==tempsubs){teams++;}
    }
}

 result[0]=subs;result[1]=teams;
return result;

}


I was thinking i could make the bitset sizes more dynamic to shave off time but it passed thats all i care about.

after thinking about the direction the bitset was written i guess it didn't really even matter. the result of the or operation would have been the same regardless. I will definitely say this is a handy tool to have bc theres so many problems that get dumbed down to true or false and being able to weed out differences between sets of objects en mass via 1 operation. fantastic.
Last edited on
Dynamic sizing of bitsets is a problem hence the 500 to cover all cases.

It's easy enough to extract the data from the sample file as bitsets and convert the bits to strings of length m as below, even if you aren't in a position to use stringview.

Despite that maybe just stick to bitsets and use OR/AND blah blah on the two team members over the range of m and keep count of that. i.e. conversion to strings most likely isn't needed.

(The number of teams is a worry for time.)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#include <iostream>
#include <fstream>

int main()
{

    std::ifstream in_file;
    in_file.open("bits_ACM.txt");
    
    int n{0}; // NO. OF ATTENDEES
    int m{0}; // NO> OF TOPICS
    
    if(in_file)
    {
        in_file >> n >> m;
        
        std::bitset<500> temp;
        
        for(int i = 0; i < n; i++)
        {
            in_file  >> temp;
            std::cout << temp.to_string().substr(500-m, 500) << '\n';
        }
        
        in_file.close();
    }
    else
        return -1;
    
    return 0;
}



10101
11100
11010
00101
Program ended with exit code: 0
I was playing around with it for a bit. seems like its pretty well hell bent against it. It allows you to use sizeof() but only so much can be done with that. I was looking at some of the built in functions for bitset and it figures theres a function that returns a count of the bits that are set... good for future reference tho.
Works with the sample data set by the look of it but whether it scales up?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
#include <iostream>
#include <fstream>
#include <vector>

struct Team
{
    int i{0};
    int j{0};
    size_t count{0};
    
    friend std::ostream& operator<< (std::ostream& out, Team& tm)
    {
        out << '(' << tm.i << ',' << tm.j << ") " <<  tm.count;
        return out;
    }
};

int main()
{
    std::ifstream in_file;
    in_file.open("bits_ACM.txt");
    
    std::vector<Team> Team_List;
    
    int n{0}; // NO. OF ATTENDEES
    int m{0}; // NO> OF TOPICS
    
    Team temp;
    
    if(in_file)
    {
        in_file >> n >> m;
        
        std::bitset<500>* attendee = new std::bitset<500>[n]; // EDIT n, not m
        
        for(int i = 0; i < n; i++)
        {
            in_file  >> attendee[i];
        }
        
        // COMBINE PLAYERS FOR ALL POSSIBLE TEAMS
        for(int i = 0; i < n; i++)
        {
            for(int j = i + 1; j < n; j++)
            {
                temp.i = i + 1;
                temp.j = j + 1;
                temp.count = (attendee[i] | attendee[j]).count();
                
                Team_List.push_back(temp);
            }
        }
        
        // DISPLAY ALL TEAMS
        for(auto t: Team_List)
            std::cout << t << '\n';
        std::cout << '\n';
        
        // FIND MAXIMUM COUNT
        auto it = std::max_element
        ( Team_List.begin(), Team_List.end(), [](const Team& A,const Team& B)
        { return A.count < B.count; } );
        
        size_t max_count = (*it).count;
        
        // DISPLAY TEAM(S WITH MAXIMUM COUNT
        for(auto t: Team_List)
        {
            if (t.count == max_count)
                std::cout << t << '\n';
        }
        
        in_file.close();
    }
    else
        return -1;
    
    return 0;
}



(1,2) 4
(1,3) 5
(1,4) 3
(2,3) 4
(2,4) 4
(3,4) 5

(1,3) 5
(3,4) 5
Program ended with exit code: 0


EDIT: Changed the lot to latest final - scale-up/time might be an issue.
Last edited on
you may also look at vector bool, which is oddly sometimes better suited than bitset
@jonin with this particular exercise tho if you used a vector bool then it would almost defeat the purpose of doin any conversion away from the string to begin with. i'm sure there would be some speed benefit due to the more efficient nature of the vector bool but wouldn't the conversion process negate it? Unless there is a way to or 2 vector bools together. As i said the added benefit using the bitset is that count method.

@againstry my final answer is what it is bc i'm only saving the relevant information. The problem only requires the answer have the number of sets of teams that have the maximum number of subjects and what the max subjects number is of the combined teams. Part of the reason to shoot for maximum efficiency bc if the code doesn't run under a certain time on these tests it will fail you. That is usually the difference between a couple extra memory operations like push_backs or an extra loop or 2. Your code looks legit, it would probably get the job done. The only thing i really don't get is why you're creating attendee on the heap? I realize this is just playing around but shouldn't you delete when you're done with it?
Last edited on
I don't know if you can swindle a vec bool into a 64 bit register and do it in chunks efficiently or not. I have not tried. Its one of those things that if you are unhappy with your best performance, you can start fooling with exotic alternate approaches...
(not even bitset can or something bigger than 1 register in one clock... ) it may be fast, but its still going to have to iterate if you exceed the hardware.
Last edited on
Counting the number of set bits is potentially a slow operation.

It's fairly instructive to look at the generated code (with and without -march=native):
https://godbolt.org/z/6bGh7j

If this is a bottleneck, there's potentially a lot of room for optimization.
http://0x80.pl/articles/sse-popcount.html

SA:
https://github.com/kimwalisch/libpopcnt/blob/master/libpopcnt.h
Last edited on
@martyrocks
I am reproducing the sample info and step by step making sure of that, hence all the display stuff etc. All of that would be cut out, along with the struct etc, if I wanted to submit it.

attendee on the heap is because they specify how many and therefore why not do the absolute minimum instead of assuming the maximum was the reason.

delete should be there but program ending solves that - if not they’re in for a shock ;)
And, cutting all the overheads:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
#include <iostream>
#include <fstream>
#include <bitset>

int main()
{
    std::ifstream in_file;
    in_file.open("bits_ACM3.txt");
    
    int no_attendees{0};
    int no_topics{0};
    
    std::bitset<500>* attendee = nullptr;
    
    int count_arr[500]{0};
    
    if(in_file)
    {
        in_file >> no_attendees >> no_topics;
        attendee = new std::bitset<500>[no_attendees];
        
        for(int i = 0; i < no_attendees; i++)
        in_file  >> attendee[i];
        
        size_t team_count{0};
        size_t team_count_max{0};
        
        for(int i = 0; i < no_attendees; i++)
        {
            for(int j = i + 1; j < no_attendees; j++)
            {
                team_count = (attendee[i] | attendee[j]).count();
                count_arr[team_count]++;
                
                if(team_count > team_count_max)
                    team_count_max = team_count;
            }
        }
        
        std::cout << team_count_max << '\n'; // MAX NO. TOPICS
        std::cout << count_arr[team_count_max] << '\n'; // NO. TEAMS WITH MAX
        
        in_file.close();
    }
    else
        return -1;
    
    return 0;
}
Looks good. The only thing that I'd probably change is the attendee being created on the heap, the count array is probably unnecessary, i also like using vectors bc you can use .size() instead of a separate counter but bitset on its own has its own counter which is conveniently also .size(). Bitset is probably a template placed over top of a vector bool anyways so I guess that would make sense. I like how you or'ed them together then used .count() on the result. Tricky.

My keyboard on my laptop broke for no apparent reason. Idk wtf happened. Must have beat it to death. Ridiculous.

You should join hackerrank the site is amazing. Improving my skills every day.
Last edited on
Bitset is probably a template placed over top of a vector bool anyways so I guess that would make sense.
Since the size of a bitset is set at compile time, I doubt that it's implemented on top of vector<bool>
you cannot rely on it, but *some* intel chips have an assembly command for this.
https://www.felixcloutier.com/x86/popcnt

so if its on intel and you don't mind a couple lines of inline asm...

one would think an intel library bitset would use this, but all too often the libraries are not assembly optimized.
Last edited on
Topic archived. No new replies allowed.