Going a "bit" bananas!! HA


Ok so I had read a post the other day and I believe it was jonin that said something about being able to force a number to be odd via bitwise or=1. I thought that was pretty cool and playing around in the bits wasn't something I had really done b4. I've touched on it but figured it was time to really get my feet wet. not to mention that I was kinda tired of not having even a simple set of functions that would tell me which bits were flipped and to target specific bits and flip them individually if I wanted ect.

That and I'm interested in finding primes in big numbers and the prime sieve has come up many times and the point of this is instead of using bitset container or a vector just go straight to the source and start my own class geared towards making a bitset type container with the intention of eventually using it with the sieve and finding primes in big numbers... (nerd)

I kinda wanted to post this bc i'm pretty happy with the way its turning out so far. But I did have a question. In my class I included a destructor. The only thing the destructor does is delete the array of bytes which is the main focus of the whole class. I was under the assumption that the destructor wasn't called until the object went out of scope but for some reason if the overloaded operator[] returns a copy of the class by value it goes out of scope but if I return a class pointer its fine. I guess in that instance the original would go out of scope. Its kinda irritating bc I know returning the pointer is always going to be faster than making a copy but then the pointer needs dereferenced to then be used with the other operators. I realize that its better for efficiency to be passing around pointers but when you start with just an object then it miraculously turns into a object pointer that can be very confusing...Any suggestions on how to work that all out?

I'm open to any critiques or criticisms. Be gentle I just learned how to count in binary yesterday.

o one last thing. I have it set up so bit 0-7 is left to right. So backwards
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152

//part 1

#include <Windows.h>
#include<string>
#include<iostream>
using namespace std;

struct b {
private:
	byte control;
	size_t size;
	size_t index;
	unsigned long long realsize;
	int bit;
	byte* _byte;
public:

	b() : index(0), bit(0), size(1), realsize(1), control(0) {
		_byte = new byte[1]{};
		this->populate();
	}
	b(unsigned long long i) : index(0), bit(0), size(ceil(i/8.0) ? ceil(i / 8.0) : 1), realsize(i),control(0) {
		_byte = new byte[ size ];
		this->populate();
	}
	b(unsigned long long i, byte* bi) : index(0), bit((i % 8)), size(ceil(i / 8.0)), realsize(i),control(*bi) {
		_byte = new byte[size];
		this->populate();
		this->clearRemainder();
	}
	b(unsigned long long i, bool bo) : index(0), bit((i % 8)), size(i / 8.0 ? ceil(i / 8.0) : 1), realsize(i), control( bo==1 ? 255 : 0) {
		_byte = new byte[size];
		this->populate();
		this->clearRemainder();
	}
	~b() {
		delete[] _byte;
	}
private:
	
	void realocate() { 
		byte* t = (byte*)malloc(this->size+1); //get new memory

		if (t) {
			ZeroMemory(t, size + 1);
			memcpy(t, this->_byte, size);
			delete[] this->_byte;

			this->_byte = t;  //watch this go down in flames.
			size++;
		}
	}
	void realocate(unsigned long long ull) {
		this->setIndexAndBit(ull);
		byte* t = (byte*)malloc(this->index + 1); 

		if (t) {
			ZeroMemory(t, index + 1);
			memcpy(t, this->_byte, size);
			delete[] this->_byte;

			this->_byte = t;  
			size = index + 1;
		}
	}
	void setIndexAndBit(unsigned long long i) {
		this->bit = i % 8;
		this->index = (i - bit) > 7 ? (i - bit) >> 3 : 0;
	}
	void populate() {  
		for (size_t i = 0; i < size;i++) {
			_byte[i] = control;
		}
	}
	void clearRemainder() {
		if (!bit) { return; }
		for (int i=7;i> bit-1; i--) {
			this->_byte[size - 1] &= ~(1 << (7 - i)); 
		}
		bit = 0;
	}
	string ByteToString() {
		string s{};
		for (int i = 0; i < 8; i++) {
			if (checkbit(i)) { s.push_back('1'); }
			else { s.push_back('0'); }
		}
		return s;
	}
	bool checkbit(int bi) {
		if (this->index + 1 > this->size) { return false; }
		if (this->_byte[this->index] == (this->_byte[index] | (1 << (7 - bi)))) {
			return true;
		}
		return false;
	}
	bool checkbit(byte bi, int i) {
		if (bi == (bi | (1 << (7 - i)))) {
			return true;
		}
		return false;
	}
	void bitshiftright(unsigned long long ull) {
		unsigned long long s = (unsigned long long)this->size;
		if (realsize + ull > s * 8) {
			this->realocate(realsize + ull);
		}
		for (unsigned long long i = realsize - 1; ; i--) {
			this->SetBit(i+ull, test(i));
			if (i == 0) { break; }
		}
		
		for (unsigned long long i = 0; i < ull; i++) { //cleans up the left
			this->SetBit(i, 0);
			this->SetBit(realsize + i, 0);
		}
		this->shrink();
	}
	void bitshiftleft(unsigned long long ull) {
		unsigned long long s = (unsigned long long)this->size;
		if (realsize + ull > s * 8) {
			this->realocate(realsize + ull);
		}
		for (unsigned long long i = 0; i < realsize; i++) {
			this->SetBit(i, test(i + ull));
		}
		for (unsigned long long i = realsize; i < ull + realsize; i++) { //cleans up the right
			this->SetBit(i, 0);
		}
	}
	void alignright() { 
		
		while (!this->_byte[this->size - 1]) {
			for (auto i = size - 1; i > 0; i--) {
				if (!this->_byte[i]) {
					this->_byte[i] = this->_byte[i - 1];
					this->_byte[i - 1] = 0;
				}
			}
		}
		unsigned long long s = (unsigned long long)this->size;
		this->realsize = s * 8;
	}
	bool compare(bool bo) {
		if (bo) {
			return _byte[index] == (_byte[index] | (1 << (7 - bit)));
		}
		else {
			return _byte[index] == (_byte[index] & ~(1 << (7 - bit)));
		}
	}


i was just thinking that anyplace i have pow(ect, I could just use 1<<bit. probably alot more efficient.

Last edited on
Do you know about the std::bitset class? See http://www.cplusplus.com/reference/bitset/bitset/
vector<bool> is supposed to be (I think the standard has a soft MAY in there, or is it hard now?) optimized to the bit level. That is, vector<bool> x(128) will be optimally be stored in 2 64 bit ints or 4 32 bit ints or something, rather than 128 bytes. Vector also manages memory for you, so you can avoid the destructor problem you are having (or if you want to continue making bytes, use a vector there and it will clean itself up properly).

Try hard to write this avoiding dynamic memory, unless your goal is to study dynamic memory, in which case, I can look deeper and try to fix what you have.

if you want to do large primes / large integer work, use a free large integer library from somewhere. If you want to write your own for practice, use the largest integer type you have (probably 64 bit int) as the base .. when you multiply or add or whatever you just carry it over, but instead of base 10 use base 2^63 or whatever. The idea is that the cpu does the work in the largest chunks it can handle for performance and memory footprint etc reasons.
@seeplus I appreciate the response but I'm aware of the bitset class. I had mentioned that in my original post. The only issue I have with it is that I have no idea what is going on in there, I could always find out but look how much fun I'm having. This is just a learning experience.

@jonin if you want to play around with it be my guest. I'm not too worried about it. What you mentioned about the vector bool is what I was thinking about. I had played with the vector bool and broke it bc the numbers I was looking at exceeded the maximum vector size. The point of what im trying to do is essentially make a pseudo vector bool at the bit level with 0 or 1 signifying true or false. Any which way you slice it the size is not going to get any smaller unless I pretend every bit is odd, hence skip the even entirely.

The issue with the [ ] operator is that if it returns a vector or even a byte then the subsequent = or == operator wouldn't work. Honestly tho I'm fine with it this way. I'd rather know the main memory isn't being copied or deleted erroneously. The ultimate goal here is efficiency. As it is it has fat to cut but I just wanted to get it working and then I can clean it up some. I'm just happy I was able to make it work. The other goals being learning and actual comprehension. I definitely feel like that mission has been accomplished.

I'm not really interested in creating a big number library but that being said any number of any size could be saved via a byte array. The only trick would be doing math operations on a array of bytes with another array. I guess in that regard you could use the biggest number variable size. Maybe that's something I'll have to think about. I've done array on array math b4 but not at the bit level. It's something id have to sit down with pen and paper and see what I came up with.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274

//part 2

public:
	
	void shrink() {
		unsigned long long ull = size - 1;

		while (_byte[ull] == 0) {
			ull--;
			if (ull == 0) { break; }
		}
		this->resize(((ull+1) << 3));
	}
	void operator>>(unsigned long long ull) {
		this->bitshiftright(ull);
	}
	void operator<<(unsigned long long ull) {
		return this->bitshiftleft(ull);
	}
	bool operator==(bool bo) {
		return this->compare(bo);
	}
	void operator=(bool bo) {
		if (bo) {
			this->_byte[index] |= (1 << (7 - bit));
		}
		else {
			this->_byte[index] &= (~(1 << (7 - bit)));
		}
	}
	b& operator=(b& other) {

		while (other.realsize > this->realsize) { this->pushback(false); }
		memcpy(this->_byte, other._byte, size);
		return *this;
	}
	b& operator-(b& other) {

	}

	b& subtract(b& other) { //untested not finished

		if (this->realsize < other.realsize) {
			while (this->realsize < other.realsize) { this->bitshiftright(1); } //not efficient
		}
		else if (other.realsize < this->realsize) {
			while (other.realsize < this->realsize) { other.bitshiftright(1); }
		}

		//this->bitshiftright(1);
		//other.bitshiftright(1);
		other.shrink();
		this->shrink();

		for (unsigned long long i = 0; i < realsize; i++) {

			if ( this->test(i) == 1 && other.test(i) == 0 ) {
				this->SetBit(i, 1);//unecessary
			}
			else if (this->test(i) == 1 && other.test(i) == 1) {
				this->SetBit(i, 0);
			}
			else if (this->test(i) == 0 && other.test(i) == 1) {
				for (unsigned long long j = i;; j--) {
					if (this->test(j) == 0) {
						this->SetBit(j, 1);
					}
					else if (this->test(j) == 1) {
						this->SetBit(j, 0);
						break;
					}

					if (j == 0) { break; } //gets to here it would be negitive... not sure how to handle that
				}

			}

		}
		//this->bitshiftright(7);
		//this->shrink();
		return *this;

	}

	b& operator[](unsigned long long i) {
		this->setIndexAndBit(i) ;
		if (index + 1 > size) { 
			this->realocate(); 
			this->realsize = i + 1;
		}
		return *this;
	}
	b& operator+(b& other) { 

		return this->add(other);
	}
	b& add(b& other) {
		
			while (this->realsize < other.realsize) { this->pushback(false); } 
			
			while (other.realsize < this->realsize) { other.pushback(false); }
			
			/*this->print();
			cout << "\n";
			other.print();*/


			this->pushback(false);
			other.pushback(false);
			this->bitshiftright(1);
			other.bitshiftright(1);
			this->alignright();
			other.alignright();

			//this->print();
			//cout << "\n";
			//other.print();
		for (unsigned long long i = 0; i < realsize; i++) {

			if ((this->test(i) == 0 && other.test(i) == 1) || (this->test(i) == 1 && other.test(i) == 0)) {
				this->SetBit(i, 1);
			}
			else if (this->test(i) == 1 && other.test(i) == 1) {
				for (unsigned long long j = i;; j--) {
					if (this->test(j) == 1) {
						this->SetBit(j, 0);
					}
					else if(this->test(j)==0){
						this->SetBit(j, 1);
						break;
					}

					if (j == 0) { break; } //if it gets to here something went very wrong
				}

			}

		}
		return *this;
	}
	void resize(unsigned long long ull) {
		this->setIndexAndBit(ull-1);
		byte* t = (byte*)malloc(this->index + 1);

		if (t) {
			ZeroMemory(t, index + 1);
			memcpy(t, this->_byte, index + 1);
			delete[] this->_byte;
			this->_byte = t;
			size = index + 1;
		}
		this->realsize = ull; //this could be clear remainder...
		int i = 0;
		while ((realsize + i) % 8 != 0) {
			this->SetBit(realsize + i, 0);
			i++;
		}
	}
	unsigned long long capacity() {
		unsigned long long s = (unsigned long long)this->size;
		return s << 3;
	}
	void SetBit(unsigned long long ull, bool bo) {
		unsigned long long s = (unsigned long long)this->size;
		if (ull > (s<<3) - 1) { 
			this->realocate(ull); 
			this->realsize = ull + 1;
		}
		this->setIndexAndBit(ull);
		if (bo) {
			this->_byte[index] |= (1 << (7 - bit));
		}
		else {
			this->_byte[index] &= (~(1 << (7 - bit)));
		}
	}
	bool isEmpty() {
		return !realsize;
	}
	void swap(unsigned long long b1,unsigned long long b2) {
		bool bb1 = this->test(b1);
		this->SetBit(b1,this->test(b2));
		this->SetBit(b2, bb1);
	}
	bool test(unsigned long long ull) {
		this->setIndexAndBit(ull);
		return this->checkbit(bit);
	}
	void pushback(bool bo) {
		unsigned long long s = (unsigned long long)this->size;
		if (realsize+1 > s << 3 ) {
			this->realocate();
		}
		realsize++;
		this->setIndexAndBit(realsize - 1);
		this->SetBit(realsize - 1, bo);
	}
	void pushback(byte* bp) { //this might be useless
		for (int i = 0; i < 8; i++) {
			this->pushback(checkbit(*bp, i));
		}
	}
	void clear() {
		this->control = 0b00000000;
		this->populate();
	}
	void popback() {
		this->SetBit(realsize - 1, 0);
		realsize--;
	}
	void insert(unsigned long long ull, bool bo) {
		unsigned long long s = (unsigned long long)this->size;
		if (ull > ( s << 3 ) -1 ) {
			this->SetBit(ull,bo);
			realsize++;
			return;
		}
		else if (realsize + 1 > s << 3 ) {
			this->realocate();
		}
		for (unsigned long long i = realsize; i > ull; i--) {
			this->swap(i, i - 1);
		}
		this->SetBit(ull, bo);
		realsize++;
	}
	void flip(unsigned long long ull) {
		this->setIndexAndBit(ull);
		if (this->checkbit(bit)) {
			this->SetBit(ull, 0);
		}
		else { this->SetBit(ull, 1); }
	}
	void print() {
		for (size_t i = 0; i < size; i++) {
			index = i;
			cout << this->ByteToString() << "\n";
		}

	}
	unsigned long long count() {
		unsigned long long c{0};

		for (unsigned long long i = 0; i < realsize; i++) {
			if (this->test(i)) {
				c++;
			}
		}
		return c;
	}
	unsigned long long Size() {
		return this->realsize;
	}
};

int main() {
	b a(16, true);
	b ab(8, true);
	b c{};
	ab.SetBit(21, true);
	//cout << ab.Size();
	//cout << "\n";
	//ab.print();
	//cout << "\n";
	//a.print();
	c = a + ab;

	c.print();
	//cout << "\n";
	//c >> 5;
	//c.print();
} 
Last edited on
> I had played with the vector bool and broke it bc the numbers I was looking at
> exceeded the maximum vector size.

For huge numbers, consider implementing a segmented sieve.
https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes#Segmented_sieve
@JL that is a good point. The way the above code is setup I could easily take a b[ ] and have the sieve wrap around the corner essentially going forever. This has all got me thinking about a big number library.
operator[] should return a bool representing the bit.

Do you want to get fancy so you can set the bit like this?:
1
2
3
b mine;  // btw, "b" is a bad name for a type
...
b[22] = false;  // set a bit using the [] operator 


If so then have operator [] return a helper class:
BitHelper b::operator[](size_t pos) { return BitHelper(*this, pos); }

and the BitHelper looks something like this:
1
2
3
4
5
6
7
8
9
class BitHelper {
public:
    BitHelper(b &, size_t p) : myB(b), pos(p) {}
    operator bool() const;  // returns the value of bit #pos in myB
    void operator=(bool val); // sets the value of bit #pos in myB
private:
    b &myB;
    size_t pos;
};


1
2
3
4
5
6
7
8
9


Do you want to get fancy so you can set the bit like this?:

1
2
3
b mine; // btw, "b" is a bad name for a type ... b[22] = false; // set a bit using the [] operator 


This is currently the way it's designed. If you notice when the [ ] operator is called it sets the index position of the byte array and the bit variable is set which is the bit in question but it's reversed 0 is all the way to left via 01234567. It's a bit confusing to look at but I think it will make operations down the line more streamlined.

I guess can I overload the operator by return value alone? Maybe overload it with one taking an int returning bool, the other taking a bool returning b*?

It does actually work the way that it is. b[i]= will set a specific bit.

The code as it stands isn't very defensive, I just wanted to keep it light until its working the way I want it.
Last edited on
I figured i'd give an update bc its actually starting to look like somethin. I will say that I know that dynamic memory isn't exactly optimal but what I will say is that i've written this code in a way that makes it difficult maybe impossible to go out of bounds. The down fall to this is that I've determined that it is possible to get into a infinite loop. I wasn't really able to recreate it but it was similar to a situation where the variable that keeps track of the size was getting updated as memory allocations where happening in the same loop and it was kinda a snow ball effect. Anyways lets talk about the good.

If i was to create a b(100,true) and fill it with ones. if I then attempt to access say element 114. Theres a check depending on what type of access i'm after it either reallocates and does what I need it to do or if its just a simple check it will just return false bc clearly it doesn't exist so its 0. I found the draw back of this very quickly and i realize if I write code like this i'm going to be weary of the possibility of a math error leading to infinite loops. As it stands it handles a fair amount of being out of bounds. I may dial it back to ensure infinite loops don't happen. But I will say for only working on this for a few hours the other night and a solid 6-8 tonight, It almost looks like I know what i'm doing. You be the judge.

snip. updated the first post.
Last edited on
Here are several comments. I don't mean to nitpick. The code seems to mostly work and it's a great job for a beginner. I'm just trying to suggest ways to make it even better.

When I compile your code, I get a ton of warnings. You should review warnings and fix then as needed.

I don't like the fact that the class stores the "current" index and bit position. If nothing else, it means that many of your methods can't be const.

Why do some constructors call clearRemainder() and others don't?

realocate() should be reallocate()
Realocate() could simply call realloate(size+1);

Why do you access the bits backwards? I mean why use 7 - i everywhere when you could use i instead?

checkbit(int bi) should validate that bi is in range. Or perhaps use the low order 3 bits. Also, after parameter checking , the code should simply be
return _byte[index] & (1 << (7-bi));

I can't figure out what operator >>(unsigned long long ) and operator <<(unsigned long long) are supposed to do. What should b >> 8543 do?

After calling setIndexAndBit(), SetBit() should just do *this = bo;

You use many different parameter names to refer to "a bit position within the array". Sometimes it's i, sometimes it's ull, sometimes it's b1 or b2. Use a consistent, descriptive name, like "pos" (for position).

flip() should use the ^ operator: _byte[index] ^= 1 << (7-bit);

operator== could be bo == checkbit(bit);

Why does operator<< (ostream &, b&) print only one byte?
1
2
3
4
5
6
7
8
9
10
byte* t = (byte*)malloc(this->size+1); //get new memory

		if (t) {
			ZeroMemory(t, size + 1);
			memcpy(t, this->_byte, size);
			delete[] this->_byte;

			this->_byte = t;  //watch this go down in flames.
			size++;
		}


No. Don't mix malloc and delete. In C++ you should use new instead of malloc:

1
2
3
4
5
6
7
8
9
auto t = new (std::nothrow) byte [size + 1] {}; //get new memory

if (t) {
	memcpy(t, _byte, size);
	delete[] _byte;

	_byte = t;  //watch this go down in flames.
	++size;
}


What if memory allocation fails? By default, new will throw an exception so testing of variable is not needed.
@dhayden I appreciate it man. I totally agree with your point on the warnings, the majority of the warnings stem from casting unsigned long long to size_t. Size_t being the value placed in array indexes. It's definitely something I'll look into.

As far as accessing the bits in reverse hence 7-i vs i. I did this to try and ward off confusion for myself later on down the pike. Like to print the bits out in the correct order id have print them out in reverse not to mention iterating through them from 0-n would assume left to right and having an array of bits that are supposedly connected it seems to make sense that the string wouldn't reverse every 8 bits.. idk I could be wrong on that but it seems to make sense to me.

As for some of your other questions they can be chalked up to just getting the feel of what im trying to accomplish and early code vs being more settled in. Even tho the ull type probably isn't going to fly for everything.

You mention b1, b2. Thats the swap function so they actually represent 2 different bits. The ull name has kinda just turned into a utility all purpose name.

Ummmm. The ostream operator was for testing, I deleted it bc its useless. There's been many changes similar to some of your suggestions b4 I even read this. The >> operator is actually bitshift. It might have been broken in this version but it indeed slides the bits to the right the specified number of times. Like 11111111>>5 =0000011111111; I know it should clip the end but it doesn't atm.

The reason I have 2 different reallocate functions is bc the one does just add 1 byte the other adds as many as needed to satisfy the current operation. Last but not least the constructors that don't call clear remainder I believe the initialize everything to zero anyways. Kinda unnecessary. Like if I do

b a(10,true) if I didn't clear it would result in 11111111 11111111 vs 11111111 11000000.

@seeplus I'll keep that in mind. I only did that to just try out different methods to do things. If the malloc failed I could always recursively call the function (which could result in infinite loop of fails) but I've tested this thousands of times and it hasn't failed yet. If I was going to release this I'd clean it up better. Also as I said before to both of you guys this isn't very defensive and I'm aware of that.

I appreciate any input. I'll post an update tonight and you guys can tear it apart if you like. I'm never going to be a pro unless I take my licks. Thanks guys. I know wall of text sry.
Topic archived. No new replies allowed.