Understanding binary search trees

Hi everyone,

I have recently seen binary search trees and I'm trying to understand how its code works and how to implement it myself. The code below is the example I was given in my lecture. There are parts I do not understand and any help would be appreciated. I understande the theory behind binary search trees but the problem is understanding the code. My main issues are:

1.) What does this line of code actually do ?
 
typedef tree_node* NodePtr;


As far as I know it is basically allowing the pointer to have a different name so that instead of just being called tree_node, you can also call it NodePtr but I would like to be sure if that's all or if thereĀ“s anything else about it.


2.) In the insert function, why is ptr passed as a reference ? What's the meaning of this ?

3.) For the print in order function, When ptr=null because it starts from the root (or reaches a leave), what happens next ? How does the program now that now it has to go to travel up the left subtree throught the right branches, back to the root and then go into the right subtree ?

Please any help in making me understand binary search trees would be appreciated because right now I feel confused trying to understand how the code works.

Thank you.



Tree.h (Header file)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
  #ifndef TREE_H
#define TREE_H

struct tree_node { //Create Node "template".
    tree_node *left;   // left subtree has smaller elements
    tree_node *right;  // right subtree has larger elements
    int number; // This tree stores int values
};

typedef tree_node* NodePtr; /*Shortcut name. Instead of tree_node
we call it NodePtr. IS THIS RIGHT ????????????????????????????????????????????????????????*/



class Tree
{
private:
    NodePtr fRoot;
public:
    Tree();
    ~Tree();
    NodePtr &getRoot() { return fRoot; } //Why a reference ? Is this a function or an object ???? ????????????????????????????????????????
    void print_inorder(NodePtr ptr);
    void insert(NodePtr &ptr, int aNumber);
    void removeNode (NodePtr &ptr);
//    bool find(NodePtr ptr, int aNumber);
//    int getCount(NodePtr ptr);
};

#endif // TREE_H 



Tree.cpp (Source file)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
#include "tree.h"
#include <iostream>

using namespace std;

Tree::Tree()
{
    fRoot = NULL;
}

Tree::~Tree()
{
  removeNode(fRoot);
}

void Tree::print_inorder(NodePtr ptr) // WHAT'S THE VALUE OF PTR ???????????????????????????????????????????
{ /*When ptr=null because it has reached a leave, what happens next ? How does the program now that
   now it has to go to travel up the left subtree throught the right branches,
   back to the root and then go into the right subtree ????????????????????????????????????????????????????*/
    if (ptr != NULL)
    {
        print_inorder(ptr->left);       // print left subtree
        cout << ptr->number << endl;    // print this node--------WILL THIS LINE BE EXCUTED OR WHAT'S GOING ON EXACTLY HERE ??????????????????????????????????
        print_inorder(ptr->right);      // print right subtree--------WILL THIS LINE BE EXCUTED OR WHAT'S GOING ON EXACTLY HERE ??????????????????????????????????
    }
}


void Tree::insert(NodePtr &ptr, int aNumber){  //Why ptr passed as a reference here ????????????????????????????????????????
    //If we are at a leaf---End of tree
    if (ptr == NULL){ /* Enter here to create a new node (leaf)
        but how do you know if it is left or right child ?-----The else statement is used to test wheter your No. should
        go down a left or right branch. */

        ptr = new tree_node; // Create a new node.
        ptr->left = NULL; //Set the left branch to 0 (No child).
        ptr->right = NULL; //Set the right branch to 0 (No child).
        ptr->number = aNumber; //Store number in node ?

    } else {
        if (aNumber == ptr->number){ /* If Number = Number stored in
            node, exit function */

            return;
        } else if (aNumber > ptr->number){ /* If Number greater
          than the one in node, enter right branch. */

            //insert in right subtree
            insert(ptr->right, aNumber); /* From right branch keep
            moving down the tree (go back to start of function) until you
            find the leave. */

        } else {

            insert(ptr->left, aNumber);

            /* From left branch keep moving down the tree (go back to start of function)
             *  until you find the leave. */
        }
    }
}

void Tree::removeNode(NodePtr &ptr){
    if(ptr != NULL)
    {

            removeNode (ptr->left);

            removeNode (ptr->right);

        delete ptr;

    }
}


1.) What does this line of code actually do ?

typedef tree_node* NodePtr;

typedefs let you rename one type to another, for example calling unsigned char "byte" or something. Here, it says the type NodePtr is a tree_node*

2) the usual reason to pass a pointer as a reference is to change the pointer.
consider
void foo (int *&x)
{ x = new int; //you DO want the new value of x to be pushed back out to the caller here!!! }
this is a point of confusion for new programmers. you CAN Modify the thing pointed TO by the pointer, but not the pointer, when passed by value. You cannot change the pointer itself, though. To do that, the pointer itself must be by reference. The pointer itself and what it points to -- you must understand the difference for this detail.

3) print in order is recursive. So it prints the current value and then calls itself on left and right. eventually it is called with a null, does nothing, backtracks ...
so the algorithm here is 'print current node if not null', go left, go right. go left is the same algorithm on new data... and go right will be as well when it gets there..

1
2
3
 7
2 8
1

print 7. go left, root is 2. print 2. go left, root is 1. print 1. go left, its null, do nothing return root is 1. go right, null, do nothing, pop back out root is 1, but 1 is done, pop back out root is 2, go right, do nothing, pop back, 2 is done, pop back, root is 7, go right, print 8, go L/R do nothings, pop back, all done.. stop..

the way it 'knows' what to do is the implicit tracking done by the recursive call stack.
Last edited on
Hi Jonnin,

Thanks for your reply. I don't understand your answer to the 2nd question. As far as I have always undertood you can modify where the ppointer is pointing at and the value of the variable it's pointing at. But from what you are saying when passing a pointer by value:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
main(){
 
int x=5;
*p=x;

myfunction(p);

return 0;
}

myfunction(p)
{

int y=7;
*p=y;  

return (p)
}


Would be llegal ?
Last edited on
Also what I still don't get is that if I have my print in order function:

1
2
3
4
5
6
7
8
9
void Tree::print_inorder(NodePtr ptr) 
{
    if (ptr != NULL)
    {
        print_inorder(ptr->left);       // print left subtree
        cout << ptr->number << endl;    // print this node--------WILL THIS LINE BE EXCUTED OR 
        print_inorder(ptr->right);      // print right subtree--------WILL THIS LINE BE EXCUTED OR 
    }
}


Why if there is a left child, would the cout statement be printed ? Wouldn't the:

 
print_inorder(ptr->left);


make the program recurse back to the start of the function without printing and go into the left child ?
trees are not the best place to start learning recursion; have you seen it before a bit? You may need to drop out of the trees and study a simple example or two if you haven't.

Ill try again.. (and I mixed my head up and gave you pre-order not in-order because Im senile or something). But that is mostly irrelevant. My mistake just changes when it prints (there are 3 prints from trees, each with different uses). the only difference is the order of the calls in the code.

lets take it one input and one call at a time.
main calls print with the root, in my example, 7.
if its not null (it isnt, it 7)
call with 7's left, left is 2.
is 2 null, no.
call it with let. left is 1.
is 1 null? no.
call it with 1's left.
is it null? yes. the if statement fails, and it returns.
you are NOW in the cout statement of the previous call, which is with 1 passed in.
so it prints 1.
then it calls 1. right.
that is null, so it returns.
you are on the next statement in the '1' call .. which is nothing, so it exits the 1 call.
that pops the recursion back to 2, and its on the cout statement, so it prints 2 and then calls 2 .right which is null again.. and so on.

so just carefully watch what is passed in each time and what statement it is on as you trace it by hand for a small example.

you can MAKE the program generate this kind of text for you.
add some cout statements...
for example add this at the top inside the if statement to see how it moves through the tree.
cout <<"*" << ptr->number << "\n"; //* means debug statement to me, vs normal output lines. then you can add "going left" and "going right" or whatever you want to see to unravel it in action.
Last edited on
jefazo92 wrote:
I don't understand your answer to the 2nd question. As far as I have always undertood you can modify where the ppointer is pointing at and the value of the variable it's pointing at. But from what you are saying when passing a pointer by value:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
main(){
 
   int x=5;
   *p=x;

   myfunction(p);

   return 0;
}

myfunction(p)
{
   int y=7;
   *p=y;  

   return (p)
}


Would be llegal ?


(Indentation added by me)

That code is illegal for all sorts of reasons. You haven't defined defined any of the variables called p, not have you given p a value in main(). p doesn't point to anything, so when you try and dereference it using *p, you'll have undefined behaviour - if you're lucky, it will cause a crash. Nor have you defined the return type of myfunction.

But that's beside the point, I guess.

To turn it into legal C++, and assuming I understand the point you're trying to make, here's what you should have written:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
int main(){
 
   int x=5;
   int* p = &x;

   std::cout << "The value of x before calling myfunction is " << x;
   myfunction(p);
   std::cout << "The value of x after calling myfunction is " << x;
   return 0;
}

void myfunction(int* p)
{
   int y=7;
   *p=y;  

   // I don't know why you put this here, when you don't use the return value 
   // anywhere, and it obscures the whole point of using pointers to simulate
   // passing by reference.
   // return (p) 
}


So, once we've made the code legal and sensible, this would behave the way you expect. myfunction() would change the value of the thing p is pointing to, and this change would be seen in main() as a change to the value of of x.

We call this "passing by reference", although that's not technically true. We're passing the value of p - which is a pointer, remember - by value, and then changing the value of the thing p points to. The variable called p in myfunction() is a copy of the variable called p in main(). They both point to the same memory location, which means that when you change what's stored in that location in myfunction(), and then look at what's stored at that location in main(), you see the changed value.

So, although the pointer is passed by value, we can use it to simulate passing by reference the thing that the pointer is pointing to.

However, you're misunderstanding what jonnin said. You're talking about changing the value of the thing p points to. He's talking about the value of p itself. Passing p as a reference allows the function to change the value of p itself, and have that change be made to the value of p in main().
Last edited on
Topic archived. No new replies allowed.