Data Structure - LinkedList and how Pointers come into play - C++

         In my quest to solve Leetcode problems, I have come across Linkedlist. It has been long overdue for me to go through Linkedlist. Overall, I was familiar with the basic concepts and for the most part, It is not really hard to understand mainly because in Java, we do not really run into the complex part of pointers. That is because Java automatically handles the underlaying concept of pointers, I assume. So these two are really interdependent and I think the main reason you should learn Linkedlist is to understand pointers.   
       
        After really getting deeper into Linkedlist, I came to realization that this data structure isn't much about, in my opinion, about storing values and linking, but it is more about understanding how objects are connected using pointers. In this tutorial, I am trying to explain my understanding of Linkedlist and pointers using the things I learned. My personal desclaimer is that this sort of understanding is only possible if you really get your hands dirty with codes because even I could code Linkedlist without understanding pointers but once you seek the true meaning / understanding of this data structure, it is essential that you try the code by yourself and question everything that is written. Some of the things, for C++ programmers, may seem obvious but for new beginners, I think this explanation will help to visualize a bit more clearly. 

LinkedList & Pointers in C++

In Linkedlist, it is simply a list of objects. This list is connected by pointers. So Let's define a collection of items that is going to be added in linkedlist.

 
 //Node.h file
class Node {
    public:
        int data;
        Node* next;
};

So basically the objects of this class (Node), can be connected together using pointers to make a list. 

Now there are tons of tutorials on Linkedlist, so I am only explain my understanding of it using Push method. The simple operation of adding an object (also known as List item) at the start of a linked list is called push method. 

Now let's add code for how to create a node. 
 int main(){
    Node* head = NULL;
    Node* second = NULL;
    Node* third = NULL;

    //allocate three nodes in the heap
    head = new Node();
    second = new Node();
    third = new Node();
    /*
    Three blocks have been initialized dynamically. We have pointers to these 
    three blocks as head, second, and third. 
    */

    head->data = 1;
    head->next = second;

    second->data = 2;
    second->next = third;

    third->data = 3;
    third->next = NULL;

    //adds a node at the start, also known as push method.
    addNode(&head, 4);
    cout << "Push() = 4" << endl;
    printList(head);
    
    return 0;
}
 
This above code, creates  3 pointers of Node object. And calls method addNode(&head,4) which will perform the action of adding a node object to the front of the list of exisiting objects.

What is a pointer? 
Pointer is a place in memory which stores the address of a given type. 

In above picture, imagine a bix box is a memory board where all data is saved. I picked 4 blocks of variable "a" and "p" is because Int uses 4 bytes of memory, but you can represent it in one simple any box too.
So Pointer is simply a memory block where the memory address is saved of a type int since we initialized an int pointer, "int *p;" . 

So back to the Linkedlist, then what we did in the code was to create 3 pointers of Node object, so that visual representation could be seen like this. 


The fundamental force, if you will, that connects all objects together is pointer because each object holds memory address of next object (list item of Node object) and using that pointer, the next item can be found.
The main reason I didn't draw object pictures in an ascending order is to illustrate the fact that on memory board there are many objects can be allocated anywhere, but what connects them? the answer is the pointer

Now let's look at the code of adding a simple node of push method i.e. adding a node at the start. 
/*
    Note: New node is always added before the head of a given Linked list.
    So 1->2->3->4, if we add 5, then it would be 5->1->2->3->4

    and newly added node becomes the new head of the new linked list. 
*/
void addNode(Node** head_reference, int new_data){
        Node* new_node = new Node(); //Initialized new node in memory

        new_node->data = new_data; //memory's data ie node no. added, for example 4

        new_node->next = (*head_reference);  //change new node (4) next as head (1) so 4->1 (4->next =1 )

        (*head_reference) = new_node; //change assign new_node to head ref address.
    }
This code can be visulized step by strep like this, code lines are numbered in the below picture. Please click on the image or download and zoom it if cannot be seen clearly. 

Now conceptually, it seems really easy but for as beginner and certainly myself had a question regarding how pointer was passed. 

    Pointer to a Pointer

For me the biggest questions was
  • What is pointer to a pointer, the parameter passed in addNode(Node **head, int data) in which ** is puzzling. I wanted to know what's the meaning behind that. 
This line in code ..about why ** is used rather than * only.
void addNode(Node** head_reference, int new_data)
So to understand that, we have to first know, what is pass-by-reference and referencing and dereferencing works.

First of all,  basics of pointers...
 #include<iostream>
using namespace std;

int main(){

    int a = 5; 
    int *p; // p = address, *p = value at address.

    p = &a;

    cout << p << " = P pointer value, which is address of a!" <<endl;

    cout << &a << " = Address of 'a' variable"<< endl;

    cout << &p <<" = P Pointer's own address in memory" <<endl; 
    //pointer itself lives on diff diff address
    //pointer is in the momory, and memory has an address for pointer itself too.

    cout << *p << " = *p gives value of pointer-dereferencing" << endl; 
}
 
The output of above program... 
0x28fecc = P pointer value, which is address of a!
0x28fecc = Address of 'a' variable
0x28fec8 = P Pointer's own address in memory
5 = *p gives value of pointer-dereferencing
The above code and output can be represented in memory like this...


Now we can understand pass-by-reference using the example below...
#include<bits/stdc++.h>

using namespace std;

//Pass by reference with Pointer args
void square2(int *n){

    //Address of n in Square2 and Main() n2, is the same.  
    cout << "Address of n in square2() " << n <<endl;

    //Explicit dereferencing to get the value pointed to
    *n *= *n;  // n* = *n * n*;

    /* no need to return because it changes value from address
    so one variable is accessed from variable and changed so this change
    would affect n2 in main method as well. */
}
int main(){
    int n2= 20;
    cout << "Address of n2 in main" << &n2 <<endl;
    square2(&n2);
    cout<<"Square on n2: (changed in main too)" << n2 << endl;

}
In this simple example, we are passing address of n2 in the method square2(&n2). 

You can simply pass an address of any variable by using &.  Now this is perhaps the most important thing to understand that when you leave a scope of any function, in this case main to square2, square2 allocates different block memory in which it will have its own set of variables, however, when you use pass-by-reference  method, you are passing address of variable so using that address (Pointer) program can find any value from all the momory available and can access and/or change values. 

In this case, even changing value in the function square2(int *n)  changes the value universally because n2 wasn't passed, the pointer was passed by :
square2(&n2);
So similarly, the reason we are using pointer to a pointer in this case is because if you noticed this block of code :

    Node* head = NULL;
    Node* second = NULL;
    Node* third = NULL;

    //allocate three nodes in the heap
    head = new Node();
    second = new Node();
    third = new Node();
 
That head, second and third are instances of Node(), but more importantly, they are pointers too. 

So here in the method addNode(Node **head, int data),  we want to change the head reference to newly added node, (4) in this case so technically we want to pass the address of head so we can change even in an outside function correct?

So we call addNode function in main() method like this.
    addNode(&head, 4);
So here we are passing &head,  as parameter, however, head is a pointer. So simply put 

we are passing in other words...  addNode(&(*head), 4);  . Hope that makes sense.

which is why on the definition, we add 

addNode(Node **head_ref, int data); 
because technically this is how it passed, 
addNode( Node * (*head_ref), int data);
        That is why it is pointer to a pointer notation, i.e ** is used as parameters of the function.
As I mentioned above, Linkedlist is only possible using pointers and the purpose of understanding Linkedlist is to understand pointer itself. Thus, I believe that Pointers are really essential to learn and Linkedlist data structure is a really good data structure to learn because it uses many use-cases of pionters.  
        Please try out code yourself to understand and hopefully my explanation helped to understand the use of pointer-to-pointer  use in Linkedlist. If you have any questions or noticed I have made a mistake in understanding it, feel free to drop a comment, happy to answer and discuss. Thanks and happy coding.

Post a Comment

0 Comments