Heap

  • Heap are the large area for dynamic space allocations
  • Stores large data
  • Data remains until explicitly removed
    • Unlike stack where a function execution is complete, the data is removed or rather the stack pointer is updated so the for new function data will be overwritten
    • But in the heap we explicitly need to remove the data
  • All function can access the heap
  • It grows from low to high (address)
    • Unlike data & stack section which grows from high to low
  • We need functions like malloc, free, new to allocate and de-allocate the space in heap

Pointers

  • Points to memory address in the heap
  • A pointer is stored in a stack or a data or heap section
    • When a large object is stored in the heap, the pointer of that may be stored in the stack section as a local variable to access it fast.
    • If the data is constant or a global data, then the data will be stored in heap but the pointer of that data will be stored in “data” section.
  • The pointer stores the address of first byte of the heap
    • So if large data is stored and that data may be sparse or continuous in the heap, then only the address of first byte will be stored in the pointer.
  • Pointer also stores the type of data it points to.
    • This helps in determining what will be the size of the data.
    • For eg in a 32 bit machine, if a integer is stored in the heap, than the pointer data type will be int and the kernel will know that the data size if 4 byte, so need to read 4 bytes.
    • Also, will need type when allocation memory for storing the data.

Process execution

int main() {
	int *ptr = malloc(sizeof(int));
	*ptr = 10
	*ptr += 1

	free(ptr)
	return 0;
}
  1. Program is loaded, and all the registers are initialised
  2. sizeof(int) i.e “#4” will be stored to register as to pass it as a parameter to malloc function
  3. malloc function will start executing
    1. process will change from “user mode” to “kernal mode”
    2. As user mode is not authorised to deal with memory allocation and stuff
    3. Only kernal can do it.
    4. Note: As mode is switching, all the CPU data will be stored and loaded again, so this is a heavy operation
  4. malloc will return the pointer of the space
    1. The pointer will be stored in the register as well as in the stack
  5. cpu will store “10” in the ptr location
  6. free function will start executing
    1. This will also make the process change to kernel mode

malloc & free

  • malloc & free is a function call, which is executed in kernel mode only
  • malloc allocates the size given in the heap and returns the address
  • malloc also stores header of fix size before the address (space allocated)
    • The size is stored in the header
  • free deletes the data and frees the space
    • free reads the header and get the info about the size of memory it needs to free

Memory leaks

  • freeing the heap memory is important
  • memory leaks happens in the heap
  • When a pointer (stored in stack) is created by function which points to a memory in heap, and that function completes it’s execution then the the stack is cleared.
    • But the memory is not freed in the heap
    • And heap memory is large but limited, so when the heap is full a memory leak is happened
  • Some languages handles this freeing of memory by using “Refcounting” i.e “Garbage Collection”
    • Number if references of the memory in the code is stored. When is code is executed we either increment or decrement the reference count
    • So even a read can became write in memory

Dangling Pointer

  • When 2 pointer is pointing to same address, for eg main func call a func1 and func1 is using same data passed from main, so both have a local variable which points to the same address
  • Not some function calls free with their local pointer, then the memory will be freed. But the pointer1 will still point to the same address.
  • So reading data from ptr1 will given wrong data or segfault

Performance Heap vs Stack

  • Stack comes with builtin memory management, in heap we explicitly needs to free the memory
  • Stack variables are in sequence vs heap is random
    • So CPU will cache next variable data in the L1 for stack, but can not for heap
  • Stack space is limited

Order of variable is important, google was able to improve the performance of socket code just by reordering the variables.

  • All the variables that were accessed nearby were created in a sequential order. So all these variables are stored sequentially in the stack and just by accessing 1 variable, we were caching other variables. So a memory call was reduced for that.