Heap
- Heap are the large area for dynamic space allocations
- Stores large data
- Data remains until explicitly removed
- Unlike stack where a function execution is complete, the data is removed or rather the stack pointer is updated so the for new function data will be overwritten
- But in the heap we explicitly need to remove the data
- All function can access the heap
- It grows from low to high (address)
- Unlike data & stack section which grows from high to low
- We need functions like malloc, free, new to allocate and de-allocate the space in heap
Pointers
- Points to memory address in the heap
- A pointer is stored in a stack or a data or heap section
- When a large object is stored in the heap, the pointer of that may be stored in the stack section as a local variable to access it fast.
- If the data is constant or a global data, then the data will be stored in heap but the pointer of that data will be stored in “data” section.
- The pointer stores the address of first byte of the heap
- So if large data is stored and that data may be sparse or continuous in the heap, then only the address of first byte will be stored in the pointer.
- Pointer also stores the type of data it points to.
- This helps in determining what will be the size of the data.
- For eg in a 32 bit machine, if a integer is stored in the heap, than the pointer data type will be int and the kernel will know that the data size if 4 byte, so need to read 4 bytes.
- Also, will need type when allocation memory for storing the data.
Process execution
int main() {
int *ptr = malloc(sizeof(int));
*ptr = 10
*ptr += 1
free(ptr)
return 0;
}
- Program is loaded, and all the registers are initialised
- sizeof(int) i.e “#4” will be stored to register as to pass it as a parameter to malloc function
- malloc function will start executing
- process will change from “user mode” to “kernal mode”
- As user mode is not authorised to deal with memory allocation and stuff
- Only kernal can do it.
- Note: As mode is switching, all the CPU data will be stored and loaded again, so this is a heavy operation
- malloc will return the pointer of the space
- The pointer will be stored in the register as well as in the stack
- cpu will store “10” in the ptr location
- free function will start executing
- This will also make the process change to kernel mode
malloc & free
- malloc & free is a function call, which is executed in kernel mode only
- malloc allocates the size given in the heap and returns the address
- malloc also stores header of fix size before the address (space allocated)
- The size is stored in the header
- free deletes the data and frees the space
- free reads the header and get the info about the size of memory it needs to free
Memory leaks
- freeing the heap memory is important
- memory leaks happens in the heap
- When a pointer (stored in stack) is created by function which points to a memory in heap, and that function completes it’s execution then the the stack is cleared.
- But the memory is not freed in the heap
- And heap memory is large but limited, so when the heap is full a memory leak is happened
- Some languages handles this freeing of memory by using “Refcounting” i.e “Garbage Collection”
- Number if references of the memory in the code is stored. When is code is executed we either increment or decrement the reference count
- So even a read can became write in memory
Dangling Pointer
- When 2 pointer is pointing to same address, for eg main func call a func1 and func1 is using same data passed from main, so both have a local variable which points to the same address
- Not some function calls free with their local pointer, then the memory will be freed. But the pointer1 will still point to the same address.
- So reading data from ptr1 will given wrong data or segfault
Performance Heap vs Stack
- Stack comes with builtin memory management, in heap we explicitly needs to free the memory
- Stack variables are in sequence vs heap is random
- So CPU will cache next variable data in the L1 for stack, but can not for heap
- Stack space is limited
Order of variable is important, google was able to improve the performance of socket code just by reordering the variables.
- All the variables that were accessed nearby were created in a sequential order. So all these variables are stored sequentially in the stack and just by accessing 1 variable, we were caching other variables. So a memory call was reduced for that.