Simple Program execution
Consider a simple below program
void func1 () {
int z = 1;
z = z + 1;
}
int main() {
int a = 1;
int b = 3;
func1();
int c = a + b;
return 0;
}
The process execution steps from CPU & Stack perspective
- Program is started, process is created, code (machine code) is loaded in the code section in memory
- In the code file header, the address of main will be written, initialise the pc to that address and start executing the code
- Store the start pointer of stack to sp register
- Store the base pointer & link register of stack to bp & lr register
- Move “1” to ro -⇒ int a = 1; line
- store the ro to the memory at location [bp - 8], this will do a memory write
- Similar step 5 & 6 for b = 3;
- Call func1(), load the pc to lr
- update pc to the address of func1
- initilize the sp by sp + 8 and save it to register
- Store the main’s bp in the memory
- Repeat steps from 3
- Func1 ends: update the pc to the address of lr
- load sp from the current func’s bp
- load r0; read the memory to load “a” to ro
- load r1; read the memory to load “b” to r1
- add and store in r3 = r0 + r1
- write the r3 to memory
- load the kernal lr and base pointer
- return
- When a infinite function calls happens, that time the stack grows and reaches limit. This is what a stack overflow is called.
- Also when a large local variable is stored in the stack
- Stack has a limit, which can be overridden by the compiler
- When a function call is done, CPU will perform a context switch for function and will have to store the current registers data to memory and load the register for new functions
- When the function is done executing then the previous func data which was stored in the memory will have to be loaded again to the registers
- So for every function call, the pc will update and a new memory read will happen
- A single memory read will cache continuous code from the pc, so when a new function is called the pc will change and all the cache would not be useful.
- So multiple function call can make the execution slow because of cache invalidation and context switching
- Some compiler so allow to make the function inline to all the place where it’s code.
- Basically copy the function code and place it the caller’s code
- But the downside of doing this is it will make the binary file larger
- Parameters and return values are passed by registers