Simple Program execution

Consider a simple below program

void func1 () {
	int z = 1;
	z = z + 1;
}

int main() {
	int a = 1;
	int b = 3;
	func1();
	int c = a + b;
	return 0;
}

The process execution steps from CPU & Stack perspective

  1. Program is started, process is created, code (machine code) is loaded in the code section in memory
  2. In the code file header, the address of main will be written, initialise the pc to that address and start executing the code
  3. Store the start pointer of stack to sp register
  4. Store the base pointer & link register of stack to bp & lr register
  5. Move “1” to ro - int a = 1; line
  6. store the ro to the memory at location [bp - 8], this will do a memory write
  7. Similar step 5 & 6 for b = 3;
  8. Call func1(), load the pc to lr
  9. update pc to the address of func1
  10. initilize the sp by sp + 8 and save it to register
  11. Store the main’s bp in the memory
  12. Repeat steps from 3
  13. Func1 ends: update the pc to the address of lr
  14. load sp from the current func’s bp
  15. load r0; read the memory to load “a” to ro
  16. load r1; read the memory to load “b” to r1
  17. add and store in r3 = r0 + r1
  18. write the r3 to memory
  19. load the kernal lr and base pointer
  20. return
  • When a infinite function calls happens, that time the stack grows and reaches limit. This is what a stack overflow is called.
    • Also when a large local variable is stored in the stack
    • Stack has a limit, which can be overridden by the compiler
  • When a function call is done, CPU will perform a context switch for function and will have to store the current registers data to memory and load the register for new functions
    • When the function is done executing then the previous func data which was stored in the memory will have to be loaded again to the registers
  • So for every function call, the pc will update and a new memory read will happen
  • A single memory read will cache continuous code from the pc, so when a new function is called the pc will change and all the cache would not be useful.
    • So multiple function call can make the execution slow because of cache invalidation and context switching
  • Some compiler so allow to make the function inline to all the place where it’s code.
    • Basically copy the function code and place it the caller’s code
    • But the downside of doing this is it will make the binary file larger
  • Parameters and return values are passed by registers