{"id":1215,"date":"2021-08-27T23:55:57","date_gmt":"2021-08-27T15:55:57","guid":{"rendered":"https:\/\/markjohntaylor.com\/blog\/wordpress\/?p=1215"},"modified":"2023-07-16T14:49:20","modified_gmt":"2023-07-16T06:49:20","slug":"a-running-example","status":"publish","type":"post","link":"https:\/\/markjohntaylor.com\/blog\/wordpress\/index.php\/2021\/08\/27\/a-running-example\/","title":{"rendered":"A Running Example"},"content":{"rendered":"\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">#include &lt;stdio.h>\n  \nint add(int a, int b)\n{\n    int c = a+b;\n    printf(\"@add(): &amp;a=%p, &amp;b=%p\\n\", &amp;a, &amp;b);\n    return c;\n}\n\nint main()\n{\n    int i = 3;\n    int j = 4;\n    int k = add(i,j);\n    printf(\"@main(): &amp;i=%p, &amp;j=%p, &amp;k=%p\\n\", &amp;i, &amp;j, &amp;k);\n    return k;\n}<\/pre>\n\n\n\n<p>I made a test in Dev-C++ (version 5.7.1, with MinGW GCC 4.8.1 32-bit), and in debug mode, via the CPU Window it produced the following instructions:<span style=\"color:#111\" class=\"tadv-color\"><\/span><\/p>\n\n\n\n<p>procedure <em>main<\/em>:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"asm\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">   0x004016e0 &lt;+0>:\tpush   %ebp\n   0x004016e1 &lt;+1>:\tmov    %esp,%ebp\n   0x004016e3 &lt;+3>:\tand    $0xfffffff0,%esp\n   0x004016e6 &lt;+6>:\tsub    $0x20,%esp\n   0x004016e9 &lt;+9>:\tcall   0x401cc0 &lt;__main>\n=> 0x004016ee &lt;+14>:\tmovl   $0x3,0x1c(%esp)\n   0x004016f6 &lt;+22>:\tmovl   $0x4,0x18(%esp)\n   0x004016fe &lt;+30>:\tmov    0x18(%esp),%edx\n   0x00401702 &lt;+34>:\tmov    0x1c(%esp),%eax\n   0x00401706 &lt;+38>:\tmov    %edx,0x4(%esp)\n   0x0040170a &lt;+42>:\tmov    %eax,(%esp)\n   0x0040170d &lt;+45>:\tcall   0x4016b0 &lt;add>\n   0x00401712 &lt;+50>:\tmov    %eax,0x14(%esp)\n   0x00401716 &lt;+54>:\tlea    0x14(%esp),%eax\n   0x0040171a &lt;+58>:\tmov    %eax,0xc(%esp)\n   0x0040171e &lt;+62>:\tlea    0x18(%esp),%eax\n   0x00401722 &lt;+66>:\tmov    %eax,0x8(%esp)\n   0x00401726 &lt;+70>:\tlea    0x1c(%esp),%eax\n   0x0040172a &lt;+74>:\tmov    %eax,0x4(%esp)\n   0x0040172e &lt;+78>:\tmovl   $0x40507a,(%esp)\n   0x00401735 &lt;+85>:\tcall   0x4036c8 &lt;printf>\n   0x0040173a &lt;+90>:\tmov    0x14(%esp),%eax\n   0x0040173e &lt;+94>:\tleave  \n   0x0040173f &lt;+95>:\tret    <\/pre>\n\n\n\n<p>subroutine <em>add<\/em>:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"asm\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">   0x004016b0 &lt;+0>:\tpush   %ebp\n   0x004016b1 &lt;+1>:\tmov    %esp,%ebp\n   0x004016b3 &lt;+3>:\tsub    $0x28,%esp\n   0x004016b6 &lt;+6>:\tmov    0x8(%ebp),%edx\n   0x004016b9 &lt;+9>:\tmov    0xc(%ebp),%eax\n   0x004016bc &lt;+12>:\tadd    %edx,%eax\n   0x004016be &lt;+14>:\tmov    %eax,-0xc(%ebp)\n   0x004016c1 &lt;+17>:\tlea    0xc(%ebp),%eax\n   0x004016c4 &lt;+20>:\tmov    %eax,0x8(%esp)\n   0x004016c8 &lt;+24>:\tlea    0x8(%ebp),%eax\n   0x004016cb &lt;+27>:\tmov    %eax,0x4(%esp)\n   0x004016cf &lt;+31>:\tmovl   $0x405064,(%esp)\n   0x004016d6 &lt;+38>:\tcall   0x4036c8 &lt;printf>\n   0x004016db &lt;+43>:\tmov    -0xc(%ebp),%eax\n   0x004016de &lt;+46>:\tleave  \n   0x004016df &lt;+47>:\tret    <\/pre>\n\n\n\n<p>Following these instructions, I drew a simple diagram (partial, incomplete):<\/p>\n\n\n\n<figure class=\"wp-container-2 wp-block-gallery-1 wp-block-gallery has-nested-images columns-default is-cropped\">\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"768\" data-id=\"1233\"  src=\"https:\/\/markjohntaylor.com\/blog\/wordpress\/wp-content\/uploads\/2021\/08\/stack-1024x768.jpg\" alt=\"\" class=\"wp-image-1233\" srcset=\"https:\/\/markjohntaylor.com\/blog\/wordpress\/wp-content\/uploads\/2021\/08\/stack-1024x768.jpg 1024w, https:\/\/markjohntaylor.com\/blog\/wordpress\/wp-content\/uploads\/2021\/08\/stack-300x225.jpg 300w, https:\/\/markjohntaylor.com\/blog\/wordpress\/wp-content\/uploads\/2021\/08\/stack-768x576.jpg 768w, https:\/\/markjohntaylor.com\/blog\/wordpress\/wp-content\/uploads\/2021\/08\/stack-1536x1152.jpg 1536w, https:\/\/markjohntaylor.com\/blog\/wordpress\/wp-content\/uploads\/2021\/08\/stack-2048x1535.jpg 2048w, https:\/\/markjohntaylor.com\/blog\/wordpress\/wp-content\/uploads\/2021\/08\/stack-1568x1176.jpg 1568w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>The Running Stack<\/figcaption><\/figure>\n<\/figure>\n\n\n\n<p>The arguments for <code>printf<\/code> were left out in this diagram for simplicity, but it is not difficult to imagine the whole running process with the aid of the above instructions. The <a href=\"https:\/\/en.wikipedia.org\/wiki\/Calling_convention\">calling convention<\/a>, however, can be different from another compiler. For example, on my Ubuntu 20.04.1 LTS server with GCC 9.3.0, it yields something like<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"false\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">@add(): &amp;a=0x7ffdf0f69a3c, &amp;b=0x7ffdf0f69a38\n@main(): &amp;i=0x7ffdf0f69a6c, &amp;j=0x7ffdf0f69a70, &amp;k=0x7ffdf0f69a74<\/pre>\n\n\n\n<p>which is completely the opposite of the above diagram: the passed arguments were arranged in decreasing addresses, and the local vars were placed in increasing addresses.  For local variables,  the compiler knows in advance how much space should be allocated in the stack by looking at its symbol table. Then it allocates enough space by subtracting some value from the RSP register. Thus, it looks like the compiler can place the local vars wherever it likes, it just needs to perform some arithmetics based on the RSP or RBP register. A good explanation for the locations of the passed arguments is, <span style=\"color:#0073a8\" class=\"tadv-color\">as opposed to storing them on the caller stack, it first copies those arguments to registers (if possible) in reverse order and then saves them to the callee stack <\/span><s><span style=\"color:#a30003\" class=\"tadv-color\">near the beginning<\/span><\/s><span style=\"color:#0073a8\" class=\"tadv-color\"> in argument order (RDI, RSI, RDX, RCX, R8, R9)<\/span>. In this way, the value in the EDI register (1st argument) is first placed into the callee stack and hence it has a higher address. Aha! Now it makes perfect sense! Next, let&#8217;s check it out by looking at the disassembly by running <code>gdb a.out -batch -ex 'disassemble\/s main'<\/code> (or <code>add<\/code>).<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"asm\" data-enlighter-theme=\"\" data-enlighter-highlight=\"19-22\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">Dump of assembler code for function main:\nadd.c:\n11\t{\n   0x00000000000011a7 &lt;+0>:\tendbr64 \n   0x00000000000011ab &lt;+4>:\tpush   %rbp\n   0x00000000000011ac &lt;+5>:\tmov    %rsp,%rbp\n   0x00000000000011af &lt;+8>:\tsub    $0x20,%rsp\n   0x00000000000011b3 &lt;+12>:\tmov    %fs:0x28,%rax\n   0x00000000000011bc &lt;+21>:\tmov    %rax,-0x8(%rbp)\n   0x00000000000011c0 &lt;+25>:\txor    %eax,%eax\n\n12\t    int i = 3;\n   0x00000000000011c2 &lt;+27>:\tmovl   $0x3,-0x14(%rbp)\n\n13\t    int j = 4;\n   0x00000000000011c9 &lt;+34>:\tmovl   $0x4,-0x10(%rbp)\n\n14\t    int k = add(i,j);\n   0x00000000000011d0 &lt;+41>:\tmov    -0x10(%rbp),%edx\n   0x00000000000011d3 &lt;+44>:\tmov    -0x14(%rbp),%eax\n   0x00000000000011d6 &lt;+47>:\tmov    %edx,%esi\n   0x00000000000011d8 &lt;+49>:\tmov    %eax,%edi\n   0x00000000000011da &lt;+51>:\tcallq  0x1169 &lt;add>\n   0x00000000000011df &lt;+56>:\tmov    %eax,-0xc(%rbp)\n\n15\t    printf(\"@main(): &amp;i=%p, &amp;j=%p, &amp;k=%p\\n\", &amp;i, &amp;j, &amp;k);\n   0x00000000000011e2 &lt;+59>:\tlea    -0xc(%rbp),%rcx\n   0x00000000000011e6 &lt;+63>:\tlea    -0x10(%rbp),%rdx\n   0x00000000000011ea &lt;+67>:\tlea    -0x14(%rbp),%rax\n   0x00000000000011ee &lt;+71>:\tmov    %rax,%rsi\n   0x00000000000011f1 &lt;+74>:\tlea    0xe22(%rip),%rdi        # 0x201a\n   0x00000000000011f8 &lt;+81>:\tmov    $0x0,%eax\n   0x00000000000011fd &lt;+86>:\tcallq  0x1070 &lt;printf@plt>\n\n16\t    return k;\n   0x0000000000001202 &lt;+91>:\tmov    -0xc(%rbp),%eax\n\n17\t}\n   0x0000000000001205 &lt;+94>:\tmov    -0x8(%rbp),%rsi\n   0x0000000000001209 &lt;+98>:\txor    %fs:0x28,%rsi\n   0x0000000000001212 &lt;+107>:\tje     0x1219 &lt;main+114>\n   0x0000000000001214 &lt;+109>:\tcallq  0x1060 &lt;__stack_chk_fail@plt>\n   0x0000000000001219 &lt;+114>:\tleaveq \n   0x000000000000121a &lt;+115>:\tretq   \nEnd of assembler dump.<\/pre>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"asm\" data-enlighter-theme=\"\" data-enlighter-highlight=\"8,9,12-15\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">Dump of assembler code for function add:\nadd.c:\n4\t{\n   0x0000000000001169 &lt;+0>:\tendbr64 \n   0x000000000000116d &lt;+4>:\tpush   %rbp\n   0x000000000000116e &lt;+5>:\tmov    %rsp,%rbp\n   0x0000000000001171 &lt;+8>:\tsub    $0x20,%rsp\n   0x0000000000001175 &lt;+12>:\tmov    %edi,-0x14(%rbp)\n   0x0000000000001178 &lt;+15>:\tmov    %esi,-0x18(%rbp)\n\n5\t    int c = a+b;\n   0x000000000000117b &lt;+18>:\tmov    -0x14(%rbp),%edx\n   0x000000000000117e &lt;+21>:\tmov    -0x18(%rbp),%eax\n   0x0000000000001181 &lt;+24>:\tadd    %edx,%eax\n   0x0000000000001183 &lt;+26>:\tmov    %eax,-0x4(%rbp)\n\n6\t    printf(\"@add(): &amp;a=%p, &amp;b=%p\\n\", &amp;a, &amp;b);\n   0x0000000000001186 &lt;+29>:\tlea    -0x18(%rbp),%rdx\n   0x000000000000118a &lt;+33>:\tlea    -0x14(%rbp),%rax\n   0x000000000000118e &lt;+37>:\tmov    %rax,%rsi\n   0x0000000000001191 &lt;+40>:\tlea    0xe6c(%rip),%rdi        # 0x2004\n   0x0000000000001198 &lt;+47>:\tmov    $0x0,%eax\n   0x000000000000119d &lt;+52>:\tcallq  0x1070 &lt;printf@plt>\n\n7\t    return c;\n   0x00000000000011a2 &lt;+57>:\tmov    -0x4(%rbp),%eax\n\n8\t}\n   0x00000000000011a5 &lt;+60>:\tleaveq \n   0x00000000000011a6 &lt;+61>:\tretq   \nEnd of assembler dump.<\/pre>\n\n\n\n<p>The above assembler code agrees with our speculation. Note, however, the location of the local variable <code>c<\/code> here. It is above the two arguments in the stack (just one position below the RBP register). If we rewrite the  <code>main<\/code> function as <code>int main(int argc, char** argv)<\/code>, then we can get something like<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"asm\" data-enlighter-theme=\"\" data-enlighter-highlight=\"4-6\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">   0x00000000000011a7 &lt;+0>:\tendbr64 \n   0x00000000000011ab &lt;+4>:\tpush   %rbp\n   0x00000000000011ac &lt;+5>:\tmov    %rsp,%rbp\n   0x00000000000011af &lt;+8>:\tsub    $0x30,%rsp\n   0x00000000000011b3 &lt;+12>:\tmov    %edi,-0x24(%rbp)\n   0x00000000000011b6 &lt;+15>:\tmov    %rsi,-0x30(%rbp)\n   0x00000000000011ba &lt;+19>:\tmov    %fs:0x28,%rax\n   0x00000000000011c3 &lt;+28>:\tmov    %rax,-0x8(%rbp)\n   0x00000000000011c7 &lt;+32>:\txor    %eax,%eax<\/pre>\n\n\n\n<p>in the setup of the stack of <code>main<\/code>. We can see that the two passed arguments <code>argc<\/code> and <code>argv<\/code> are stored at the stack top via the EDI and RSI registers, respectively.<\/p>\n\n\n\n<p>(Jul 15, 2023)<\/p>\n\n\n\n<p>Effective implementations of some x86 instructions (slides from <a href=\"https:\/\/www.cs.princeton.edu\/courses\/archive\/spr17\/cos217\/lectures\/15_AssemblyFunctions.pdf\">here<\/a>):<\/p>\n\n\n\n<figure class=\"wp-container-4 wp-block-gallery-3 wp-block-gallery has-nested-images columns-default is-cropped\">\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1907\" height=\"1474\" data-id=\"1755\"  src=\"https:\/\/markjohntaylor.com\/blog\/wordpress\/wp-content\/uploads\/2023\/07\/ret_impl.png\" alt=\"\" class=\"wp-image-1755\" srcset=\"https:\/\/markjohntaylor.com\/blog\/wordpress\/wp-content\/uploads\/2023\/07\/ret_impl.png 1907w, https:\/\/markjohntaylor.com\/blog\/wordpress\/wp-content\/uploads\/2023\/07\/ret_impl-300x232.png 300w, https:\/\/markjohntaylor.com\/blog\/wordpress\/wp-content\/uploads\/2023\/07\/ret_impl-1024x791.png 1024w, https:\/\/markjohntaylor.com\/blog\/wordpress\/wp-content\/uploads\/2023\/07\/ret_impl-768x594.png 768w, https:\/\/markjohntaylor.com\/blog\/wordpress\/wp-content\/uploads\/2023\/07\/ret_impl-1536x1187.png 1536w, https:\/\/markjohntaylor.com\/blog\/wordpress\/wp-content\/uploads\/2023\/07\/ret_impl-1568x1212.png 1568w\" sizes=\"(max-width: 1907px) 100vw, 1907px\" \/><\/figure>\n<\/figure>\n","protected":false},"excerpt":{"rendered":"<p>I made a test in Dev-C++ (version 5.7.1, with MinGW GCC 4.8.1 32-bit), and in debug mode, via the CPU Window it produced the following instructions: procedure main: subroutine add: Following these instructions, I drew a simple diagram (partial, incomplete): The arguments for printf were left out in this diagram for simplicity, but it is &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/markjohntaylor.com\/blog\/wordpress\/index.php\/2021\/08\/27\/a-running-example\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;A Running Example&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[15],"tags":[],"_links":{"self":[{"href":"https:\/\/markjohntaylor.com\/blog\/wordpress\/index.php\/wp-json\/wp\/v2\/posts\/1215"}],"collection":[{"href":"https:\/\/markjohntaylor.com\/blog\/wordpress\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/markjohntaylor.com\/blog\/wordpress\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/markjohntaylor.com\/blog\/wordpress\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/markjohntaylor.com\/blog\/wordpress\/index.php\/wp-json\/wp\/v2\/comments?post=1215"}],"version-history":[{"count":32,"href":"https:\/\/markjohntaylor.com\/blog\/wordpress\/index.php\/wp-json\/wp\/v2\/posts\/1215\/revisions"}],"predecessor-version":[{"id":1758,"href":"https:\/\/markjohntaylor.com\/blog\/wordpress\/index.php\/wp-json\/wp\/v2\/posts\/1215\/revisions\/1758"}],"wp:attachment":[{"href":"https:\/\/markjohntaylor.com\/blog\/wordpress\/index.php\/wp-json\/wp\/v2\/media?parent=1215"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/markjohntaylor.com\/blog\/wordpress\/index.php\/wp-json\/wp\/v2\/categories?post=1215"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/markjohntaylor.com\/blog\/wordpress\/index.php\/wp-json\/wp\/v2\/tags?post=1215"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}