{"id":1501,"date":"2022-11-06T16:10:39","date_gmt":"2022-11-06T08:10:39","guid":{"rendered":"https:\/\/markjohntaylor.com\/blog\/wordpress\/?p=1501"},"modified":"2022-11-06T19:14:25","modified_gmt":"2022-11-06T11:14:25","slug":"zero-pointer-dereference-huh","status":"publish","type":"post","link":"https:\/\/markjohntaylor.com\/blog\/wordpress\/index.php\/2022\/11\/06\/zero-pointer-dereference-huh\/","title":{"rendered":"Zero Pointer Dereference, Huh?"},"content":{"rendered":"\n<p>Consider the following code:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"c\" data-enlighter-theme=\"\" data-enlighter-highlight=\"10-11\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">int main()\n{\n    struct s {\n        int m0;\n        int m1;\n        int m2;\n    };\n\n    struct s *p = NULL;\n    int a = (int) &amp;((struct s *)p)->m2;  \/\/ crash ??\n    int b = (int) &amp;((struct s *)0)->m2;  \/\/ crash ??\n    int c = (int) p->m2;  \/\/ definitely dead\n\n    return 0;\n}<\/pre>\n\n\n\n<p>So, do you think the above two highlighted lines will crash the program? Well, at least to me, it will, since they&#8217;re dereferencing null pointers. Take, <code data-enlighter-language=\"generic\" class=\"EnlighterJSRAW\">int b = (int) &amp;((struct s *)0)->m2;<\/code>, for example, we first dereference the zero pointer to get the member <code>m2<\/code>, and then obtain its address. Right? This is how we <em>literally<\/em> read <code>&amp;p_struct->member<\/code>.<\/p>\n\n\n\n<p>However, this is not how the compiler interprets it. The compiler is a very cool and knowledgeable guy who is assumed to know <em>everything<\/em>. So the compiler <strong>knows the offset of each member in a structure<\/strong>, and he says, &#8220;Well, why do I even care for the values of those members (dereferencing)? To obtain the address of a member in a structure, I just need to add the offset of the member to the address of the structure: <code>&amp;p_struct->member := p_struct + offset(member)<\/code>.&#8221;<\/p>\n\n\n\n<p>With such ability, the compiler now can generate assembly code<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"asm\" data-enlighter-theme=\"\" data-enlighter-highlight=\"16-17,21,25\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">6       {\n   0x0000000000001129 &lt;+0>:     endbr64\n   0x000000000000112d &lt;+4>:     push   %rbp\n   0x000000000000112e &lt;+5>:     mov    %rsp,%rbp\n\n7           struct s {\n8               int m0;\n9               int m1;\n10              int m2;\n11          };\n12\n13          struct s *p = NULL;\n   0x0000000000001131 &lt;+8>:     movq   $0x0,-0x8(%rbp)\n\n14          int a = (int) &amp;((struct s *)p)->m2;  \/\/ okay\n   0x0000000000001139 &lt;+16>:    mov    -0x8(%rbp),%rax\n   0x000000000000113d &lt;+20>:    add    $0x8,%rax\n   0x0000000000001141 &lt;+24>:    mov    %eax,-0x14(%rbp)\n\n15          int b = (int) &amp;((struct s *)0)->m2;  \/\/ okay\n   0x0000000000001144 &lt;+27>:    movl   $0x8,-0x10(%rbp)\n\n16          int c = (int) p->m2;  \/\/ dead\n   0x000000000000114b &lt;+34>:    mov    -0x8(%rbp),%rax\n   0x000000000000114f &lt;+38>:    mov    0x8(%rax),%eax # segfault, since it tries to access invalid memory address 0x8 (page 0)\n   0x0000000000001152 &lt;+41>:    mov    %eax,-0xc(%rbp)\n\n17\n18          return 0;\n   0x0000000000001155 &lt;+44>:    mov    $0x0,%eax\n\n19      }\n   0x000000000000115a &lt;+49>:    pop    %rbp\n   0x000000000000115b &lt;+50>:    ret<\/pre>\n\n\n\n<p>We can see that there are no dereferences whatsoever when we try to get the address of a member in a structure, just the addition of the member offset to the address of the structure.<\/p>\n\n\n\n<p>Therefore, as a special case with the structure address being zero, <code>&amp;((TYPE *)0)->MEMBER<\/code> yields the exact offset of the member in a structure, which accounts for why the macro<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"c\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">#define offsetof(TYPE, MEMBER) ((size_t) &amp;((TYPE *)0)->MEMBER)<\/pre>\n\n\n\n<p>works in the Linux kernel.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Consider the following code: So, do you think the above two highlighted lines will crash the program? Well, at least to me, it will, since they&#8217;re dereferencing null pointers. Take, int b = (int) &amp;((struct s *)0)->m2;, for example, we first dereference the zero pointer to get the member m2, and then obtain its address. &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/markjohntaylor.com\/blog\/wordpress\/index.php\/2022\/11\/06\/zero-pointer-dereference-huh\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Zero Pointer Dereference, Huh?&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[15],"tags":[],"_links":{"self":[{"href":"https:\/\/markjohntaylor.com\/blog\/wordpress\/index.php\/wp-json\/wp\/v2\/posts\/1501"}],"collection":[{"href":"https:\/\/markjohntaylor.com\/blog\/wordpress\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/markjohntaylor.com\/blog\/wordpress\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/markjohntaylor.com\/blog\/wordpress\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/markjohntaylor.com\/blog\/wordpress\/index.php\/wp-json\/wp\/v2\/comments?post=1501"}],"version-history":[{"count":56,"href":"https:\/\/markjohntaylor.com\/blog\/wordpress\/index.php\/wp-json\/wp\/v2\/posts\/1501\/revisions"}],"predecessor-version":[{"id":1561,"href":"https:\/\/markjohntaylor.com\/blog\/wordpress\/index.php\/wp-json\/wp\/v2\/posts\/1501\/revisions\/1561"}],"wp:attachment":[{"href":"https:\/\/markjohntaylor.com\/blog\/wordpress\/index.php\/wp-json\/wp\/v2\/media?parent=1501"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/markjohntaylor.com\/blog\/wordpress\/index.php\/wp-json\/wp\/v2\/categories?post=1501"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/markjohntaylor.com\/blog\/wordpress\/index.php\/wp-json\/wp\/v2\/tags?post=1501"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}