Does C + + reference occupy memory space?

Posted by opencombatclan on Mon, 07 Mar 2022 11:28:13 +0100

Through the disassembly code, see if you can see the address of a reference variable to know whether it occupies memory space.
Test code:

int main(){
    int num = 6;
    int &r = num;
    int *p = #
    int x = r;
    int y = *p;
    return 0;
}

1. X86-64 platform

Obtain the disassembly code of the main function through GDB or objdump. The assembly code defaults to at & T syntax.
The following disassembly code is obtained through GDB:

   0x000055555555466a <+0>:        push   %rbp
   0x000055555555466b <+1>:        mov    %rsp,%rbp
   0x000055555555466e <+4>:        sub    $0x30,%rsp         
   0x0000555555554672 <+8>:        mov    %fs:0x28,%rax
   0x000055555555467b <+17>:        mov    %rax,-0x8(%rbp)
   0x000055555555467f <+21>:        xor    %eax,%eax
   0x0000555555554681 <+23>:        movl   $0x6,-0x24(%rbp)  # Address of num rbp-0x24
   0x0000555555554688 <+30>:        lea    -0x24(%rbp),%rax  # Save the address of num in rax
   0x000055555555468c <+34>:        mov    %rax,-0x18(%rbp)  # The address of r is rbp-18, and the address of num is saved in r 
   0x0000555555554690 <+38>:        lea    -0x24(%rbp),%rax  # Save num address in rax
   0x0000555555554694 <+42>:        mov    %rax,-0x10(%rbp)  # The address of p is rbp-10, and the address of num is saved in p
   0x0000555555554698 <+46>:        mov    -0x18(%rbp),%rax  
   0x000055555555469c <+50>:        mov    (%rax),%eax
   0x000055555555469e <+52>:        mov    %eax,-0x20(%rbp)  # Address of x rbp-20
   0x00005555555546a1 <+55>:        mov    -0x10(%rbp),%rax
   0x00005555555546a5 <+59>:        mov    (%rax),%eax
   0x00005555555546a7 <+61>:        mov    %eax,-0x1c(%rbp)  # Address of y rbp-0x1c
=> 0x00005555555546aa <+64>:        mov    $0x0,%eax
   0x00005555555546af <+69>:        mov    -0x8(%rbp),%rdx
   0x00005555555546b3 <+73>:        xor    %fs:0x28,%rdx
   0x00005555555546bc <+82>:        je     0x5555555546c3 <main()+89>
   0x00005555555546be <+84>:        callq  0x555555554540 <__stack_chk_fail@plt>
   0x00005555555546c3 <+89>:        leaveq 
   0x00005555555546c4 <+90>:        retq   

It can be seen from the above assembly code that the five variables are local variables and are stored in the stack space.

  • num address: rbp-24
  • r's address: rbp-18
  • p's address: rbp-10
  • Address of x: rbp-20
  • Address of y: rbp-0x1c
    The C + + reference type occupies the same memory space as the pointer, and at the assembly level, it behaves like a pointer.

2. ARM platform

Use the clang + + - target arm linux android21 - s option to obtain the assembly code:
MOV {condition} {S} destination register, source operand
STR {condition} source register, < memory address >
LDR {condition} destination register, < memory address >

main:
    .fnstart
@ %bb.0:
    .pad    #24
    sub sp, sp, #24     
    mov r0, #0          
    str r0, [sp, #20]    
    mov r1, #6          
    str r1, [sp, #16]   # Address of num sp+16
    add r1, sp, #16     # r1 = sp + 16   
    str r1, [sp, #12]   # Address of r sp + 12
    str r1, [sp, #8]    # p's address sp + 8
    ldr r1, [sp, #12]   
    ldr r1, [r1]
    str r1, [sp, #4]    # Address of x sp + 4
    ldr r1, [sp, #8]    
    ldr r1, [r1]
    str r1, [sp].       # Address of y sp
    add sp, sp, #24
    bx  lr
.Lfunc_end0:

It can be seen from the above assembly code that, like the X86 platform, the C + + reference type occupies the same memory space as the pointer, and at the assembly level, it behaves like a pointer.

3. Can I get the referenced address

Reference as local variable
Test code:

#include <stdio.h>
int main(){
    int num = 9;
    int& r = num;
    int& rr = r;
    printf("%p %p %p\n",&num, &r, &rr);
    return 0;
}

The output is as follows:
0x7ffee9a90118 0x7ffee9a90118 0x7ffee9a90118
You can see that the address of the referenced variable or the referenced reference is the address of the object pointed to by the original reference.
Reference as member variable
Reference is a member variable. Normally, its address is obtained, and all it gets is the address of the object it points to.
However, through some operations (Universal pointers), you can still obtain the real address of the reference and change the direction of the reference.
The test code is as follows:

#include <cstdio>


struct S
{
    S(int &a,int *_p):r(a),p(_p){ }
    int& r;
    int *p;
};

int main()
{
    int num = 12;
    int tem = 199;
    S c(num,&num);

    printf("num Address of %p\n",&num);
    printf("Normally, get the address of the reference type:\n");
    printf("c.r Address:%p c.p Address:%p\n\n",&(c.r),&(c.p));

    long long *p = (long long *)(&(c.p)-1);
    printf("Get the address of the reference through the pointer:\n");
    printf("quote c.r Your real address %p\n\n",p);
    
    printf("Dereference acquisition num Address of %p\n\n",(long long *)(*p));
    printf("Through the pointer, try to change the direction of the reference:\n");
    printf("original c.r = %d\n", c.r);


    long long temAddress = (long long)&tem;
    *p = temAddress;

    printf("Now? c.r = %d\n",c.r);
    return 0;
}

The output is as follows:
Address of num 0x7fffac56c028
Normally, get the address of the reference type:
c.r address: 0x7fffac56c028 c.p address: 0x7fffac56c048

Get the address of the reference through the pointer:
Reference the real address of c.r 0x7fffac56c040

Dereference to get the address 0x7fffac56c028 of num

Through the pointer, try to change the direction of the reference:
Original c.r = 12
Now c.r = 199

From the above output, we can know that if we normally get the address of a reference member variable, we must get the address it points to the object. Although we can't get the address of the reference member variable, we can get the address of a pointer member variable. Through the above assembly code, we also know that the reference variable occupies a memory space of the size of a pointer, so we can offset the address of some known member variables to get the real reference type address.

4. Summary

C + + references are not as powerful as pointers, but they are more secure. A reference must be initialized at the time of definition, and once defined, its direction cannot be changed. There are no referenced references, and they all point to the object pointed to by the originally defined reference.
From the language level, the reference is only used as a type alias. Even if you get the address of the reference, you only get the address of the object to which the reference points.

This test only proves that the reference occupies the same memory space as the pointer, and the real address of the reference can be obtained through some special pointer operations. In the production environment, please do not try to obtain the address of a reference, let alone change the direction of the reference, because it violates the original intention of the language design. If you have to do so, it may cause unpredictable trouble.

Topics: C++