Ownership of Rust

Posted by kickassamd on Mon, 03 Jan 2022 18:09:25 +0100

What is ownership

Ownership is a unique core concept of Rust. This feature enables Rust to write memory safe programs even without garbage collection mechanism. Therefore, understanding the working mechanism of ownership is very important for learning Rust. Other things related to ownership include borrowing, slicing, and memory layout of data.

The memory resources of the computer are very valuable. All programs need some way to make rational use of the memory resources of the computer. How common languages use memory:

language	Memory usage scheme
Java,Go	The garbage collection mechanism constantly checks whether some memory is not in use. If it is no longer needed, it will be released, occupying more memory and CPU resources
C/C++	Programmers manually apply for and release memory, which is error prone and difficult to troubleshoot
Rust	Ownership mechanism, memory is managed by the ownership system according to a series of rules, which are only checked during program compilation

stack
Maybe you rarely need to pay attention to the difference between stack and heap in the development language you used before, but in a language like Rust, a variable placed on the stack or heap will directly affect the final behavior of the program. Heap and stack are memory areas, but programs use them differently. The stack is organized into a last in first out queue, just like a stack of plates. When we read and write the stack, we always access the data on the top layer. The amount of data to be read and written is also determined during compilation, so its access speed is very fast. For variables on the heap, you can't know how much space it needs during compilation. When your program needs to store data, you ask the operating system for them in some way, At the beginning of use, the operating system registers (applies for memory), returns the operating system after use, and logs off this memory (frees memory), so that subsequent variables can continue to use this memory.

Ownership rules

Each variable with value in Rust is called the owner
There is only one owner at a time
When a variable (owner) leaves the scope, its value will be released

Variable scope

{ // s is not valid here; it's not yet declared
    let s = "hello"; // s is valid from this point forward
    // do stuff with s
} // this scope is now over, and s is no longer valid

String type

To illustrate ownership, we need a more complex type.

As we have seen before, String constants cannot be modified. In addition, Rust supports another String type, String, which can be modified. We can use String constants to build variables of type String:

let s = String::from("hello");

Operator:: allows us to access the from function under the String type.

let mut s = String::from("hello"); // Modifiable
s.push_str(", world!"); // push_str() appends a literal to a String
println!("{}", s); // this will print `hello, world!`

memory allocation

In the scenario of String constants, String data is hard coded into the program at compile time, so the access of String constants is more efficient. However, we need variable String types to store data whose size and content are known only at run time, and even the storage space of strings can be expanded dynamically, Therefore, String cannot be stored on the stack (because the variables on the stack are of fixed size and confirmed by the compiler). Therefore, String can only be stored on the heap, that is, the program can request memory from the operating system at run time.

When our String runs out, we can return the memory to the operating system for reuse, otherwise the memory will be exhausted.
The first part is relatively simple, which is completed by ourselves. Here is to call the String::from() function!
In the second part, some languages use the GC scheme. Languages without GC need to release memory in the appropriate actual environment. This is the most error prone place in the process of programming! If we forget to release, memory will be wasted (can no longer be used). If we release early, the code pointing to it will access non-existent variables and make an error.

Rust takes a different approach: once the variable leaves its scope, the memory will be released automatically!

{
    let s = String::from("hello"); // s is valid from this point forward
    // do stuff with s
}
// this scope is now over, and s is no
// longer valid

When a variable leaves the scope, rust will help us call a special function. This function is called drop. The author of String puts the operation of releasing memory in this function. Rust automatically calls drop when it reaches}.

In C + + language, this mode of releasing resources after variables leave action is called RAII.
At present, this mode is quite simple. When we have multiple variables and allocate memory from the heap, it becomes complex. Let's continue.

How variables and data interact: move

Let's look at a simple example:

fn main() {
    let x = 5;
    let y = x;
    println!("x is {}, y is {}", x, y);
    
}

First, a variable x is defined, assigned 5, and then a variable y is defined. Before assigning y, a copy of the value of X is copied, and then assigned to y. The change of X does not affect the final value of Y. Here's one thing to pay attention to! x. Y is integer and its size is known during compilation, so it will be put on the stack! This simple type has no complex memory allocation.

Let's try changing x to another type of string! String can be dynamically expanded, and its memory is allocated on the heap!

fn main() {
    let s1 = String::from("hello");
    let s2 = s1;
    println!("s1 is {}, s2 is {}", s1, s2);
}

There seems to be no problem, but an error is reported during compilation:

error[E0382]: borrow of moved value: `s1` 
value borrowed here after move

s1 has been move d and cannot be borrowed!

A string variable consists of three parts: String content, string length and capacity. The memory layout is as follows:

When we run to s2=s1, the String data is copied, including the pointer, length and capacity. These three values are on the stack (the size is fixed), and the memory area pointed to by the pointer is on the heap (the size cannot be determined during compilation). Rust will not copy it, because the memory area may be large, and copying takes a long time!

Suppose we draw the memory area according to the current scene without move operation, it will be like this:

However, we mentioned the ownership problem earlier. When a variable leaves its scope, it will be released, so s1 and s2 will be released. A memory area will be released twice, which will lead to serious system memory errors.
In order to solve the problem of multiple releases, when s2=s1, s1 becomes unavailable. In this way, s1 will not release memory after leaving the scope! The above program can no longer be used s1!

y = x can be reused because the copy of X is assigned to y, so it can be released normally after leaving the scope. After s2 = s1, it cannot be because Rust only copies part of s1 (pointer + length + capacity stored in stack space), and the memory area pointed to by the pointer (in heap space) has not been copied. Therefore, in order to prevent system problems caused by multiple releases, we can only release one s2, and the other s1 can no longer be released.

How variables and data interact: clone

What if we explicitly want to copy String variables, not only the data on the stack, but also the data on the heap? We can use a common method: called clone.
fn main() {
let s1 = String::from("hello");
let s2 = s1.clone();
println!("s1 is {}, s2 is {}", s1, s2);
}

Only applicable to all stack data: copy

In our previous example, x and y.

let x = 5;
let y = x;
println!("x = {}, y = {}", x, y);

x does not explicitly call the clone function, but automatically performs the copy operation, which seems to conflict with the above phenomenon that String needs to actively call clone.
This is because types such as integers have a certain size on the stack during compilation, so they are easy to copy. The scenario of making x inaccessible after y is created no longer exists because copying is too simple. The difference between deep copy and shallow copy does not exist at all, because neither calling clone nor calling clone will have any impact on the final results of these two copy forms!

If a type implements copy trait (attribute), its old variables are still available after calling it! If a type implements Drop trait, it can no longer implement copy trait. You can only choose one of the two.

What types automatically have the Copy feature?

All integer types, such as i32,u32
Boolean type, bool
Character type, char
All floating point types
A tuple of available Copy elements. For example, (i32, i32) can be copied, and (i32, String) cannot be copied;

Ownership and functions

Pass a parameter to the function. The parameter may be copy or move. The rules are similar to assignment.

fn main() {
    let s = String::from("hello"); // s comes into scope
    takes_ownership(s); // s's value moves into the function...
                        // ... and so is no longer valid here
    let x = 5; // x comes into scope
    makes_copy(x); // x would move into the function,
                   // but i32 is Copy, so it's okay to
                   // still use x afterward
} // Here, x goes out of scope, then s. But because s's value was moved,
  // nothing special happens.

fn takes_ownership(some_string: String) {
    // some_string comes into scope
    println!("{}", some_string);
} // Here, some_string goes out of scope and `drop` is called. The backing
  // memory is freed.
  
fn makes_copy(some_integer: i32) {
    // some_integer comes into scope
    println!("{}", some_integer);
} // Here, some_integer goes out of scope. Nothing special happens.

Return value and scope

The return value will also pass ownership!

fn main() {
    let s1 = gives_ownership(); // gives_ownership moves its return
                                // value into s1
    let s2 = String::from("hello"); // s2 comes into scope
    let s3 = takes_and_gives_back(s2); // s2 is moved into
                                       // takes_and_gives_back, which also
                                       // moves its return value into s3
} // Here, s3 goes out of scope and is dropped. s2 goes out of scope but was
  // moved, so nothing happens. s1 goes out of scope and is dropped.
fn gives_ownership() -> String {
    // gives_ownership will move its
    // return value into the function
    // that calls it
    let some_string = String::from("hello"); // some_string comes into scope
    some_string // some_string is returned and
                // moves out to the calling
                // function
}
// takes_and_gives_back will take a String and return one
fn takes_and_gives_back(a_string: String) -> String {
    // a_string comes into
    // scope
    a_string // a_string is returned and moves out to the calling function
}

The ownership of variables follows a fixed transfer criterion: assigning a variable to another variable will lead to the transfer of ownership (variables with Copy feature do not need drop, so there will be no ownership transfer). When a variable containing heap memory area data leaves the scope, if this value is not transferred, the drop function will be triggered to clean up. If it is transferred, it will not drop.

Reference and borrowing

Tuples are used to realize the following functions:

fn main() {
    let s1 = String::from("hello");
    let (s2, len) = calculate_length(s1);
    println!("The length of '{}' is {}.", s2, len);
}

fn calculate_length(s: String) -> (String, usize) {
    let length = s.len(); // len() returns the length of a String
    (s, length)
}

It's sometimes troublesome to transfer ownership every time. Fortunately, Rust supports reference, which can effectively solve such problems! Here is a simple example

fn main() {
    let s1 = String::from("hello");
    let len = calculate_length(&s1);
    println!("The length of '{}' is {}.", s1, len);
}

fn calculate_length(s: &String) -> usize {
    s.len()
}

The argument passed is & s1. Note that you need to bring &, which means to create a reference to s1 instead of transferring ownership, so it is passed to calculate_ The drop function operation will not be raised after length. Formal parameter s: & String, also with &, indicates that only String type references are accepted. Since the s variable does not obtain ownership, drop will not be triggered after it leaves the scope.
The schematic diagram of the pointer is as follows:

We call the parameter reference behavior in a function borrowing.

If you want to modify the reference variable in the function, you need to change the original variable, argument and formal parameter variable into mut, and put mut between & and variable name.

fn main() {
    let mut s = String::from("hello");
    change(&mut s);
}
fn change(some_string: &mut String) {
    some_string.push_str(", world");
}

The variable attribute of a reference has a great limitation. It is variable that only one reference can be run in the same scope for a variable! This is set to prevent data competition!
The following code compilation will report an error!

let mut s = String::from("hello");
let r1 = &mut s;
let r2 = &mut s; // error[E0499]: cannot borrow `s` as mutable more than once at a time
println!("r1 is {}, r2 is {}", r1, r2); // This sentence is wrong only when it is written

Data contention occurs when the following three behaviors occur:

At least two or more pointers operate on the same data at the same time;
At least one pointer is writing;
There is no method for accessing data synchronously;

Data competition leads to undefined behavior and is difficult to diagnose and repair during operation. Rust won't let this problem spread to the runtime, but it will be discovered and pointed out by the compiler during compilation.

Usually, we can create a new scope by creating a new pair of parentheses {}, so that we can support multiple variable references.

let mut s = String::from("hello");
{
    let r1 = &mut s;
} // r1 goes out of scope here, so we can make a new reference with no
// problems.
let r2 = &mut s;

A similar rule is used to specify the behavior of both variable and immutable references to a variable.

let mut s = String::from("hello");
let r1 = &s; // no problem
let r2 = &s; // no problem
let r3 = &mut s; // There is a problem, but if it is not printed, no error will be reported
println!("r1 is {}", r1);
println!("r3 is {}", r3); // error[E0502]

Error message:

error[E0502]: cannot borrow `s` as mutable because it is also borrowed as immutable

If a variable has been immutably referenced, the variable can no longer become a variable reference.
This is not allowed because a reference variable thinks that the value of the variable it references is always constant, and a variable reference destroys its immutable attribute.

Hanging reference

In languages with pointers, a common error is hanging pointers. What is a dangling pointer? It means that the memory pointed to by the pointer has been allocated to other variables or has been released. In contrast, in Rust, the compiler guarantees that references do not become dangling references.
For example, the following code:

fn main() {
    let reference_to_nothing = dangle();
}

fn dangle() -> &String { // error[E0106]: missing lifetime specifier
    let s = String::from("hello");
    &s
}

The error reported here is a new feature of Rust - life cycle. I'll explain it in detail later.

Let's see why this error occurred? S is defined in the dangle function. The return value of the function is a reference, but the problem is that when the dangle function exits, s will leave the scope and will be released. This causes the error of hanging references.

The learned usage of Rust can be changed as follows:

fn main() {
    let reference_to_nothing = dangle();
}
fn dangle() -> String {
    let s = String::from("hello");
    s
}

Let's summarize the referenced rules:
At any time, you can choose one of the following: only one variable reference or any number of immutable references;
References must be available at all times;

Slice type

Slice type is another type that does not need to consider ownership, just like integer type. A slice is a reference to an element in a collection.

String slice

fn main() {
    let s = String::from("Hello world.");

    let hello = &s[0..5];
    let world = &s[6..11];
    println!("{}   and {}", hello, world);
}

Reference characters from 0 to 5, excluding characters with subscript 5.
We can create slices by specifying the range in brackets: [start..end], which contains the start character and does not contain the end character. The length is end start
The following are the same:

let s = String::from("hello");
let slice = &s[0..2];
let slice = &s[..2]; // The beginning 0 is omitted

//The following are the same:
let s = String::from("hello");
let len = s.len();
let slice = &s[3..len];
let slice = &s[3..]; // Omission of ending

//The following are the same:
let s = String::from("hello");
let len = s.len();
let slice = &s[0..len];
let slice = &s[..]; // Omission of beginning and end

You should pay extra attention to the UTF8 encoded String. If you accidentally put the index in the middle of the UTF8 character, an error will be reported!

String constants are slices

let s = "Hello, world!";

The type of s is & STR, which is an immutable reference.

String slice as parameter

fn first_word(s: &String) -> &str {
    // ...
}

fn first_word(s: &str) -> &str {
    // ...
}

String constants can also be directly passed in:

fn main() {
    let my_string = String::from("hello world");
    
    // first_word works on slices of `String`s
    let word = first_word(&my_string[..]);
    let my_string_literal = "hello world";
    
    // first_word works on slices of string literals
    let word = first_word(&my_string_literal[..]);
    
    // Because string literals *are* string slices already,
    // this works too, without the slice syntax!
    let word = first_word(my_string_literal);
}

Other slices
String slicing is a special case of strings. There are more common types of slices, such as arrays

let a = [1, 2, 3, 4, 5];
let slice = &a[1..3];

The type of slice is & [I32], and the type of string is called & STR, which is different!

Topics: Back-end Rust

Programmer Think