The Rust life cycle you don't know

Posted by ctcmedia on Wed, 05 Jan 2022 17:55:16 +0100

Rust - life cycle

Original text: https://hashrust.com/blog/lifetimes-in-rust/

Translator: Han Xuanliang (a go developer who loves open source and Rust)

introduce

For many Rust beginners, life time is a difficult concept to master. I also struggled for some time before I began to understand how important they are for the Rust compiler to perform its duties (move/borrow). Life time itself is not difficult. But they are so novel that most programmers have never seen them in other languages. What's worse, people use the word "lifetime" too much to talk about many closely related issues. In this article, I will separate these ideas in order to provide you with tools to think clearly about lifetime.

objective

Before we get into the details, let's take a look at why we need a lifecycle. What is their purpose? The lifecycle helps the compiler enforce a simple rule: no reference itself can survive longer than the object it references. In other words, the lifecycle helps the compiler eliminate dangling pointer errors (Note: that is, lifecycle annotations occur only when referenced).

In the following example, you will see that the compiler does this by analyzing the life cycle of related variables. If the referenced lifetime is less than the referenced lifetime, the compilation succeeds, otherwise the compilation fails.

"lifetime"

Part of the reason life cycle is so confusing is that in most of Rust's articles, the term life cycle is loosely used to refer to three different things:

  • Actual life cycle of variable
  • Lifecycle constraints
  • Lifecycle notes

Let's talk about it one by one.

variables lifecycle

The interaction pattern between variables in the code imposes some restrictions on their lifetime. For example, in the following code: x = & Y; This line adds a constraint, that is, the lifetime of X should be included in the lifetime of Y (x ≤ y):

//error:`y` does not live long enough
{
let x: &Vec<i32>;
    {
                 let y =Vec::new();//----+
//                               | y's lifetime
//                               |
        x = &y;//----------------|--------------+
//                               |              |
    }// <------------------------+              | x's lifetime
    println!("x's length is {}", x.len());//    |
}// <-------------------------------------------+

If this constraint is not added, x will be in println! Invalid memory access in code block. Because x is a reference to y, it will be destroyed on the previous line.

It should be noted that the constraint does not change the actual lifetime -- for example, the lifetime of x will actually extend to the end of the external block -- lifetime is just a tool used by the compiler to prohibit dangling references. In the above example, the actual lifetime does not meet the constraint: the lifetime of x has exceeded the lifetime of y. Therefore, this code cannot be compiled.

Lifecycle notes

As shown in the previous section, many times the compiler will (automatically) generate all lifetime constraints. However, as the code becomes more and more complex, the compiler will require developers to manually add constraints. Programmers do this through lifecycle annotations. For example, in the following code, the compiler needs to know print_ Whether the reference returned by ret() borrows s1 or s2, so the compiler requires the programmer to explicitly add this constraint:

// error:missing lifetime specifier
// this function's return type contains a borrowed value,
// but the signature does not say whether it is borrowed from `s1` or `s2`
fn print_ret(s1: &str,s2: &str) -> &str{
    println!("s1 is {}", s1);
    s2
}
fn main() {
        let some_str:String= "Some string".to_string();
        let other_str:String= "Other string".to_string();
        let s1 = print_ret(&some_str, &other_str);
}

📒:
If you want to know why the compiler cannot see that the output reference is borrowed from s2, take a look at this answer: here . To see when the compiler can omit lifetime annotations, see the lifetime section below.

Then, the developer marks s2 and the returned reference with 'a' to tell the compiler that the returned value is borrowed from s2.

fn print_ret<'a>(s1: &str,s2: &'astr) -> &'astr{
    println!("s1 is {}", s1);
    s2
}
fn main() {
        let some_str:String= "Some string".to_string();
        let other_str:String= "Other string".to_string();
        let s1 = print_ret(&some_str, &other_str);
}

However, I want to emphasize that just because 'A is marked on the parameter s2 and the returned reference, it does not mean that s2 and the returned reference have exactly the same lifetime. Instead, it should be understood that a return reference marked 'a' is borrowed from a parameter with the same tag.

Because s2 borrows other further_ STR, lifetime constraint is that the returned reference cannot exceed other_ Lifetime of str. If the constraint is satisfied here, the compilation is successful:

fn print_ret<'a>(s1: &str, s2: &'a str) -> &'a str {
    println!("s1 is {}", s1);
    s2
}
fn main() {
    let some_str: String = "Some string".to_string();
    let other_str: String = "Other string".to_string();//-------------+
    let ret = print_ret(&some_str, &other_str);//---+                 | other_str's lifetime
    //                                              | ret's lifetime  |
}// <-----------------------------------------------+-----------------+

Before showing more examples, briefly introduce the lifetime annotation syntax.

To create a lifetime annotation, you must first declare the lifetime parameter. For example, <'a > is a lifecycle statement. The lifetime parameter is a general parameter. You can read <'a > as "for some Lifetime 'a...". Once the lifetime parameter is declared, it can be used in other references to create a lifetime constraint.

Remember, by marking references with 'a', programmers are only constructing constraints; Then, the compiler's job is to find a specific lifetime for 'a that satisfies its constraints.

Example

Next, consider a function that finds the minimum of two values:

fn min<'a>(x: &'a i32, y: &'a i32) -> &'a i32 {
    if x < y {
        x
    } else {
        y
    }
}
fn main() {
    let p = 42;
    {
        let q = 10;
        let r = min(&p, &q);
        println!("Min is {}", r);
    }
}

Here, the 'a lifetime annotation marks the parameters x, y, and the return value. This means that the return value can be borrowed from X or y. Since X and y are further borrowed from p and q respectively, the life cycle of the returned reference should be included in the life cycle of p and q. This code can also be compiled to meet the following constraints:

fn min<'a>(x: &'a i32, y: &'a i32) -> &'a i32 {
    if x < y {
        x
    } else {
        y
    }
}
fn main() {
    let p = 42;//-------------------------------------------------+
    {//                                                           |
        let q = 10;//------------------------------+              | p's lifetime
        let r = min(&p, &q);//------+              | q's lifetime |
        println!("Min is {}", r);// | r's lifetime |              |
    }// <---------------------------+--------------+              |
}// <-------------------------------------------------------------+

Generally, when the same lifetime parameter marks two or more formal parameters of a function, the returned reference cannot exceed the lifetime with the smallest formal parameter.

As a final example, this is a mistake made by many novice C + + developers, that is, returning a pointer to a local variable. Similar behaviors are not allowed in Rust:

//Error:cannot return reference to local variable `i`
fn get_int_ref<'a>() -> &'a i32 {
    let i: i32 = 42;
    &i
}
fn main() {
    let j = get_int_ref();
}

Due to get_int_ref() has no parameters, so the compiler knows that the returned reference has to be borrowed from a local variable, which is not allowed. The compiler correctly avoids bug s because local variables are cleaned up when the returned reference attempts to access the memory it points to.

fn get_int_ref<'a>() -> &'a i32 {
    let i: i32 = 42;//-------+
    &i//                     | i's lifetime
}// <------------------------+
fn main() {
    let j = get_int_ref();//-----+
//                               | j's lifetime
}// <----------------------------+

Omitted scene

When the compiler allows developers to omit the lifetime annotation, it is called lifetime omission. Again, the term "life cycle omission" is also misleading - lifetime is closely related to the generation and destruction of variables. How can lifetime be omitted?

Therefore, what is omitted is not lifetime, but lifetime annotations and extended lifetime constraints. In earlier versions of the Rust compiler, omission is not allowed, and each life cycle needs to be marked. But over time, the compiler team observed that the same patterns of lifecycle annotations were repeated, so the compiler rules were modified to infer them.

The programmer can omit the tag in the following cases:

  1. When there is only one input parameter reference: in this case, the lifecycle annotation of the input parameter is assigned to all output parameter references. For example: FN some_ Func (s: & STR) - > & STR is inferred as FN some_ func<'a>(s: &'a str) -> &'a str
  2. When there are multiple incoming parameter references, but the first parameter is: & self / & mut self: in this case, the lifecycle annotation of the input parameter is also assigned to all output references. For example: FN some_ Method (& self) - > & STR is equivalent to FN some_ method<'a>(&'a self) -> &'a str

The omission of lifetime annotation reduces the confusion in the code, and the compiler may infer more schema lifetime constraints in the future.

summary

Many Rookies of Rust find the topic of life time difficult to understand. But lifetime itself is not the problem, but the way this concept is presented in many articles of Rust. In this article, I try to sort out the hidden meaning behind the overuse of the word "lifetime".

The life cycle of variables must meet some constraints imposed by the compiler and developers before the compiler can ensure that the code is reasonable. Without the mechanism of lifetime, the compiler will not be able to guarantee the security of most Rust programs.

Topics: Rust