The Rust Programming Language - Chapter 19 advanced features - 19.1 unsafe Rust

Posted by bliss322 on Sat, 04 Dec 2021 23:07:41 +0100

19 advanced features

We will learn more advanced features in this chapter

19.1 unsafe Rust

So far, the code we have compiled, Rust, will enforce checks at compile time to ensure memory security. However, Rust also provides a mode, unsafe Rust. These codes are written in unsafe blocks. They are no different from conventional codes, but they can provide additional functions to meet the needs that we cannot achieve through secure Rust code

Why design unsafe Rust? There are two reasons:

1. Static analysis is conservative in nature. Therefore, when using ANN Rust, when the compiler checks whether a piece of code supports a guarantee, when it is uncertain, even if the code itself is safe, it will reject the code due to conservative rules, which leads to code killing by mistake

We can compile the code that cannot be determined by the compiler in unsafe Rust, but the disadvantage is that we have to be responsible for the security of the code ourselves

2. The underlying computer hardware is inherently insecure. If unsafe rust is not allowed, we will not be able to complete some tasks. Rust needs to perform some operations, such as directly interacting with the operating system, and even implementing some underlying system programming, such as writing our own operating system. This is also rust's goal and its strength

The unsafe super ability of Rust mainly includes the following aspects

1. Dereference bare pointer

2. Call unsafe functions or methods

3. Access or modify variable static variables

4. Implement unsafe trait

5. Access the field of union

Also, note: unsafe does not close the borrowing checker or disable other t rust checks: so if you use references in unsafe code, it will still be checked. The unsafe keyword only provides unchecked exceptions to the above five functions. So we can still get some degree of security in unsafe blocks

Also, the code in the unsafe block does not necessarily represent insecurity. The reason has been mentioned above

When we compile unsafe code, we want to isolate it. Encapsulating it into a secure abstraction and providing a secure API is a good way. This will prevent unsafe leaks

Let's look at these features

Dereference bare pointer

In regular code, the compiler always ensures that the reference is valid

Unsafe Rust has two new types similar to references called raw pointers. Like references, bare pointers are divided into variable and immutable, written as * const T and * mut T respectively. The * here is part of the type, not the dereference operator. Immutability means that the value cannot be assigned directly after dereference

Difference between bare pointer and reference and smart pointer

1. It is allowed to ignore borrowing rules. You can have both variable and immutable pointers, or multiple variable pointers pointing to the same position

2. It is not guaranteed to point to valid memory

3. Null is allowed

4. No automatic cleaning function can be realized

By giving up Rust's security guarantee, we can gain performance or the ability to use another language or hardware interface

fn main() {
    let mut num = 5;

    let r1 = &num as *const i32;
    let r2 = &mut num as *mut i32;

    println!("{:?},{:?}",r1,r2)
}
     Running `target/debug/advancedfunction`
0x7ffeed3519fc,0x7ffeed3519fc

Create immutable and immutable raw pointers by reference and print them (creating a raw pointer directly from a secure reference can always ensure that it is valid). Here, as is used to force the variable reference and immutable reference to the corresponding raw pointer type

We can not create a bare pointer in the unsafe code block, but dereference it outside

Let's create a bare pointer whose validity cannot be determined. Trying to use arbitrary memory is an undefined behavior. This address may or may not have data. Compilation may optimize this memory access, or the program may have segment errors. There is usually no good reason to write such code, but it is feasible

fn main() {
  let address = 0x012345usize;
  let r = address as *const i32;

  println!("{:?}",r)
}

Creates a raw pointer to any memory address

Now we dereference the raw pointer in the unsafe block, because we can't dereference the raw pointer outside the unsafe block

fn main() {
    let mut num = 5;
    
    let r1 = &num as *const i32;
    let r2 = &mut num as *mut i32;

    println!("{:?},{:?}",r1,r2);
        
    unsafe {
        println!("r1 is: {}", *r1);
        println!("r2 is: {}", *r2)
    }
}
     Running `target/debug/advancedfunction`
0x7ffeec00695c,0x7ffeec00695c
r1 is: 5
r2 is: 5	

Creating a bare pointer does not pose any danger. It is only possible to encounter invalid values when accessing the value it points to

Although we can create variable pointers and good immutable pointers of the same address at the same time through bare pointers, if we modify data through variable pointers, it may potentially cause data competition. Please pay attention

Since there is such a danger, why should it have this function? First, we can call the c code interface. The other is to build security abstractions that the borrowing checker cannot understand

Calling an unsafe function or method

Similarly, unsafe functions and methods are very similar to conventional functions and methods, except that there is an unsafe keyword in front of them. Of course, we need to be responsible for these functions themselves

unsafe fn dangerous() {}

unsafe {
    dangerous();
}

The unsafe function must be called in the unsafe block, otherwise the compiler will report the error.

src/main.rs:14:5
   |
14 |     dangerous()
   |     ^^^^^^^^^^^ call to unsafe function
   |
   = note: consult the function's documentation for information on how to avoid undefined behavior

Inserting a function call into an unsafe block tells the compiler that we know what we are doing. Unsafe function bodies are also valid unsafe blocks, so there is no need to add additional unsafe blocks for another unsafe operation

Create a security abstraction of unsafe code

The inclusion of some unsafe code in a function does not mean that the whole function is unsafe. We generally encapsulate unsafe code into safe functions, which is a common abstraction

fn main() {
    let mut v = vec![1,2,3,4,5,6];

    let r = &mut v[..];

    let(a,b) = r.split_at_mut(3);

    assert_eq!(a,&mut [1,2,3]);
    assert_eq!(b,&mut [4,5,6]);
}

A slice is divided into two slices, where we use split_at_mut function

Using only secure Rust to implement this function may be as follows, which is only applicable to i32 rather than generic T

pub fn split_at_mut(slice: &mut [i32],mid:usize) -> (&mut [i32],&mut [i32]) {
    let len = slice.len();
    assert!(mid <= len);

    (&mut slice[..mid],
     &mut slice[mid..])
}
error[E0499]: cannot borrow `*slice` as mutable more than once at a time
 --> src/main.rs:9:11
  |
4 |   pub fn split_at_mut(slice: &mut [i32],mid:usize) -> (&mut [i32],&mut [i32]) {
  |                              - let's call the lifetime of this reference `'1`
...
8 |       (&mut slice[..mid],
  |       -     ----- first mutable borrow occurs here
  |  _____|
  | |
9 | |      &mut slice[mid..])
  | |___________^^^^^_______- returning this value requires that `*slice` is borrowed for `'1`
  |             |
  |             second mutable borrow occurs here

We borrowed two different fragments of slice in the code. This operation is safe, but Rust is not intelligent enough. It thinks we borrowed the same slice twice, so it rejected this code. Then we can only use unsafe Rust to achieve this

use std::slice;

pub fn split_at_mut(slice: &mut [i32],mid:usize) -> (&mut [i32],&mut [i32]) {
    let len = slice.len();
    let ptr = slice.as_mut_ptr();
    assert!(mid <= len);

    unsafe {
        (slice::from_raw_parts_mut(ptr, mid),
         slice::from_raw_parts_mut(ptr.add(mid), len-mid),)
    }
}

Let's look at the details

Because we want to borrow the same slice twice (actually borrowing two different fragments), we use the bare pointer, as_mut_ptr forcibly converts a variable reference to a bare pointer

pub const fn as_mut_ptr(&mut self) -> *mut T {
        self as *mut [T] as *mut T
    }

With bare pointers, we can use them many times in unsafe code blocks. By the way, let's take a look at from_ raw_ parts_ What did mut do? It cut slice in half!

pub unsafe fn from_raw_parts_mut<'a, T>(data: *mut T, len: usize) -> &'a mut [T] {
    debug_assert!(is_aligned_and_not_null(data), "attempt to create unaligned or null slice");
    debug_assert!(
        mem::size_of::<T>().saturating_mul(len) <= isize::MAX as usize,
        "attempt to create slice covering at least half the address space"
    );
    unsafe { &mut *ptr::slice_from_raw_parts_mut(data, len) }
}

Note: we do not need to split_ at_ The result of the MUT function is labeled unsafe and can be called in the secure Rust. This is because we created a security abstraction of unsafe code and used unsafe code in a safe way, because we directly created valid bare pointers from using parameters (parameters are valid)

However, the following functions may crash when using slice

use std::slice;

let address = 0x012345usize;
let r = address as *const i32;

let slice: &[i32] = unsafe {
    slice::from_raw_parts_mut(r,10000)
}

Create slice from any memory address

We don't have the memory of this arbitrary address, nor can we guarantee that the slice created by this code contains a valid i32 value. Attempting to use a slice that is supposed to be valid will result in undefined behavior

Use the extern function to call external code

The keyword extern in Rust allows us to interact with other languages using Rust code, which helps to create and use external interfaces. An external function interface is the way a programming language defines functions that allow different (external) programming languages to call these functions

extern "C" {
    fn abs(input:i32)-> i32;
}
fn main() {
    unsafe {
        println!("Absolute value of -3 according to C: {}",abs(-3));
    }
}

Declare and call the extern function in another language

The functions declared in the extern block are always unsafe in Rust code, because other languages will not enforce Rust's rules and Rust cannot check them, so it is the programmer's responsibility to ensure their safety

In the external "C" block, we list the function signature and name of another language we want to call. The "C" part defines the application binary interface (ABI) used by the external function, and ABI defines how to call this function at the assembly language level. "C" ABI is the most common and follows the ABI of the C programming language

Call the Rust function from another language

We can also use extern to create an interface that allows other languages to call Rust functions

Unlike the extern block, add the extern keyword before the fn keyword and specify the ABI to be used. You also need to add #[no#u mangle] annotation to tell the Rust compiler not to mangle the name of this function. Mangle occurs when the compiler changes the function name we specify to a different name, which adds additional information for other programming procedures, but makes its name more difficult to read. The compiler of each programming language will mangle the function name in a slightly different way, so in order for the Rust function to be specified in other languages, the name mangling of the Rust compiler must be disabled

For the following example, once it is compiled into a dynamic library and linked from the C language, call_from_c functions can be accessed in C code

#[no_mangle]
pub extern "C" fn call_from_c() {
    println!("Just called a Rust function from C!");
}

unsafe is not required for extern

Access or modify variable static variables

Global variables are called static variables in rust. Rust supports them, but they are problematic for ownership rules. If two threads access the same variable global variables, it may lead to data competition

static HELLO_WORLD: &str = "Hello, world!";

fn main() {
    println!("name is: {}", HELLO_WORLD);
}

Define and use a global variable

Static variables are similar to constants. They are usually written in all uppercase and underscore. It is safe to access immutable static variables. However, the value memory address of static variables is fixed, but constants allow data to be copied. Static variables are variable. It is not safe to access variable static variables

static mut COUNTER:u32 =0;

fn add_to_count (inc:u32) {
    unsafe {
        COUNTER += inc;
    }
}
fn main() {
    add_to_count(3);

    unsafe {
        println!("COUNTER:{}", COUNTER);
    }
}

For this competition problem, please give priority to concurrency technology and thread safe smart pointers, so that the compiler can detect whether data access between different threads is safe

Implement insecure trait

A trait is unsafe when at least one of its methods contains invariants that cannot be verified by the compiler. You can add the unsafe keyword before trait to declare trait unsafe, and the implementation of trait must also be marked unsafe

unsafe trait Foo {
    //methods go here
}
unsafe impl Foo for i32 {
    //method implementations go here
}

Define and implement unsafe trait s

In the section "using the extensible concurrency of Sync and Send traits" in Chapter 16, the compiler will automatically implement Sync and Send traits for types that are completely composed of Send and Sync types. If the implementation contains types other than Sync and Send, such as bare pointers, and you want to mark this type as Send or Sync, you must use unsafe

Accessing fields in a consortium

union is similar to struct, but only one declared field can be used in an instance at the same time

The consortium is mainly used to interact with the consortium in C code. It is not safe to access the fields of the consortium, because Rust cannot guarantee the type of data currently stored in the consortium instance. You can see the reference document for more information

When to use unsafe code

Of course, it is used when it is necessary and we can ensure its safety. After all, in these cases, the compiler can not help ensure memory safety

Topics: security Rust