Application of Rust beauty iterator in algorithm

Posted by Cook on Sat, 15 Jan 2022 05:51:22 +0100

preface

In this paper, we will introduce several methods related to iterators in Rust language from a simple algorithm problem to help you understand chars, all, try_ fold,enumerate,try_ for_ each,char_ Use of indices. ​

The title is as follows

We define that capitalized words are correct when:
quote
All letters are capitalized, such as "USA".
All letters in a word are not capitalized, such as "leetcode".
If a word contains more than one letter, only the first letter is capitalized, such as "Google".
Give you a string word. If the uppercase usage is correct, return true; Otherwise, false is returned.
quote
Source: LeetCode
Link: https://leetcode-cn.com/probl...
The copyright belongs to Lingkou network. For commercial reprint, please contact the official authorization, and for non-commercial reprint, please indicate the source.

analysis
According to the meaning of the question, the string satisfying the condition is

  • Either the characters in the string are lowercase;
  • Either all in uppercase;
  • Either the first is uppercase and the rest is lowercase.

Original solution
According to the above analysis, we only need to make the following three judgments

  • If all characters in word are uppercase, return true
  • Whether all characters in word are lowercase and return true
  • If the first character in word is uppercase and the remaining characters are lowercase, return true
  • false is returned in other cases
pub fn detect_capital_use(word: String) -> bool {
    if word.len() < 2 {
        return true;
    }

    // All judgments are capitalized
    let mut res = true;
    for c in word.as_bytes() {
        if c.is_ascii_lowercase() {
            res = false;
            break;
        }
    }
    if res {
        return res;
    }

    // All judgments are lowercase
    let mut res = true;
    for c in word.as_bytes() {
        if c.is_ascii_uppercase() {
            res = false;
            break;
        }
    }
    if res {
        return res;
    }

    // Judge whether the first letter is uppercase and the rest is lowercase
    if word.as_bytes()[0].is_ascii_lowercase() {
        return false;
    }
    let mut res = true;
    for c in &word.as_bytes()[1..] {
        if c.is_ascii_uppercase() {
            res = false;
            break;
        }
    }
    if res {
        return res;
    }

    false
}

Using Iterators
The above code uses three iterations. If the iterator in Rust is used, it can be very concise. The corresponding code is as follows:

pub fn detect_capital_use(word: String) -> bool {

    if word.len() ==0{
        return true
    }
    if word.len() ==1{
        return true
    }

    let mut word1 = word.chars(); // Returns the iterator of characters in word
    if word1.all(|x|x.is_lowercase()){ // All lowercase
        return true 
    }

    let mut word1 = word.chars();
    if word1.all(|x|x.is_uppercase()){ // All capitalized
        return true 
    }

    let mut word1 = word.chars();
    let first_word = word1.next().unwrap();// Get first character
    if first_word.is_lowercase(){  //First character uppercase
        return false
    }
    if word1.all(|x|x.is_lowercase()){ // The remaining lowercase
        return true 
    }

    false

}

code analysis
We use it in the above code chars() method obtains the character iterator of String, and then uses the all method of iterator in t rust to complete this function. The whole code logic is very close to human natural language and has strong readability. Let's first introduce the all method: the all method of the iterator is used to judge whether all items traversed by the iterator meet the closure conditions

  1. Tests if every element of the iterator matches a predicate.
  2. all() takes a closure that returns true or false. It applies this closure to each element of the iterator, and if they all return true, then so does all(). If any of them return false, it returns false.
  3. all() is short-circuiting; in other words, it will stop processing as soon as it finds a false, given that no matter what else happens, the result will also be false.
  4. An empty iterator returns true.

Translated as

  1. Determine whether each element in the iterator satisfies the assertion (that is, the incoming closure function).
  2. all() takes a closure as a parameter, which returns true or false. all() passes each element into the closure during iterator traversal. If all elements return true in the closure, all() returns true. Otherwise, all() returns false.
  3. All () is a short circuit; In other words, in the process of traversal, once an element passes in a closure and returns false, it will immediately stop traversal. No matter what the result of the following elements is, the final result of all() is false
  4. An empty iterator's all() always returns true

The source code corresponding to all() is as follows:

#[inline]
    #[stable(feature = "rust1", since = "1.0.0")]
    fn all<F>(&mut self, f: F) -> bool
    where
        Self: Sized,
        F: FnMut(Self::Item) -> bool,
    {
        #[inline]
        fn check<T>(mut f: impl FnMut(T) -> bool) -> impl FnMut((), T) -> ControlFlow<()> {
            move |(), x| {
                if f(x) { ControlFlow::CONTINUE } else { ControlFlow::BREAK }
            }
        }
        self.try_fold((), check(f)) == ControlFlow::CONTINUE
    }

You can see that the internal start of the all method is to use try_fold(). We mentioned above that all() is short circuited, that is, try is used_ Characteristics of fold.

  • First, the all method encapsulates the incoming closure f into a new closure of impl fnmut ((), t) - > controlflow < () > type. This new closure returns ControlFlow::CONTINUE when f(x) is true, ControlFlow::BREAK when f(x) is false, and try_fold() will exit in advance when the closure function returns ControlFlow::BREAK, realizing the nature of short circuit.
  • try_fold() if ControlFlow::CONTINUE is finally returned, it means that all elements execute f(x) and return true. If ControlFlow::BREAK is returned, it means that an element is encountered in the middle, and f(x) returns false

Next, let's take a closer look at try_fold this method:

  • An iterator method that applies a function as long as it returns successfully, producing a single, final value.
  • try_fold() takes two arguments: an initial value, and a closure with two arguments: an 'accumulator', and an element. The closure either returns successfully, with the value that the accumulator should have for the next iteration, or it returns failure, with an error value that is propagated back to the caller immediately (short-circuiting).
  • The initial value is the value the accumulator will have on the first call. If applying the closure succeeded against every element of the iterator, try_fold() returns the final accumulator as success.
  • Folding is useful whenever you have a collection of something, and want to produce a single value from it.

Translated as

  • try_ The fold method applies a function on the iterator. If the function consistently returns success, it will continue to execute until a final value is returned. If it fails, it will exit in advance.
  • try_fold accepts two parameters. The first parameter is the initial value and the second parameter is a closure. This closure needs to pass in two parameters: an accumulated value and an element value. If this closure returns successfully, it will return the accumulated value required for the next operation. If it fails, it will immediately return the wrong value directly to the caller (short circuit)
  • The initial value is the cumulative value used in the first closure call. If every element in the iterator succeeds after being applied to the closure, then try_ The total accumulated value will be returned after the execution of fold().
  • Fold is useful when you have an element mechanism and need to get a single value from this collection.

try_ The source code of fold is as follows, which is also relatively simple

#[inline]
    #[stable(feature = "iterator_try_fold", since = "1.27.0")]
    fn try_fold<B, F, R>(&mut self, init: B, mut f: F) -> R
    where
        Self: Sized,
        F: FnMut(B, Self::Item) -> R,
        R: Try<Output = B>,
    {
        let mut accum = init;
        while let Some(x) = self.next() {
            accum = f(accum, x)?;
        }
        try { accum }
    }

You can see try_ Within fold, each element in the iterator is continuously obtained through the next method of the iterator, and the function f, Accum = f(accum,x)?;, Among them? Is a special symbol in t rust, indicating that if f(accum,x) returns ControlFlow::BREAK, it will exit the current loop and take ControlFlow::BREAK as a try_ The return value of fold.

Simplified all lowercase and all uppercase judgment

In the above code, determine whether word is all lowercase logic judgment, divided into two rows, first obtain iterator through chars, then call all method to determine whether all are lowercase, this code can directly replace word. with one line of code. chars(). All (|x|x.is_lowercase()), judge that words are capitalized, the same is true, as follows:

pub fn detect_capital_use(word: String) -> bool {

    if word.len() ==0{
        return true
    }
    if word.len() ==1{
        return true
    }

    if word.chars().all(|x|x.is_lowercase()){ // If all are lowercase
        return true 
    }

    if word.chars().all(|x|x.is_uppercase()){ // If all are capitalized
        return true 
    }

    let mut word1 = word.chars();
    let first_word = word1.next().unwrap(); // Get first character
    if first_word.is_lowercase(){  //Returns false if the first character is lowercase
        return false
    }
    if word1.all(|x|x.is_lowercase()){  // The rest are lowercase and return true
        return true 
    }

    false

}

Simplify the judgment of the first character uppercase and other lowercase

because. The iterator obtained by chars will only return each element in the traversal, and will not return the index value in the original set, so we first call next to obtain the first character to judge whether it is uppercase, and then use all to judge whether the remaining characters are lowercase. If we can get the index and element value at the same time, we can judge the first character and other characters at the same time in the closure of all. In order to obtain the iterator containing the index, we can use enumerate to encapsulate the iterator. The code is as follows:

pub fn detect_capital_use(word: String) -> bool {
    if word.len() == 0 {
        return true;
    }
    if word.len() == 1 {
        return true;
    }

    if word.chars().all(|x| x.is_lowercase()) {// If all are lowercase
        return true;
    }

    if word.chars().all(|x| x.is_uppercase()) {// If all are capitalized
        return true;
    }

    if word.chars().enumerate().all(|(id, x)| {
        if id == 0 {
            x.is_uppercase()  // The first character is uppercase
        } else {
            x.is_lowercase()  // The rest are lowercase
        }
    }) {
        return true;
    }

    false
}

You can see that the element returned by Enumerate () each iteration is a tuple, and the first element of the tuple is the index value. We can judge according to the index value and element value. The method of enumerate() returns a new iterator of type Enumerate. The corresponding source code is as follows:

pub struct Enumerate<I> {
    iter: I,
    count: usize,
}
impl<I> Enumerate<I> {
    pub(in crate::iter) fn new(iter: I) -> Enumerate<I> {
        Enumerate { iter, count: 0 }
    }
}

#[stable(feature = "rust1", since = "1.0.0")]
impl<I> Iterator for Enumerate<I>
where
    I: Iterator,
{
    type Item = (usize, <I as Iterator>::Item);

    /// # Overflow Behavior
    ///
    /// The method does no guarding against overflows, so enumerating more than
    /// `usize::MAX` elements either produces the wrong result or panics. If
    /// debug assertions are enabled, a panic is guaranteed.
    ///
    /// # Panics
    ///
    /// Might panic if the index of the element overflows a `usize`.
    #[inline]
    #[rustc_inherit_overflow_checks]
    fn next(&mut self) -> Option<(usize, <I as Iterator>::Item)> {
        let a = self.iter.next()?;
        let i = self.count;
        self.count += 1;
        Some((i, a))
    }

    ...
}

You can see that Enumerate mainly encapsulates the previous iterator, and then internally maintains a count to record the index, which is updated every time next and returned as the first element of the tuple. Three logic codes will be merged:

pub fn detect_capital_use(word: String) -> bool {
    if word.len() == 0 {
        return true;
    }
    if word.len() == 1 {
        return true;
    }

    if word.chars().all(|x| x.is_lowercase()) // If all are lowercase
        || word.chars().all(|x| x.is_uppercase())// If all are lowercase
        || word.chars().enumerate().all(|(id, x)| {
            if id == 0 {
                x.is_uppercase() // The first character is uppercase
            } else {
                x.is_lowercase() // The second character is lowercase
            }
        })
    {
        return true;
    }

    false
}

Since all of the empty iterator always returns true, the above judgment of word length can also be omitted, and the following is obtained:

pub fn detect_capital_use(word: String) -> bool {

    if word.chars().all(|x| x.is_lowercase()) // 
        || word.chars().all(|x| x.is_uppercase())
        || word.chars().enumerate().all(|(id, x)| {
            if id == 0 {
                x.is_uppercase()
            } else {
                x.is_lowercase()
            }
        })
    {
        return true;
    }

    false
}

In addition, word chars(). Enumerate () can also be abbreviated to word char_ Indices (), so the above code can be further written as

pub fn detect_capital_use(word: String) -> bool {

    if word.chars().all(|x| x.is_lowercase()) // 
        || word.chars().all(|x| x.is_uppercase())
        || word.char_indices().all(|(id, x)| {
            if id == 0 {
                x.is_uppercase()
            } else {
                x.is_lowercase()
            }
        })
    {
        return true;
    }

    false
}

Three iterations become one
The above algorithm will lead to three iterations for the case of capitalized initials, which can be further optimized

  • Judge whether the remaining letters except the first letter are all lowercase or uppercase. Specific implementation: judge whether the case of the second letter is consistent with that of the first letter
  • If the remaining cases are different, return false directly
  • Then judge according to the first letter and the second letter
  • Returns true if the first letter is uppercase. All uppercase or initial uppercase for
  • Returns true if both the first and second letters are lowercase. The correspondence is lowercase, please.
  • Return false

The code is as follows:

pub fn detect_capital_use(word: String) -> bool {

    let mut word = word.chars();
    let first = word.next();
    if first.is_none(){
        return true
    }
    let first = first.unwrap();

    if let Some(second) = word.next(){
        let res = word.try_for_each(move |x|{
            if second.is_lowercase() && x.is_lowercase(){
                return Ok(())
            }

            if second.is_uppercase() && x.is_uppercase(){
                return Ok(())
            }

            Err(())
        });

        if res.is_err(){
            return false
        }
        if first.is_uppercase(){
            return true
        }

        if first.is_lowercase() && second.is_lowercase(){
            return true
        }

        false
    }else{
        true
    }

}

Try is used in the code here_ for_ Each, official description:
An iterator method that applies a fallible function to each item in the iterator, stopping at the first error and returning that error. Execute a function that may cause an error for each element in the iterator. If the function returns an error, immediately stop the iteration and return the error. Where try_ for_ The source code of each is as follows. You can see that try is also called internally_ Fold implementation

fn try_for_each<F, R>(&mut self, f: F) -> R
    where
        Self: Sized,
        F: FnMut(Self::Item) -> R,
        R: Try<Output = ()>,
    {
        #[inline]
        fn call<T, R>(mut f: impl FnMut(T) -> R) -> impl FnMut((), T) -> R {
            move |(), x| f(x)
        }

        self.try_fold((), call(f))
    }

So the above code can also put try_for_each uses try directly_ Fold, as follows:

pub fn detect_capital_use(word: String) -> bool {
    let mut word = word.chars();
    let first = word.next();
    if first.is_none() {
        return true;
    }
    let first = first.unwrap();

    if let Some(second) = word.next() {
        let res = word.try_fold(second, move |sd, x| {
            if sd.is_lowercase() && x.is_lowercase() {
                return Ok(sd);
            }

            if sd.is_uppercase() && x.is_uppercase() {
                return Ok(sd);
            }

            Err(())
        });

        if res.is_err() {
            return false;
        }
        if first.is_uppercase() {
            return true;
        }

        if first.is_lowercase() && second.is_lowercase() {
            return true;
        }

        false
    } else {
        true
    }
}

summary
In this paper, we can see that although it is a simple algorithm example, we use the iterator in Rust language and several related methods, chars,all,try_fold,enumerate,try_for_each,char_indices `, you can see the flexibility and power of the Rust language.

Topics: Database Rust