How to implement user defined iterator

Posted by gl_itch on Thu, 28 Oct 2021 21:44:01 +0200

  1. home page
  2. special column
  3. logger
  4. Article details
0

How to implement user defined iterator

hedzr Released today at 19:15

Implement your own iterator

Using std::iterator

Before C++17, implementing custom iterators was recommended to derive from std::iterator.

Basic definition of std::iterator

Std::iterator has the following definition:

template<
    class Category,
    class T,
    class Distance = std::ptrdiff_t,
    class Pointer = T*,
    class Reference = T&
> struct iterator;

Where, T is your container class type, no need to mention it. And Category is the so-called iterator label that must be specified first, reference here . Categories mainly include:

  1. input_iterator_tag: input iterator
  2. output_iterator_tag: output iterator
  3. forward_iterator_tag: forward iterator
  4. bidirectional_iterator_tag: bidirectional iterator
  5. random_access_iterator_tag: random access iterator
  6. contiguous_iterator_tag: continuous iterator

These labels seem quite inexplicable, as if I know their purpose, but in fact they are difficult to understand and choose.

iterator tag

The following is a rough introduction to their characteristics and their associated entities to help you understand them.

These tags are actually associated with some entity classes with the same name, such as input_iterator, etc. through template specialization technology, realize proprietary distance() and advance() respectively to achieve specific iterative optimization effect.

input_iterator_tag

input_iterator_tag can wrap the output of a function -- to be used as an input stream for others. So it can only be incremented (only + 1). You can't + n it. You can only simulate the corresponding effect by incrementing it n times. input_ The iterator cannot decrement (- 1) because the input stream does not have such a feature. Its iterator value (* it) is read-only and you cannot set it.

But output_iterator_tag,forward_ iterator_ The iterator value of tag is readable and writable. The read-write iterator values are:

std::list<int> l{1,2,3};
auto it = l.begin();
++it;
(*it) = 5; // <- set value back into the container pointed by iterator

input_ The iterator renders the container as an input stream, which you can use input_ The iterator receives the input data stream.

output_iterator_tag

output_iterator_tag is rarely used directly by users. It is usually associated with back_insert_iterator/ front_insert_iterator/ insert_iterator and ostream_iterator, etc.

output_ The iterator has no + + / - - capability. You can report to output_ Write / place new values in the container pointed to by the iterator, that's all.

If you have output stream style rendering requirements, you can choose it.

forward_iterator_tag

forward_iterator_tag represents a forward iterator, so it can only increment and cannot fallback. It inherits input_iterator_tag has all the basic capabilities, but it has been enhanced, such as allowing setting values.

In terms of capability, input_ The iterator supports reading / setting values and incremental walking, but does not support decreasing walking (simulation is required, which is inefficient), + n needs circular simulation, so it is inefficient. However, if your container has only such exposed requirements, forward_iterator_tag is the best choice.

In theory, it supports forward_iterator_tag iterators must implement at least begin/end.

bidirectional_iterator_tag

bidirectional_iterator_ The associated entity of tag is bidirectional_iterator is bidirectional and walkable. It can be it + + or it --, such as std::list. Like forward_iterator_tag, bidirectional_iterator_tag cannot directly + n (and - n), so + n needs a specialized advance function to cycle n times, each time + 1 (that is, simulate by cycling n times incrementally or decrementally).

In theory, it supports bidirectional_iterator_tag iterators must implement begin/end and rbegin/rend at the same time.

random_access_iterator_tag

random_access_iterator_ Random access iterator represented by tag, random_ access_ The iterator supports reading / setting values, increasing and decreasing, and + n/-n.

Due to random_access_iterator supports efficient + n/-n, which also means that it allows efficient direct positioning. The container of this iterator usually also supports operator [] subscript access, just like std::vector.

contiguous_iterator_tag

contiguous_iterator_tag was introduced in C++17, but there is a problem with the support of compilers, so we can't introduce it in detail at present. For implementation, we don't have to consider its existence.

Implementation of custom iterator

A custom iterator needs to select an iterator tag, that is, the set of supporting capabilities of the iterator. Here is an example:

namespace customized_iterators {
  template<long FROM, long TO>
  class Range {
    public:
    // member typedefs provided through inheriting from std::iterator
    class iterator : public std::iterator<std::forward_iterator_tag, // iterator_category
    long,                      // value_type
    long,                      // difference_type
    const long *,              // pointer
    const long &               // reference
      > {
      long num = FROM;

      public:
      iterator(long _num = 0)
        : num(_num) {}
      iterator &operator++() {
        num = TO >= FROM ? num + 1 : num - 1;
        return *this;
      }
      iterator operator++(int) {
        iterator ret_val = *this;
        ++(*this);
        return ret_val;
      }
      bool operator==(iterator other) const { return num == other.num; }
      bool operator!=(iterator other) const { return !(*this == other); }
      long operator*() { return num; }
    };
    iterator begin() { return FROM; }
    iterator end() { return TO >= FROM ? TO + 1 : TO - 1; }
  };

  void test_range() {
    Range<5, 13> r;
    for (auto v : r) std::cout << v << ',';
    std::cout << '\n';
  }

}

The prototype of this example comes from cppreference std::iterator And its original author, slightly modified.

Self increasing and self decreasing operator overload

A separate section, because there are too many garbage tutorials.

The operator overloading of self increasing and self decreasing is divided into two forms: prefix and suffix. The prefix method returns a reference and the suffix method returns a new copy:

struct X {
  // Prefix self increment
  X& operator++() {
    // The actual self increase is carried out here
    return *this; // Returns a new value by reference
  }

  // Suffix self increment
  X operator++(int) {
    X old = *this; // Copy old values
    operator++();  // Prefix self increment
    return old;    // Return old value
  }

  // Prefix subtraction
  X& operator--() {
    // The actual self subtraction is carried out here
    return *this; // Returns a new value by reference
  }

  // Suffix subtraction
  X operator--(int) {
    X old = *this; // Copy old values
    operator--();  // Prefix subtraction
    return old;    // Return old value
  }
};

Or check the cppreference file as well as file , don't look at those tutorials. I can't find two correct ones.

The correct encoding is to implement a prefix overload, and then implement a suffix overload based on it:

struct incr {
  int val{};
  incr &operator++() {
    val++;
    return *this;
  }
  incr operator++(int d) {
    incr ret_val = *this;
    ++(*this);
    return ret_val;
  }
};

If necessary, you may need to implement operator = or X (x const & O) copy constructors. But it can be omitted for simple and trivial structs (if you are not sure whether automatic memory copy is provided, consider looking at the assembly code, or simply explicitly implement the operator = or X (x const & O) copy constructor)

From C++17

However, std::iterator has been deprecated since C++17.

If you really care about gossip, you can go here Look at the discussion.

In most cases, you can still use std::iterator to simplify code writing, but this feature and the early concepts of iterator tags, categories and so on are outdated.

Full handwriting iterator

Therefore, in the new era starting from C++17, the custom iterator is only handwritten for the time being.

namespace customized_iterators {
  namespace manually {
    template<long FROM, long TO>
    class Range {
      public:
      class iterator {
        long num = FROM;

        public:
        iterator(long _num = 0)
          : num(_num) {}
        iterator &operator++() {
          num = TO >= FROM ? num + 1 : num - 1;
          return *this;
        }
        iterator operator++(int) {
          iterator ret_val = *this;
          ++(*this);
          return ret_val;
        }
        bool operator==(iterator other) const { return num == other.num; }
        bool operator!=(iterator other) const { return !(*this == other); }
        long operator*() { return num; }
        // iterator traits
        using difference_type = long;
        using value_type = long;
        using pointer = const long *;
        using reference = const long &;
        using iterator_category = std::forward_iterator_tag;
      };
      iterator begin() { return FROM; }
      iterator end() { return TO >= FROM ? TO + 1 : TO - 1; }
    };
  } // namespace manually

  void test_range() {
    manually::Range<5, 13> r;
    for (auto v : r) std::cout << v << ',';
    std::cout << '\n';
  }

}

The iterator traits section in the example is not required. You don't have to support them.

Things to take care of

Considerations for a fully handwritten iterator include:

  1. begin() and end()
  2. Iterator embedding class (not necessarily limited to embedding), which at least implements:

    1. The increment operator is overloaded for walking
    2. Decrement operator overload, if it is bidirectional_iterator_tag or random_access_iterator_tag
    3. operator * operation overloaded for iterator evaluation
    4. operator!= Operator overloading to calculate the iteration range; You can also explicitly overload operator = = (by default, the compiler automatically generates a matching substitute from the! = operator)

If your code supports iteration scope, you can use the for scope loop:

your_collection coll;
for(auto &v: coll) {
  std::cout << v << '\n';
}

For the expansion of the for range loop, you can view here.

After C++20

After C++20, iterators have changed dramatically. However, because its engineering implementation is still very early, it will not be discussed in this paper.

Other related

Besides the iterator, there is const_iterator

For code specification and security, getter s usually provide two at a time, writable and non writable:

struct incr {
  int &val(){ return _val; }
  int const &val() const { return _val; }
  private:
  int _val{};
}

In the same way, the begin() and end() of iterators should provide at least two versions of const and non const. In general, you can help provide multiple versions through independent implementation:

struct XXX {
  
  // ... struct leveled_iter_data {
  //    static leveled_iter_data begin(NodePtr root_) {...}
  //.   static leveled_iter_data end(NodePtr root_) {...}
  // }
  
  using iterator = leveled_iter_data;
  using const_iterator = const iterator;
  iterator begin() { return iterator::begin(this); }
  const_iterator begin() const { return const_iterator::begin(this); }
  iterator end() { return iterator::end(this); }
  const_iterator end() const { return const_iterator::end(this); }

}

This is a brainless way. The read-write security is constrained within XXX: of course, the owner can understand what should be exposed and what needs to be temporarily constrained.

Except iterator and Const_ In addition to iterator, rbegin/rend, cbegin/cend, etc. can also be considered to be implemented.

Note: use of iterators

The use of iterators must pay attention to the take as you go rule.

void test_iter_invalidate() {
  std::vector<int> vi{3, 7};
  auto it = vi.begin();
  it = vi.insert(it, 11);
  vi.insert(it, 5000, 23);
  vi.insert(it, 1, 31);                // crach here!
  std::cout << (*it) << '\n';
  return;
}

In most OS environments, vi.insert(it, 5000, 23); The statement has a great probability that the vector has to reallocate the internal array space. Therefore, after the statement is executed, the internal pointer held by it is meaningless (it still points to a position of the old buffer). Therefore, the continued use of it in the next line of statement will lead to incorrect pointing and writing. This error often leads to SIGSEGV fatal exception because the obsolete buffer may have been scheduled to be in the page missing state. If the SIGSEGV signal is generated, you may be very lucky. On the contrary, if the outdated buffer is still valid, it is fatal that this statement can be executed without reporting any errors.

Iterator search and delete

stdlib uses a container called erase and remove To actually delete an element. Take std::list as an example, remove_if() can find the qualified elements from the list, gather (collect) them and move them to the end of the list, and then return the position iter of the first element in the group. However, these elements have not been deleted from the list. If you need to remove them, you need to explicitly remove them with list.erase(iter, list.end()).

So delete the element as follows:

bool IsOdd(int i) { return i & 1; }

std::vector<int> v = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
v.erase(std::remove_if(v.begin(), v.end(), IsOdd), v.end());

std::list<int> l = { 1,100,2,3,10,1,11,-1,12 };
l.erase(l.remove_if(IsOdd), l.end());

Since std::vector cannot aggregate elements to the end of the linked list like std::list, it does not remove_if() member function, so you need STD:: remove to do search & erase on it_ If participation. std::list can directly use the member function remove_if to complete, the code also appears a little concise.

Erase and remove since C++20_ If can be simplified to std::erase_if() or erase_if() member function, for example std::erase, std::erase_if (std::vector) .

Postscript

This about customizing your own STL like iterator has contributed some guidelines for personal understanding and best practice, but there is still a little to be said.

Next time, consider introducing a tree_t and its iterator implementation may have more reference value.

Refs

Reading 25 was released today at 19:15
Like collection
logger
DevOps direction, micro service direction, including various levels of operation and maintenance (Bash, container, architecture, etc.), and various levels of R & D (development language, architecture, etc.)
Focus column
81 prestige
7 fans
Focus on the author
Submit comments
You know what?

Register login
81 prestige
7 fans
Focus on the author
Article catalog
follow
Billboard

Topics: STL iterator