Implement your own iterators

Using STD: : iterator

Prior to C++17, it was recommended to implement custom iterators derived from STD ::iterator.

STD :: Basic definition of iterator

Std:: Iterator has this definition:

template<
    class Category.class T.class distance = std::ptrdiff_t.class Pointer = T*,
    class Reference = T&
> struct iterator;
Copy the code

Where T is your container class type, needless to mention. Category is the so-called iterator tag that must be specified first, see here. Categories can be:

  1. Input_iterator_tag: input iterator
  2. Output_iterator_tag: output iterator
  3. Forward_iterator_tag: forward iterator
  4. Bidirectional_iterator_tag: bidirectional iterator
  5. Random_access_iterator_tag: random access iterator
  6. Contiguous_iterator_tag: continuous iterator

The labels seemed rather baffling, as if I knew what they meant, but in reality they were hard to read and hard to pick out.

Iterator tag

Here’s a rough description of their features and their associated entities to help you understand them.

These tags are actually bound to entity classes of the same name such as Input_iterator and so on, implementing proprietary distance() and advance() through template specialization techniques to achieve specific iterative optimization.

input_iterator_tag

Input_iterator_tag can wrap the output of a function — to be used as the input stream of another person. So it is incrementable only (+1 only), you can’t increment it, you can only simulate the effect by incrementing it n times. Input_iterator cannot decrement (-1) because the input stream does not have such a property. Its iterator value (*it) is read only and you cannot set it.

But the iterator values for output_iterator_tag, forward_iterator_tag, are read and written. Read/write iterator values are:

std::list<int> l{1.2.3};
auto it = l.begin(a); ++it; (*it) =5; // <- set value back into the container pointed by iterator
Copy the code

The input_iterator renders the container as an input stream. You can receive the input data stream through the input_iterator.

output_iterator_tag

Output_iterator_tag is rarely used directly by users, It is usually used in conjunction with back_insert_iterator/ front_insert_iterator/ insert_iterator and ostream_iterator.

Output_iterator has no ++/– capability. You can write/put new values into the container pointed to by the output_iterator, and that’s it.

If you have an output stream style rendering requirement, choose it.

forward_iterator_tag

Forward_iterator_tag represents a forward iterator, so it can only be incremented, not retracted. It inherits all the basic capabilities of input_iterator_tag, with some enhancements, such as allowing values to be set.

Input_iterator is capable of reading/setting values and incremental walks, but not decrement walks (requiring emulation, inefficient). +n is inefficient because it requires loop emulation, but if your container has only such explicit requirements, then forward_iterator_tag is the best choice.

In theory, iterators that support forward_iterator_tag must implement at least begin/end.

bidirectional_iterator_tag

The associated entity of biDirectional_iterator_tag is bidirectional_iterator, which can be either IT ++ or IT –, such as STD ::list. Like forward_iterator_tag, bidirectional_iterator_tag cannot be +n (and -n) directly, so +n requires a specialized advance function to loop n times, +1 each time (that is, simulated by a loop of n increments or decrements).

In theory, iterators that support biDirectional_iterator_tag must implement both BEGIN /end and RBEGIN /rend.

random_access_iterator_tag

Random_access_iterator_tag Is a random-access iterator that reads/sets values, increments and decrement values, and supports +n/-n.

Since random_access_iterator supports efficient +n/-n, which means that it allows efficient direct positioning, the container to which the iterator belongs usually also incidentally supports operator [] subscript access, as in STD ::vector.

contiguous_iterator_tag

Access to a contiguous_iterator_tag has been introduced in C++17, but there are some problems with compiler support, so we cannot introduce it in detail at present, and its existence is unnecessary for implementation.

Implementation of custom iterators

A custom iterator needs to select an iterator tag, that is, the set of supported capabilities for the iterator. Here is an example:

namespace customized_iterators {
  template<long FROM, long TO>
  class Range {
    public:
    // member typedefs provided through inheriting from std::iterator
    class iterator : public std::iterator<std::forward_iterator_tag, // iterator_category
    long.// value_type
    long.// difference_type
    const long *,              // pointer
    const long &               // reference
      > {
      long num = FROM;

      public:
      iterator(long _num = 0)
        : num(_num) {}
      iterator &operator++() {
        num = TO >= FROM ? num + 1 : num - 1;
        return *this;
      }
      iterator operator+ + (int) {
        iterator ret_val = *this; + + (*this);
        return ret_val;
      }
      bool operator==(iterator other) const { return num == other.num; }
      bool operator! =(iterator other)const { return! (*this == other); }
      long operator* () {returnnum; }};iterator begin(a) { return FROM; }
    iterator end(a) { return TO >= FROM ? TO + 1 : TO - 1; }};void test_range(a) {
    Range<5.13> r;
    for (auto v : r) std::cout << v << ', ';
    std::cout << '\n'; }}Copy the code

The prototype for this example comes from STD :: Iterator and its author on CPpreference, with minor modifications.

Increment and decrement operator overloading

Have a separate section, because there are too many spam tutorials.

Operator overloading can be either prefixed or postfix. Prefix returns a reference, or postfix returns a new copy:

struct X {
  // The prefix is added automatically
  X& operator+ + () {// The actual increment takes place here
    return *this; // Return the new value by reference
  }

  // The suffix is added
  X operator+ + (int) {
    X old = *this; // Copy the old value
    operator+ + ();// The prefix is added automatically
    return old;    // Return the old value
  }

  // The prefix is decrement
  X& operator--() {
    // The actual decrement takes place here
    return *this; // Return the new value by reference
  }

  // the suffix is subtracted
  X operator--(int) {
    X old = *this; // Copy the old value
    operator--();  // The prefix is decrement
    return old;    // Return the old value}};Copy the code

Or go to the documentation for CPPreference and the documentation, don’t go to the tutorials, you can’t find two of them.

The correct encoding is to implement a prefix overload and then implement a suffix overload based on it:

struct incr {
  int val{};
  incr &operator++() {
    val++;
    return *this;
  }
  incr operator+ + (int d) {
    incr ret_val = *this; + + (*this);
    returnret_val; }};Copy the code

If necessary, you may need to implement the operator= or X(X const &o) copy constructor. But for simple trivial structs it can be omitted (if you are not sure whether automatic memory copying is provided, consider looking at assembly code, or simply implementing the operator= or X(X const& o) copy constructor explicitly).

C + + 17 up

But STD ::iterator has been deprecated since C++17.

If you really care about gossip, you can check out the discussion here.

In most cases, you can still use STD :: Iterator to simplify code writing, but that feature and the earlier concepts of iterator tags, categories, and so on are outdated.

Fully handwritten iterator

So in the new era starting with C++17, custom iterators were, in principle, only handwritten for the time being.

namespace customized_iterators {
  namespace manually {
    template<long FROM, long TO>
    class Range {
      public:
      class iterator {
        long num = FROM;

        public:
        iterator(long _num = 0)
          : num(_num) {}
        iterator &operator++() {
          num = TO >= FROM ? num + 1 : num - 1;
          return *this;
        }
        iterator operator+ + (int) {
          iterator ret_val = *this; + + (*this);
          return ret_val;
        }
        bool operator==(iterator other) const { return num == other.num; }
        bool operator! =(iterator other)const { return! (*this == other); }
        long operator* () {return num; }
        // iterator traits
        using difference_type = long;
        using value_type = long;
        using pointer = const long *;
        using reference = const long &;
        using iterator_category = std::forward_iterator_tag;
      };
      iterator begin(a) { return FROM; }
      iterator end(a) { return TO >= FROM ? TO + 1 : TO - 1; }}; }// namespace manually

  void test_range(a) {
    manually::Range<5.13> r;
    for (auto v : r) std::cout << v << ', ';
    std::cout << '\n'; }}Copy the code

The Iterator traits sections in the example are not required, and you don’t need to support them at all.

Things to take care of

Considerations for fully handwritten iterators include:

  1. Begin () and end ()
  2. Iterator embedded classes (not necessarily limited to embedding) implement at least:
    1. Increment operator overload for walking
    2. Decrement operator overload if bidirectional_iterator_tag or random walk (random_access_iterator_tag)
    3. operator*The operator is overloaded to allow the iterator to evaluate
    4. operator! =Operator overload to calculate iteration range; You can also explicitly overload if necessaryoperator==(By default, the compiler automatically switches from! =Generate a matching substitute on the operator)

If your code supports iterative ranges, then you can use a for range loop:

your_collection coll;
for(auto &v: coll) {
  std::cout << v << '\n';
}
Copy the code

For an expansion of the for scope loop, see here.

C + + 20 after

Iterators have changed dramatically since C++20. But because its engineering implementation is still very early, so this paper will not discuss.

Other related

In addition to iterator there is const_iterator

For code specification and security, getters usually provide two, writable and unwritable, at a time:

struct incr {
  int &val(a){ return _val; }
  int const &val(a) const { return _val; }
  private:
  int _val{};
}
Copy the code

By the same token, begin() and end() of iterators should provide at least two const and nonconst versions. In general you can help provide multiple versions with a separate implementation:

struct XXX {
  
  // ... struct leveled_iter_data {
  // static leveled_iter_data begin(NodePtr root_) {... }
  //. static leveled_iter_data end(NodePtr root_) {... }
  // }
  
  using iterator = leveled_iter_data;
  using const_iterator = const iterator;
  iterator begin(a) { return iterator::begin(this); }
  const_iterator begin(a) const { return const_iterator::begin(this); }
  iterator end(a) { return iterator::end(this); }
  const_iterator end(a) const { return const_iterator::end(this); }}Copy the code

This is a no-brainer in that read and write security is constrained within XXX: the owner can of course understand what should be exposed and what needs to be temporarily constrained to be exposed.

Besides Iterator and const_iterator, Rbegin /rend, Cbegin /cend, etc. can also be considered to be implemented.

Note: Use of iterators

The use of iterators must be guided by the rule of “take as you go”.

void test_iter_invalidate(a) {
  std::vector<int> vi{3.7};
  auto it = vi.begin(a); it = vi.insert(it, 11);
  vi.insert(it, 5000.23);
  vi.insert(it, 1.31);				// crach here!
  std::cout << (*it) << '\n';
  return;
}
Copy the code

In most OS environments, vi. Insert (it, 5000, 23); The statement has a high probability of causing the vector to have to reallocate its internal array space. Therefore, after the statement is executed, the internal pointer held by IT is meaningless (it still points to a location in the old buffer), so continuing to use it in the next line will result in incorrect pointing and writing. This error often results in a SIGSEGV fatal exception, since there is a high probability that the obsolete buffer has been scheduled to be in a page-missing state. If a SIGSEGV signal is generated, you may be lucky, but if the stale buffer is still valid, the statement can be executed without any errors.

Iterator search and delete

The stdlib container uses an idiom called erase and remove to actually remove an element. Taking STD ::list as an example, remove_if() can find qualifying elements from the list, gather them to the end of the list, and then return iter, the location of the first element in the set. However, these elements have not been removed from the list. If you need to remove them, you need to remove them explicitly with list.erase(iter, list.end()).

So deleting elements looks like this:

bool IsOdd(int i) { return i & 1; }

std::vector<int> v = {0.1.2.3.4.5.6.7.8.9};
v.erase(std::remove_if(v.begin(), v.end(), IsOdd), v.end());

std::list<int> l = { 1.100.2.3.10.1.11.- 1.12 };
l.erase(l.remove_if(IsOdd), l.end());
Copy the code

Since STD ::vector does not aggregate elements to the end of the list like STD ::list does, it does not have a remove_if() member function, so search and erase on it requires STD ::remove_if. STD ::list can be done directly using the remove_if member function, and the code is a little cleaner.

Since C++20, erase and remove_if can be simplified to STD ::erase_if() or erase_if() member functions, e.g. STD ::erase, STD ::erase_if(STD ::vector).

Afterword.

This About Customizing Your Own STl-like Iterator provides some guidelines for personal understanding and best practices, but leaves a little more to be said.

It might be more useful to consider introducing a tree_t and its iterator implementation next time.

Refs

  • Scope-based for loops (C++11) – cppreference.com

  • std::iterator – cppreference.com

  • std::input_iterator_tag, std::output_iterator_tag, std::forward_iterator_tag, std::bidirectional_iterator_tag, std::random_access_iterator_tag, std::contiguous_iterator_tag – cppreference.com

  • Increment/decrement operators – cppreference.com

  • operator overloading – cppreference.com

  • Traversing Trees with Iterators

  • c++ – How to implement an STL-style iterator and avoid common pitfalls? – Stack Overflow

  • c++ – Writing your own STL Container – Stack Overflow

  • STL & Generic Programming: Writing Your Own Iterators – Dr Dobb’s

    c++ – How to correctly implement custom iterators and const_iterators? – Stack Overflow

:end: