std::vector source code analysis
Observe the STL design from the perspective of source code, and the code is implemented as libstdc++(GCC 4.8.5)
Because we only focus on the implementation of vector, and the implementation of vector is almost all in the header file, we can use such a method to obtain relatively fresh source code
// main.cpp #include <vector> int main() { std::vector<int> v; v.emplace_back(1); }
g++ -E main.cpp -std=c++11 > vector.cpp
Open vector in vscode CPP uses the regular "#. * \ n" to delete all compiler related lines. In this way, it can filter all precompiled instructions, and does not rely on external implementation, and there is no pressure to jump
allocator
For a trait that an allocator needs to implement, at least
- allocate memory allocation
- deallocate memory recycling
The minimum granularity allocated by allocator is the object, so it is necessary to increase the maximum allocated quantity
- max_size maximum allocated quantity
The above is the most basic function to realize a distributor. On this basis, the construction and destruct of objects are extended. For places where allocators need to be used, such as STL, the container itself does not need to pay attention to the memory related functions of object construction and destruct.
- Construction object construct ion means that templates need to be used for implementation and generalization
- destroy object destruction
To sum up, realize the alloc of allocator_ Traits are as follows:
- allocate allocation
- deallocate recycling
- Construction object construct ion means that templates need to be used for implementation and generalization
- destroy object destruction
- max_size maximum allocated quantity
std::allocator
The implementation of the allocator of the standard library is relatively simple. Allocation and recycling:: operator new/delete
pointer allocate(size_type __n, const void * = 0) { if (__n > this->max_size()) std::__throw_bad_alloc(); return static_cast<_Tp *>(::operator new(__n * sizeof(_Tp))); } void deallocate(pointer __p, size_type) { ::operator delete(__p); }
For the maximum allocation, the entire process space (virtual) can be allocated
// sizeof(size_t) = process address width size_type max_size() const throw() { return size_t(-1) / sizeof(_Tp); }
For the construction and Deconstruction of objects, the layout construction and destructor are used
void construct(pointer __p, const _Tp &__val) { ::new ((void *)__p) _Tp(__val); } void destroy(pointer __p) { __p->~_Tp(); }
std::vector
General sequential container, supporting user-defined memory allocator;
Basic implementation
libstdc + + defines vector as follows, which provides:
template <typename _Tp, typename _Alloc = std::allocator<_Tp>> class vector : protected _Vector_base<_Tp, _Alloc> {};
Two template parameters: an element type in a container and an allocator type, and the allocator type is not a required parameter.
Using protected inheritance_ Vector_base, but there is no use of empty base class optimization (EBO), but more class isolation;
Observe_ Vector_ The implementation of base includes an impl:
template <typename _Tp, typename _Alloc> struct _Vector_base { typedef typename __gnu_cxx::__alloc_traits<_Alloc>::template rebind<_Tp>::other _Tp_alloc_type; typedef typename __gnu_cxx::__alloc_traits<_Tp_alloc_type>::pointer pointer; struct _Vector_impl : public _Tp_alloc_type { pointer _M_start; pointer _M_finish; pointer _M_end_of_storage; } public: _Vector_impl _M_impl; }
_ Vector_base provides vector's operations on memory, including allocating and releasing memory_ Vector_impl public inheritance_ Tp_ alloc_ Type (the default is STD:: allocator < _tp1 >), from the semantics of C + +_ Vector_impl can also be called an allocator (as it is).
_Vector_impl
_ Vector_impl implementation is relatively simple. Three core member variables are used as the underlying expression of vector
- _ M_start element space start address, address returned by data()
- _ M_finish the end address of the meta space, which is related to size()
- _ M_ end_ of_ The storage element is the end address of free space, which is related to capacity()
struct _Vector_impl : public _Tp_alloc_type { pointer _M_start; pointer _M_finish; pointer _M_end_of_storage; _Vector_impl() : _Tp_alloc_type(), _M_start(0), _M_finish(0), _M_end_of_storage(0) {} _Vector_impl(_Tp_alloc_type const &__a) : _Tp_alloc_type(__a), _M_start(0), _M_finish(0), _M_end_of_storage(0) {} void _M_swap_data(_Vector_impl &__x) { std::swap(_M_start, __x._M_start); std::swap(_M_finish, __x._M_finish); std::swap(_M_end_of_storage, __x._M_end_of_storage); } };
_Vector_base
_ Vector_impl has provided the expression of the underlying storage_ Vector_base is the initialization of the underlying expression, the implementation of shielding memory, and the application / release interface for the upper layer
// Only one constructor is selected for display _Vector_base(size_t __n) : _M_impl() { _M_create_storage(__n); } void _M_create_storage(size_t __n) { this->_M_impl._M_start = this->_M_allocate(__n); this->_M_impl._M_finish = this->_M_impl._M_start; this->_M_impl._M_end_of_storage = this->_M_impl._M_start + __n; } // Free memory ~_Vector_base() { _M_deallocate(this->_M_impl._M_start, this->_M_impl._M_end_of_storage - this->_M_impl._M_start); } pointer _M_allocate(size_t __n) { return __n != 0 ? _M_impl.allocate(__n) : 0; } void _M_deallocate(pointer __p, size_t __n) { if (__p) _M_impl.deallocate(__p, __n); }
Constructor
Taking the implementation of the three constructors as an example, it should be noted that when constructing the latter two, there will be a cost of size() replication
L174 default constructor does nothing except basic initialization
L209 construct has initializer_ Container for list init content
L214 constructs a container with the contents of the range [first, last]
174 explicit vector(const allocator_type &__a) : _Base(__a) {} 209 vector(initializer_list<value_type> __l, 210 const allocator_type &__a = allocator_type()) 211 : _Base(__a) { 212 _M_range_initialize(__l.begin(), __l.end(), random_access_iterator_tag()); 213 } 214 template <typename _InputIterator, 215 typename = std::_RequireInputIter<_InputIterator>> 216 vector(_InputIterator __first, _InputIterator __last, 217 const allocator_type &__a = allocator_type()) 218 : _Base(__a) { 219 _M_initialize_dispatch(__first, __last, __false_type()); 220 }
method
To understand the underlying implementation of std::vector, we will directly look at the methods provided later. The most basic is to add, delete, change and check the size.
Size dependent
The number of implementation elements of size() is
size_type size() const { return size_type(this->_M_impl._M_finish - this->_M_impl._M_start); }
capacity() the size of free space, which is implemented as
size_type capacity() const { return size_type(this->_M_impl._M_end_of_storage - this->_M_impl._M_start); }
push_back
push_back is the most frequently used method. If you understand its implementation, the change strategy of the whole vector will be clear.
60 void push_back(const value_type &__x) { 61 if (this->_M_impl._M_finish != this->_M_impl._M_end_of_storage) { 62 _Alloc_traits::construct(this->_M_impl, this->_M_impl._M_finish, __x); 63 ++this->_M_impl._M_finish; 64 } else 65 _M_emplace_back_aux(__x); 66 } 67 68 void push_back(value_type &&__x) { emplace_back(std::move(__x)); } 85 template <typename _Tp, typename _Alloc> 86 template <typename... _Args> 87 void vector<_Tp, _Alloc>::emplace_back(_Args && ...__args) { 88 if (this->_M_impl._M_finish != this->_M_impl._M_end_of_storage) { 89 _Alloc_traits::construct(this->_M_impl, this->_M_impl._M_finish, 90 std::forward<_Args>(__args)...); 91 ++this->_M_impl._M_finish; 92 } else 93 _M_emplace_back_aux(std::forward<_Args>(__args)...); 94 }
push_ Empty is used at the bottom of back()_ Back (c + + 11) optimization:
In the case of size() < capacity(), copy / move the structure directly at the position after the last element, and the bottom address offset is + 1
In the case of size() == capacity(), you need to apply for a new piece of memory before inserting a new element, and you need to move the previous element to the new memory. The implementation is as follows, ignoring exception handling and unnecessary branch handling.
11 template <typename _Tp, typename _Alloc> 12 template <typename... _Args> 13 void vector<_Tp, _Alloc>::_M_emplace_back_aux(_Args && ...__args) { 14 const size_type __len = 15 _M_check_len(size_type(1), "vector::_M_emplace_back_aux"); 16 pointer __new_start(this->_M_allocate(__len)); 17 pointer __new_finish(__new_start); 19 _Alloc_traits::construct(this->_M_impl, __new_start + size(), 20 std::forward<_Args>(__args)...); 21 __new_finish = 0; 22 __new_finish = std::__uninitialized_move_if_noexcept_a( 23 this->_M_impl._M_start, this->_M_impl._M_finish, __new_start, 24 _M_get_Tp_allocator()); 25 ++__new_finish; 26 std::_Destroy(this->_M_impl._M_start, this->_M_impl._M_finish, 27 _M_get_Tp_allocator()); 28 _M_deallocate(this->_M_impl._M_start, 29 this->_M_impl._M_end_of_storage - this->_M_impl._M_start); 30 this->_M_impl._M_start = __new_start; 31 this->_M_impl._M_finish = __new_finish; 32 this->_M_impl._M_end_of_storage = __new_start + __len; 33 }
_ M_check_len checks whether there is enough space for allocation and returns the increased size. The implementation is as follows
size_type _M_check_len(size_type __n, const char *__s) const { if (max_size() - size() < __n) __throw_length_error((__s)); const size_type __len = size() + std::max(size(), __n); return (__len < size() || __len > max_size()) ? max_size() : __len; }
As you can see, the first push_ After back, size() == capacity() == 1, the second time is 2, followed by * 2, and the maximum is size_t(-1)/sizeof(T).
L14 get the space size to be allocated
L16 requests a new piece of memory
L19 construct new elements
L22 copy / move the old elements to the new memory
L26 destructs the old elements
L28 release the old space
L30-L32 update the index of the underlying implementation
Therefore, we can see that the underlying implementation of vector must be a sequence table, which can be on the stack (implement the allocator yourself) or on the heap (default).
For capacity expansion, the growth factor is 2, and there is a maximum size limit. The case of integer overflow is also considered.
With regard to constructors, there will be a call to copy constructors for each insertion
insert
Inserts the element into the container at the specified location.
insert and push_ The implementation of back is not different. There are (size() - pos) more copy / move constructors
resize
Change the number of elements that can be stored in the container
Here we only look at the implementation of the default initialization new element value
298 void resize(size_type __new_size) { 299 if (__new_size > size()) 300 _M_default_append(__new_size - size()); 301 else if (__new_size < size()) 302 _M_erase_at_end(this->_M_impl._M_start + __new_size); 303 } 525 void _M_erase_at_end(pointer __pos) { 526 std::_Destroy(__pos, this->_M_impl._M_finish, _M_get_Tp_allocator()); 527 this->_M_impl._M_finish = __pos; 528 } 408 void vector<_Tp, _Alloc>::_M_default_append(size_type __n) { 409 if (__n != 0) { 410 if (size_type(this->_M_impl._M_end_of_storage - 411 this->_M_impl._M_finish) >= __n) { 412 std::__uninitialized_default_n_a(this->_M_impl._M_finish, __n, 413 _M_get_Tp_allocator()); 414 this->_M_impl._M_finish += __n; 415 } else { 416 const size_type __len = _M_check_len(__n, "vector::_M_default_append"); 417 const size_type __old_size = this->size(); 418 pointer __new_start(this->_M_allocate(__len)); 419 pointer __new_finish(__new_start); 420 try { 421 __new_finish = std::__uninitialized_move_if_noexcept_a( 422 this->_M_impl._M_start, this->_M_impl._M_finish, __new_start, 423 _M_get_Tp_allocator()); 424 std::__uninitialized_default_n_a(__new_finish, __n, 425 _M_get_Tp_allocator()); 426 __new_finish += __n; 427 } catch (...) { 428 std::_Destroy(__new_start, __new_finish, _M_get_Tp_allocator()); 429 _M_deallocate(__new_start, __len); 430 throw; 431 } 432 std::_Destroy(this->_M_impl._M_start, this->_M_impl._M_finish, 433 _M_get_Tp_allocator()); 434 _M_deallocate(this->_M_impl._M_start, 435 this->_M_impl._M_end_of_storage - this->_M_impl._M_start); 436 this->_M_impl._M_start = __new_start; 437 this->_M_impl._M_finish = __new_finish; 438 this->_M_impl._M_end_of_storage = __new_start + __len; 439 } 440 } 441 }
There are also three cases in resize
Ignore when you need to reset the size of the current container
When the reset size is smaller than the current container size, the processing is simple, the memory is released, and the value of finish is modified
When the reset size is larger than the current container size:
- The current reset is less than or equal to the capacity of the container, with additional elements directly at the tail with the default constructor
- When the reset size is larger than the container, and push_ Like back, you need to apply for memory first, then copy / move elements, and then repeat step 1
L416-L412 apply for new memory and copy / move elements
L424 is an additional element with the default constructor at the end
clear
Clear the elements in the container, and then size() = 0
The implementation is relatively simple
521 void clear() noexcept { _M_erase_at_end(this->_M_impl._M_start); } 525 void _M_erase_at_end(pointer __pos) { 526 std::_Destroy(__pos, this->_M_impl._M_finish, _M_get_Tp_allocator()); 527 this->_M_impl._M_finish = __pos; 528 }
reserve
Reserve storage space and increase the capacity of vector to (greater than or equal to) new_ The value of cap
The implementation is also relatively simple, new_ When the value of cap is greater than the capacity of the container, reallocate it, copy / move it to a new memory, and finally update the underlying data structure
566 template <typename _Tp, typename _Alloc> 567 void vector<_Tp, _Alloc>::reserve(size_type __n) { 568 if (__n > this->max_size()) 569 __throw_length_error(("vector::reserve")); 570 if (this->capacity() < __n) { 571 const size_type __old_size = size(); 572 pointer __tmp = _M_allocate_and_copy( 573 __n, std::__make_move_if_noexcept_iterator(this->_M_impl._M_start), 574 std::__make_move_if_noexcept_iterator(this->_M_impl._M_finish)); 575 std::_Destroy(this->_M_impl._M_start, this->_M_impl._M_finish, 576 _M_get_Tp_allocator()); 577 _M_deallocate(this->_M_impl._M_start, 578 this->_M_impl._M_end_of_storage - this->_M_impl._M_start); 579 this->_M_impl._M_start = __tmp; 580 this->_M_impl._M_finish = __tmp + __old_size; 581 this->_M_impl._M_end_of_storage = this->_M_impl._M_start + __n; 582 } 583 }
shrink_to_fit
Request to remove unused capacity
void shrink_to_fit() { _M_shrink_to_fit(); } template <typename _Tp, typename _Alloc> bool vector<_Tp, _Alloc>::_M_shrink_to_fit() { if (capacity() == size()) return false; return std::__shrink_to_fit_aux<vector>::_S_do_it(*this); } template <typename _Tp> struct __shrink_to_fit_aux<_Tp, true> { _Tp(__make_move_if_noexcept_iterator(__c.begin()), __make_move_if_noexcept_iterator(__c.end()), __c.get_allocator()) .swap(__c); return true; };
There are too many templates, which seems laborious. Let's put it another way
std::vector<int> v; v.push_back(1); // size()=1 capacity()=1 v.push_back(1); // size()=2 capacity()=2 v.push_back(1); // size()=3 capacity()=4 std::vector<int>(v.begin(), v.end()).swap(v); // size()=3 capacity()=3
Time complexity analysis
Complexity | method | explain |
---|---|---|
\(O(1)\) | size() | Variable subtraction |
\(O(1)\) | capacity() | Variable subtraction |
\(O(1)\) | push_back() | The worst case of equal sharing is 3 |
\(O(n)\) | insert() | The operation requires copying the size()-pos |
\(O(n)\) | clear() | size() secondary deconstruction |
\(O(n)\) | reserve() | size() copies required for capacity expansion |
\(O(n)\) | shrink_to_fit() | size() copy is required for construction, and swap() is a constant |
push_back complexity proof
Prepare for libstdc + +, and the growth factor of vector is 2. Analyze and execute n push for an empty vector_ Complexity of back.
The \ (c_i \) number of copy constructions required for the \ (I \) operation can be divided into two cases:
- size() < capacity(), \(c_i=1\)
- size() == capacity(), expand vector, \ (c_i=i \)
The number of times obtained each time is:
n push_ The total number of times the constructor is copied
n push_ The upper bound of back is 3n and the number of single amortization is 3, so the complexity is \ (O(1) \)