Skip to main content

C++ benchmark - std::vector VS std::list

A updated version of this article is available: C++ benchmark – std::vector VS std::list VS std::deque

In C++, the two most used data structures are the std::vector and the std::list. In this article, we will compare the performance in practice of these two data structures on several different workloads. In this article, when I talk about a list it is the std::list implementation and vector refers to the std::vector implementation.

It is generally said that a list should be used when random insert and remove will be performed (performed in O(1) versus O(n) for a vector). If we look only at the complexity, search in both data structures should be roughly equivalent, complexity being in O(n). When random insert/replace operations are performed on a vector, all the subsequent data needs to be moved and so each element will be copied. That is why the size of the data type is an important factor when comparing those two data structures.

However, in practice, there is a huge difference, the usage of the memory caches. All the data in a vector is contiguous where the std::list allocates separately memory for each element. How does that change the results in practice ?

Keep in mind that all the tests performed are made on vector and list even if other data structures could be better suited to the given workload.

In the graphs and in the text, n is used to refer to the number of elements of the collection.

All the tests performed have been performed on an Intel Core i7 Q 820  @ 1.73GHz. The code has been compiled in 64 bits with GCC 4.7.2 with -02 and -march=native. The code has been compiled with C++11 support (-std=c++11).

Fill

The first test that is performed is to fill the data structures by adding elements to the back of the container. Two variations of vector are used, vector_pre being a std::vector with the size passed in parameters to the constructor, resulting in only one allocation of memory.

Fill (8 bytes)vector_…vectorlist100010000100000100000002004006008001,0001,200Milliseconds
xvector_prevectorlist
1000001
100000110
100000411100
100000072341,023

Fill (1024 bytes)vector_…vectorlist100010000100000100000005,00010,00015,00020,00025,000Milliseconds
xvector_prevectorlist
1000091
100001224518
1000009492,6351,153
10000009,13823,65411,270

All data structures are impacted the same way when the data size increases, because there will be more memory to allocate. The vector_pre is clearly the winner of this test, being one order of magnitude faster than a list and about twice faster than a vector without pre-allocation. The result are directly linked to the allocations that have to be performed, allocation being slow. Whatever the data size is, push_back to a vector will always be faster than to a list. This is logical becomes vector allocates more memory than necessary and so does not need to allocate memory for each element.

But this test is not very interesting, generally building the data structure is not critical. What is critical is the operations that are performed on the data structure. That will be tested in the next sections.

Random Find

The first operation is that is tested is the search. The container is filled with all the numbers in [0, N] and shuffled. Then, each number in [0,N] is searched in the container with std::find that performs a simple linear search.

Find (8 bytes)vectorlist1001000500010000200000200,000400,000600,000800,000Microseconds
xvectorlist
100011
100001,545
5000035,886
100000150,865
200000614,496

Yes, vector is present in the graph, its line is the same as the x line ! Performing a linear search in a vector is several orders of magnitude faster than in a list.

The only reason is the usage of the cache line. When a data is accessed, the data is fetched from the main memory to the cache. Not only the accessed data is accessed, but a whole cacheline is fetched. As the elements in a vector are contiguous, when you access an element, the next element is automatically in the cache. As the main memory is orders of magnitude slower than the cache, this makes a huge difference. In the list case, the processor spends its whole time waiting for data being fetched from memory to the cache.

If we augment the size of the data type to 1KB, the results remain the same, but slower:

Find (1024 bytes)vectorlist10010005000100002000001,000,0002,000,0003,000,0004,000,000Microseconds
xvectorlist
100011
100003,551
50000195,429
100000829,631
2000003,356,432

Random Insert

In the case of random insert, in theory, the list should be much faster, its insert operation being in O(1) versus O(n) for a vector.

The container is filled with all the numbers in [0, N] and shuffled. Then, 1000 random values are inserted at a random position in the container. The random position is found by linear search. In both cases, the complexity of the search is O(n), the only difference comes from the insert that follow the search.

Insert (8 bytes)vectorlist1000200040006000800010000020406080100120Milliseconds
xvectorlist
1000985
2000985
40001094
60001298
800013106
1000014106

When, the vector should be slower than the list, it is almost an order of magnitude faster. Again, this is because finding the position in a list is much slower than copying a lot of small elements.

If we increase the size:

Insert (32 bytes)vectorlist1000200040006000800010000050100150200Milliseconds
xvectorlist
100027120
200030113
400034122
600037140
800042145
1000047155

The two lines are getting closer, but vector is still faster.

Increase it to 1KB:

Insert (1024 bytes)vectorlist100020004000600080001000001,0002,0003,0004,000Milliseconds
xvectorlist
10001,821167
20001,941163
40002,383191
60002,679207
80002,960214
100003,308228

This time, list outperforms vector by an order of magnitude ! The performance of random insert in a list are not impacted much by the size of the data type, where vector suffers a lot when big sizes are used. We can also see that list doesn't seem to care about the size of the collection. It is because the size of the collection only impact the search and not the insertion and as few search are performed, it does not change the results a lot.

If the iterator was already known (no need for linear search), it would be faster to insert into a list than into the vector.

Random Remove

In theory, random remove is the same case than random insert. Now that we've seen the results with random insert, we could expect the same behavior for random remove.

The container is filled with all the numbers in [0, N] and shuffled. Then, 1000 random values are removed from a random position in the container.

Push front (8 bytes)vectorlist10010001000050000100000200000300000010,00020,00030,00040,00050,000Milliseconds
xvectorlist
10000
100000
10000400
500009492
1000003,9374
20000016,0039
30000042,39312

Again, vector is several times faster and looks to scale better. Again, this is because it is very cheap to copy small elements.

Let's increase it directly to 1KB element.

Sort (8 bytes)vectorlist100010000100000100000005,00010,00015,00020,000Milliseconds
xvectorlist
100000
10000226
100000163684
10000002,14715,950

The two lines have been reversed !

The behavior of random remove is the same as the behavior of random insert, for the same reasons.

Push Front

The next operation that we will compare is inserting elements in front of the collection. This is the worst case for vector, because after each insertion, all the previously inserted will be moved and copied. For a list, it does not make a difference compared to pushing to the back.

Push front (8 bytes)vectorlist10010001000050000100000200000300000010,00020,00030,00040,00050,000Milliseconds
xvectorlist
10000
100000
10000400
500009492
1000003,9374
20000016,0039
30000042,39312

The results are crystal-clear and as expected. vector is very bad at inserting elements to the front. This does not need further explanations. There is no need to change the data size, it will only make vector much slower.

Sort

The next operation that is tested is the performance of sorting a vector or a list. For a vector std::sort is used and for a list the member function sort is used.

Sort (8 bytes)vectorlist100010000100000100000005,00010,00015,00020,000Milliseconds
xvectorlist
100000
10000226
100000163684
10000002,14715,950

We can see that sorting a list is several times slower. It comes from the poor usage of the cache.

If we increase the size of the element to 1KB:

Sort (1024 bytes)vectorlist1000100001000001000000010,00020,00030,00040,00050,00060,000Milliseconds
xvectorlist
100020
1000022450
1000004,2891,083
100000050,97317,975

This time the list is faster than the vector. It is not very clear on the graph, but the values for the list are almost the same as for the previous results. That is because std::list::sort() does not perform any copy, only pointers to the elements are changed. On the other hand, swapping two elements in a vector involves at least three copies, so the cost of sorting will increase as the cost of copying increases.

Number Crunching

Finally, we can also test a number crunching operation. Here, random elements are inserted into the container that is kept sorted. It means, that the position where the element has to be inserted is first searched by iterating through elements and the inserted. As we talk about number crunching, only 8 bytes elements are tested.

Random Sorted Insert (8 bytes)vectorlist100010000500001000002000003000000200,000400,000600,000800,000Milliseconds
xvectorlist
100000
1000045166
5000092810,665
1000003,75350,766
20000015,185231,480
30000034,293715,892

We can clearly see that vector is more than an order of magnitude faster than list and this will only be more as the size of the collection increase. This is because traversing the list is much more expensive than copying the elements of the vector.

Conclusion

To conclude, we can get some facts about each data structure:

  • std::vector is insanely faster than std::list to find an element
  • std::vector performs always faster than std::list with very small data
  • std::vector is always faster to push elements at the back than std::list
  • std::list handles very well large elements, especially for sorting or inserting in the front

This draw simple conclusions on usage of each data structure:

  • Number crunching: use std::vector
  • Linear search: use std::vector
  • Random Insert/Remove: use std::list (if data size very small (< 64B on my computer), use std::vector)
  • Big data size: use std::list (not if intended for searching)

If you have the time, in practice, the best way to decide is always to benchmark both versions, or even to try another data structures.

I hope that you found this article interesting. If you have any comment or have an idea about an other workload that you would like to test, don't hesitate to post a comment ;) If you have a question on results, don't hesitate as well.

The code source of the benchmark is available online: https://github.com/wichtounet/articles/blob/master/src/vector_list/bench.cpp

Comments

vector_pre
vector_pre