C++ benchmark - std::vector VS std::list

Baptiste Wicht

A updated version of this article is available: C++ benchmark – std::vector VS std::list VS std::deque

In C++, the two most used data structures are the std::vector and the std::list. In this article, we will compare the performance in practice of these two data structures on several different workloads. In this article, when I talk about a list it is the std::list implementation and vector refers to the std::vector implementation.

It is generally said that a list should be used when random insert and remove will be performed (performed in O(1) versus O(n) for a vector). If we look only at the complexity, search in both data structures should be roughly equivalent, complexity being in O(n). When random insert/replace operations are performed on a vector, all the subsequent data needs to be moved and so each element will be copied. That is why the size of the data type is an important factor when comparing those two data structures.

However, in practice, there is a huge difference, the usage of the memory caches. All the data in a vector is contiguous where the std::list allocates separately memory for each element. How does that change the results in practice ?

Keep in mind that all the tests performed are made on vector and list even if other data structures could be better suited to the given workload.

In the graphs and in the text, n is used to refer to the number of elements of the collection.

All the tests performed have been performed on an Intel Core i7 Q 820 @ 1.73GHz. The code has been compiled in 64 bits with GCC 4.7.2 with -02 and -march=native. The code has been compiled with C++11 support (-std=c++11).

Fill

The first test that is performed is to fill the data structures by adding elements to the back of the container. Two variations of vector are used, vector_pre being a std::vector with the size passed in parameters to the constructor, resulting in only one allocation of memory.

x	vector_pre	vector	list
1000	0	0	1
10000	0	1	10
100000	4	11	100
1000000	7	234	1,023

x	vector_pre	vector	list
1000	0	9	1
10000	12	245	18
100000	949	2,635	1,153
1000000	9,138	23,654	11,270

All data structures are impacted the same way when the data size increases, because there will be more memory to allocate. The vector_pre is clearly the winner of this test, being one order of magnitude faster than a list and about twice faster than a vector without pre-allocation. The result are directly linked to the allocations that have to be performed, allocation being slow. Whatever the data size is, push_back to a vector will always be faster than to a list. This is logical becomes vector allocates more memory than necessary and so does not need to allocate memory for each element.

But this test is not very interesting, generally building the data structure is not critical. What is critical is the operations that are performed on the data structure. That will be tested in the next sections.

Random Find

The first operation is that is tested is the search. The container is filled with all the numbers in [0, N] and shuffled. Then, each number in [0,N] is searched in the container with std::find that performs a simple linear search.

x	vector	list
100	0	11
1000	0	1,545
5000	0	35,886
10000	0	150,865
20000	0	614,496

Yes, vector is present in the graph, its line is the same as the x line ! Performing a linear search in a vector is several orders of magnitude faster than in a list.

The only reason is the usage of the cache line. When a data is accessed, the data is fetched from the main memory to the cache. Not only the accessed data is accessed, but a whole cacheline is fetched. As the elements in a vector are contiguous, when you access an element, the next element is automatically in the cache. As the main memory is orders of magnitude slower than the cache, this makes a huge difference. In the list case, the processor spends its whole time waiting for data being fetched from memory to the cache.

If we augment the size of the data type to 1KB, the results remain the same, but slower:

x	vector	list
100	0	11
1000	0	3,551
5000	0	195,429
10000	0	829,631
20000	0	3,356,432

Random Insert

In the case of random insert, in theory, the list should be much faster, its insert operation being in O(1) versus O(n) for a vector.

The container is filled with all the numbers in [0, N] and shuffled. Then, 1000 random values are inserted at a random position in the container. The random position is found by linear search. In both cases, the complexity of the search is O(n), the only difference comes from the insert that follow the search.

x	vector	list
1000	9	85
2000	9	85
4000	10	94
6000	12	98
8000	13	106
10000	14	106

When, the vector should be slower than the list, it is almost an order of magnitude faster. Again, this is because finding the position in a list is much slower than copying a lot of small elements.

If we increase the size:

x	vector	list
1000	27	120
2000	30	113
4000	34	122
6000	37	140
8000	42	145
10000	47	155

The two lines are getting closer, but vector is still faster.

Increase it to 1KB:

x	vector	list
1000	1,821	167
2000	1,941	163
4000	2,383	191
6000	2,679	207
8000	2,960	214
10000	3,308	228

This time, list outperforms vector by an order of magnitude ! The performance of random insert in a list are not impacted much by the size of the data type, where vector suffers a lot when big sizes are used. We can also see that list doesn't seem to care about the size of the collection. It is because the size of the collection only impact the search and not the insertion and as few search are performed, it does not change the results a lot.

If the iterator was already known (no need for linear search), it would be faster to insert into a list than into the vector.

Random Remove

In theory, random remove is the same case than random insert. Now that we've seen the results with random insert, we could expect the same behavior for random remove.

The container is filled with all the numbers in [0, N] and shuffled. Then, 1000 random values are removed from a random position in the container.

x	vector	list
100	0	0
1000	0	0
10000	40	0
50000	949	2
100000	3,937	4
200000	16,003	9
300000	42,393	12

Again, vector is several times faster and looks to scale better. Again, this is because it is very cheap to copy small elements.

Let's increase it directly to 1KB element.

x	vector	list
1000	0	0
10000	2	26
100000	163	684
1000000	2,147	15,950

The two lines have been reversed !

The behavior of random remove is the same as the behavior of random insert, for the same reasons.

Push Front

The next operation that we will compare is inserting elements in front of the collection. This is the worst case for vector, because after each insertion, all the previously inserted will be moved and copied. For a list, it does not make a difference compared to pushing to the back.

x	vector	list
100	0	0
1000	0	0
10000	40	0
50000	949	2
100000	3,937	4
200000	16,003	9
300000	42,393	12

The results are crystal-clear and as expected. vector is very bad at inserting elements to the front. This does not need further explanations. There is no need to change the data size, it will only make vector much slower.

Sort

The next operation that is tested is the performance of sorting a vector or a list. For a vector std::sort is used and for a list the member function sort is used.

x	vector	list
1000	0	0
10000	2	26
100000	163	684
1000000	2,147	15,950

We can see that sorting a list is several times slower. It comes from the poor usage of the cache.

If we increase the size of the element to 1KB:

x	vector	list
1000	2	0
10000	224	50
100000	4,289	1,083
1000000	50,973	17,975

This time the list is faster than the vector. It is not very clear on the graph, but the values for the list are almost the same as for the previous results. That is because std::list::sort() does not perform any copy, only pointers to the elements are changed. On the other hand, swapping two elements in a vector involves at least three copies, so the cost of sorting will increase as the cost of copying increases.

Number Crunching

Finally, we can also test a number crunching operation. Here, random elements are inserted into the container that is kept sorted. It means, that the position where the element has to be inserted is first searched by iterating through elements and the inserted. As we talk about number crunching, only 8 bytes elements are tested.

x	vector	list
1000	0	0
10000	45	166
50000	928	10,665
100000	3,753	50,766
200000	15,185	231,480
300000	34,293	715,892

We can clearly see that vector is more than an order of magnitude faster than list and this will only be more as the size of the collection increase. This is because traversing the list is much more expensive than copying the elements of the vector.

Conclusion

To conclude, we can get some facts about each data structure:

std::vector is insanely faster than std::list to find an element
std::vector performs always faster than std::list with very small data
std::vector is always faster to push elements at the back than std::list
std::list handles very well large elements, especially for sorting or inserting in the front

This draw simple conclusions on usage of each data structure:

Number crunching: use std::vector
Linear search: use std::vector
Random Insert/Remove: use std::list (if data size very small (< 64B on my computer), use std::vector)
Big data size: use std::list (not if intended for searching)

If you have the time, in practice, the best way to decide is always to benchmark both versions, or even to try another data structures.

I hope that you found this article interesting. If you have any comment or have an idea about an other workload that you would like to test, don't hesitate to post a comment ;) If you have a question on results, don't hesitate as well.

The code source of the benchmark is available online: https://github.com/wichtounet/articles/blob/master/src/vector_list/bench.cpp