What are the options to clone or copy a list in Python?

Using new_list = my_list then modifies new_list every time my_list changes.
Why is this?

share|improve this question

12 Answers 12

up vote 1673 down vote accepted

With new_list = my_list, you don't actually have two lists. The assignment just copies the reference to the list, not the actual list, so both new_list and my_list refer to the same list after the assignment.

To actually copy the list, you have various possibilities:

  • You can slice it:

    new_list = old_list[:]
    

    Alex Martelli's opinion (at least back in 2007) about this is, that it is a weird syntax and it does not make sense to use it ever. ;) (In his opinion, the next one is more readable).

  • You can use the built in list() function:

    new_list = list(old_list)
    
  • You can use generic copy.copy():

    import copy
    new_list = copy.copy(old_list)
    

    This is a little slower than list() because it has to find out the datatype of old_list first.

  • If the list contains objects and you want to copy them as well, use generic copy.deepcopy():

    import copy
    new_list = copy.deepcopy(old_list)
    

    Obviously the slowest and most memory-needing method, but sometimes unavoidable.

Example:

import copy

class Foo(object):
    def __init__(self, val):
         self.val = val

    def __repr__(self):
        return str(self.val)

foo = Foo(1)

a = ['foo', foo]
b = a[:]
c = list(a)
d = copy.copy(a)
e = copy.deepcopy(a)

# edit orignal list and instance 
a.append('baz')
foo.val = 5

print('original: %r\n slice: %r\n list(): %r\n copy: %r\n deepcopy: %r'
      % (a, b, c, d, e))

Result:

original: ['foo', 5, 'baz']
slice: ['foo', 5]
list(): ['foo', 5]
copy: ['foo', 5]
deepcopy: ['foo', 1]
share|improve this answer
    
Just curious, instead of using copy.deepcopy(old_list), couldn't you do map(copy.copy, old_list)? That's what worked for me. Which one is more efficient? – Rohan May 25 at 1:41
    
@Rohan: AFAIK deepcopy creates copies of every value at any level. map(copy.copy, old_list) would only create a copy of values at the first level. I.e. cloning [[[1,2]]] would produce different results with these two methods. – Felix Kling May 25 at 1:43
    
So if I need to copy at only one level in (because I need to), map would be more efficient? – Rohan May 25 at 3:48

Felix already provided an excellent answer, but I thought I'd do a speed comparison of the various methods:

  1. 10.59 sec (105.9us/itn) - copy.deepcopy(old_list)
  2. 10.16 sec (101.6us/itn) - pure python Copy() method copying classes with deepcopy
  3. 1.488 sec (14.88us/itn) - pure python Copy() method not copying classes (only dicts/lists/tuples)
  4. 0.325 sec (3.25us/itn) - for item in old_list: new_list.append(item)
  5. 0.217 sec (2.17us/itn) - [i for i in old_list] (a list comprehension)
  6. 0.186 sec (1.86us/itn) - copy.copy(old_list)
  7. 0.075 sec (0.75us/itn) - list(old_list)
  8. 0.053 sec (0.53us/itn) - new_list = []; new_list.extend(old_list)
  9. 0.039 sec (0.39us/itn) - old_list[:] (list slicing)

So the fastest is list slicing. But be aware that copy.copy(), list[:] and list(list), unlike copy.deepcopy() and the python version don't copy any lists, dictionaries and class instances in the list, so if the originals change, they will change in the copied list too and vice versa.

(Here's the script if anyone's interested or wants to raise any issues:)

from copy import deepcopy

class old_class:
    def __init__(self):
        self.blah = 'blah'

class new_class(object):
    def __init__(self):
        self.blah = 'blah'

dignore = {str: None, unicode: None, int: None, type(None): None}

def Copy(obj, use_deepcopy=True):
    t = type(obj)

    if t in (list, tuple):
        if t == tuple:
            # Convert to a list if a tuple to 
            # allow assigning to when copying
            is_tuple = True
            obj = list(obj)
        else: 
            # Otherwise just do a quick slice copy
            obj = obj[:]
            is_tuple = False

        # Copy each item recursively
        for x in xrange(len(obj)):
            if type(obj[x]) in dignore:
                continue
            obj[x] = Copy(obj[x], use_deepcopy)

        if is_tuple: 
            # Convert back into a tuple again
            obj = tuple(obj)

    elif t == dict: 
        # Use the fast shallow dict copy() method and copy any 
        # values which aren't immutable (like lists, dicts etc)
        obj = obj.copy()
        for k in obj:
            if type(obj[k]) in dignore:
                continue
            obj[k] = Copy(obj[k], use_deepcopy)

    elif t in dignore: 
        # Numeric or string/unicode? 
        # It's immutable, so ignore it!
        pass 

    elif use_deepcopy: 
        obj = deepcopy(obj)
    return obj

if __name__ == '__main__':
    import copy
    from time import time

    num_times = 100000
    L = [None, 'blah', 1, 543.4532, 
         ['foo'], ('bar',), {'blah': 'blah'},
         old_class(), new_class()]

    t = time()
    for i in xrange(num_times):
        Copy(L)
    print 'Custom Copy:', time()-t

    t = time()
    for i in xrange(num_times):
        Copy(L, use_deepcopy=False)
    print 'Custom Copy Only Copying Lists/Tuples/Dicts (no classes):', time()-t

    t = time()
    for i in xrange(num_times):
        copy.copy(L)
    print 'copy.copy:', time()-t

    t = time()
    for i in xrange(num_times):
        copy.deepcopy(L)
    print 'copy.deepcopy:', time()-t

    t = time()
    for i in xrange(num_times):
        L[:]
    print 'list slicing [:]:', time()-t

    t = time()
    for i in xrange(num_times):
        list(L)
    print 'list(L):', time()-t

    t = time()
    for i in xrange(num_times):
        [i for i in L]
    print 'list expression(L):', time()-t

    t = time()
    for i in xrange(num_times):
        a = []
        a.extend(L)
    print 'list extend:', time()-t

    t = time()
    for i in xrange(num_times):
        a = []
        for y in L:
            a.append(y)
    print 'list append:', time()-t

    t = time()
    for i in xrange(num_times):
        a = []
        a.extend(i for i in L)
    print 'generator expression extend:', time()-t

EDIT: Added new-style, old-style classes and dicts to the benchmarks, and made the python version much faster and added some more methods including list expressions and extend().

share|improve this answer
    
Since you are benchmarking, it might be helpful to include a reference point. Are these figures still accurate in 2017 using Python 3.6 with fully compiled code? I'm noting the answer below (stackoverflow.com/a/17810305/26219) already questions this answer. – Mark Edington Apr 3 at 19:52
    
@MarkEdington I got inspired to do this by another SO question, but this question looked like the best place to post my results – River Apr 5 at 1:02

I've been told that Python 3.3+ adds list.copy() method, which should be as fast as slicing:

newlist = old_list.copy()

share|improve this answer

What are the options to clone or copy a list in Python?

There are two semantic ways to copy a list. A shallow copy creates a new list of the same objects, a deep copy creates a new list containing equivalent objects.

Shallow list copy

A shallow copy only copies the list itself, which is a container of references to the objects in the list. If the objects contained themselves are mutable and one is changed, the change will be reflected in both lists.

There are different ways to do this in Python 2 and 3. The Python 2 ways will also work in Python 3.

Python 2

In Python 2, the idiomatic way of making a shallow copy of a list is with a complete slice of the original:

a_copy = a_list[:]

You can also accomplish the same thing by passing the list through the list constructor,

a_copy = list(a_list)

but using the constructor is less efficient:

>>> timeit
>>> l = range(20)
>>> min(timeit.repeat(lambda: l[:]))
0.30504298210144043
>>> min(timeit.repeat(lambda: list(l)))
0.40698814392089844

Python 3

In Python 3, lists get the list.copy method:

a_copy = a_list.copy()

In Python 3.5:

>>> import timeit
>>> l = list(range(20))
>>> min(timeit.repeat(lambda: l[:]))
0.38448613602668047
>>> min(timeit.repeat(lambda: list(l)))
0.6309100328944623
>>> min(timeit.repeat(lambda: l.copy()))
0.38122922903858125

Making another pointer does not make a copy

Using new_list = my_list then modifies new_list every time my_list changes. Why is this?

my_list is a pointer to the actual list in memory. When you say new_list = my_list you're not making a copy, you're just adding another name that points at that original list in memory. We can have similar issues when we make copies of lists.

>>> l = [[], [], []]
>>> l_copy = l[:]
>>> l_copy
[[], [], []]
>>> l_copy[0].append('foo')
>>> l_copy
[['foo'], [], []]
>>> l
[['foo'], [], []]

The list is just an array of pointers to the contents, so a shallow copy just copies the pointers, and so you have two different lists, but they have the same contents. To make copies of the contents, you need a deep copy.

Deep copies

To make a deep copy of a list, in Python 2 or 3, use deepcopy in the copy module:

import copy
a_deep_copy = copy.deepcopy(a_list)

To demonstrate how this allows us to make new sub-lists:

>>> import copy
>>> l
[['foo'], [], []]
>>> l_deep_copy = copy.deepcopy(l)
>>> l_deep_copy[0].pop()
'foo'
>>> l_deep_copy
[[], [], []]
>>> l
[['foo'], [], []]

And so we see that the deep copied list is an entirely different list from the original. You could roll your own function - but don't. You're likely to create bugs you otherwise wouldn't have by using the standard library's deepcopy function.

Don't use eval

You may see this used as a way to deepcopy, but don't do it:

problematic_deep_copy = eval(repr(a_list))
  1. It's dangerous, particularly if you're evaluating something from a source you don't trust.
  2. It's not reliable, if a subelement you're copying doesn't have a representation that can be eval'd to reproduce an equivalent element.
  3. It's also less performant.

In 64 bit Python 2.7:

>>> import timeit
>>> import copy
>>> l = range(10)
>>> min(timeit.repeat(lambda: copy.deepcopy(l)))
27.55826997756958
>>> min(timeit.repeat(lambda: eval(repr(l))))
29.04534101486206

on 64 bit Python 3.5:

>>> import timeit
>>> import copy
>>> l = list(range(10))
>>> min(timeit.repeat(lambda: copy.deepcopy(l)))
16.84255409205798
>>> min(timeit.repeat(lambda: eval(repr(l))))
34.813894678023644
share|improve this answer

This answer is only for Python 2. I haven't upgraded to Python 3 yet.

There are many answers already that tell you how to make a proper copy, but none of them say why your original 'copy' failed.

Python doesn't store values in variables; it binds names to objects. Your original assignment took the object referred to by my_list and bound it to new_list as well. No matter which name you use there is still only one list, so changes made when referring to it as my_list will persist when referring to it as new_list. Each of the other answers to this question give you different ways of creating a new object to bind to new_list.

Each element of a list acts like a name, in that each element binds non-exclusively to an object. A shallow copy creates a new list whose elements bind to the same objects as before.

new_list = list(my_list)  # or my_list[:], but I prefer this syntax
# is simply a shorter way of:
new_list = [element for element in my_list]

To take your list copy one step further, copy each object that your list refers to, and bind those element copies to a new list.

import copy  
# each element must have __copy__ defined for this...
new_list = [copy.copy(element) for element in my_list]

This is not yet a deep copy, because each element of a list may refer to other objects, just like the list is bound to its elements. To recursively copy every element in the list, and then each other object referred to by each element, and so on: perform a deep copy.

import copy
# each element must have __deepcopy__ defined for this...
new_list = copy.deepcopy(my_list)

See the documentation for more information about corner cases in copying.

share|improve this answer

new_list = list(old_list)

share|improve this answer

Python's idiom for doing this is newList = oldList[:]

share|improve this answer

Use thing[:]

>>> a = [1,2]
>>> b = a[:]
>>> a += [3]
>>> a
[1, 2, 3]
>>> b
[1, 2]
>>> 
share|improve this answer

All of the other contributors gave great answers, which work when you have a single dimension (leveled) list, however of the methods mentioned so far, only copy.deepcopy() works to clone/copy a list and not have it point to the nested list objects when you are working with multidimensional, nested lists (list of lists). While Felix Kling refers to it in his answer, there is a little bit more to the issue and possibly a workaround using built-ins that might prove a faster alternative to deepcopy.

While new_list = old_list[:], copy.copy(old_list)' and for Py3k old_list.copy() work for single-leveled lists, they revert to pointing at the list objects nested within the old_list and the new_list, and changes to one of the list objects are perpetuated in the other.

Edit: New information brought to light

As was pointed out by both Aaron Hall and PM 2Ring using eval() is not only a bad idea, it is also much slower than copy.deepcopy().

This means that for multidimensional lists, the only option is copy.deepcopy(). With that being said, it really isn't an option as the performance goes way south when you try to use it on a moderately sized multidimensional array. I tried to timeit using a 42x42 array, not unheard of or even that large for bioinformatics applications, and I gave up on waiting for a response and just started typing my edit to this post.

It would seem that the only real option then is to initialize multiple lists and work on them independently. If anyone has any other suggestions, for how to handle multidimensional list copying, it would be appreciated.

As others have stated, there can be are significant performance issues using the copy module and copy.deepcopy for multidimensional lists. Trying to work out a different way of copying the multidimensional list without using deepcopy, (I was working on a problem for a course that only allows 5 seconds for the entire algorithm to run in order to receive credit), I came up with a way of using built-in functions to make a copy of the nested list without having them point at one another or at the list objects nested within them. I used eval() and repr() in the assignment to make the copy of the old list into the new list without creating a link to the old list. It takes the form of:

new_list = eval(repr(old_list))

Basically what this does is make a representation of old_list as a string and then evaluates the string as if it were the object that the string represents. By doing this, no link to the original list object is made. A new list object is created and each variable points to its own independent object. Here is an example using a 2 dimensional nested list.

old_list = [[0 for j in range(y)] for i in range(x)] # initialize (x,y) nested list

# assign a copy of old_list to new list without them pointing to the same list object
new_list = eval(repr(old_list)) 

# make a change to new_list 
for j in range(y):
    for i in range(x):
    new_list[i][j] += 1

If you then check the contents of each list, for example a 4 by 3 list, Python will return

>>> new_list

[[1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1]]

>>> old_list

[[0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0]]

While this probably isn't the canonical or syntactically correct way to do it, it seems to work well. I haven't tested performance, but I am going to guess that eval() and rep() will have less overhead to run than deepcopy will.

share|improve this answer
4  
This won't always work, since there's no guarantee that the string returned by repr() is sufficient to re-create the object. Also, eval() is a tool of last resort; see Eval really is dangerous by SO veteran Ned Batchelder for details. So when you advocate the use eval() you really should mention that it can be dangerous. – PM 2Ring Jul 10 '15 at 14:51
    
Fair point. Though I think that Batchelder's point is that the having the eval() function in Python in general is a risk. It isn't so much whether or not you make use of the function in code but that it is a security hole in Python in and of itself. My example isn't using it with a function that receives input from input(), sys.agrv, or even a text file. It is more along the lines of initializing a blank multidimensional list once, and then just having a way of copying it in a loop instead of reinitializing at each iteration of the loop. – AMR Jul 10 '15 at 16:41
    
As @AaronHall has pointed out, there is likely a significant performance issue to using new_list = eval(repr(old_list)), so besides it being a bad idea, it probably is also way too slow to work. – AMR Jul 10 '15 at 17:19

Unlike other languages have variable and value, python has name and object.

a=[1,2,3]

means give the list(object) a name "a", the

b=a

just gives the same object a new name "b", so whenever you do something with a, the object changes and therefore b changes.

The only way to make a really copy of a is to create a new object like other answers have said.

You can see more about this here

share|improve this answer

Python 3.6.0 Timings

Here are the timing results using Python 3.6.0. Keep in mind these times are relative to one another, not absolute.

I stuck to only doing shallow copies, and also added some new methods that weren't possible in Python2, such as list.copy() (the Python3 slice equivalent) and list unpacking (*new_list, = list):

METHOD                  TIME TAKEN
b = a[:]                6.468942025996512   #Python2 winner
b = a.copy()            6.986593422974693   #Python3 "slice equivalent"
b = []; b.extend(a)     7.309216841997113
b = a[0:len(a)]         10.916740721993847
*b, = a                 11.046738261007704
b = list(a)             11.761539687984623
b = [i for i in a]      24.66165203397395
b = copy.copy(a)        30.853400873980718
b = []
for item in a:
  b.append(item)        48.19176080400939

We can see the old winner still comes out on top, but not really by a huge amount, considering the increased readability of the Python3 list.copy() approach.

Note that these methods do not output equivalent results for any input other than lists. They all work for sliceable objects, a few work for any iterable, but only copy.copy() works for any Python object.


Here is the testing code for interested parties (Template from here):

import timeit

COUNT = 50000000
print("Array duplicating. Tests run", COUNT, "times")
setup = 'a = [0,1,2,3,4,5,6,7,8,9]; import copy'

print("b = list(a)\t\t", timeit.timeit(stmt='b = list(a)', setup=setup, number=COUNT))
print("b = copy.copy(a)\t\t", timeit.timeit(stmt='b = copy.copy(a)', setup=setup, number=COUNT))
print("b = a.copy()\t\t", timeit.timeit(stmt='b = a.copy()', setup=setup, number=COUNT))
print("b = a[:]\t\t", timeit.timeit(stmt='b = a[:]', setup=setup, number=COUNT))
print("b = a[0:len(a)]\t", timeit.timeit(stmt='b = a[0:len(a)]', setup=setup, number=COUNT))
print("*b, = a\t", timeit.timeit(stmt='*b, = a', setup=setup, number=COUNT))
print("b = []; b.extend(a)\t", timeit.timeit(stmt='b = []; b.extend(a)', setup=setup, number=COUNT))
print("b = []\nfor item in a: b.append(item)\t", timeit.timeit(stmt='b = []\nfor item in a:  b.append(item)', setup=setup, number=COUNT))
print("b = [i for i in a]\t", timeit.timeit(stmt='b = [i for i in a]', setup=setup, number=COUNT))
share|improve this answer

Another method (that I feel is fairly readable) is to turn it into a string and then switch it back to a list.

new_list = list(''.join(my_list))
share|improve this answer
    
On my machine, using your syntax instead of slicing is about 35-40% slower. This is another way to do it, yes, but I think people are better of learning the slice syntax (it takes the same amount of time as learning the ''.join() syntax. It seems most of the difference comes from using list, anyway, instead of using a literal: [x for x in ''.join(my_list)] (this is arguably even less readable). – not_a_robot Jan 26 at 14:09
    
The question states what are the options. This is inefficient, but readable and not listed in other answers. I'm not deleting this answer. – JJFord3 Feb 7 at 14:44

protected by Jon Clements Dec 5 '14 at 17:10

Thank you for your interest in this question. Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).

Would you like to answer one of these unanswered questions instead?

Not the answer you're looking for? Browse other questions tagged or ask your own question.