Tag: Golang-101-hacks

Golang 101 hacks (12) — Reallocating underlying array of slice

When appending data into slice, if the underlying array of the slice doesn’t have enough space, a new array will be allocated. Then the elements in old array will be copied into this new memory, accompanied with adding new data behind. So when using Golang built-inappend function, you must always keep the idea that “the array may have been changed” in mind, and be very careful about it, otherwise, it may bite you!

Let me explain it through a contrived example:

package main

import (
    "fmt"
)

func addTail(s []int)  {
    var ns [][]int
    for _, v := range []int{1, 2} {
        ns = append(ns, append(s, v))
    }
    fmt.Println(ns)
}

func main() {
    s1 := []int{0, 0}
    s2 := append(s1, 0)

    for _, v := range [][]int{s1, s2} {
        addTail(v)
    }
}

The s1 is [0, 0], and the s2 is [0, 0, 0]; in addTail function, I want to add 1 and 2 behind the slice. So the wanted output is like this:

[[0 0 1] [0 0 2]]
[[0 0 0 1] [0 0 0 2]]

But the actual result is:

[[0 0 1] [0 0 2]]
[[0 0 0 2] [0 0 0 2]]

The operations on s1 are successful, while s2 not.

Let’s use delve to debug this issue and check the internal mechanism of slice: Add breakpoint on addTail function, and it is first hit when processing s1:

(dlv) n
> main.addTail() ./slice.go:8 (PC: 0x401022)
     3: import (
     4:         "fmt"
     5: )
     6:
     7: func addTail(s []int)  {
=>   8:         var ns [][]int
     9:         for _, v := range []int{1, 2} {
    10:                 ns = append(ns, append(s, v))
    11:         }
    12:         fmt.Println(ns)
    13: }
(dlv) p s
[]int len: 2, cap: 2, [0,0]
(dlv) p &s[0]
(*int)(0xc82000a2a0)

The length and capacity of s1 are both 2, and the underlying array address is 0xc82000a2a0, so what happened when executing the following statement:

ns = append(ns, append(s, v))

Since the length and capacity of s1 are both 2, there is no room for new buddy. To append a new value, a new array must be allocated, and it contains both [0, 0] from s1 and the new value(1 or 2). You can consider append(s, v) generated an anonymous new slice, and it is appended in ns. We can check it after running “ns = append(ns, append(s, v))“:

(dlv) n
> main.addTail() ./slice.go:9 (PC: 0x401217)
     4:         "fmt"
     5: )
     6:
     7: func addTail(s []int)  {
     8:         var ns [][]int
=>   9:         for _, v := range []int{1, 2} {
    10:                 ns = append(ns, append(s, v))
    11:         }
    12:         fmt.Println(ns)
    13: }
    14:
(dlv) p ns
[][]int len: 1, cap: 1, [
        [0,0,1],
]
(dlv) p ns[0]
[]int len: 3, cap: 4, [0,0,1]
(dlv) p &ns[0][0]
(*int)(0xc82000e240)
(dlv) p s
[]int len: 2, cap: 2, [0,0]
(dlv) p &s[0]
(*int)(0xc82000a2a0)

We can see the length of anonymous slice is 3, capacity is 4, and the underlying array address is 0xc82000e240, different from s1‘s (0xc82000a2a0). Continue executing until exit loop:

(dlv) n
> main.addTail() ./slice.go:12 (PC: 0x40124c)
     7: func addTail(s []int)  {
     8:         var ns [][]int
     9:         for _, v := range []int{1, 2} {
    10:                 ns = append(ns, append(s, v))
    11:         }
=>  12:         fmt.Println(ns)
    13: }
    14:
    15: func main() {
    16:         s1 := []int{0, 0}
    17:         s2 := append(s1, 0)
(dlv) p ns
[][]int len: 2, cap: 2, [
        [0,0,1],
        [0,0,2],
]
(dlv) p &ns[0][0]
(*int)(0xc82000e240)
(dlv) p &ns[1][0]
(*int)(0xc82000e280)
(dlv) p &s[0]
(*int)(0xc82000a2a0)

We can see s1, ns[0] and ns[1] have 3 independent array.

Now, let’s follow the same steps to check what happened on s2:

(dlv) n
> main.addTail() ./slice.go:8 (PC: 0x401022)
     3: import (
     4:         "fmt"
     5: )
     6:
     7: func addTail(s []int)  {
=>   8:         var ns [][]int
     9:         for _, v := range []int{1, 2} {
    10:                 ns = append(ns, append(s, v))
    11:         }
    12:         fmt.Println(ns)
    13: }
(dlv) p s
[]int len: 3, cap: 4, [0,0,0]
(dlv) p &s[0]
(*int)(0xc82000e220)

The length of s2 is 3, and capacity is 4, so there is one slot for adding new element. Check the s2 and ns‘ values after executing “ns = append(ns, append(s, v))” the first time:

(dlv)
> main.addTail() ./slice.go:9 (PC: 0x401217)
     4:         "fmt"
     5: )
     6:
     7: func addTail(s []int)  {
     8:         var ns [][]int
=>   9:         for _, v := range []int{1, 2} {
    10:                 ns = append(ns, append(s, v))
    11:         }
    12:         fmt.Println(ns)
    13: }
    14:
(dlv) p ns
[][]int len: 1, cap: 1, [
        [0,0,0,1],
]
(dlv) p &ns[0][0]
(*int)(0xc82000e220)
(dlv) p s
[]int len: 3, cap: 4, [0,0,0]
(dlv) p &s[0]
(*int)(0xc82000e220)

We can see the new anonymous slice’s array address is also 0xc82000e220, that’s because the s2 has enough space to hold new value, no new array is allocated. Check the s2 and ns again after adding 2:

(dlv)
> main.addTail() ./slice.go:12 (PC: 0x40124c)
     7: func addTail(s []int)  {
     8:         var ns [][]int
     9:         for _, v := range []int{1, 2} {
    10:                 ns = append(ns, append(s, v))
    11:         }
=>  12:         fmt.Println(ns)
    13: }
    14:
    15: func main() {
    16:         s1 := []int{0, 0}
    17:         s2 := append(s1, 0)
(dlv) p ns
[][]int len: 2, cap: 2, [
        [0,0,0,2],
        [0,0,0,2],
]
(dlv) p &ns[0][0]
(*int)(0xc82000e220)
(dlv) p &ns[1][0]
(*int)(0xc82000e220)
(dlv) p s
[]int len: 3, cap: 4, [0,0,0]
(dlv) p &s[0]
(*int)(0xc82000e220)

All 3 slices point to the same array, so the later value(2) will override previous item(1).

So in a conclusion, append is very tricky since it can modify the underlying array without noticing you. You must know the memory layout behind every slice clearly, else the slice can give you a big, unwanted surprise!

Golang 101 hacks (11) — Two-dimensional slice

Golang supports multiple-dimensional slice, but I only want to introduce two-dimensional slice here. One reason is the two-dimensional slice is usually used in daily life, while multiple-dimensional seems not common. If you often use multiple-dimensional slice, personally I think the code is a little clumsy and not easy to maintain, so maybe you can try to check whether there is a better method; the other reason is the principle behind multiple-dimensional slice is the same with two-dimensional slice, you can also understand it if you know two-dimensional slice well.

Let’s the following example:

package main

import "fmt"

func main() {
    s := make([][]int, 2)
    fmt.Println(len(s), cap(s), &s[0])

    s[0] = []int{1, 2, 3}
    fmt.Println(len(s[0]), cap(s[0]), &s[0][0])

    s[1] = make([]int, 3, 5)
    fmt.Println(len(s[1]), cap(s[1]), &s[1][0])
}

I still use gdb to inspect the execution flow:

5       func main() {
(gdb) n
6               s := make([][]int, 2)
(gdb)
7               fmt.Println(len(s), cap(s), &s[0])
(gdb)
2 2 &[]
9               s[0] = []int{1, 2, 3}
(gdb) p &s
$1 = (struct [][]int *) 0xc82003fe70
(gdb) x/24xb 0xc82003fe70
0xc82003fe70:   0x40    0x02    0x01    0x20    0xc8    0x00    0x00    0x00
0xc82003fe78:   0x02    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xc82003fe80:   0x02    0x00    0x00    0x00    0x00    0x00    0x00    0x00

s is a slice (the start memory address is 0xc82003fe70), but its elements are also slices. Let’s check the elements:

(gdb) x/48xb 0xc820010240
0xc820010240:   0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xc820010248:   0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xc820010250:   0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xc820010258:   0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xc820010260:   0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xc820010268:   0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00

All the memory content are 0, nothing exciting! Continue to step by step:

(gdb) n
10              fmt.Println(len(s[0]), cap(s[0]), &s[0][0])
(gdb)
3 3 0xc82000e220
12              s[1] = make([]int, 3, 5)

Now since s contains a valid slice element, check its underlying array:

(gdb) x/48xb 0xc820010240
0xc820010240:   0x20    0xe2    0x00    0x20    0xc8    0x00    0x00    0x00
0xc820010248:   0x03    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xc820010250:   0x03    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xc820010258:   0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xc820010260:   0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xc820010268:   0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00

Yeah, the memory has been updated by the pointer, length and capacity of s[0], the same with previous output from fmt.Println. Check the underlying array of s[0]:

(gdb) x/24xb 0xc82000e220
0xc82000e220:   0x01    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xc82000e228:   0x02    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xc82000e230:   0x03    0x00    0x00    0x00    0x00    0x00    0x00    0x00

We can see 3 elements: 1, 2, 3.

Following the same method to check the s[1]:

(gdb) n
13              fmt.Println(len(s[1]), cap(s[1]), &s[1][0])
(gdb)
3 5 0xc820010270
14      }
(gdb) x/48xb 0xc820010240
0xc820010240:   0x20    0xe2    0x00    0x20    0xc8    0x00    0x00    0x00
0xc820010248:   0x03    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xc820010250:   0x03    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xc820010258:   0x70    0x02    0x01    0x20    0xc8    0x00    0x00    0x00
0xc820010260:   0x03    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xc820010268:   0x05    0x00    0x00    0x00    0x00    0x00    0x00    0x00
(gdb) x/40xb 0xc820010270
0xc820010270:   0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xc820010278:   0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xc820010280:   0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xc820010288:   0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xc820010290:   0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00

Now, we can see s contains all the info of its slice elements, and the elements of s[1] are initialized to 0.

Golang 101 hacks (10) — Pass slice as a function argument

In Golang, the function parameters are passed by value. With respect to use slice as a function argument, that means the function will get the copies of the slice: a pointer which points to the starting address of the underlying array, accompanied by the length and capacity of the slice. Oh boy! Since you know the address of the memory which is used to store the data, you can tweak the slice now. Let’s see the following example:

package main

import (
    "fmt"
)

func modifyValue(s []int)  {
    s[1] = 3
    fmt.Printf("In modifyValue: s is %v\n", s)
}
func main() {
    s := []int{1, 2}
    fmt.Printf("In main, before modifyValue: s is %v\n", s)
    modifyValue(s)
    fmt.Printf("In main, after modifyValue: s is %v\n", s)
}

The result is here:

In main, before modifyValue: s is [1 2]
In modifyValue: s is [1 3]
In main, after modifyValue: s is [1 3]

You can see, after running modifyValue function, the content of slice s is changed. Although the modifyValue function just gets a copy of the memory address of slice’s underlying array, it is enough!

See another example:

package main

import (
    "fmt"
)

func addValue(s []int) {
    s = append(s, 3)
    fmt.Printf("In addValue: s is %v\n", s)
}

func main() {
    s := []int{1, 2}
    fmt.Printf("In main, before addValue: s is %v\n", s)
    addValue(s)
    fmt.Printf("In main, after addValue: s is %v\n", s)
}

The result is like this:

In main, before addValue: s is [1 2]
In addValue: s is [1 2 3]
In main, after addValue: s is [1 2]

This time, the addValue function doesn’t take effect on the s slice in main function. That’s because it just manipulate the copy of the s, not the “real” s.

So if you really want the function to change the content of a slice, you can pass the address of the slice:

package main

import (
    "fmt"
)

func addValue(s *[]int) {
    *s = append(*s, 3)
    fmt.Printf("In addValue: s is %v\n", s)
}

func main() {
    s := []int{1, 2}
    fmt.Printf("In main, before addValue: s is %v\n", s)
    addValue(&s)
    fmt.Printf("In main, after addValue: s is %v\n", s)
}

The result is like this:

In main, before addValue: s is [1 2]
In addValue: s is &[1 2 3]
In main, after addValue: s is [1 2 3]

Golang 101 hacks (9) — The internals of slice

There are 3 components of slice:
a) Pointer: Points to the start position of slice in the underlying array;
b) length (type is int): the number of the valid elements of the slice;
b) capacity (type is int): the total number of slots of the slice.

Check the following code:

package main

import (
    "fmt"
    "unsafe"
)

func main() {
    var s1 []int
    fmt.Println(unsafe.Sizeof(s1))
}

The result is 24 on my 64-bit system (The pointer and int both occupy 8 bytes).

In the next example, I will use gdb to poke the internals of slice. The code is like this:

package main

import "fmt"

func main() {
        s1 := make([]int, 3, 5)
        copy(s1, []int{1, 2, 3})
        fmt.Println(len(s1), cap(s1), &s1[0])

        s1 = append(s1, 4)
        fmt.Println(len(s1), cap(s1), &s1[0])

        s2 := s1[1:]
        fmt.Println(len(s2), cap(s2), &s2[0])
}

Use gdb to step into the code:

5       func main() {
(gdb) n
6               s1 := make([]int, 3, 5)
(gdb)
7               copy(s1, []int{1, 2, 3})
(gdb)
8               fmt.Println(len(s1), cap(s1), &s1[0])
(gdb)
3 5 0xc820010240

Before executing “s1 = append(s1, 4)“, fmt.Println outputs the length(3), capacity(5) and the starting element address(0xc820010240) of the slice, let’s check the memory layout of s1:

10              s1 = append(s1, 4)
(gdb) p &s1
$1 = (struct []int *) 0xc82003fe40
(gdb) x/24xb 0xc82003fe40
0xc82003fe40:   0x40    0x02    0x01    0x20    0xc8    0x00    0x00    0x00
0xc82003fe48:   0x03    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xc82003fe50:   0x05    0x00    0x00    0x00    0x00    0x00    0x00    0x00
(gdb)

Through examining the memory content of s1(the start memory address is 0xc82003fe40), we can see its content matches the output offmt.Println.

Continue executing, and check the result before “s2 := s1[1:]“:

(gdb) n
11              fmt.Println(len(s1), cap(s1), &s1[0])
(gdb)
4 5 0xc820010240
13              s2 := s1[1:]
(gdb) x/24xb 0xc82003fe40
0xc82003fe40:   0x40    0x02    0x01    0x20    0xc8    0x00    0x00    0x00
0xc82003fe48:   0x04    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xc82003fe50:   0x05    0x00    0x00    0x00    0x00    0x00    0x00    0x00

We can see after appending a new element(s1 = append(s1, 4)), the length of s1 is changed to 4, but the capacity remains the original value.

Let’s check the internals of s2:

(gdb) n
14              fmt.Println(len(s2), cap(s2), &s2[0])
(gdb)
3 4 0xc820010248
15      }
(gdb) p &s2
$3 = (struct []int *) 0xc82003fe28
(gdb) x/24hb 0xc82003fe28
0xc82003fe28:   0x48    0x02    0x01    0x20    0xc8    0x00    0x00    0x00
0xc82003fe30:   0x03    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xc82003fe38:   0x04    0x00    0x00    0x00    0x00    0x00    0x00    0x00

The element start address of s2 is 0xc820010248, actually the second element of s1(0xc82003fe40), and the length(3) and capacity(4) are both one less than the counterparts of s1(4 and 5 respectively).

Golang 101 hacks (8) — String

In Golang, string is an immutable array of bytes. So if created, we can’t change its value. E.g.:

package main

func main()  {
    s := "Hello"
    s[0] = 'h'
}

The compiler will complain:

cannot assign to s[0]

To modify the content of a string, you could convert it to a byte array. But in fact, you are not operate on the original string, just a copy:

package main

import "fmt"

func main()  {
    s := "Hello"
    b := []byte(s)
    b[0] = 'h'
    fmt.Printf("%s\n", b)
}

The result is like this:

hello

Since Golang uses UTF-8 encoding, you must remember the len function will return the a string’s byte number, not character number:

package main

import "fmt"

func main()  {
    s := "日志log"
    fmt.Println(len(s))
}

The result is:

Because each Chinese character occupied 3 bytes, s in the above example contains 5 characters and 9bytes.

If you want to access every character, for ... range loop can give a help:

package main
import "fmt"

func main() {
    s := "日志log"
    for index, runeValue := range s {
        fmt.Printf("%#U starts at byte position %d\n", runeValue, index)
    }
}

The result is:

U+65E5 '日' starts at byte position 0
U+5FD7 '志' starts at byte position 3
U+006C 'l' starts at byte position 6
U+006F 'o' starts at byte position 7
U+0067 'g' starts at byte position 8

Reference:
Strings, bytes, runes and characters in Go;
The Go Programming Language.