Robust generic functions on slices - The Go Programming Language

nội dung

The slices package provides functions that work for slices of any type. In this blog post we’ll discuss how you can use these functions more effectively by understanding how slices are represented in memory and how that affects the garbage collector, and we’ll cover how we recently adjusted these functions to make them less surprising.

With Type parameters we can write functions like slices.Index once for all types of slices of comparable elements:

// Index returns the index of the first occurrence of v in s,
// or -1 if not present.
func Index[S ~[]E, E comparable](s S, v E) int {
    for i := range s {
        if v == s[i] {
            return i
        }
    }
    return -1
}

It is no longer necessary to implement Index again for each different type of element.

The slices package contains many such helpers to perform common operations on slices:

    s := []string{"Bat", "Fox", "Owl", "Fox"}
    s2 := slices.Clone(s)
    slices.Sort(s2)
    fmt.Println(s2) // [Bat Fox Fox Owl]
    s2 = slices.Compact(s2)
    fmt.Println(s2)                  // [Bat Fox Owl]
    fmt.Println(slices.Equal(s, s2)) // false

Several new functions (Insert, Replace, Delete, etc.) modify the slice. To understand how they work, and how to properly use them, we need to examine the underlying structure of slices.

A slice is a view of a portion of an array. Internally, a slice contains a pointer, a length, and a capacity. Two slices can have the same underlying array, and can view overlapping portions.

For example, this slice s is a view on 4 elements of an array of size 6:

If a function changes the length of a slice passed as a parameter, then it needs to return a new slice to the caller. The underlying array may remain the same if it doesn’t have to grow. This explains why append and slices.Compact return a value, but slices.Sort, which merely reorders the elements, does not.

Consider the task of deleting a portion of a slice. Prior to generics, the standard way to delete the portion s[2:5] from the slice s was to call the append function to copy the end portion over the middle portion:

s = append(s[:2], s[5:]...)

The syntax was complex and error-prone, involving subslices and a variadic parameter. We added slice.Delete to make it easier to delete elements:

func Delete[S ~[]E, E any](s S, i, j int) S {
       return append(s[:i], s[j:]...)
}

The one-line function Delete more clearly expresses the programmer’s intent. Let’s consider a slice s of length 6 and capacity 8, containing pointers:

This call deletes the elements at s[2], s[3], s[4] from the slice s:

s = slices.Delete(s, 2, 5)

The gap at the indices 2, 3, 4 is filled by shifting the element s[5] to the left, and setting the new length to 3.

Delete need not allocate a new array, as it shifts the elements in place. Like append, it returns a new slice. Many other functions in the slices package follow this pattern, including Compact, CompactFunc, DeleteFunc, Grow, Insert, and Replace.

When calling these functions we must consider the original slice invalid, because the underlying array has been modified. It would be a mistake to call the function but ignore the return value:

    slices.Delete(s, 2, 5) // incorrect!
    // s still has the same length, but modified contents

A problem of unwanted liveness

Before Go 1.22, slices.Delete didn’t modify the elements between the new and original lengths of the slice. While the returned slice wouldn’t include these elements, the “gap” created at the end of the original, now-invalidated slice continued to hold onto them. These elements could contain pointers to large objects (a 20MB image), and the garbage collector would not release the memory associated with these objects. This resulted in a memory leak that could lead to significant performance issues.

In this above example, we’re successfully deleting the pointers p2, p3, p4 from s[2:5], by shifting one element to the left. But p3 and p4 are still present in the underlying array, beyond the new length of s. The garbage collector won’t reclaim them. Less obviously, p5 is not one of the deleted elements, but its memory may still leak because of the p5 pointer kept in the gray part of the array.

This could be confusing for developers, if they were not aware that “invisible” elements were still using memory.

So we had two options:

  • Either keep the efficient implementation of Delete. Let users set obsolete pointers to nil themselves, if they want to make sure the values pointed to can be freed.
  • Or change Delete to always set the obsolete elements to zero. This is extra work, making Delete slightly less efficient. Zeroing pointers (setting them to nil) enables the garbage collection of the objects, when they become otherwise unreachable.

It was not obvious which option was best. The first one provided performance by default, and the second one provided memory frugality by default.

The fix

A key observation is that “setting the obsolete pointers to nil” is not as easy as it seems. In fact, this task is so error-prone that we should not put the burden on the user to write it. Out of pragmatism, we chose to modify the implementation of the five functions Compact, CompactFunc, Delete, DeleteFunc, Replace to “clear the tail”. As a nice side effect, the cognitive load is reduced and users now don’t need to worry about these memory leaks.

In Go 1.22, this is what the memory looks like after calling Delete:

The code changed in the five functions uses the new built-in function clear (Go 1.21) to set the obsolete elements to the zero value of the element type of s:

The zero value of E is nil when E is a type of pointer, slice, map, chan, or interface.

Tests failing

This change has led to some tests that passed in Go 1.21 now failing in Go 1.22, when the slices functions are used incorrectly. This is good news. When you have a bug, tests should let you know.

If you ignore the return value of Delete:

slices.Delete(s, 2, 3)  // !! INCORRECT !!

then you may incorrectly assume that s does not contain any nil pointer. Example in the Go Playground.

If you ignore the return value of Compact:

slices.Sort(s) // correct
slices.Compact(s) // !! INCORRECT !!

then you may incorrectly assume that s is properly sorted and compacted. Example.

If you assign the return value of Delete to another variable, and keep using the original slice:

u := slices.Delete(s, 2, 3)  // !! INCORRECT, if you keep using s !!

then you may incorrectly assume that s does not contain any nil pointer. Example.

If you accidentally shadow the slice variable, and keep using the original slice:

s := slices.Delete(s, 2, 3)  // !! INCORRECT, using := instead of = !!

then you may incorrectly assume that s does not contain any nil pointer. Example.

Conclusion

The API of the slices package is a net improvement over the traditional pre-generics syntax to delete or insert elements.

We encourage developers to use the new functions, while avoiding the “gotchas” listed above.

Thanks to the recent changes in the implementation, a class of memory leaks is automatically avoided, without any change to the API, and with no extra work for the developers.

Further reading

The signature of the functions in the slices package is heavily influenced by the specifics of the representation of slices in memory. We recommend reading

The original proposal about zeroing obsolete elements contains many details and comments.

Tóm tắt
The slices package in Go provides functions that work for slices of any type, allowing for more efficient use by understanding how slices are represented in memory. With type parameters, functions like Index can be written once for all types of slices of comparable elements. The package contains helpers for common operations on slices like Clone, Sort, and Compact. New functions like Insert, Replace, and Delete modify slices, requiring a new slice to be returned if the length changes. Prior to Go 1.22, the Delete function did not clear obsolete pointers, leading to memory leaks. To address this, functions in the slices package were modified to clear obsolete elements, reducing the burden on users. This change has caused some tests to fail when functions are used incorrectly, highlighting the importance of proper usage. Overall, the slices package API is an improvement over traditional syntax, and developers are encouraged to use the new functions while being mindful of potential issues.