Description
I propose a new runtime.TaintOSThread function. This would mark the current thread that the calling goroutine is running on as being "tainted": not safe for use by other goroutines. When the goroutine exits or the goroutine is unpinned from the (previously pinned) OS thread, the thread is killed.
This is needed when goroutines are pinned to OS threads and the state of the thread is modified in way that is unsafe for other goroutines, such as unsharing and then modifying one of the Linux namespaces. Without this change, it is impossible to clean up OS threads: the pinning goroutines and OS threads need to stay until the programme exits, leading to unfixable goroutine and OS thread leaks.
The expected calling pattern is:
runtime.LockOSThread()
runtime.TaintOSThread()
syscall.Unshare(syscall.CLONE_NEWNS) // Example mutation of OS thread state.
Activity
bradfitz commentedon May 18, 2017
Could you instead just LockOSThread + Unshare + then kill that thread yourself?
rgooch commentedon May 18, 2017
How can I kill the thread safely? I don't see a function in the runtime package to do this. If I can safely kill the thread on which a goroutine is running, then that would work too, I just didn't think that would be as palatable as marking a thread as tainted.
There's actually no need to unshare (and that wouldn't help anyway, since the new namespace inherits from its ancestor).
bradfitz commentedon May 18, 2017
syscall.Syscall(syscall.SYS_EXIT, 0, 0, 0)? Haven't tried it, though.bradfitz commentedon May 18, 2017
Works for me:
bradfitz commentedon May 18, 2017
Or, better: (showing it actually went away)
randall77 commentedon May 18, 2017
I worry that the runtime would be upset at an OS thread permanently vanishing. The runtime may want notification to ensure that anything the
mis holding is freed first.Maybe the existing
enterSysCallalready does everything that might matter?At the very least, the
allmlist will grow without bound.rgooch commentedon May 18, 2017
Yes, that was one of my worries too. Can we get some kind of guidance on whether it's considered safe to kill a thread? What happens to the goroutine that's running on it?
randall77 commentedon May 18, 2017
@aclements
ianlancetaylor commentedon May 18, 2017
I do think we need to let the runtime know that the thread can not be used. The simple approach is
That will ensure that nothing else uses the thread, but it won't actually kill the thread. It would be an acceptable approach if you only need to do this a few times, but of course would be unsatisfactory if you need to do it continually.
Another approach would be to go through C. Have C code create a new thread, which could then call into Go. That gives you a thread that the Go runtime won't use itself, so you can simply call
pthread_exitwhen done. Of course this is not ideal if you otherwise have a pure Go program.rgooch commentedon May 18, 2017
All my Go code so far is pure Go. I was hoping to keep it that way.
It seems to me that we need some way to cleanly kill a thread. The hack shown above is, well, a hack :-)
ianlancetaylor commentedon May 18, 2017
Is
runtime.ThreadExitsufficient for your purposes? I'm thinking of a function that would simply cause the thread to exit and not return. The goroutine could continue running on a different thread.I'll note that I appreciate that you need this, but it seems awfully special purpose.
rgooch commentedon May 18, 2017
Yes,
runtime.ThreadExitwould work fine. I'm not special, I'm just ahead of the curve :-)bradfitz commentedon May 18, 2017
@ianlancetaylor, or instead of introducing new API: just consider all threads "tainted" if they ever used LockOSThread and when the goroutine exits (implicitly or via the existing runtime.Goexit()), then kill the thread.
29 remaining items
gopherbot commentedon Oct 6, 2017
Change https://golang.org/cl/68750 mentions this issue:
runtime: rename sched.mcount to sched.mnextruntime: replace sched.mcount int32 with sched.mnext int64
runtime: make it possible to exit Go-created threads
gopherbot commentedon Jan 3, 2018
Change https://golang.org/cl/85662 mentions this issue:
runtime: remove special handling of g0 stackrgooch commentedon Jan 27, 2018
@aclements: I'm reading the release notes for go1.10rc1 about locked OS threads. This part in particular has me concerned:
I have code which locks the thread, unshares the mount namespace, mounts a file-system (which is now visible only in that thread) and calls os/exec to run commands which need access to the mounted file-system. The code relies on the forked process being in the same mount namespace as the thread.
Will this code break with the new behaviour? It's unclear from the release notes.
aclements commentedon Jan 27, 2018
Hi @rgooch. This change is specifically to make things like that better. Prior to this change, random goroutines could wind up running in your new mount namespace because the runtime spawned new internal threads off your locked thread. Now that will no longer happen. But this doesn't affect the behavior of os/exec.
Maybe I misunderstand your concern? If the release notes aren't clear, I'd like to understand how to improve them. Is your specific interpretation of the release notes that this would affect the
forkperformed by os/exec because that, in some sense, creates a new thread? What we really mean is that the runtime will never clone another thread off of a locked thread for the purposes of scheduling goroutines.rgooch commentedon Jan 27, 2018
Hi, @aclements. Correct, the release notes are not clear to me. From your explanation above, it seems that when using
os/exec, the thread state of the calling goroutine is inherited, regardless of whether the goroutine calledrunime.LockOSThread(that's the behaviour I rely on). It would help if this were made clear.aclements commentedon Jan 27, 2018
Yes, with
os/exec, the thread state of the caller is inherited by the new process. Though it's still important to callruntime.LockOSThreadbecause otherwise you have no control over which thread's state will be inherited.rgooch commentedon Jan 27, 2018
OK, great. If you can tweak the documentation and release notes to make that clear, that will be helpful, thanks. The behaviour you've described is the one I rely on, and it's good to know I can keep relying on :-)