Python 信号处理在不同平台上的差异
在前文《为何 Windows 下无法用 Ctrl+C 终止 Python 进程》中,虽然解释了产生该现象的原因,但却没有解释为何同样的代码在 Linux 下就可以用 Ctrl+C 来中止。究其原因,是由于在操作系统层面,Linux 和 Windows 对 SIGINT 的信号处理方式不同所导致的。
Python 的底层实现原理
Python 将操作系统或 C 标准库提供的信号处理器称作 Low-level signal handler,Python 内建的 signal 模块在其基础上进行了封装。在 Python/pylifecycle.c 文件中找到 PyOS_setsig()
函数:
c/*
* All of the code in this function must only use async-signal-safe functions,
* listed at `man 7 signal` or
* http://www.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_04.html.
*/
PyOS_sighandler_t
PyOS_setsig(int sig, PyOS_sighandler_t handler)
{
#ifdef HAVE_SIGACTION
/* Some code in Modules/signalmodule.c depends on sigaction() being
* used here if HAVE_SIGACTION is defined. Fix that if this code
* changes to invalidate that assumption.
*/
struct sigaction context, ocontext;
context.sa_handler = handler;
sigemptyset(&context.sa_mask);
/* Using SA_ONSTACK is friendlier to other C/C++/Golang-VM code that
* extension module or embedding code may use where tiny thread stacks
* are used. https://bugs.python.org/issue43390 */
context.sa_flags = SA_ONSTACK;
if (sigaction(sig, &context, &ocontext) == -1)
return SIG_ERR;
return ocontext.sa_handler;
#else
PyOS_sighandler_t oldhandler;
oldhandler = signal(sig, handler);
#ifdef HAVE_SIGINTERRUPT
siginterrupt(sig, 1);
#endif
return oldhandler;
#endif
}
如果编译时定义了 HAVE_SIGACTION
宏,则调用 POSIX 标准的 sigaction()
函数注册信号处理器,否则使用 ANSI C 标准库的 signal()
函数。由于 Windows 下的 <signal.h>
没有提供 sigaction()
函数,所以在 Windows 平台上,Python 使用 signal()
函数来注册信号处理器;而 Linux 内核兼容 POSIX 标准,所以在 Linux 平台上,Python 使用 sigaction()
函数。关于 Python 信号处理的进一步封装代码可以在 Modules/signalmodule.c 文件内找到。
Windows 下的实现原理
先来看 Windows 下的 signal()
函数是如何实现的。如果安装了 Windows SDK,则可以在本地路径 C:\Program Files (x86)\Windows Kits\10\Source\10.0.22621.0\ucrt\misc\signal.cpp
(此处以版本 10.0.22621 为例,不同版本安装路径不同)找到实现代码:
c++extern "C" __crt_signal_handler_t __cdecl signal(int signum, __crt_signal_handler_t sigact)
{
// Check for signal actions that are supported on other platforms but not on
// this one, and make sure the action is not SIG_DIE:
if (is_unsupported_signal(signum, sigact))
return signal_failed(signum);
// First, handle the case where the signal does not correspond to an
// exception in the host OS:
if (signum == SIGINT ||
signum == SIGBREAK ||
signum == SIGABRT ||
signum == SIGABRT_COMPAT ||
signum == SIGTERM)
{
bool set_console_ctrl_error = false;
__crt_signal_handler_t old_action = nullptr;
__acrt_lock(__acrt_signal_lock);
__try
{
// If the signal is SIGINT or SIGBREAK make sure the handler is
// installed to capture ^C and ^Break events:
// C4127: conditional expression is constant
#pragma warning( suppress: 4127 )
if (is_console_signal(signum) && !console_ctrl_handler_installed)
{
if (SetConsoleCtrlHandler(ctrlevent_capture, TRUE))
{
console_ctrl_handler_installed = true;
}
else
{
_doserrno = GetLastError();
set_console_ctrl_error = true;
}
}
__crt_signal_handler_t* const action_pointer = get_global_action_nolock(signum);
if (action_pointer != nullptr)
{
old_action = __crt_fast_decode_pointer(*action_pointer);
if (sigact != SIG_GET)
*action_pointer = __crt_fast_encode_pointer(sigact);
}
}
__finally
{
__acrt_unlock(__acrt_signal_lock);
}
if (set_console_ctrl_error)
return signal_failed(signum);
return old_action;
}
// If we reach here, signum is supposed to be one of the signals which
// correspond to exceptions on the host OS. If it's not one of these,
// fail and return immediately:
if (signum != SIGFPE && signum != SIGILL && signum != SIGSEGV)
return signal_failed(signum);
__acrt_ptd* const ptd = __acrt_getptd_noexit();
if (ptd == nullptr)
return signal_failed(signum);
// Check that there is a per-thread instance of the exception-action table
// for this thread. If there isn't, create one:
if (ptd->_pxcptacttab == __acrt_exception_action_table)
{
// Allocate space for an exception-action table:
ptd->_pxcptacttab = static_cast<__crt_signal_action_t*>(_malloc_crt(__acrt_signal_action_table_size));
if (ptd->_pxcptacttab == nullptr)
return signal_failed(signum);
// Initialize the table by copying the contents of __acrt_exception_action_table:
memcpy(ptd->_pxcptacttab, __acrt_exception_action_table, __acrt_signal_action_table_size);
}
// Look up the proper entry in the exception-action table. Note that if
// several exceptions are mapped to the same signal, this returns the
// pointer to first such entry in the exception action table. It is assumed
// that the other entries immediately follow this one.
__crt_signal_action_t* const xcpt_action = siglookup(signum, ptd->_pxcptacttab);
if (xcpt_action == nullptr)
return signal_failed(signum);
// SIGSEGV, SIGILL and SIGFPE all have more than one exception mapped to
// them. The code below depends on the exceptions corresponding to the same
// signal being grouped together in the exception-action table.
__crt_signal_handler_t const old_action = xcpt_action->_action;
// If we are not just getting the currently installed action, loop through
// all the entries corresponding to the given signal and update them as
// appropriate:
if (sigact != SIG_GET)
{
__crt_signal_action_t* const last = ptd->_pxcptacttab + __acrt_signal_action_table_count;
// Iterate until we reach the end of the table or we reach the end of
// the range of actions for this signal, whichever comes first:
for (__crt_signal_action_t* p = xcpt_action; p != last && p->_signal_number == signum; ++p)
{
p->_action = sigact;
}
}
return old_action;
}
通过代码得知,Windows 下 signal()
函数的 signum
参数仅允许以下信号:
SIGINT
SIGBREAK
SIGABRT
SIGABRT_COMPAT
SIGTERM
SIGFPE
SIGILL
SIGSEGV
由于 Windows 内核本身并不支持 signal,所以不同的信号处理方式也是不同的。其中,SIGINT
、SIGBREAK
、SIGABRT
、SIGABRT_COMPAT
和 SIGTERM
通过调用 Windows API SetConsoleCtrlHandler()
实现;而 SIGFPE
、SIGILL
和 SIGSEGV
的信号处理则由 CRT 自己维护。使用 SetConsoleCtrlHandler()
注册一个 HandlerRoutine
类型的回调函数后,当控制台收到信号时,系统会在进程中创建一个新线程来执行回调函数(未经证实的猜测:这一步操作可能是由 ConHost 或者 OpenConsole 来完成的)。默认的回调函数在接收到 SIGINT
信号时,会直接使用 ExitProcess()
来退出当前进程。相当于如下代码:
c#include <windows.h>
#include <stdio.h>
BOOL WINAPI CtrlHandler(DWORD fdwCtrlType)
{
if (fdwCtrlType == CTRL_C_EVENT) {
ExitProcess();
}
return TRUE;
}
int main(void)
{
SetConsoleCtrlHandler(CtrlHandler, TRUE)
while (1) {}
return 0;
}
由于 CtrlHandler()
是在新线程中被调用的,所以即便程序的主线程处于阻塞状态,也可以实现立即退出进程。然而 Windows 下的 Python 重写了这一行为,它注册的 low-level signal hander 只是修改了 VM 的一个标志位,用于告知 VM 有待处理的信号。而在 Python 代码中注册的 signal handler,必须等到主线程返回到 VM 内才有机会被调用。如果此时主线程阻塞在 VM 之外的代码上,比如 Windows API 或者其他 Native 代码,那么主线程就不能对信号做出响应。这就为什么是在某些情况下,Windows 下的 Python 程序不响应 Ctrl+C 的原因。
time.sleep()
的特殊处理
可能你会奇怪,下面的代码明明在 Windows 下工作得很正常,会立即响应 SIGINT
信号:
pythonimport time
try:
while True: time.sleep(100000)
except KeyboardInterrupt:
print('keyboard interrupt received')
那是因为 Python 对 Windows 下的 time.sleep()
做了特殊处理,并非通过调用 Windows API Sleep()
来实现线程休眠功能。在 Modules/timemodule.c 查看 Windows 下的 time.sleep()
实现:
c#else // MS_WINDOWS
_PyTime_t timeout_100ns = _PyTime_As100Nanoseconds(timeout,
_PyTime_ROUND_CEILING);
// Maintain Windows Sleep() semantics for time.sleep(0)
if (timeout_100ns == 0) {
Py_BEGIN_ALLOW_THREADS
// A value of zero causes the thread to relinquish the remainder of its
// time slice to any other thread that is ready to run. If there are no
// other threads ready to run, the function returns immediately, and
// the thread continues execution.
Sleep(0);
Py_END_ALLOW_THREADS
return 0;
}
LARGE_INTEGER relative_timeout;
// No need to check for integer overflow, both types are signed
assert(sizeof(relative_timeout) == sizeof(timeout_100ns));
// SetWaitableTimer(): a negative due time indicates relative time
relative_timeout.QuadPart = -timeout_100ns;
HANDLE timer = CreateWaitableTimerExW(NULL, NULL, timer_flags,
TIMER_ALL_ACCESS);
if (timer == NULL) {
PyErr_SetFromWindowsErr(0);
return -1;
}
if (!SetWaitableTimerEx(timer, &relative_timeout,
0, // no period; the timer is signaled once
NULL, NULL, // no completion routine
NULL, // no wake context; do not resume from suspend
0)) // no tolerable delay for timer coalescing
{
PyErr_SetFromWindowsErr(0);
goto error;
}
// Only the main thread can be interrupted by SIGINT.
// Signal handlers are only executed in the main thread.
if (_PyOS_IsMainThread()) {
HANDLE sigint_event = _PyOS_SigintEvent();
while (1) {
// Check for pending SIGINT signal before resetting the event
if (PyErr_CheckSignals()) {
goto error;
}
ResetEvent(sigint_event);
HANDLE events[] = {timer, sigint_event};
DWORD rc;
Py_BEGIN_ALLOW_THREADS
rc = WaitForMultipleObjects(Py_ARRAY_LENGTH(events), events,
// bWaitAll
FALSE,
// No wait timeout
INFINITE);
Py_END_ALLOW_THREADS
if (rc == WAIT_FAILED) {
PyErr_SetFromWindowsErr(0);
goto error;
}
if (rc == WAIT_OBJECT_0) {
// Timer signaled: we are done
break;
}
assert(rc == (WAIT_OBJECT_0 + 1));
// The sleep was interrupted by SIGINT: restart sleeping
}
}
else {
DWORD rc;
Py_BEGIN_ALLOW_THREADS
rc = WaitForSingleObject(timer, INFINITE);
Py_END_ALLOW_THREADS
if (rc == WAIT_FAILED) {
PyErr_SetFromWindowsErr(0);
goto error;
}
assert(rc == WAIT_OBJECT_0);
// Timer signaled: we are done
}
CloseHandle(timer);
return 0;
error:
CloseHandle(timer);
return -1;
#endif
Windows 下的 Python 使用 CreateWaitableTimerExW()
创建了一个可等待的计时器对象,通过 WaitForMultipleObjects()
实现线程等待休眠。而在 Modules/signalmodule.c 中,Python low-level signal_handler()
遇到 SIGINT
信号时,通过 SetEvent()
让 WaitForMultipleObjects()
结束等待:
c#ifdef MS_WINDOWS
if (sig_num == SIGINT) {
signal_state_t *state = &signal_global_state;
SetEvent(state->sigint_event);
}
#endif
Linux 下的实现原理
Python 在 Linux 下使用 POSIX 标准的 sigaction()
函数来实现信号机制。由于 Linux 的 POSIX 兼容是内核提供的,所以实现原理和 Windows 完全不同。最本质的区别是,Linux 下的信号是真正的软中断,由内核触发信号处理,其原理有点类似 Windows 下的异步过程调用。区别是,Linux 的 signal 在收到信号时就进行异步调用;而 Windows 的 QueueUserAPC()
是将异步过程的函数加入当前线程的 APC 队列,等到线程执行到特定的 API 函数时再触发执行异步过程,这一点倒是和 Python 的 signal 机制有些类似。
Linux 下的信号由内核处理,当内核收到信号后,会将信号放入目标进程的信号队列中,并且向目标进程发生一个中断,使目标进程进入内核态。如果目标进程的主线程此时正被 IO 操作阻塞,相当于处于休眠状态,则休眠的主线程会被唤醒进行信号处理。接下来内核会将数据复制到进程的用户空间,并将 EIP 指令寄存器指向信号处理器的函数地址。然后返回到用户态中,执行相应的信号处理函数。因此,Linux 下的 Python 进程总是会响应 SIGINT
信号。