星期二, 七月 31, 2007

python forking after threading?

现在需要对 mirrord 进行单元测试,但 mirrord 进行了 daemon switch,因此会 fork 子进程并退出父进程。而在单元测试中,我的测试程序只能通过创建子进程或子线程来运行工作代码(否则运行到 daemon 状态就阻塞进入无限循环,测试代码就没办法往下走了),为了得到工作代码的状态来进行检测,需要能够访问到工作代码的一些变量,所以简单的就是使用线程。

但问题是,如果在 threading 之后又进行了 forking,而 fork 后的父进程退出,那么线程是否退出呢?写一个原型代码看看:
#!/usr/bin/env python
# -*- encoding: utf-8 -*-

import os,sys
import time
import threading

def run_thread():
pid = os.fork()
if pid == 0:
print "child process ..."
while True:
time.sleep(5)
print "child process is running ..."
print "main process exiting ..."
sys.exit(0)

that = threading.Thread(target=run_thread, name="child")
that.setDaemon(1)
that.start()
# #1:
that.join()
print "main thread end"
运行结果如下:
sh$ python mirrord_5_ut.py
child process ...
main process exiting ...
main thread end
[Ctrl^C]
sh$ child process is running ...
child process is running ...
child process is running ...
[Ctrl^C]
sh$ child process is running ...
由此可见,fork 后的子进程还在运行,但是主进程退出了,也导致调度它的线程退出了!

但有一个问题,如果在 thread code 中 fork 的子进程也退出,那么在上面 #1 后面的代码会不会执行?因为进程是复制地址空间的,例如:
#!/usr/bin/env python
# -*- encoding: utf-8 -*-

"""
If the main process exits after forking,
while the forking is execute by a child thread,
will the main/child threads end? -- Yes
Author: Roc Zhou
Date: 2007-07-31
Email: chowroc.z@gmail.com
"""

import os,sys
import time
import threading

var = 0

def run_thread():
global var
var = 1
print "var is '%d' in thread" % var
pid = os.fork()
if pid == 0:
print "child process ..."
i = 0
while i < 10:
var = i * 2
time.sleep(1)
print "child process is running %d and var is '%d' ..." % (i, var)
i += 1
else:
print "main process ..."
j = 0
while j < 10:
var = j * 2 + 1
time.sleep(1)
print "main process is running %d and var is '%d' ..." % (j, var)
j += 1
# sys.exit(0)
print "var is '%d' at the end of thread code" % var

print "var is '%d' at the beginning" % var
# #a:
# run_thread()
# ------------
# #b:
that = threading.Thread(target=run_thread, name="child")
that.setDaemon(1)
that.start()
time.sleep(5)
# #1:
that.join()
# ------------
print "var is '%d' at the main thread end" % var
当按照 #a 来运行的时候,没有启动线程,那么输出结果是:
sh$ python mirrord_5_ut.py
var is '0' at the beginning
var is '1' in thread
child process ...
main process ...
main process is running 0 and var is '1' ...
child process is running 0 and var is '0' ...
main process is running 1 and var is '3' ...
child process is running 1 and var is '2' ...
main process is running 2 and var is '5' ...
child process is running 2 and var is '4' ...
main process is running 3 and var is '7' ...
child process is running 3 and var is '6' ...
main process is running 4 and var is '9' ...
child process is running 4 and var is '8' ...
main process is running 5 and var is '11' ...
child process is running 5 and var is '10' ...
main process is running 6 and var is '13' ...
child process is running 6 and var is '12' ...
main process is running 7 and var is '15' ...
child process is running 7 and var is '14' ...
main process is running 8 and var is '17' ...
child process is running 8 and var is '16' ...
main process is running 9 and var is '19' ...
var is '19' at the end of thread code
var is '19' at the main thread end
sh$ child process is running 9 and var is '18' ...
var is '18' at the end of thread code
var is '18' at the main thread end
可以看到,最后的代码运行了两次!但是在调用线程之后就不是,按 #b 运行,输出结果如下:
sh$ python mirrord_5_ut.py
var is '0' at the beginning
var is '1' in thread
child process ...
main process ...
child process is running 0 and var is '0' ...
main process is running 0 and var is '1' ...
child process is running 1 and var is '2' ...
main process is running 1 and var is '3' ...
child process is running 2 and var is '4' ...
main process is running 2 and var is '5' ...
child process is running 3 and var is '6' ...
main process is running 3 and var is '7' ...
child process is running 4 and var is '8' ...
main process is running 4 and var is '9' ...
child process is running 5 and var is '10' ...
main process is running 5 and var is '11' ...
child process is running 6 and var is '12' ...
main process is running 6 and var is '13' ...
child process is running 7 and var is '14' ...
main process is running 7 and var is '15' ...
child process is running 8 and var is '16' ...
main process is running 8 and var is '17' ...
child process is running 9 and var is '18' ...
var is '18' at the end of thread code
main process is running 9 and var is '19' ...
var is '19' at the end of thread code
var is '19' at the main thread end
这里最后的代码只运行了一次!

另外,是否使用 #1 处的 that.join() 也是有差别的!
sh$ python mirrord_5_ut.py
var is '0' at the beginning
var is '1' in thread
child process ...
main process ...
child process is running 0 and var is '0' ...
main process is running 0 and var is '1' ...
child process is running 1 and var is '2' ...
main process is running 1 and var is '3' ...
child process is running 2 and var is '4' ...
main process is running 2 and var is '5' ...
child process is running 3 and var is '6' ...
main process is running 3 and var is '7' ...
var is '9' at the main thread end
sh$ child process is running 4 and var is '8' ...
child process is running 5 and var is '10' ...
child process is running 6 and var is '12' ...
child process is running 7 and var is '14' ...
child process is running 8 and var is '16' ...
child process is running 9 and var is '18' ...
var is '18' at the end of thread code
所以,最终,将 mirrord.py 这个 module 中的 daemon 的代码移到 mirrord 这个 script 中!

没有评论: