弹剑而歌: 二月 2007

星期二, 二月 27, 2007

vsftpd 为不同用户进行不同设置

当需要对 vsftpd 的不同用户进行不同设置时，比如 web 站点使用 httpd 帐户登录，其 DocumentRoot 为 /data/httpd，而另外有一个 sample 帐号，目录为 /home/backup/shopex。对 httpd，进行了设置：

local_root=/data/httpd

则当以 sample 帐号登录时，也会进入 /data/httpd 而不是希望的 /home/backup/shopex。这时可以利用 vsftpd 的 user_config_dir 参数：

sh$ vi /etc/vsftpd/vsftpd.conf
user_config_dir=/etc/vsftpd/users
sh$ vi /etc/vsftpd/users/sample
local_root=/home/backup/shopex # 只需增加一条

即可。

参考：official vsftpd FAQ

注意，你可能需要再为匿名用户(ftp)创建一个配置文件：/etc/vsftpd/users/ftp：
local_root=/var/ftp
否则登录时会显示错误：
Login failed: 500 OOPS: reading non-root config file

bash text line by line process

如下的一个文件：

sh$ head sites.txte.72h.net /home/httpd/72h.net/e.72h.net
syssite /home/httpd/72h.net/ec.72h.net/syssite
mall.72h.net /home/httpd/72h.net/mall.72h.net
shop.72h.net /home/httpd/72h.net/shop.72h.net
aws /home/httpd/aws
ftputil /home/httpd/ftputil
is-me.com /home/httpd/is-me.com
mail.phpedu.org /home/httpd/mail.phpedu.org
payex /home/httpd/payex
phpadm /home/httpd/phpadm

现在需要对每一行进行一个代码生成，例如生成：
fs_backup -a t:$2 $1
# $2--path, $1--identity
如果使用 bash 来处理，那么需要保证每一行都能够正确的被分离，使用下面的方法：

sh$ for line in "`cat sites.txt`"; do echo $line; done
e.72h.net /home/httpd/72h.net/e.72h.net syssite /home/httpd/72h.net/ec.72h.net/syssite mall.72h.net /home/httpd/72h.net/mall.72h.net shop.72h.net /home/httpd/72h.net/shop.72h.net aws /home/httpd/aws ftputil /home/httpd/ftputil is-me.com /home/httpd/is-me.com mail.phpedu.org /home/httpd/mail.phpedu.org payex /home/httpd/payex phpadm /home/httpd/phpadm phpedu.org /home/httpd/phpedu.org 46.shopex.com.cn /home/httpd/shopex/46.shopex.com.cn 99bill.shopex.com.cn /home/httpd/shopex/99bill.shopex.com.cn acco.shopex.com.cn /home/httpd/shopex/acco.shopex.com.cn alipay.shopex.com.cn /home/httpd/shopex/alipay.shopex.com.cn book.shopex.cn /home/httpd/shopex/book.shopex.cn card.shopex.cn /home/httpd/shopex/card.shopex.cn cd.shopex.cn /home/httpd/shopex/cd.shopex.cn clothes.shopex.com.cn /home/httpd/shopex/clothes.shopex.com.cn com.shopex.com.cn /home/httpd/shopex/com.shopex.com.cn digi.shopex.com.cn /home/httpd/shopex/digi.shopex.com.cn free.shopex.com.cn /home/httpd/shopex/free.shopex.com.cn game.shopex.com.cn /home/httpd/shopex/game.shopex.com.cn gift.shopex.com.cn /home/httpd/shopex/gift.shopex.com.cn health.shopex.cn /home/httpd/shopex/health.shopex.cn home.shopex.cn /home/httpd/shopex/home.shopex.cn magicisland.shopex.com.cn /home/httpd/shopex/magicisland.shopex.com.cn makeup.shopex.cn /home/httpd/shopex/makeup.shopex.cn mall.shopex.com.cn /home/httpd/shopex/mall.shopex.com.cn mobile.shopex.cn /home/httpd/shopex/mobile.shopex.cn paypal.shopex.com.cn /home/httpd/shopex/paypal.shopex.com.cn pc.shopex.cn /home/httpd/shopex/pc.shopex.cn pifa.shopex.com.cn /home/httpd/shopex/pifa.shopex.com.cn syssite /home/httpd/shopex/platform.shopex.com.cn/syssite shopex.cn /home/httpd/shopex/shopex.cn blog /home/httpd/shopex/shopex.cn/blog help /home/httpd/shopex/shopex.cn/help store.shopex.cn /home/httpd/shopex/store.shopex.cn tex.shopex.com.cn /home/httpd/shopex/tex.shopex.com.cn top.shopex.cn /home/httpd/shopex/top.shopex.cn update.shopex.com.cn /home/httpd/shopex/update.shopex.com.cn store.verycd.com /home/httpd/store.verycd.com zovamailredir /home/httpd/zovamailredir zovatech /home/httpd/zovatech

sh$ for line in `cat sites.txt`; do echo $line; done
e.72h.net
/home/httpd/72h.net/e.72h.net
syssite
/home/httpd/72h.net/ec.72h.net/syssite
mall.72h.net
/home/httpd/72h.net/mall.72h.net
shop.72h.net
/home/httpd/72h.net/shop.72h.net
aws
/home/httpd/aws
......

都不是想要的结果！

只能使用下面的形式了：

sh$cat sites.txt | while read id path; do echo $id, $path; done
e.72h.net, /home/httpd/72h.net/e.72h.net
syssite, /home/httpd/72h.net/ec.72h.net/syssite
mall.72h.net, /home/httpd/72h.net/mall.72h.net
shop.72h.net, /home/httpd/72h.net/shop.72h.net
aws, /home/httpd/aws
ftputil, /home/httpd/ftputil
is-me.com, /home/httpd/is-me.com
mail.phpedu.org, /home/httpd/mail.phpedu.org
payex, /home/httpd/payex
phpadm, /home/httpd/phpadm
......

所以最终的命令如下：

sh$ cat sites.txt | while read id path; do fs_backup -a t:$path $id; done

svn over http

sh$ cat /usr/src/apache/.config
command = "./configure --with-apr=/usr/local/apr --with-apr-util=/usr/local/apr --with-mpm=prefork --enable-so --enable-rewrite=static --enable-track-vars --enable-dav=shared --enable-dav-fs=shared --enable-dav-lock=shared";

sh$ cat /usr/src/subversion/.config
command = "./configure --with-apr=/usr/local/apr --with-apr-util=/usr/local/apr";
command = "make";
command = "sed -i 's@MKDIR = /usr/bin/install -c -d@MKDIR = mkdir -p@g' Makefile";
command = "make install";

sh$ cat httpd.conf
......
NameVirtualHost *:80

"<"VirtualHost *:80">"
        ServerName docs.shopex.cn
        DocumentRoot "/var/www/html/docs"
        "<"Directory "/var/www/html/docs"">"
            Options Indexes FollowSymLinks
            AllowOverride None

            Order allow,deny
            Allow from all
        "<"/Directory">"
        "<"Location /repos">"
                DAV svn
                SVNParentPath /var/www/html/docs/repos
        "<"/Location">"
"<"/VirtualHost">"

sh$ svnadmin create --fs-type fsfs http://docs.shopex.cn/
svnadmin: 'http://docs.shopex.cn' is an URL when it should be a path

svnadmin 直接访问版本库（因此只可以在存放版本库的机器上使用），它通过路径访问版本库，而不是 URL。

sh$ svnadmin create --fs-type fsfs /var/www/html/docs/repos/sysadm

sh$ svn list http://docs.shopex.cn/docs
svn: Unrecognized URL scheme for 'http://docs.shopex.cn/docs'

这是因为没有 ra_dav 这个库。

sh$ upm files subversion-1.4.3 | grep ra_dav

故障解决
但是编译没有成功将 ra_dav 编译进来，这是因为 neon 不存在或版本不够。可以从 ./configure 的输出中找到相应的说明。

sh$ rpm -qi neon

neon is an HTTP and WebDAV client library, with a C interface;
providing a high-level interface to HTTP and WebDAV methods along
with a low-level interface for HTTP request handling. neon
supports persistent connections, proxy servers, basic, digest and
Kerberos authentication, and has complete SSL support.

换用一个较低版本的 subversion，保证可以成功编译 ra_dav，然后运行：

sh$ svn list file:///var/www/html/docs/repos/sysadm/
sh$ svn list http://docs.shopex.cn/repos/sysadm
svn: PROPFIND request failed on '/repos/sysadm'
svn: PROPFIND of '/repos/sysadm': 301 Moved Permanently (http://docs.shopex.cn)

从页面上访问 docs.shopex.cn/repos：
Forbidden
You don't have permission to access /repos/ on this server. 然后访问 http://docs.shopex.cn/repos/index.html：

"<"D:error">"
"<"C:error/">"
"<"m:human-readable errcode="2"">"
Could not open the requested SVN filesystem
"<"/m:human-readable">"
"<"/D:error">"

这个问题参考这些页面：
Subversion 出現 301 Moved Permanently 的解決方法
subversion 301-error
SVN 基本的 Apache 配置
也就是说，不能将 DocumentRoot 和 SVN 的 repos Location 定义为同样的路径，因为如果一个请求的 URI 是/repos/foo.py，Apache不知道是直接到 repos/foo.py 访问这个文件还是让 mod_dav_svn 代理从 Subversion 版本库返回 foo.py。这里只需要注释上面的 DocumentRoot 即可！

星期一, 二月 26, 2007

apache compile so

要将模块编译成 so 共享模块，应该使用如下参数：
--enable-so --enable-foo=shared
例如：
--enable-so --enable-dav=shared --enable-dav-fs=shared --enable-dav-lock=shared

如果仅仅使用 --enable-so --enable-foo，那么 foo 这个模块就将静态连接到 httpd 中，使用 httpd -l 可以看到这个模块。

星期日, 二月 25, 2007

docbook xml (3): xsl

因为现在不知道如何在 Blog 里面张贴 XML 标记，而且也不清楚怎么从 google docs 里面直接 post 到 blog 里面，所以只好暂时写在 google docs 里面了。

星期四, 二月 15, 2007

sudo timeout Defaults

因为安装的 rpm 包，不能重新编译，只能通过 Defaults 语法在 /etc/sudoers 里重新定义超时的时间。但首先必须知道都有哪些 Defaults 可以重新定义。通过如下命令完成：

sh$ sudo -L

所以可以知道 timeout 的相关参数：

sh$ sudo -L | grep timeout
timestamp_timeout: Authentication timestamp timeout
passwd_timeout: Password prompt timeout

再编辑 /etc/sudoers：

sh# visudo
......
Defaults:%admins timestamp_timeout=30
# 设置为 30 分钟, -1 表示没有限制
......

后面的 passwd_timeout 应该是指出现密码提示符后，如果在这个时间段里没有输入密码，就失效。

python digit map in 2 ranges

def digit_map_by_range(d, rx, ry):
"""map digit 'd' in 'rx' to a number in 'rx'
for example: 9, 10, 21 --> 21/20*(9+1)-1 = 20
L[20] is the last element of the sequence"""
return ry/rx*(d+1)-1

sudo (1)

sudo 有诸多好处：
1. 管理员能够在不告诉用户 root 密码的前提下，授予他们某些特定类型的超级用户权限，这正是许多系统管理员所梦寐以求的。
2. 在这种情况下，我们也就可以定义多个系统管理员，来执行不同的任务。
3. 如果需要取消某人的管理权限(例如离职)，那么相对来说也比较简单一点，只需要锁定相应帐号即可(usermod -L $user, 或在 /etc/passwd 文件中的密码项部分开始处插入"!"，最好不要使用 userdel -r $user)。当然，如果没有集中的认证如 Kerberos，那么还是需要在每一台主机上执行这项操作的。
4. 减小误操作的机会和损失。

Ubuntu 默认的 /etc/sudoers 文件内容(因为 Ubuntu 默认采用 sudo 并且禁止 root 登录的，比较典型)：

# User privilege specification 
root ALL=(ALL) ALL
# Members of the admin group may gain root privileges
%admin ALL=(ALL) ALL

这太过于简单，做一个复杂点的来适应服务器系统管理的需要：

......
# Cmnd alias specification
Cmnd_Alias      SYSCMD=/bin/*,/sbin/*,/usr/bin/*,/usr/sbin/*,/usr/local/bin/*,/usr/local/sbin/*
Cmnd_Alias      COPY_BIN=/*/cp */bin/* *,/*/cp * */bin/*,/*/cp */sbin/* *,/*/cp * */sbin/*
Cmnd_Alias      MOVE_BIN=/*/mv */bin/* *,/*/mv * */bin/*,/*/mv */sbin/* *,/*/mv * */sbin/*
Cmnd_Alias      LINK_BIN=/*/ln */bin/* *,/*/ln * */bin/*,/*/ln */sbin/* *,/*/ln * */sbin/*
Cmnd_Alias      RENAME=/*/rename,
Cmnd_Alias      VISUDO=/usr/sbin/visudo,/*/vi* /etc/sudoers
# Cmnd_Alias    VISUDO=/*/visudo
Cmnd_Alias      CHOWN=/*/chown *
Cmnd_Alias      CHMOD=/*/chmod *
Cmnd_Alias      USER_CMD=/usr/bin/passwd,\
/usr/sbin/useradd,\
/usr/sbin/usermod,\
/usr/sbin/groupadd,\
/usr/sbin/groupmod,\
/usr/bin/chage -m ? -W [3-7] * ?*

# Defaults specification

# User privilege specification
root            ALL=(ALL) ALL
%admins         ALL=(%root) SYSCMD,!COPY_BIN,!MOVE_BIN,!LINK_BIN,!VISUDO,!CHOWN,!CHMOD,!USER_CMD
sysadm          ALL=(%root) NOPASSWD:ALL

......

这里重点是 Cmnd_Alias 部分，因为涉及到安全的很多问题。首先 SYSCMD 定义所有的系统命令，%admin 组可以执行这些命令，并且只能执行这些命令，即所有 bin 和 sbin 目录下的程序。

这样做是为了阻止这些用户通过 sudo 以 root 身份运行后面的一些敏感操作，包括 visudo, chown, chmod, passwd, useradd, groupadd, usermod, groupmod, chage 等。只是不明白为什么用 /*/visudo 不能匹配？如果仅仅只是用"!"定义不能执行这些命令是不够的，因为这些命令可以被拷贝到 /tmp 或 $HOME 下，然后改个名字就可以运行，所以必须定义只能运行 SYSCMD 范围之内的命令。

另外，这也可以提醒人们编写质量更高的脚本。如果一个脚本不够资格放在 bin 目录下，那最好还是不要用它吧，在 /root 下一大堆丑陋的脚本将极大的降低系统环境的复用性，导致熵增大。

但这样仍然不够，因为用户还可以把 /usr/sbin/visudo 拷贝成 /usr/sbin/VISUDO，还是可以运行，所以必须禁止一切针对 SYSCMD 的拷贝和改名动作。COPY_BIN, MOVE_BIN, LINK_BIN, RENAME 被用来做这件事情，同时它还阻止向 SYSCMD 的拷贝以防使用 hack 版本覆盖原文件。

然后增加用户和组：

sh# useradd -u 123 sysadm
sh# groupadd -u 124 admins
sh# useradd sysadm01
sh# useradd -G admins sysadm01

这样看上去万无一失了

sh$ sudo /bin/cp /bin/cp /tmp
Sorry, user sysadm01 is not allowed to execute '/bin/cp /bin/cp /tmp' as root on stor.shopex.cn.
sh$ sudo /bin/cp /usr/sbin/visudo /tmp/
Sorry, user sysadm01 is not allowed to execute '/bin/cp /usr/sbin/visudo /tmp/' as root on stor.shopex.cn.
sh$ sudo cp /usr/sbin/visudo /tmp/
Sorry, user sysadm01 is not allowed to execute '/bin/cp /usr/sbin/visudo /tmp/' as root on stor.shopex.cn.
sh$ sudo cp -r /usr/sbin/visudo /tmp/
# 看看加一个参数在中间会不会有影响！
Sorry, user sysadm01 is not allowed to execute '/bin/cp -r /usr/sbin/visudo /tmp/' as root on stor.shopex.cn.

但实际上还是不够。不信运行下面这几条命令：

sh$ cd /usr/sbin
sh$ sudo cp visudo /tmp  
sh$ sudo cp visudo VISUDO     # [1]    
sh$ sudo cp /tmp/VISUDO ./VISUDO    # [2]

显然 sudo 无法检查匹配的路径，这样就完全绕过了前面的安全设定；而且 sudo 的匹配只是针对命令字符串的匹配，并且不支持正则表达式。对[1]还可以处理，虽然没有正则，但可以多加几条规则也可以避免，但对[2]就完全无能为力了，因为完全可以从别的系统上传一个 VISUDO 到 /tmp，然后切换到 /usr/sbin 绕开 !COPY_BIN 限制！

对此，我想唯一的解决办法就是使用 tripware + sudo.log，但如果 sudo.log 被抹掉？而且我的 sudo.log 里面没有任何记录！？

星期日, 二月 11, 2007

python flat file db && table 'type'

我前面在"数据结构与配置设想"中设想配置的三种形式，即树配置、表配置和列表配置。那里相对比较详细的讨论过树配置，现在考虑表配置。

表配置实际上是关系型的，但因为配置的可读性很重要，所以文本形式是首选，特别是对于一些简单的应用，如果弄一个复杂的 Server/Client SQL 关系型数据库就太麻烦了。最起码需要放置在本地文件系统中吧。

仅对后一点，在本地文件系统上操作，是可以利用 SQLite3 来实现的，python 的相应接口为 pysqlite

sh$ upm packs
sqlite-3.3.12
pysqlite-2.3.2

sh$ python
>>> import pysqlite2
>>> dir(pysqlite2)
['__builtins__', '__doc__', '__file__', '__name__', '__path__']
>>> from pysqlite2 import dbapi2 as sqlite
>>> dir(pysqlite2)
['__builtins__', '__doc__', '__file__', '__name__', '__path__', '_sqlite', 'dbapi2']
>>> dbconn = sqlite.connect('/tmp/filedb')
>>> dbcurs = dbconn.cursor()
>>> dbcurs.execute('''create table stocks (date text, trans text, symbol text, qty real, price real)''')
pysqlite2.dbapi2.Cursor object at 0xb7c20ce0
>>> dbcurs.execute("""insert into stocks values ('2006-01-05','BUY','RHAT',100,35.14)""")
pysqlite2.dbapi2.Cursor object at 0xb7c20ce0
>>> dbcurs.close()
>>> dbconn.close()

星期六, 二月 10, 2007

python name declaration order

import logger

class CTreeEx:
    def __init__(self, ...):
        ......
    def logError(self, logerr=outlog):
        print >> sys.stderr, self.strerr)
        if self.errno == 1:  logerr.error(self.strerr)
        elif self.errno == 127:  logerr.critical(self.strerr)

class CTree:
    ......
    def func():
        ......
            raise CTreeEx(...)

def main():
    ......
        ctobj = CTree()
        ctobj.func()

if __name__ == '__main__':
    outlog = logger.get(program)
    usage = "program usage ..."
    main()
else:
    outlog = logger.get(__name__)
    usage = "module usage ..."

sh$ ./ctree.py
Traceback (most recent call last):
  File "./ctree.py", line 78, in ?
    class CTreeEx:
  File "./ctree.py", line 84, in CTreeEx
    def logError(self, logerr=outlog):
NameError: name 'outlog' is not defined

这里抛出 NameError 的异常。因为 python 是动态类型变量，所以不可能有 C 语言那样的声明(declare)方式可以在赋值之前进行声明，所以对于名字问题只能通过调整出现的顺序来解决。所以这里应该调整为：

def main():
if __name__ == '__main__':
class CTreeEx:
class CTree:

注意这里 main() 也要提到前面，在 if __name__ == '__main__': 之前，否则又会出现“NameError: name 'main' is not defined”。

虽然有 global 语句，但也不能这样使用：

if __name__ == '__main__':
    outlog = logger.get(program)
    global outlog

sh$ ./ctree.py
./ctree.py:0: SyntaxWarning: name 'outlog' is assigned to before global declaration
Traceback (most recent call last):
  File "./ctree.py", line 78, in ?
    class CTreeEx:
  File "./ctree.py", line 84, in CTreeEx
    def logError(self, logerr=outlog):
NameError: name 'outlog' is not defined

当然这里只是一个 Warning，但这也说明这样是有问题的。要知道 python 的结构性已经很强了。而在最前面使用 global：

global outlog
class CTreeEx:
    ......

则不会有任何效果，仍然会抛出 NameError 异常。

对这个问题要深入研究一下。假设我在最上面的顺序下这样做：

class CTreeEx:
    def __init__(self, ...):
        print usage

这样，看上去对 usage 的使用在 outlog 之前了，应该会抛出 'usage' 的 NameError，但实际上却不会！换一种作法却会：

class CTreeEx:
    def __init__(self, ..., test=usage):

NameError: name 'usage' is not defined

所以，可以看到，只有那些定义，或成为赋值的语句会有影响，即 assignment──在 python 中，因为是动态类型，所以所有的名字和对象之间都相当于一个 C 语言中的指针赋值，所以只有赋值，而 class name:/def name(): 都只不过是将一个 class/function 对象赋给一个名字 name，甚至包括 import name 也是将一个 module(文件)赋给一个 name。

那么如果这样呢？

class CTreeEx:
    def __init__(self, errno=1, strerr='CTree,Exception'):
        test = usage

这样却不会有问题。所以这取决于这个赋值会在何时发生。象这里的 class/def 定义，都是在 module 顶层，是在从命令行调用或 import module 时马上就会执行的部分，包括 def __init__() 也是在生成 class CTreeEx 这个 class 对象时必须的部分，所以会有问题；但 __init__() 内部的东西则是在生成 class instance 对象时才调用的，所以不会有问题。可以看看这样的效果：

class CTreeEx:
    test = usage
    def __init__(self, ...):

就完全明白了。

python map [] to empty strings

>>> L1 = ['a', 'b', 'c']
>>> L2 = [1, 2, 3]
>>> zip(L1, L2)
[('a', 1), ('b', 2), ('c', 3)]
>>> map(L1, L2)
Traceback (most recent call last):
  File "", line 1, in ?
TypeError: 'list' object is not callable
>>> map(None, L1, L2)
[('a', 1), ('b', 2), ('c', 3)]
>>> dict(zip(L1, L2))
{'a': 1, 'c': 3, 'b': 2}
>>> dict(map(None, L1, L2))
{'a': 1, 'c': 3, 'b': 2}

>>> L3 = []
>>> zip(L1, L3)
[]
>>> map(L1, L3)
[]
>>> map(None, L1, L3)
[('a', None), ('b', None), ('c', None)]
>>> dict(zip(L1, L3))
{}
>>> dict(map(None, L1, L3))
{'a': None, 'c': None, 'b': None}

>>> def fun(A, B):
...     if B == None: B = ''
...     return A, B
...
map(fun, L1, L2)
[('a', 1), ('b', 2), ('c', 3)
>>> map(fun, L1, L3)
[('a', ''), ('b', ''), ('c', '')]
>>> dict(map(fun, L1, L3))
{'a': '', 'c': '', 'b': ''}

星期五, 二月 09, 2007

find -path dir/* -prune

有时候，备份时可能只希望删除某个目录下的文件和子目录，但需要保留目录本身。因为 fs_backup 是利用 find 来查找文件的，所以如果 find 本身可以解决这个问题，就不需要在另外编码了。幸运的是，find 可以：

sh$ find . -path ./linux-2.6.16.28 -prune -o -print | grep linux-2.6.16.28
./linux-2.6.16.28.SMP
./linux-2.6.16.28.tar.bz2
sh$ find . -path ./linux-2.6.16.28/* -prune -o -print | grep linux-2.6.16.28
find: paths must precede expression
Usage: find [path...] [expression]
sh$ find . -path './linux-2.6.16.28/*' -prune -o -print | grep linux-2.6.16.28
./linux-2.6.16.28
./linux-2.6.16.28.SMP
./linux-2.6.16.28.tar.bz2

这样，在使用 fs_backup 时就可以使用如下方式来加入或排除一个目录是否被备份：

sh$ fs_backup -a t:/path/to/dir/*
sh$ fs_backup -a x:/path/to/dir/*

只是 fs_backup 这个脚本调用 os.popen3(command) 时这个 command(即 find 的命令)需要做出调整，在代码生成时给 -path 'dir/*' 加上引号。

另外，再更进一步看看 find -path dir/*/* 是否可以呢？

sh$ find . | grep subversion
./.subversion
./.subversion/README.txt
./.subversion/config
./.subversion/servers
./.subversion/auth
./.subversion/auth/svn.username
./.subversion/auth/svn.simple
./.subversion/auth/svn.ssl.server
sh$ find . -path ./.subversion -prune -o -print | grep subversion
sh$ find . -path './.subversion/*' -prune -o -print | grep subversion
./.subversion
sh$ find . -path './.subversion/*/*' -prune -o -print | grep subversion
./.subversion
./.subversion/README.txt
./.subversion/config
./.subversion/servers
./.subversion/auth

OK，也可以！

但看一下实际的执行情况：

sh# cat /var/fs_backup/myopt/.t_files
/opt
sh# cat /var/fs_backup/myopt/.x_files
/opt/backup
/opt/profiles/*
sh# ./fs_backup myopt
/opt/profiles/* is not a valid file or directory
find /opt \( -path '/opt/backup' \) -prune ! -type d -print >>/tmp/fs_backup.myopt.full.1171197285.0.list

这是因为在寻找顶级文件/目录列表时的问题，因为 * 通配符不被支持。修改如下：

 83 def find_top_files(tmpl):
 84     import glob
 85     tmpl.sort()
 86     topfiles = []
 87     invalids = []
 88     file1 = ''
 89     for item in tmpl:
 90         item = item.strip()
 91         if not os.path.exists(item) and len(glob.glob(item)) == 1:
 92         # There may be 'dir/*' style string
 93             strerr = "^[[31m%s is not a valid file or directory^[[00m" % item
 94             print >> sys.stderr, strerr
 95             invalids.append(item)
 96             continue
 97         if not file1:
 98             if item == '': continue #^W
 99             file1 = item
100             topfiles.append(file1)
101         elif item.startswith(file1):
102             # 'file1' is the parent directory of 'item'
103             continue
104         elif file1.startswith(item):
105             file1 = item
106             topfiles[-1] = file1
107         else:
108             file1 = item
109             topfiles.append(file1)
110     return topfiles, invalids

星期三, 二月 07, 2007

SNMP access control (1)

前面已经了解了 SNMP 及其 MIBs，并且使用了一些工具来查看 MIB 树。但是问题是，我只能看到 system 这一分支的情况，即：

snmpwalk -v2c -c public localhost system

但是 interfaces 就不行，这样如何监控网络的流量呢？而使用 snmpget 也得不到需要的 IF-MIB:: 中的信息，使用 snmpgetnext 得到的也不正确。snmpgetnext 应该是得到下一个(NEXT)节点的信息，例如：

sh$ snmpwalk -v2c -c demo 192.168.0.98 system | head -n 2
SNMPv2-MIB::sysDescr.0 = STRING: Linux localhost.localdomain 2.6.14.2 #1 SMP Thu Jan 11 15:39:36 EST 2007 i686
SNMPv2-MIB::sysObjectID.0 = OID: NET-SNMP-MIB::netSnmpAgentOIDs.10
sh$ snmpget -v2c -c demo 192.168.0.98 SNMPv2-MIB::sysDescr.0
SNMPv2-MIB::sysDescr.0 = STRING: Linux localhost.localdomain 2.6.14.2 #1 SMP Thu Jan 11 15:39:36 EST 2007 i686
sh$ snmpgetnext -v2c -c demo 192.168.0.98 SNMPv2-MIB::sysDescr.0
SNMPv2-MIB::sysObjectID.0 = OID: NET-SNMP-MIB::netSnmpAgentOIDs.10

这里 -c demo 是一个 community name，而且这里也不是使用的 localhost 而是 192.168.0.98 这样的地址，这是因为更改了 snmpd.conf 的缘故，实际上，如果不更改而使用默认的 snmpd.conf，那么只能使用 -c public localhost，否则只能得到诸如："Timeout: No Response from 192.168.0.98."这样的信息。这些会在下面讲到。

根据 net-snmp FAQ "I can see the system group, but nothing else. Why?" 上的说明，无法得到 interfaces 这个子树的原因是由于 agent 的 access control 的缘故。那么在
netsnmp FAQ "How do I configure access control?"
和
et-snmp FAQ "I don't understand the new access control stuff - what does it mean?"
这两个部分说明了如何来配置 agent 的 access control。

我们现在只考虑 SNMPv2，不考虑 SNMPv3。那么 access control 要解决的问题就是，我要让哪些人(who)可以获取哪些子树(what)。与此相关的几个语句是 com2sec, group, view 和 access。

那么先来看看 access 语句，它就是定义哪些人可以获取哪些子树的语句。其语法为：

access {group} "" any noauth exact {read-tree} {write-tree} {notify-tree}

这里 {group} 就是将要用 group 语句来定义的组，{read-tree} {write-tree} {notify-tree} 就是将要用 view 来定义的子树。所以 group 就是哪些人，view 就是哪些子树。

于是用 group 来定义哪些人：

# com2sec notConfigUser  default       public
# group   notConfigGroup  v1           notConfigUser
# group   notConfigGroup  v2c          notConfigUser
com2sec mynet     192.168.0.0/24  demo
group   gmynet     v1              mynet
group   gmynet     v2c             mynet

为了更清楚的说明，这里我将原来的注释掉了。v1/v2c 是 serurityModel，就是在 snmpwalk/snmpget 这些命令使用时使用的参数如 -v2c(-v 2c)。所以我们的 group 为 gmynet，它与 mynet 这个名字(security name)是一个映射关系，而为了简便起见，也可以直接定义 group 为 mynet，而不用绕这么多圈子：

group   mynet     v1              mynet
group   mynet     v2c             mynet

com2sec 即 community to security，实际上定义了一个基于地址的访问控制，另外它大概还有一个将 SNMPv2/SNMPv1 的名字映射过来的作用，如上的 demo，这样在 snmpwalk/snmpget 时使用 -v2c 这样的参数时可以使用 -c demo。按照上面的方式定义之后，就只能使用上面的 snmpwalk/snmpget -v2c -c demo 192.168.0.98 这样的形式，而不能再使用 -c public localhost 了，否则就得到"Timeout: No Response from localhost"这样的出错。

然后用 view 来定义可以查看哪些子树：

view    interface included       .1.3.6.1.2.1.2
view    system    included       .1.3.6.1.2.1.1
view    system    included       .1.3.6.1.2.1.25.1.1

可以利用 snmptranslate 来得到 numeric 树，

sh$ snmptranslate -On IF-MIB::interfaces
.1.3.6.1.2.1.2
sh$ snmptranslate -On SNMPv2-MIB::system
.1.3.6.1.2.1.1

也可以直接使用 MIB 定义。

那么 access 的定义就应该如下：

access  mynet ""  any  noauth  exact  system  none  none
access  mynet ""  any  noauth  exact  interface  none  none

这样，按道理就应该可以得到 interfaces 的值了。记得要使 agent 重新读取配置文件，在 RHEL4 下面使用 /etc/init.d/snmpd restart 即可。

但实际上却不行：

sh$ snmpwalk -v2c -c demo 192.168.0.98 interfaces
IF-MIB::interfaces = No Such Object available on this agent at this OID
sh$ snmpget -v2c -c demo 192.168.0.98 IF-MIB::ifDescr.1
IF-MIB::ifDescr.1 = No Such Object available on this agent at this OID

但是如果使用如下的设置却可以：

view    all       included       .1
access  mynet "" any  noauth  exact  all  none  none

sh$ snmpwalk -v2c -c demo 192.168.0.98 interface
IF-MIB::ifNumber.0 = INTEGER: 4
IF-MIB::ifIndex.1 = INTEGER: 1
IF-MIB::ifIndex.2 = INTEGER: 2
IF-MIB::ifIndex.3 = INTEGER: 3
IF-MIB::ifIndex.4 = INTEGER: 4
IF-MIB::ifDescr.1 = STRING: lo
IF-MIB::ifDescr.2 = STRING: eth0
IF-MIB::ifDescr.3 = STRING: eth1
IF-MIB::ifDescr.4 = STRING: sit0
IF-MIB::ifType.1 = INTEGER: softwareLoopback(24)
IF-MIB::ifType.2 = INTEGER: ethernetCsmacd(6)
IF-MIB::ifType.3 = INTEGER: ethernetCsmacd(6)
IF-MIB::ifType.4 = INTEGER: tunnel(131)
IF-MIB::ifMtu.1 = INTEGER: 16436
IF-MIB::ifMtu.2 = INTEGER: 1500
IF-MIB::ifMtu.3 = INTEGER: 1500
IF-MIB::ifMtu.4 = INTEGER: 1480
IF-MIB::ifSpeed.1 = Gauge32: 10000000
IF-MIB::ifSpeed.2 = Gauge32: 100000000
IF-MIB::ifSpeed.3 = Gauge32: 10000000
IF-MIB::ifSpeed.4 = Gauge32: 0
IF-MIB::ifPhysAddress.1 = STRING:
IF-MIB::ifPhysAddress.2 = STRING: 0:2:b3:b0:59:36
IF-MIB::ifPhysAddress.3 = STRING: 0:2:b3:b0:59:4a
IF-MIB::ifPhysAddress.4 = STRING: 0:0:0:0:59:4a
IF-MIB::ifAdminStatus.1 = INTEGER: up(1)
IF-MIB::ifAdminStatus.2 = INTEGER: up(1)
IF-MIB::ifAdminStatus.3 = INTEGER: down(2)
IF-MIB::ifAdminStatus.4 = INTEGER: down(2)
IF-MIB::ifOperStatus.1 = INTEGER: up(1)
IF-MIB::ifOperStatus.2 = INTEGER: up(1)
IF-MIB::ifOperStatus.3 = INTEGER: down(2)
IF-MIB::ifOperStatus.4 = INTEGER: down(2)
IF-MIB::ifInOctets.1 = Counter32: 381118
IF-MIB::ifInOctets.2 = Counter32: 125019173
IF-MIB::ifInOctets.3 = Counter32: 0
IF-MIB::ifInOctets.4 = Counter32: 0
IF-MIB::ifInUcastPkts.1 = Counter32: 4308
IF-MIB::ifInUcastPkts.2 = Counter32: 1069602
IF-MIB::ifInUcastPkts.3 = Counter32: 0
IF-MIB::ifInUcastPkts.4 = Counter32: 0
IF-MIB::ifInDiscards.1 = Counter32: 0
IF-MIB::ifInDiscards.2 = Counter32: 0
IF-MIB::ifInDiscards.3 = Counter32: 0
IF-MIB::ifInDiscards.4 = Counter32: 0
IF-MIB::ifInErrors.1 = Counter32: 0
IF-MIB::ifInErrors.2 = Counter32: 0
IF-MIB::ifInErrors.3 = Counter32: 0
IF-MIB::ifInErrors.4 = Counter32: 0
IF-MIB::ifOutOctets.1 = Counter32: 383414
IF-MIB::ifOutOctets.2 = Counter32: 1770179210
IF-MIB::ifOutOctets.3 = Counter32: 0
IF-MIB::ifOutOctets.4 = Counter32: 0
IF-MIB::ifOutUcastPkts.1 = Counter32: 4340
IF-MIB::ifOutUcastPkts.2 = Counter32: 1319881
IF-MIB::ifOutUcastPkts.3 = Counter32: 0
IF-MIB::ifOutUcastPkts.4 = Counter32: 0
IF-MIB::ifOutDiscards.1 = Counter32: 0
IF-MIB::ifOutDiscards.2 = Counter32: 0
IF-MIB::ifOutDiscards.3 = Counter32: 0
IF-MIB::ifOutDiscards.4 = Counter32: 0
IF-MIB::ifOutErrors.1 = Counter32: 0
IF-MIB::ifOutErrors.2 = Counter32: 0
IF-MIB::ifOutErrors.3 = Counter32: 0
IF-MIB::ifOutErrors.4 = Counter32: 0
IF-MIB::ifOutQLen.1 = Gauge32: 0
IF-MIB::ifOutQLen.2 = Gauge32: 0
IF-MIB::ifOutQLen.3 = Gauge32: 0
IF-MIB::ifOutQLen.4 = Gauge32: 0
IF-MIB::ifSpecific.1 = OID: SNMPv2-SMI::zeroDotZero
IF-MIB::ifSpecific.2 = OID: SNMPv2-SMI::zeroDotZero
IF-MIB::ifSpecific.3 = OID: SNMPv2-SMI::zeroDotZero
IF-MIB::ifSpecific.4 = OID: SNMPv2-SMI::zeroDotZero

sh$ snmpget -v2c -c demo 192.168.0.98 IF-MIB::ifDescr.1
IF-MIB::ifDescr.1 = STRING: lo
sh$ snmpget -v2c -c demo 192.168.0.98 IF-MIB::ifDescr.2
IF-MIB::ifDescr.2 = STRING: eth0
sh$ snmpgetnext -v2c -c demo 192.168.0.98 IF-MIB::ifDescr.2
IF-MIB::ifDescr.3 = STRING: eth1

那么最初的配置有什么问题呢？

无论如何，为安全起见，只做如下的 access：

sh$ snmptranslate .1.3.6.1.2.1
SNMPv2-SMI::mib-2
sh$ snmptranslate -Of .1.3.6.1.2.1
.iso.org.dod.internet.mgmt.mib-2

sh$ cat /etc/snmp/snmpd.conf
view    system       included       .1.3.6.1.2.1
access  mynet "" any  noauth  exact  system  none  none

星期二, 二月 06, 2007

SNMP MIBs base && some utils

MID(Management Information Base, 管理信息库)这个树形数据库是按照数字(numeric)来组织的，即每一个节点(OID)都是数字，因此有一个名字到数字的映射关系，例如 system, interfaces 这样的名字要映射到各个被控端的实际设备节点上，或反之需要知道实际的名字。所有这些映射关系的定义都在 MIB 文件中，即 /usr/share/snmp/mibs(根据实际的安装情况会有不同)。例如：

sh$ grep 'system' /usr/share/snmp/mibs/SNMPv2-MIB.txt
system   OBJECT IDENTIFIER ::= { mib-2 1 }
......

snmptranslate 这个命令可以用来查看映射关系：

sh$ snmptranslate .1.3.6.1.2.1.1.3.0
SNMPv2-MIB::sysUpTime.0
sh$ snmptranslate -On SNMPv2-MIB::system.sysUpTime.0
.1.3.6.1.2.1.1.3.0

可以看到这个 SNMPv2-MIB 其实就是 /usr/share/snmp/mibs/SNMPv2-MIB.txt。

如果要使用自定义的 local MIBs，参见：NET-SNMP Tutorial -- Using local MIBs

使用 snmpwalk 可以取得一个树的结果：

sh$ snmpwalk -v2c -c public localhost system
SNMPv2-MIB::sysDescr.0 = STRING: Linux localhost.localdomain 2.6.14.2 #1 SMP Thu Jan 11 15:39:36 EST 2007 i686
SNMPv2-MIB::sysObjectID.0 = OID: NET-SNMP-MIB::netSnmpAgentOIDs.10
SNMPv2-MIB::sysUpTime.0 = Timeticks: (687617) 1:54:36.17
SNMPv2-MIB::sysContact.0 = STRING: zhoupeng@zovatech.com
SNMPv2-MIB::sysName.0 = STRING: localhost.localdomain
SNMPv2-MIB::sysLocation.0 = STRING: Unknown (edit /etc/snmp/snmpd.conf)
SNMPv2-MIB::sysORLastChange.0 = Timeticks: (1) 0:00:00.01
SNMPv2-MIB::sysORID.1 = OID: IF-MIB::ifMIB
SNMPv2-MIB::sysORID.2 = OID: SNMPv2-MIB::snmpMIB
SNMPv2-MIB::sysORID.3 = OID: TCP-MIB::tcpMIB
SNMPv2-MIB::sysORID.4 = OID: IP-MIB::ip
SNMPv2-MIB::sysORID.5 = OID: UDP-MIB::udpMIB
SNMPv2-MIB::sysORID.6 = OID: SNMP-VIEW-BASED-ACM-MIB::vacmBasicGroup
SNMPv2-MIB::sysORID.7 = OID: SNMP-FRAMEWORK-MIB::snmpFrameworkMIBCompliance
SNMPv2-MIB::sysORID.8 = OID: SNMP-MPD-MIB::snmpMPDCompliance
SNMPv2-MIB::sysORID.9 = OID: SNMP-USER-BASED-SM-MIB::usmMIBCompliance
SNMPv2-MIB::sysORDescr.1 = STRING: The MIB module to describe generic objects for network interface sub-layers
SNMPv2-MIB::sysORDescr.2 = STRING: The MIB module for SNMPv2 entities
SNMPv2-MIB::sysORDescr.3 = STRING: The MIB module for managing TCP implementations
SNMPv2-MIB::sysORDescr.4 = STRING: The MIB module for managing IP and ICMP implementations
SNMPv2-MIB::sysORDescr.5 = STRING: The MIB module for managing UDP implementations
SNMPv2-MIB::sysORDescr.6 = STRING: View-based Access Control Model for SNMP.
SNMPv2-MIB::sysORDescr.7 = STRING: The SNMP Management Architecture MIB.
SNMPv2-MIB::sysORDescr.8 = STRING: The MIB for Message Processing and Dispatching.
SNMPv2-MIB::sysORDescr.9 = STRING: The management information definitions for the SNMP User-based Security Model.
SNMPv2-MIB::sysORUpTime.1 = Timeticks: (0) 0:00:00.00
SNMPv2-MIB::sysORUpTime.2 = Timeticks: (1) 0:00:00.01
SNMPv2-MIB::sysORUpTime.3 = Timeticks: (1) 0:00:00.01
SNMPv2-MIB::sysORUpTime.4 = Timeticks: (1) 0:00:00.01
SNMPv2-MIB::sysORUpTime.5 = Timeticks: (1) 0:00:00.01
SNMPv2-MIB::sysORUpTime.6 = Timeticks: (1) 0:00:00.01
SNMPv2-MIB::sysORUpTime.7 = Timeticks: (1) 0:00:00.01
SNMPv2-MIB::sysORUpTime.8 = Timeticks: (1) 0:00:00.01
SNMPv2-MIB::sysORUpTime.9 = Timeticks: (1) 0:00:00.01

sh$ snmpwalk -v2c -c public localhost interfaces
IF-MIB::interfaces = No Such Object available on this agent at this OID

如果增加 -Of 参数可以得到一个完整的树形表达。我不太明白这里的 system 和 interfaces 是如何定义和识别的？因为如果使用 snmpget 这样就不行：

sh$ snmpget -v2c -c public localhost system
SNMPv2-MIB::system = No Such Object available on this agent at this OID
sh$ snmpget -v2c -c public localhost SNMPv2-MIB::system
SNMPv2-MIB::system = No Such Object available on this agent at this OID
sh$ snmpget -v2c -c public localhost SNMPv2-MIB::sysDescr.0
SNMPv2-MIB::sysDescr.0 = STRING: Linux localhost.localdomain 2.6.14.2 #1 SMP Thu Jan 11 15:39:36 EST 2007 i686

现在我知道 interfaces 的一些 MIB(《Linux Server Hacks, 卷二》)，前面用 snmpwalk 得不到结果，那么现在用 snmpget 呢？

sh$ snmpget -v2c -c public localhost IF-MIB::ifDescr.1
IF-MIB::ifDescr.1 = No Such Object available on this agent at this OID

再参考 net-snmp 的 FAQ，使用 snmpgetnext：

sh$ snmpgetnext -v2c -c public localhost IF-MIB::ifDescr.1
HOST-RESOURCES-MIB::hrSystemUptime.0 = Timeticks: (70364798) 8 days, 3:27:27.98
sh$ snmpgetnext -v2c -c public localhost HOST-RESOURCES-MIB::hrSystemUptime.0
HOST-RESOURCES-MIB::hrSystemUptime.0 = No more variables left in this MIB View (It is past the end of the MIB tree)

根据此 FAQ 上的说明，这样是会有问题的，因为实际上是使用了别的 MIB 文件，或者得到诸如"end of MIB"的响应，应该要更改配置，那么如何来做？在《Linux Server Hacks, Volume 2》上，是如下的表示：

IF-MIB::ifDescr.1 = STRING: lo
IF-MIB::ifDescr.2 = STRING: eth0
...

SNMP base

SNMP(Simple Network Management Protocal, 简单网络管理协议)在架构体系的监控子系统中将扮演重要角色。大体上，其基本原理是，在每一个被监控的主机或节点上(如交换机)都运行了一个 agent，用来收集这个节点的所有相关的信息，同时监听 snmp 的 port，也就是 UDP 161，并从这个端口接收来自监控主机的指令(查询和设置)。

如果使用 RHEL4 的 net-snmp，那么被监控主机需要安装 net-snmp(包含了 snmpd 这个 agent)，而监控端需要安装 net-snmp-utils。如果自行编译，需要 beecrypt(libbeecrypt)和 elf(libraryelf)的库。

每一个 agent 维护一个树形的数据库，称为 MID(Management Information Base, 管理信息库)，其每一个节点称为 Object Identifier(OID)，这在使用 net-snmp-utils 的工具时会用到。这些节点就表示了这台主机系统的设备如网卡的接口描述(eth0 等)、物理地址(MAC)、接口类型等，也可能是系统的信息，或者是需要监控的进程等...

net-snmp-utils 的工具集的所有参数不能直接在其 man 手册中查到，可以查 man snmpcmd，这个命令并不实际存在，只是说明的所有 utils 命令共同的参数。

星期一, 二月 05, 2007

python 判断 path 是否是父目录

在我的项目中需要对一个列表(列表配置方式)进行操作，列表配置很常见的一种形式就是文件的路径名(path names)，通常使用绝对路径比较好(但这里对下面这个函数其实没有限制)。所以我可能希望找到所有的顶级目录和文件，那么使用下面的函数：

 83 def find_top_files(tmpl):
 84     tmpl.sort()
 85     topfiles = []
 86     invalids = []
 87     file1 = ''
 88     for item in tmpl:
 89         item = item.strip()
 90         if not os.path.exists(item):
 91             strerr = "^[[31m%s is not a valid file or directory^[[00m" % item
 92             print strerr >> sys.stderr
 93             invalids.append(item)
 94             continue
 95         if not file1:
 96             if item == '': continue #^W
 97             file1 = item
 98             topfiles.append(file1)
 99         elif item.startswith(file1):
100             # 'file1' is the parent directory of 'item'
101             continue
102         elif file1.startswith(item):
103             file1 = item
104             topfiles[-1] = file1
105         else:
106             file1 = item
107             topfiles.append(file1)
108     return topfiles, invalids

这里首先是对列表进行了排序，这样方便操作。而接下来一个主要的问题是如何判断一个 pathname 是另一个的父目录(或子目录/文件)。因为 pathname 是 string，但这里不能直接使用 str1 in str2 这样的判断，因为显然 /etc in /usr/etc，但两种并没有父/子目录文件的关系。所以这里我使用了上面的粗体部分的办法：if file1.startswith(item)，即 str.startswith(substr) 我想这样应该是可行的，这比直接去用 os.path.* 的方法去判读要简便得多。

几个 find 的参数测试

因为要结合 find 和 tar 来实现完全、差分和增量备份的方案，所以对 find 的情况必须深入了解。find 的参数不是 GNU getopt 那种形式的，当组合使用的时候比较费解，有时候顺序的不同就可能得出完全不同的结果。例如：

sh# find . -type f \( -path ./docbook \) -prune -o -print | grep 'docbook'
./blfs-book-6.1-html/pst/docbook-utils.html
./blfs-book-6.1-html/pst/docbook-dsssl.html
./blfs-book-6.1-html/pst/docbook-xsl.html
./docbook
./docbook/index.xml

这里没有起到需要的效果，即排除 ./docbook 目录及其所有子文件。但如果使用：

sh# find . \( -path ./docbook \) -prune -o -print -type f | grep 'docbook'
./blfs-book-6.1-html/pst/docbook-utils.html
./blfs-book-6.1-html/pst/docbook-dsssl.html
./blfs-book-6.1-html/pst/docbook-xsl.html

则可以。这其实应该也是和 -o(OR) -a(AND) 运算的顺序一致的(-print -type f 相当于 -print -a -type f)。

再看对时间戳的操作：

sh# ls ts -l
-rw-r--r-- 1 root root 0 2007-01-01 00:00 ts
sh# ls blfs-book-6.1-html/pst/docbook-* -l
-rw-r--r-- 1 roc roc 11619 2005-08-21 05:14 blfs-book-6.1-html/pst/docbook-dsssl.html
-rw-r--r-- 1 roc roc 13821 2005-08-21 05:14 blfs-book-6.1-html/pst/docbook-utils.html
-rw-r--r-- 1 roc roc 11979 2005-08-21 05:14 blfs-book-6.1-html/pst/docbook-xsl.html
sh# find . \( -path ./docbook \) -prune -o -type f -print | grep 'docbook'
./blfs-book-6.1-html/pst/docbook-utils.html
./blfs-book-6.1-html/pst/docbook-dsssl.html
./blfs-book-6.1-html/pst/docbook-xsl.html
sh# find . \( -path ./docbook \) -prune -o -type f -print -cnewer ts | grep 'docbook'
./blfs-book-6.1-html/pst/docbook-utils.html
./blfs-book-6.1-html/pst/docbook-dsssl.html
./blfs-book-6.1-html/pst/docbook-xsl.html
sh# find . \( -path ./docbook \) -prune -o -cnewer ts -type f -print | grep 'docbook'

可以看到 -cnewer ts 的位置会有影响，如果放在最后就完全没有效果！

"变态" DNS

如今南北互通成了一个大问题。这大约也算是一种行政不作为的体现吧。既然衙门没办法，那只好自己想办法，当然最后买单的还是平头老百姓了:P

办法呢就是在网通那边上一个 squid 做缓存。这需要对域名的解析做些变化，当然是希望从电信来的用户走电信的通路，从网通来的走网通的缓存。

注意，这里所谓电信和网通的用户不仅只是指客户的 PC 机所使用的 IP 地址，也包括他们所使用的 DNS 域名服务器的 IP 范围，因为实际的查询是由他们指定的 DNS Server 提交到这个域名服务器，再由它们将查询结果返回给客户主机的。

所以对 BIND9 做如下配置：

options {
        directory "/var/named";
        allow-query { any; };
        recursion yes;
        allow-transfer { none; };
        dump-file "/var/named/data/cache_dump.db";
        statistics-file "/var/named/data/named_stats.txt";
        listen-on { 192.168.0.222; };
};
logging {
        channel "querylog" {
                file "/var/log/named.log" versions 3 size 100m;
                print-time yes;
                print-severity yes;
        };
        category queries { querylog; };
};

acl cnc_ip {
192.168.0.222;
58.16.0.0/13;
58.100.0.0/15;
58.211.0.0/16;
58.240.0.0/12;
60.0.0.0/11;
60.52.145.0/24;
60.55.0.0/24;
60.194.192.0/24;
60.208.0.0/12;
61.4.64/20;
61.14.128.0/23;
61.29.146.0/23;
61.48.0.0/13;
......
};

view "from_cnc" {
    match-clients { cnc_ip; };
    zone "." {
        type hint;
        file "named.ca";
    };
    zone "0.0.127.in-addr.arpa" {
        type master;
        file "named.local";
    };

    zone "example.com" {
        type master;
        file "example.com.cnc_zone";
        allow-update { none; };
    };
};

view "from_telcom" {
    match-clients { any; };
    zone "." {
        type hint;
        file "named.ca";
    };
    zone "0.0.127.in-addr.arpa" {
        type master;
        file "named.local";
    };

    zone "example.com" {
        type master;
        file "example.com.zone";
        allow-update { none; };
    };
};

sh# cat example.com.zone
;
; BIND data file for local loopback interface
;
$TTL    10
$ORIGIN example.com.
@       IN      SOA     example.com. admin.example.com. (
        2006120602      ; Serial
        10s             ; Refresh
        10              ; Retry
        10              ; Expire
        10 )            ; Negative Cache TTL
        IN      NS      ns1.example.com.
ns1     IN      A       192.168.0.222
@       IN      A       124.74.193.210
www     IN      A       192.168.0.210
bbs     IN      CNAME   www
test    IN      A       124.74.193.210

sh# cat example.com.cnc_zone
;
; BIND data file for local loopback interface
;
$TTL    10
$ORIGIN example.com.
@       IN      SOA     example.com. admin.example.com. (
        2006120602      ; Serial
        10s             ; Refresh
        10              ; Retry
        10              ; Expire
        10 )            ; Negative Cache TTL
        IN      NS      ns1.example.com.
ns1     IN      A       192.168.0.222
@       IN      A       210.51.46.227
www     IN      A       192.168.0.210
bbs     IN      CNAME   www
test    IN      A       210.51.46.227

然后可以使用 dig 来查看结果。首先在 acl cnc_ip{} 部分不加入 192.168.0.222(本机)，则运行 dig：

sh# dig test.example.com @192.168.0.222
;; ANSWER SECTION:
test.example.com.       10      IN      A       124.74.193.210

而 acl cnc_ip{} 部分加上本机地址后：

sh# dig test.example.com @192.168.0.222
;; ANSWER SECTION:
test.example.com.       10      IN      A       210.51.46.227

这种办法解析针对不同来源的 client 到不同的 IP，但与通常的DDNS(动态域名服务)又不同，而实际上又不是那么"智能"，所以只好命名为"变态" DNS 了 !^_^

这种办法不一定很好，但我也不知道有什么其他的办法，这也是从别人那偷学来的。不过以我看来，是不太愿意在这些问题上浪费太多的时间的。"再多的天才也不能胜任对细节的专注"，问题是层出不穷，永远也解决不完的，但是一个好的设计却可以避免很多问题，特别是那些机械重复的、本来应该由机器来处理的问题。

如果某个客户，比如是网通的用户，他查询的结果不对，也就是查出的结果到电信去了，那么先看看他本身的 IP 地址是否在 cnc_ip 的范围之内，如果在其中，则查看他指定的 DNS 服务器的 IP 地址是否在那个范围之内，如果不在，则可以通过一些办法来估算，比如 202.99.96.68 这个地址，可以从 http://www.ip138.com/ 这个站点上来查一下，分别看一下 202.99.1.68 和 202.99.255.68 是否在网通，如果在，则可以断定 202.99.0.0/16 这个网段是输入网通的，然后可以再扩大范围，看看 202.98.* 和 202.100.* 等。

RHEL4 kernel 4G MEM && SMP "冲突"问题

宝德的服务器，安装的 RHEL4，内核编译中是支持大内存的(64G)，支持 SMP 对称多处理。但换上4条三星的 1G 内存后，系统启动出错：

target 0:0:0: FAST-80 WIDE SCSI 168.0MB/s DT(12.5ns, offset 62)
sym0:0: ERROR(20:0) (d-2d-2f) (3e/18/80) @ (scripta 7f0:150003d8)
sym0: script cmd = 88080000
sym0: regdump: ...
sym0: RCI STATUS = 0x2000
sym0: SCSI BUS reset detected
sym0: SCSI BUS has been reset
......

以上信息反复出现，系统无法启动。但使用单处理器的内核却可以启动系统！而且使用 Windows 也是可以启动的，在 BIOS 中也是可以正常识别的。

所以只可能是系统本身的问题。为此查看内核的 config 文件：

sh# grep -i 'MEM' config
......
# CONFIG_HIGHMEM4G is not set
CONFIG_HIGHMEM64G=y
CONFIG_HIGHMEM=y
......

可以看到有 HIGHMEM4G 和 HIGHMEM64G，在 make menuconfig 的 "Process type and features" ->"High Memory Support"中也可以看到。既然已经有 64G 的支持，为什么还需要 4G 呢？这是一个疑点，也是一个突破点。通过查看 help，大致可以了解，如果内存在 1~4G 之间，最好使用 4G 支持，如果内存多于 4G，则应使用 64G。

按此思路重新编译内核后，OK。

星期四, 二月 01, 2007

MySQL 丢失 root 密码的重置

忘记 MySQL root 密码后，可以用如下方法重新设置：

sh# /etc/init.d/mysqld stop
sh# mysqld_safe --skip_grant_tables --defaults-file=/etc/my.cnf
sh# mysql -u root
mysql> update mysql.user set password=PASSWORD('********') where User='root';
OR
mysql> set password for 'root'@'localhost' = PASSWORD('********');
mysql> flush privileges;
mysql> quit
sh# /etc/init.d/mysqld stop
sh# /etc/init.d/mysqld start

订阅：博文 (Atom)

弹剑而歌