IBM Softwares: 12月 2009

2009年12月26日星期六

Ubuntu中如何安裝gcc及kernel原始程式

在Ubuntu中安裝任何東西時，可使用apt-get這個好用的程式。例如，要安裝gcc時，只要執行下面指令，就可以安裝完成

sudo apt-get build-dep gcc

若要安裝kernel source，則使用下列指令

sudo apt-get install linux-source

若要安裝kernel source中的一些header檔案，則使用下列指令

sudo apt-get install linux-headers

指令會回傳不同release的list，此時使用uname –r 指令先確定目前的linux release 後，再重新執行

sudo apt-get install linux-headers-2.6.31-14-generic

2009年12月15日星期二

DB2 Process Model

從接到應用程式的請求開始，DB2就由一連串的agent(又稱EDU, Engine Dispatchable Units)協同作業，完成使用者的請求。9.5版以前，除了Windows是使用Thread model外，Unix和Linux都是使用Process model來建立這些agent，較耗費系統資源。9.5版以後，所有的agent統一使用Thread model。

如下圖所示，依據使用的Protocol不同，每一個Instance都會有一個相對應的Listener來等候使用者的請求，Listener有下列三種

db2ipccm：處理來自Local Client的請求，當client與server位於同一個OS image，可以使用IPC的 protocol與此Listener溝通
db2tcpcm：處理遠端Client的請求，當client 與server位於不同OS image時，使用TCP/IP與此Listener溝通
db2tcpdm：處理DB2的 Discovery工具所發出的請求，DB2有個Discovery工具可以找尋遠端的Instance中，有那些資料庫可以使用。這類的請求由此Listener處理

Listener接受到請求後，會指定一個coordinator agent(db2agent)給該應用程式連線，被assign的db2agent就在DB2中，擔任應用程式的代理人，負責與應用程式溝通，處理應用程式的請求。若在一個Partitioned的資料庫環境，或 INTRA_PARALLEL參數設為Yes時，coordinator agent會再把工作交代給 subagent來處理，coordinator agent的角色就變成協調這些subagent完成工作。

如果Access Plan Manager分析結果產生的 Access Plan表示需要進行Prefetch，coordinator agent會送 prefetch的請求到 Prefetch Queue。每個資料庫都有一個Prefetcher，prefetcher從Queue中取得請求後，會把資料由Disk讀取到Bufferpool中。一但資料存在Bufferpool後，coordinator就可以依據應用程式的需求更新資料，更新的資料會暫存在bufferpool中，直到page cleaner將其再寫回Disk。

除此之外，每個應用程式所發出的交易都會由 logger這個agent寫入 transaction log，以備recovery之需。

下圖展示每個agent作用的範圍

接下來介紹db2相關的process。如前所述，9.5版以後 agent是採用thread model，所有的agent都成為依附在主 process下的一個thread，如此一來可大幅減少

Process名稱	描述	適用平台
db2acd	為autonomic computing的 daemon，用來處理client端與automatic相關的工作，如healther monitor、automatic maintenance utilities等。health-monitor process以一個DB2 fenced mode的process執行，在Windows環境，以db2fmp的方式呈現	只限Linux及Unix
db2ckpwd	在DB2 Server端檢驗使用者的ID/Password	只限Linux及Unix
db2fmcd	預設的fault monitor coordinator daemon，每一個實體的機器會有一個	只限Unix
db2fmd	預設的monitor daemon，每個DB2的instance都有一個fault monitor。這個daemon會受 db2fmcd所監控，若把這個process殺掉，db2fmcd會再把這個process叫起來	只限Unix
db2fmp	在DB2的firewall外，在DB2 server上執行user的code。每個db2fmp基本上都是獨立的process，不過對於某些種類的 runtime，可以用multithread的方式執行	所有LUW平台
db2sysc	DB2 System Controller引擎。在DB2 9.5中，每個partition中只會有一個db2sysc引擎 process，所有的EDU(Engine Dispathcable Unit)都會以一個Thread的形式，掛在這個process下。一但這個process停止了，DB2 server就無法作用了。 (p.s. 在Windows平台中，這個process名稱為 db2syscs)	所有LUW平台
db2vend	Fenced vendor process，這是在DB2 9.5版後才引進。在Threaded的引擎下，DB2無法讓vendor的code更改它的signal mask、啟動新的thread或破壞agent的stack。所以所有的vendor code都是獨立於DB2 engine外，在這個process執行	只限Linux及Unix
db2wdog	DB2 watchdog. 這個process是db2sysc的 parent process，負責下列internal事項 - 當db2sysc不正常結束時，清除IPC資源 - 產生 db2fmp及 health-monitor process。當這些process不正常結束時，清除該db2fmp佔用的系統資源，若需要重啟health monitor，也會進行重啟動作	只限Linux及Unix

2009年12月12日星期六

清除 Linux memory的 cache

有很多種方法可以查看Linux 記憶體的使用狀況，如vmstat、free、nmon等，這裡以 free 指令為例

[root@db01 ~]# free
                                 total           used                  free     shared    buffers     cached
Mem:                       5144296    3008448    2135848          0       8600        2819464
-/+ buffers/cache:       180384    4963912
Swap:      6160376      18484    6141892

這裡buffers是放即將被寫入硬碟資料的緩衝區；而cache則是資料預先自硬碟讀取暫存在記憶體的暫存區。Linux 作業系統傾向會把資料cache在記憶體中，即使應用程式結束，也不會立即將cache的資料清除。

以下介紹如何把cache的資料清除：

Writing to this will cause the kernel to drop clean caches, dentries and inodes from memory, causing that memory to become free.

To free pagecache:

echo 1 > /proc/sys/vm/drop_caches

To free dentries and inodes:

echo 2 > /proc/sys/vm/drop_caches

To free pagecache, dentries and inodes:

echo 3 > /proc/sys/vm/drop_caches

As this is a non-destructive operation, and dirty objects are not freeable, the user should run "sync" first in order to make sure all cached objects are freed.

Linux Memory使用資訊

轉貼自 http://ssorc.tw/rewrite.php/read-599.html

free

                      total         used           free     shared     buffers      cached
Mem:       1023916     975816       48100              0      26376     465844
-/+ buffers/cache:      483596      540320
Swap:      2096440    105564    1990876

計算方式

                      total         used           free     shared     buffers      cached
Mem:                a               b                c              d           e               f
-/+ buffers/cache:             g                h
Swap:               i                j                 k

a = 總記憶體大小
b = 配給 buffers 與 cache 的記憶體大小(包含未用的 buffers 與 cache)
c = 剩下的記憶體大小
e = 配給 buffers 但未用的記憶體大小
f = 配給 cache 但未用的記憶體大小
g = buffers 與 cache 被使用掉的記憶體大小，也就是實際被應用程式用走的
h = 那這個就是實際剩下的記憶體大小
a = b + c
a = g + h
g = b - e - f
h = c + e + f

由上面的計算可以知道 MEM:Free 欄位是真的沒有在使用的記憶體(已減掉Buffer及Cache)；MEM:Used欄位的值包含其它程式在使用的記憶體 + Buffer及Cache；-/+buffers/cache:used:是真實其它程式在使用的記憶體(即Mem:used – Mem:buffers – Mem:cached)

buffer 與 cache 的區別:
A buffer is something that has yet to be "written" to disk.
A cache is something that has been "read" from the disk and stored for later use.
Quote: http://www.ubuntu.org.tw/modules/newbb/viewtopic.php?topic_id=6132

一般情況下，Linux kernel 會盡可能多地利用 RAM 的空閑空間作為 cache/buffer 以最大幅度地提高系統性能。當系統中運行的應用程序占用的 RAM 增加時，則將 cache/buffer 所占用的空間釋放出來，讓渡給應用程序使用。

Quote: http://web.mit.edu/rhel-doc/4/RH-DOCS/rhel-isa-zh_tw-4/s1-resource-rhlspec.html

Mem: 那一行顯示了實際記憶體的使用率；
Swap: 顯示的是系統 swap 空間的使用率；
-/+ buffers/cache: 則是目前撥給系統緩衝區的實體記憶體數量。

Quote: http://wiki.gentoo.tw/mediawiki/index.php/FAQ_LINUXMEMORY

記憶體管理的概觀
當系統開機一段時間後，像是「top」這種傳統的 Unix 工具常常回報少的可憐的可用記憶體數值，在我寫這篇文章的系統中，就算我總共有 512 MB 的記憶體在我的系統裡，但約開機m後三個小時，我只剩下 60 MB 的可用記憶體，那些記憶體到底跑到那裡去了？

用掉最多記憶體的地方是磁碟快取 (disk cache)，目前它總共用了超過 290 MB 的記憶 (在 top 裡的「cached」項目中)，快取記憶體 (cached memory) 基本上是空閒的，當有新/執行中的程式需要記憶體的話，它會快速的被取回來。

為什麼 Linux 使用這麼多的記憶體來當作磁碟快取 (disk cache) 呢？主要的原因便是：假如 RAM 沒有被使用的話，它便是閒放在那邊浪費著不用。如果把資料放在用 RAM 組成的磁碟上，它的存取速度比直接從硬碟上存取還要快上 1000 倍。假如在快取裡找不到該資料，當然還是得直接從磁碟裡存取，但就如同上面說的，您將可以節省些微的存取時間。

Quote: http://7cc.tw/bencandy.php?fid=18&id=3280

Free中的buffer和cache︰（它們都是佔用內存）︰

buffer : 作為buffer cache的內存，是塊設備的讀寫緩沖區

cache: 作為page cache的內存, 文件系統的cache

如果 cache 的值很大，說明cache住的文件數很多。如果頻繁訪問到的文件都能被cache住，那麼磁盤的讀IO bi會非常小。

/proc/meminfo 解釋參考: http://www.redhat.com/advice/tips/meminfo.html
ref: http://www.wujianrong.com/archives/linux/2.html
ref: http://linux.chinaunix.net/bbs/viewthread.php?tid=887896
對 Memory Usage 的詳細說明及當 Out of Memory 的解決方法參考: http://rimuhosting.com/howto/memory.jsp
Out of Memory
more /var/log/messages

kernel: Mem-info:
kernel: Zone:DMA freepages: 2916 min:     0 low:     0 high:     0
kernel: Zone:Normal freepages:   758 min:   766 low: 4031 high: 5791
kernel: Zone:HighMem freepages:   125 min:   253 low:   506 high:   759
kernel: Free pages:        3799 (   125 HighMem)
kernel: ( Active: 237940/1195, inactive_laundry: 1, inactive_clean: 0, free: 3799 )
kernel:   aa:0 ac:0 id:0 il:0 ic:0 fr:2916
kernel:   aa:209592 ac:2087 id:1190 il:1 ic:0 fr:758
kernel:   aa:26033 ac:232 id:0 il:0 ic:0 fr:125
kernel: 0*4kB 2*8kB 4*16kB 2*32kB 4*64kB 2*128kB 1*256kB 1*512kB 0*1024kB 1*2048kB 2*4096kB = 11664kB)
kernel: 8*4kB 1*8kB 175*16kB 6*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3032kB)
kernel: 1*4kB 0*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 500kB)
kernel: Swap cache: add 5156731, delete 5156620, find 19439534/21006374, race 0+19
kernel: 4115 pages of slabcache
kernel: 1566 pages of kernel stacks
kernel: 13 lowmem pagetables, 5378 highmem pagetables
kernel: 32 bounce buffer pages, 32 are on the emergency list
kernel: Free swap:            0kB
kernel: 261760 pages of RAM
kernel: 32384 pages of HIGHMEM
kernel: 5781 reserved pages
kernel: 14591 pages shared
kernel: 116 pages swap cached
kernel: Out of Memory: Killed process 25580 (java).

Quote: http://rimuhosting.com/howto/memory.jsp Resolving: High Java Memory Usage

To determine how much memory you can spare for Java, try this: stop your Java process; run free -m; subtract the 'used' value from the "-/+ cache" row from the total memory allocated to your server and then subtract another 'just in case' margin of about 10% of your total server memory. The number you come up with is a rough indicator of the largest -Xmx setting you can use on your server.

意思就是配給 -Xmx數值多少m = a - g - a x 10%，應該是這樣子吧!!!@@

DB2 Database Performance Tuning實例

目前進行一個應用系統的效能測試，所測試的交易含有大量的資料庫Select/Insert/Update的動作，執行完一次交易大概要執行大約3、4千行的 Dynamic SQL語句。測試結果發現效能非常地差，同時50個使用者執行該交易，就需要32分鐘才能將交易做完，利用nmon來監控其CPU使用率發現這30多分鐘的CPU都是滿載使用。

因為這樣的效能實在不能接受，因此開始進行DB2及應用程式的效能調整，因為是第一次有機會做這樣的測試，所以一切就在邊作邊學的過程進行，幸好DB2 Infocenter寫的都還滿詳細的，讓我能一步步達到我的目的。接下來就分享整個Tunning的過程。

首先，要找出使應用程式DB使用效能的原因，我們必需啟用DB2的 monitor switch收及一些統計資訊，輸入 DB2 GET MONITOR SWITCHES 指令可以得到目前這些switche的狀態。

[db2inst1@db01 source_db]$ db2 get monitor switches

Monitor Recording Switches

Switch list for db partition number 0
Buffer Pool Activity Information (BUFFERPOOL) = OFF
Lock Information                        (LOCK) = OFF
Sorting Information                     (SORT) = OFF
SQL Statement Information          (STATEMENT) = OFF
Table Activity Information             (TABLE) = OFF
Take Timestamp Information         (TIMESTAMP) = ON 12/12/2009 15:13:18.256534
Unit of Work Information                 (UOW) = OFF

如上面所示，目前除了TIMESTAMP這個預設會打開的switch外，其它都是關閉的狀態，要啟動任一個 switch只需輸入 db2 UPDATE MONITOR SWITCHES USING XXX ON，比方說若要啟用LOCK這個switch的話，就輸入db2 UPDATE MONITOR SWITCHES USING LOCK ON，結果如下

[db2inst1@db01 source_db]$ db2 update monitor switches using lock on
DB20000I The UPDATE MONITOR SWITCHES command completed successfully.

回到DB2的效能檢視，因為DB2的使用率已達到100%，因此暫不朝向有太多lock的方向思考，且db2 v9版以後，已經有很多Database Configuration參數，如heap size、bufferpool等都可以動態調整了，也不朝向Database Configuration的方向思考。排除上述的因子後，決定直接看 SQL執行的效能如何。

要查看SQL執行效能，首先要把 STATEMENT 這個 Monitor Switch打開，如下

[db2inst1@db01 source_db]$ db2 update monitor switches using statement on
DB20000I The UPDATE MONITOR SWITCHES command completed successfully.

接著模擬1個使用者，執行所要測試的交易，交易執行完後，使用 GET SNAPSHOT FOR DYNAMIC SQL ON database_name 的指令，收集執行一次交易，所有會執行的dynamic SQL統計資訊。將其導向一個檔案

[db2inst1@db01 source_db]$ db2 get snapshot for dynamic sql on test_db > executed_sql

檢視該檔案，可發現每一筆錄到的dynamic SQL的Snapshot的格式如下，在這裡會用到的是 Total Execution Time、Total CPU Time及 SQL statement這三個欄位的資訊(用藍色標出的部分)

Number of executions               = 1
Number of compilations             = 1
Worst preparation time (ms)        = 7
Best preparation time (ms)         = 7
Internal rows deleted              = 0
Internal rows inserted             = 0
Rows read                          = 1590
Internal rows updated              = 0
Rows written                       = 0
Statement sorts                    = 0
Statement sort overflows           = 0
Total sort time                    = 0
Buffer pool data logical reads     = Not Collected
Buffer pool data physical reads    = Not Collected
Buffer pool temporary data logical reads   = Not Collected
Buffer pool temporary data physical reads = Not Collected
Buffer pool index logical reads    = Not Collected
Buffer pool index physical reads   = Not Collected
Buffer pool temporary index logical reads = Not Collected
Buffer pool temporary index physical reads = Not Collected
Buffer pool xda logical reads      = Not Collected
Buffer pool xda physical reads     = Not Collected
Buffer pool temporary xda logical reads    = Not Collected
Buffer pool temporary xda physical reads   = Not Collected
Total execution time (sec.microsec)= 25.435582
Total user cpu time (sec.microsec) = 25.377444
Total system cpu time (sec.microsec)= 0.000000
Total statistic fabrication time (milliseconds) = 0
Total synchronous runstats time (milliseconds) = 0
Statement text                     = SELECT CURR_NO,   DEPT_NO,        GL_ACCT_NO,     SUM(CASH_DR_NO_OF_VOH) CSHDRVOH,        SUM(TRANSFER_DR_NO_OF_VOH) TRNDRVOH,    SUM(CASH_CR_NO_OF_VOH) CSHCRVOH,        SUM(TRANSFER_CR_NO_OF_VOH) TRNCRVOH   FROM ACCT01 T1 WHERE VOH_SEQ_NO = (SELECT MIN(VOH_SEQ_NO) FROM ACCT01                   WHERE T1.VOH_NO = VOH_NO                        AND T1.BRANCH_NO = BRANCH_NO)   AND VOH_DATE = '20090518'       AND VOH_DATE = ORG_VOH_DATE     AND (VOH_SPLIT_NO = '00' OR VOH_SPLIT_NO = '99')        AND BRANCH_NO = '9964' GROUP BY CURR_NO, DEPT_NO, GL_ACCT_NO

因為所錄到的SQL數量非常龐大，無法用肉眼找出那些SQL耗掉最多的CPU time，因此在這裡寫了一個簡單的小程式來過濾這些record。程式內容如下(下列程式會挑出所有執行時間大於0.5秒的SQL，若要改成其它時間，只需在藍色該行的if 敘述中改為其它值

#!/usr/bin/perl
undef $/;
my $data = <>;
my $long_execution_sql_count=0;
while($data=~/(Total execution time $sec.microsec$= (.+?)\n.+?\n.+?\n.+?\n.+?\n.+?\n)/g)
{
        if ($2>0.5)          #這裡使用0.5會挑出所有執行時間大於0.5秒的SQL。
        {
                print $1,"\n";
                $long_execution_sql_count++;
        }
}

print "Total Long Execution SQLs: ",$long_execution_sql_count;

使用此小程式來過濾Dynamic SQL snapshot 所錄到的結果，可發現這個應用程式最耗時的SQL語句如下：

Total execution time (sec.microsec)= 27.081611
Total user cpu time (sec.microsec) = 27.067162
Total system cpu time (sec.microsec)= 0.000000
Total statistic fabrication time (milliseconds) = 0
Total synchronous runstats time (milliseconds) = 0
Statement text                     = SELECT CURR_NO, DEPT_NO, GL_ACCT_NO, SUM(CASH_DR_NO_OF_VOH)CSHDRVOH,
          SUM(TRANSFER_DR_NO_OF_VOH)TRNDRVOH, SUM(CASH_CR_NO_OF_VOH)
          CSHCRVOH, SUM(TRANSFER_CR_NO_OF_VOH)TRNCRVOH
FROM ACCT01 T1
WHERE VOH_SEQ_NO =
     (SELECT MIN(VOH_SEQ_NO)
     FROM ACCT01
     WHERE T1.VOH_NO =VOH_NO AND T1.BRANCH_NO =BRANCH_NO)AND VOH_DATE
          ='20090518' AND ORG_VOH_DATE= '20090518' AND (VOH_SPLIT_NO = '00' OR VOH_SPLIT_NO = '99')
          AND BRANCH_NO ='9964'
GROUP BY CURR_NO, DEPT_NO, GL_ACCT_NO

是的，只單單執行一次這個SQL就要花費 27秒，很顯然地，它就是要被tune的SQL。接下來使用db2expln這個命令來Explain DB2的Optimizer倒底是如何來執行這個SQL的，下面範例會把explain的結果，輸出到 sql_explain這個檔案中。先把上述要explain的SQL語句，寫入input_sql這個檔案中，執行下列指令

[db2inst1@db01 ~]$ db2expln -database test_db -stmtfile input_sql -terminator ';' -o sql_explain -graph

DB2 Universal Database Version 9.5, 5622-044 (c) Copyright IBM Corp. 1991, 2007
Licensed Material - Program Property of IBM
IBM DB2 Universal Database SQL and XQUERY Explain Tool

Output is available in "sql_explain".

接下來檢視sql_explain這個output檔案，以下擷取 Optimizer Plan這個Tree的結果，可以看到這個SQL已經使用Index Search了，所以應該也不是沒建 index的問題。

Rows
                   Operator
                     (ID)
                     Cost
                   1.03085
                   RETURN
                    ( 1)
                   4517.06
                     |
                   1.03085
                    GRPBY
                    ( 2)
                   4517.06
                     |
                   1.03085
                   TBSCAN
                    ( 3)
                   4517.06
                     |
                   1.03085
                    SORT
                    ( 4)
                   4517.06
                     |
                   3.43618
                   NLJOIN
                    ( 5)
                   4517.06
                  /       \
          6.87235            0.5
           FETCH           FILTER
           ( 6)             ( 8)
          556.13           3547.89
         /       \           |
171.809    5.10504e+06      1
IXSCAN     Table:         GRPBY
   ( 7)      DB2INST1       ( 9)
93.3323    ACCT01        3547.89
    |                        |
5.10504e+06               0.531085
Index:                     IXSCAN
DB2ADMIN                    (10)
ACCT01                    3547.89
                            |
                        5.10504e+06
                        Index:
                        DB2ADMIN
                        ACCT01

由上面的Tree可以看到，最大的cost來自Index Scan，判斷可能是因為資料太多而導致，再仔細檢視一下這個耗時的SQL，發現它是由一個主要的Select join Sub-Select的結果組成，且主Select與Sub-Select都是指向同一個Table。看到標示紅色部分的SQL的寫法有點奇怪T1.VOH_NO其實就是VOH_NO；而T1.BRANCH_NO其實就是BRANCH_NO，因為ACCT01這個表格有很多record，如果有500萬筆的record的話，這樣寫就等於做500萬筆對500萬筆資料的比對，難怪會消耗大量的CPU time。

SELECT CURR_NO, DEPT_NO, GL_ACCT_NO, SUM(CASH_DR_NO_OF_VOH)CSHDRVOH,
          SUM(TRANSFER_DR_NO_OF_VOH)TRNDRVOH, SUM(CASH_CR_NO_OF_VOH)
          CSHCRVOH, SUM(TRANSFER_CR_NO_OF_VOH)TRNCRVOH
FROM ACCT01 T1
WHERE VOH_SEQ_NO =
     (SELECT MIN(VOH_SEQ_NO)
     FROM ACCT01
     WHERE T1.VOH_NO =VOH_NO AND T1.BRANCH_NO =BRANCH_NO)AND VOH_DATE
          ='20090518' AND ORG_VOH_DATE= '20090518' AND (VOH_SPLIT_NO = '00' OR VOH_SPLIT_NO = '99')
          AND BRANCH_NO ='9964'
GROUP BY CURR_NO, DEPT_NO, GL_ACCT_NO

經過與程式開發人員確認後，原本的Sub-select還是有其意義，因此需保留。幸好，他想到在該Sub-select中加入另一個條件，縮小資料的範圍，上述的Sub-select修改如下

SELECT MIN(VOH_SEQ_NO) FROM ACCT01
WHERE T1.VOH_NO = VOH_NO AND VOH_DATE = '20090518'
AND T1.BRANCH_NO = BRANCH_NO

結果重新執行這個SQL，就發現CPU time大幅縮小

Total execution time (sec.microsec)= 0.051433
Total user cpu time (sec.microsec) = 0.037532
Total system cpu time (sec.microsec)= 0.000000
Total statistic fabrication time (milliseconds) = 0
Total synchronous runstats time (milliseconds) = 0
Statement text                     = SELECT CURR_NO, DEPT_NO, GL_ACCT_NO, SUM(CASH_DR_NO_OF_VOH)CSHDRVOH,
          SUM(TRANSFER_DR_NO_OF_VOH)TRNDRVOH, SUM(CASH_CR_NO_OF_VOH)
          CSHCRVOH, SUM(TRANSFER_CR_NO_OF_VOH)TRNCRVOH
FROM ACCT01 T1
WHERE VOH_SEQ_NO =
     (SELECT MIN(VOH_SEQ_NO) FROM ACCT01
            WHERE T1.VOH_NO = VOH_NO AND VOH_DATE = '20090518'
             AND T1.BRANCH_NO = BRANCH_NO)AND VOH_DATE
          ='20090518' AND ORG_VOH_DATE= '20090518' AND (VOH_SPLIT_NO = '00' OR VOH_SPLIT_NO = '99')
          AND BRANCH_NO ='9964'
GROUP BY CURR_NO, DEPT_NO, GL_ACCT_NO

接著再回到50個人來測試這個交易，發現執行時間有所縮減，由原本的32分鐘，縮減至27分鐘，這樣的縮減還是無法令人滿意…

再做一次 Dynamic SQL statement 的 Snapshot，發現大多數的SQL statement的執行時間都在3秒以內，只是CPU time 只要0.07秒! 這應該是執行時間很久的原因吧!

Total execution time (sec.microsec)= 2.891382
Total user cpu time (sec.microsec) = 0.076323
Total system cpu time (sec.microsec)= 0.000000
Total statistic fabrication time (milliseconds) = 0
Total synchronous runstats time (milliseconds) = 0
Statement text = UPDATE ACCT04 SET CASH_DR_AMT = 0, TRANSFER_DR_AMT = 0, CASH_CR_AMT = 0, TRANSFER_CR_AMT = 0, CASH_DR_NO_OF_VOH = 0, TRANSFER_DR_NO_OF_VOH = 0, CASH_CR_NO_OF_VOH = 0, TRANSFER_CR_NO_OF_VOH = 0, DR_ACCT_CODE_CURR_BAL = DR_ACCT_CODE_PREV_BAL, CR_ACCT_CODE_CURR_BAL = CR_ACCT_CODE_PREV_BAL WHERE TX_DATE = '20090518' AND BRANCH_NO = '9963'

檢視一下ACCT04這個表格，它足足佔用了5G的空間，而且這個交易執行非常多次這個SQL，如果時間都等在從這個表格找資料、更新資料，會大幅拖慢交易執行速度。為了避免這個問題，這裡考慮把這個表格放在獨立的 TABLE SPACE中，然後assign給他獨立的Buffer Pool。

以下為建立這個獨立的Buffer Pool的指令
CREATE BUFFERPOOL ACCT04 IMMEDIATE SIZE 100000 PAGESIZE 4 K ;
再來是建立這個獨立的Table Space的指令，選用上面建立的獨立Buffer Pool
CREATE LARGE TABLESPACE ACCT04_TABSPACE PAGESIZE 4 K MANAGED BY AUTOMATIC STORAGE EXTENTSIZE 64 OVERHEAD 10.5 PREFETCHSIZE 64 TRANSFERRATE 0.14 BUFFERPOOL ACCT04 ;
最後再把ACCT04這個表格建在新建的 TABLE SPACE上，再執行50人的case時，就發現時間可由原本的 32分鐘縮減到13分鐘，且DB2的 CPU utilization也不會在100%使用率持續太久(如下圖)，Tuning暫告完成

2009年12月9日星期三

DB2 List Applications -- 列出目前連結到資料庫的所有應用程式

List application命令可以列出目前連結到資料庫的所有應用程式資訊，若下命令時沒有指定要看那個資料庫，則所有的資料庫連結都會被列出

以下為使用 DB2 LIST APPLICATIONS FOR SAMPLE SHOW DETAIL指令後，所得到的資訊

DB2INST1 db2jcc_application 3638 9.181.158.212.38221.09120917185 03285 1 0 273 UOW Waiting 12/10/2009 07:47:53.099138 xxxDB /mnt/db_data/db2inst1/NODE0000/SQL00001/

以下簡介每個欄位所代表意義

DB2INST1: Connection Authentication ID
db2jcc_application : Application Name
3957: Application Handle
9.181.158.212.38221.09120917185 : Application ID
03285: Sequence Number
1: Number of agent
0: Database partition number
273: Coordinator PID
UOW Waiting：Status
12/10/2009 07:47:53.099138：Status change time
xxxDB: DB Name
/mnt/db_data/db2inst1/NODE0000/SQL00001/ : DB Path

DB2 Update Monitor Switches

無論是要使用 event monitor 或是 snapshot monitor，若要監控相關資訊，需要打開相關的monitor switches，使用 get monitor switches可以得到目前 monitor的狀態。

[db2inst1@db01 ~]$ db2 get monitor switches

Monitor Recording Switches

Switch list for db partition number 0
Buffer Pool Activity Information (BUFFERPOOL) = OFF
Lock Information                        (LOCK) = OFF
Sorting Information                     (SORT) = OFF
SQL Statement Information          (STATEMENT) = OFF
Table Activity Information             (TABLE) = OFF
Take Timestamp Information         (TIMESTAMP) = ON 12/08/2009 10:20:24.309342
Unit of Work Information                 (UOW) = OFF

若要把某個monitor switch設為 on，使用 update monitor switches for xxx on 的指令，如下列指令可以將 bufferpool這個 switch打開

[db2inst1@db01 ~]$ db2 update monitor switches using bufferpool on

要注意的是，所做的monitor switch更動只限於此次 attach到instance的 session有效，一但離線後再連回去，monitor switch就會被重設為預設值了

重設Monitor Switches值

Monitor switch所收集的值從打開後，會一直累積，若要Reset其數值，可使用 RESET MONITOR 指令，用法如下

DB2 GET SNAPSHOT FOR APPLICATION APPLID application_id執行結果

把所有的monitor switch打開後，使用db2 get snapshot for application applid application_id指令，可以得到下列結果。接下來分別描述其所代表意義

Application handle = 3302
Application status = UOW Waiting =>Application status顯示目前應用程式的狀態，UOW waiting代表這個unit of work正在等待此application id所代表的程式在做程式自己的邏輯。若值為 UOWEXEC表示Database Manager正在代替程式執行查詢，這個網址可看到更多的值。
Status change time = 12/09/2009 21:59:47.610770
Application code page = 1208
Application country/region code = 0
DUOW correlation token = xx.xx.xx.xx.xxxx.09120913583
Application name = db2jcc_application
Application ID = xx.xx.xx.xx.xxxx.09120913583
Sequence number = 02139
TP Monitor client user ID =
TP Monitor client workstation name = was03.csc.ibm.com
TP Monitor client application name =
TP Monitor client accounting string =

Connection request start timestamp = 12/09/2009 21:58:28.648173 =>應用程式發出請求要connect資料庫的時間
Connect request completion timestamp = 12/09/2009 21:58:28.658785 =>資料庫接受connection請求的時間
Application idle time = 1 second =>應用程式自最後一次發出request後，經歷了多少時間，此數值可用來強制結束idle過久的應用程式連接
CONNECT Authorization ID = DB2INST1
Client login ID = DB2INST1
Configuration NNAME of client = was03.csc.ibm.com
Client database manager product ID = JCC03530
Process ID of client application = 0
Platform of client application = Unknown via DRDA
Communication protocol of client = TCP/IP

Inbound communication address = xx.xx.xx.xx xxxxx

Database name = xxxxDB
Database path = /mnt/db_data/db2inst1/NODE0000/SQL00001/
Client database alias = xxxxDB
Input database alias =
Last reset timestamp =
Snapshot timestamp = 12/09/2009 21:59:48.269429
Authorization level granted =
User authority:
DBADM authority
CREATETAB authority
BINDADD authority
CONNECT authority
CREATE_NOT_FENC authority
LOAD authority
IMPLICIT_SCHEMA authority
CREATE_EXT_RT authority
QUIESCE_CONN authority
Group authority:
SYSADM authority
CREATETAB authority
BINDADD authority
CONNECT authority
IMPLICIT_SCHEMA authority
Coordinating database partition number = 0
Current database partition number = 0
Coordinator agent process or thread ID = 255
Current Workload ID = 1
Agents stolen = 0
Agents waiting on locks = 0
Maximum associated agents = 1
Priority at which application agents work = 0
Priority type = Dynamic

Lock timeout (seconds) = -1
Locks held by application = 0
Lock waits since connect = 2
Time application waited on locks (ms) = 2
Deadlocks detected = 0
Lock escalations = 0
Exclusive lock escalations = 0
Number of Lock Timeouts since connected = 0
Total time UOW waited on locks (ms) = 0

Total sorts = 920
Total sort time (ms) = 74
Total sort overflows = 2

Buffer pool data logical reads = 391625 =>此為資料直接從bufferpool讀取次數以及資料不在bufferpool中，而需I/O到bufferpool才能為CPU讀取的次數總合
Buffer pool data physical reads = 128 =>此為因為資料不在bufferpool，而需先進行I/O將資料讀入bufferpool中，才能被CPU所用的次數 1 - (Buffer pool data physical reads / Buffer pool data logical reads) 即為 bufferpool的hit ratio。Hit ratio愈高，表示要經過實際I/O才能拿到資料的次數愈少，讀取資料所花費的時間會愈少
Buffer pool temporary data logical reads = 8614 => 與前面兩個值類似，只是這個是記錄 temporary table space的buffer pool使用狀況
Buffer pool temporary data physical reads = 0 => 與前面兩個值類似，只是這個是記錄 temporary table space的buffer pool使用狀況
Buffer pool data writes = 0 =>Bufferpool的資料實際寫入disk的次數
Buffer pool index logical reads = 1365796 => 此為index page直接從bufferpool讀取次數以及 index page不在bufferpool中，而需I/O到bufferpool才能為CPU讀取的次數總合
Buffer pool index physical reads = 934
Buffer pool temporary index logical reads = 0
Buffer pool temporary index physical reads = 0
Buffer pool index writes = 0
Buffer pool xda logical reads = 0
Buffer pool xda physical reads = 0
Buffer pool temporary xda logical reads = 0
Buffer pool temporary xda physical reads = 0
Buffer pool xda writes = 0
Total buffer pool read time (milliseconds) = 32
Total buffer pool write time (milliseconds)= 0
Time waited for prefetch (ms) = 169
Unread prefetch pages = 0
Direct reads = 56
Direct writes = 0
Direct read requests = 8
Direct write requests = 0
Direct reads elapsed time (ms) = 1
Direct write elapsed time (ms) = 0

Number of SQL requests since last commit = 0
Commit statements = 2138
Rollback statements = 0
Dynamic SQL statements attempted = 3426
Static SQL statements attempted = 2138
Failed statement operations = 0
Select SQL statements executed = 631
Xquery statements executed = 0
Update/Insert/Delete statements executed = 1507
DDL statements executed = 0
Inactive stmt history memory usage (bytes) = 0
Internal automatic rebinds = 0
Internal rows deleted = 0
Internal rows inserted = 0
Internal rows updated = 0
Internal commits = 1
Internal rollbacks = 0
Internal rollbacks due to deadlock = 0
Binds/precompiles attempted = 0
Rows deleted = 0
Rows inserted = 0
Rows updated = 3816
Rows selected = 2196
Rows read = 6715606
Rows written = 11779

UOW log space used (Bytes) = 0
Previous UOW completion timestamp = 12/09/2009 21:59:47.307801
Elapsed time of last completed uow (sec.ms)= 0.001300
UOW start timestamp = 12/09/2009 21:59:47.609463
UOW stop timestamp = 12/09/2009 21:59:47.610763
UOW completion status = Committed - Commit Statement

Open remote cursors = 0
Open remote cursors with blocking = 0
Rejected Block Remote Cursor requests = 0
Accepted Block Remote Cursor requests = 631
Open local cursors = 0
Open local cursors with blocking = 0
Total User CPU Time used by agent (s) = 38.172826
Total System CPU Time used by agent (s) = 0.000000
Host execution elapsed time = 40.473861

Package cache lookups = 2138
Package cache inserts = 2097
Application section lookups = 3426
Application section inserts = 2129
Catalog cache lookups = 4257
Catalog cache inserts = 11
Catalog cache overflows = 0
Catalog cache high water mark = 0

Workspace Information

Shared high water mark = 0
Total shared overflows = 0
Total shared section inserts = 0
Total shared section lookups = 0
Private high water mark = 0
Total private overflows = 0
Total private section inserts = 0
Total private section lookups = 0

Most recent operation = Static Commit
Most recent operation start timestamp = 12/09/2009 21:59:47.610752
Most recent operation stop timestamp = 12/09/2009 21:59:47.610763
Agents associated with the application = 1
Number of hash joins = 9
Number of hash loops = 0
Number of hash join overflows = 0
Number of small hash join overflows = 0
Number of OLAP functions = 0
Number of OLAP function overflows = 0

Statement type = Static SQL Statement
Statement = Static Commit
Section number = 0
Application creator =
Package name =
Consistency Token =
Cursor name =
Statement database partition number = 0
Statement start timestamp = 12/09/2009 21:59:47.610752
Statement stop timestamp = 12/09/2009 21:59:47.610763
Elapsed time of last completed stmt(sec.ms)= 0.000011
Total Statement user CPU time = 0.000009 => Database manager agent實際處理應用程式請求的時間，若應用程式有呼叫到 stored procedure，stored procedure執行的時間也包含在內
Total Statement system CPU time = 0.000000 => Database manager agent為了處理應用程式請求，而發出System Call執行的時間
SQL compiler cost estimate in timerons = 0
SQL compiler cardinality estimate = 0
Degree of parallelism requested = 1
Number of agents working on statement = 1
Number of subagents created for statement = 1
Statement sorts = 0
Total sort time = 0
Sort overflows = 0
Rows read = 0
Rows written = 0
Rows deleted = 0
Rows updated = 0
Rows inserted = 0
Rows fetched = 0
Buffer pool data logical reads = 0
Buffer pool data physical reads = 0
Buffer pool temporary data logical reads = 0
Buffer pool temporary data physical reads = 0
Buffer pool index logical reads = 0
Buffer pool index physical reads = 0
Buffer pool temporary index logical reads = 0
Buffer pool temporary index physical reads = 0
Buffer pool xda logical reads = 0
Buffer pool xda physical reads = 0
Buffer pool temporary xda logical reads = 0
Buffer pool temporary xda physical reads = 0
Blocking cursor = NO

Memory usage for application:

Memory Pool Type = Application Heap
Current size (bytes) = 65536
High water mark (bytes) = 65536
Configured size (bytes) = 1048576

Agent process/thread ID = 255
Agent Lock timeout (seconds) = -1
Memory usage for agent:

Memory Pool Type = Other Memory
Current size (bytes) = 589824
High water mark (bytes) = 786432
Configured size (bytes) = 5267759104

DB2 Get Snapshot 指令用法

參考網址：http://publib.boulder.ibm.com/infocenter/db2luw/v9/topic/com.ibm.db2.udb.admin.doc/doc/r0001945.htm?resultof=%22%67%65%74%22%20%22%73%6e%61%70%73%68%6f%74%22%20

Snapshot可以收集某個時間點下，DB2運作的狀態。把相對應的 monitor switch打開後，就可以使用 GET SNAPSHOT的指令來取得資料

Authorizition：

只有下列權限的人，才能執行 GET SNAPSHOT指令

SYSADM
SYSCTRL
SYSMAINT
SYSMON

需要的Connection：

Snapshot是作用在INSTANCE Level的，所以執行此指令時，至少要 Attach到一個Instance

指令語法：

如下圖所示，GET SNAPSHOT指令的基本寫法就是

GET SNAPSHOT FOR XXX

其中，XXX表示要抓取的資訊，若XXX的值是下列值，不需在最後指定 on database_alias

DBM
ALL DATABASES
ALL APPLICATIONS
ALL BUFFERPOOLS
APPLICATION APPLID application_id
FCM FOR ALL DBPARTITIONNUMS
LOCKS FOR APPLICATION APPLID application_id
ALL REMOTE DATABASES
ALL REMOTE APPLICATIONS

若XXX是下列值，因為其是屬於資料庫層級的資訊，需在最後指定 on database_alias告知要monitor那個資料庫

DATABASE
APPLICATIONS
TABLES
LOCKS
BUFFERPOOLS
REMOTE DATABASES
REMOTE APPLICATIONS

接下來簡介上述各種監控參數

DATABASE MANAGER：取得目前所attach到的active的 database manager instance的統計資訊

ALL DATABASES：提供目前這個database partition中active的 database的基本統計資訊

ALL APPLICATIONS：提供目前連結到資料庫的active application的統計資訊

ALL BUFFERPOOLS：提供目前所有active的資料庫的所有bufferpool資訊

APPLICATION APPLID application_id：提供某個application_id所代表的application的統計資訊

…….

重設Monitor Switches值

Monitor switch所收集的值從打開後，會一直累積，若要Reset其數值，可使用 RESET MONITOR 指令，用法如下

2009年12月8日星期二

WAS Connection Pool設定

下圖中每個欄位的意義分別是

連線逾時值：指的是經過多少時間要不到connection後，就讓Connection Pool Manager發出ConnectionWaitTimeout的exception。通常發生於目前的connection數已達連線數目上限時，無法再建立新的connection，需等待現有的connection釋放出來才能繼續做事。

連線數目上限：設定Connection Pool中，同時可以建立的最大的connection 數

連線數目下限：設定最少可維持的connection數，若目前的connection數超過這個數字，若connection閒置過久，Pool Manager會結束connection，直到這個參數所定義的數目，就不會進行connection關閉的動作了；相反地，如果目前的 connection總數低於此數值，Pool Manager並不會建立新的connection以滿足此數值的設定

執行間隔時間：設定 Connection Pool Manager多久檢查一次connection是否閒置，若不檢查則將此值設為0。設定完成後，時間一到Pool Manager就會檢查是否有connection的閒置時超過未用逾時值的設定，或是有無connection的存在時間超過經歷逾時值的設定，Pool Manager會關掉這些connection值到connection數目等於連線數目下限

未用逾時值：設定Connection閒置超過多久，就可以成為Pool Manager關閉的對象

經歷逾時值：設定Connection存活超過多久，就可以成為Pool Manager關閉的對象，不過若該面臨經歷逾時的Connection正在做交易，會到該交易做完才關閉該connection。若此值設為0，表示connection不會因為存活過久而發生逾時

清除原則：指定當發生嚴重錯誤時，如何清除connection

EntirePool：整個Pool中的所有連線都標示為即將停擺，任何不在使用中的連線都會立即關閉。在大多數時候，這種方式是最佳選擇

FailingConnectionOnly：只關閉造成錯誤的connection

2009年12月6日星期日

DB2 更改Bufferpool大小

使用 db2 get db cfg 指令得到的 Bufferpool 這個 database configuration值指的是在建立bufferpool時預設的大小，如下圖

bufferpool在建立後可使用指令更改其大小，以下說明修改的方式。首先要知道要改的是那個Bufferpool以及它現在的大小，使用這個SQL可查詢目前系統中有那些Bufferpools

SELECT * FROM SYSCAT.BUFFERPOOLS

回傳結果中，NPAGES欄位若為-2，表示現在的bufferpool設定為Automatic。而BPNAME的值則為bufferpool之名稱，知道之後，使用下列指令，更改bufferpool 大小

db2 alter bufferpool ibmdefaultbp immediate size 4

上述指令將bufferpool設為4個4K pages；若要把它設為Automatic，則使用下列指令

db2 alter bufferpool ibmdefaultbp immediate size automatic

將bufferpool設為自動後，可使用snapshot monitor來擷取當下該bufferpool的大小

db2 get snapshot for bufferpools on db_name ##以下為結果(紅色方框列出目前該Buferpool實際大小)

……中間略過……

2009年12月1日星期二

美國國防部橘皮書中定義的安全分級

美國國防部的橘皮書中為一個可被信認的資訊系統，定義了一個分級的機制。因為zSecure提供了 B1、C1、C2的Security Policy設定，這裡只先描述這三個級別。首先要說明的是，B級別比C級別要來的嚴謹，而C2又比C1來的嚴謹

C級別：
這個級別要求的是安全的資訊系統(Trusted Computing Base)所需提供的基礎保護，並且提供audit的機制，以監督使用者所做過的事情

C1：基本的安全保護

符合C1標準的系統必需提供將使用者及資料分隔的機制，以提供基本的安全保護。它包含了依照不同的使用者提供不同的存取限制,例如可以允許使用者設定保護私人的資料，免於讓其它人存取或破壞。下列為符合C1要求的環境的基本要求

Security Policy

    基本的存取控管

    系統需能定義及控制使用者及系統中各個物位的存取關係，有許多方法能夠實現這樣的控制，如 self/group/public不同權限的控制、存取控制清單(Access Control List)等，可以讓使用者控制要將資訊與何人或何群組的人共享

Accountability

Identification及Authentication

   使用者必需先經過身份認證後，才能在系統上做任何事。系統必需提供一個機制(如密碼)讓使用者認證他們的身份，系統同時也需保護這樣的認證資訊不被任何沒有權限的人所存取

Assurance

Operational Assurance

       系統架構面

       系統需維護並確保其所執行的區域(domain)的安全，不受外部的入侵(如修改系統的程式或資料結構)。系統資源必需依照不同的目的，分成不同的subset，授權時只需授予某一需要的subset的資源給使用者即可

       系統完整性

       系統的軟、硬體需能夠定期檢查系統的硬體及韌體是在正常的運作狀況下

Life-Cycle Assurance

系統測試

系統的安全機制需能夠被檢查，以確保是否合乎系統文件所定義的安全等級。安全的測試需能夠確保沒有明顯漏洞讓未被授權的使用者bypass掉安全機制

Documentation

Security Features User’s Guide

單位需有一份手冊來說明系統的安全保護機制」使用這個安全機制的方法以及它們是如何作用的

Trusted Facility Manual

在執行Security相關的功能(Facility)時，需有一本專門為系統管理人員寫的手冊，裡面需說明使用這些管理權限所需注意的事項

Test Documentation

系統開發人員需提供審核人員一份安全功能測試的文件，裡面需描述測試計畫、測試步驟及測試結果

Design Documentation

系統必需提供另外一文件，說明系統安全機制的設計理念，以及這些設計理念如何轉化成系統設計。如果系統的安全機制是由多個元件組成，需再提供這些元件彼此溝通的介面

C2 Controlled Access Protection

符合本安全層級的系統可以做更細部的安全管控，透過Logon Procedure、相關安全事件的稽核及資源的isolation，讓每個使用者都需為其所做過的操作負責。接下來介紹符合C2安全等級需滿足的基本需求

Security Policy

基本的存取控管

除了C1所要求的定義人員與系統中物件的關係外，C2還需要控制存取權限的propogation，防止存取權限被隨意地授權給許多使用者

訂閱：文章 (Atom)

IBM Softwares

2009年12月26日星期六

Ubuntu中如何安裝gcc及kernel原始程式

2009年12月15日星期二

DB2 Process Model

2009年12月12日星期六

清除 Linux memory的 cache

Linux Memory使用資訊

DB2 Database Performance Tuning實例

2009年12月9日星期三

DB2 List Applications -- 列出目前連結到資料庫的所有應用程式

DB2 Update Monitor Switches

DB2 GET SNAPSHOT FOR APPLICATION APPLID application_id執行結果

DB2 Get Snapshot 指令用法

2009年12月8日星期二

WAS Connection Pool設定

2009年12月6日星期日

DB2 更改Bufferpool大小

2009年12月1日星期二

美國國防部橘皮書中定義的安全分級

搜尋此網誌

標籤

網誌存檔

2009年12月26日 星期六

2009年12月15日 星期二

2009年12月12日 星期六

2009年12月9日 星期三

2009年12月8日 星期二

2009年12月6日 星期日

2009年12月1日 星期二

搜尋此網誌

標籤

網誌存檔

2009年12月26日星期六

2009年12月15日星期二

2009年12月12日星期六

2009年12月9日星期三

2009年12月8日星期二

2009年12月6日星期日

2009年12月1日星期二