Skip to content


Redirecting syslogd messages

At time to time we have to deal with syslogd and its messages. How do we select the topics of these messages? Where do we send them? Will they be sent to a local or a remote  file or files? There are decisions to be made here. These decisions may warn you about an upcoming hardware failure, they may help you keep your weekend for you and yours instead of spending it in front of a keyboard…..
As we implement a global syslog policy, we may encounter problems – the syslog server may not be receiving the messages sent by the other machines. Are the messages send out? Why don’t you check lesson 16?

Posted in AIX, AIX Classes, Real life AIX. Tagged with , , , , , .

Print this Post Email this Post

File Systems details on HMC

My IBM engineer called me to tell me that one of my HMC’s “called home” complaining about an “accessive” usage of one of its file systems. So how do you list HMC file systems details? The usual “df” command will not work here – hscroot (you) is working in a restrictive shell, remember?

Without becoming a trusted user you may obtain your HMC file systems details executing:


hscroot@hmci1> monhmc -r disk
Every 4.0s: MONHmc disk Mon Mar 8 14:08:34 2010

Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda2 16184388 7465540 7896724 85% /
udev 517304 148 517156 1% /dev
/dev/sda3 5929732 760692 4867824 14% /var
/dev/sda7 8088148 152132 7525156 2% /dump
/dev/sda8 38686028 203056 36517824 1% /extra



Not being a privileged user (hcsroot; is not a privileged user), you can also zero out log files executing chhmcfs -o f -d 0, and this is pretty much all you can do.
To be able to clean a file system (like for example / which is 85% used), in order to gain the full access to HMC you have to contact IBM to generate the priviledged account password based on you HMC serial number.

Posted in AIX, HMC, Real life AIX. Tagged with , , , , , , , , .

Print this Post Email this Post

Your Network is Slow…

The other day, the gentleman installing a newer version of an application (on the new hardware) sent me (and many other people too) an email, asking “why the network is so much slower between the new host and the host with the “source” data?
As suggested, I start with the network. To learn what logical adapter does the “talking” I executed ifconfig -a. Now, let see the shape of the its physical adapter: entstat -d ent2. The output shows me the expected speed and mode. Presence of any dropped packets/errors that could indicated issues either with NIC or LAN are identified executing netstat -in. There is none, zero. The last step here (or maybe it should be the first one) – I execute errpt -a | more and look for any obvious error entries in error log. There are none.

The installer, complains about the speed of data from the data server to the new application server. I have to see it with my own eyes. How fast the data moves between these machines? I execute the same series of steps from the data server and from both application servers.Instalator skarży się na prędkość danych pomiędzy serwerem danych a nowym serwerem aplikacji. Tak więc ja muszę się o tym sam przekonać. Jak szybko są przesyłane date pomiędzy tymi dwoma maszynami? Wykonuję serię identycznych poleceń na serwerze z danymi i serwerze z nową wersją aplikacji.

From the data server I open ftp sessions to both application servers and sent in the binary mode the same amount of data. I look at the time it took to sent the data. Next, I reverse the target from each data servers I open ftp session and send the same amount of data to the data server. These tests do not involve any disk/file system/memory buffers because the transfers were made between the zero and the null devices in the bin mode. Look below for details:

From the “existing” application server to data server

hades:/home/duszyk: > ftp datavault
Connected to datavault.
220 datavault FTP server (Version 4.1 Thu Dec 7 16:20:00 CST 2006) ready.
Name (datavault:duszyk):
331 Password required for duszyk.
Password:
230-Last unsuccessful login: Wed Jan 20 09:19:47 EST 2010 on ssh from 000e7f6d9087.hell.org
230-Last login: Wed Feb 10 16:59:25 EST 2010 on ftp from champ1
230 User duszyk logged in.
ftp> bin
200 Type set to I.
ftp> put "|dd if=/dev/zero bs=32k count=10000" /dev/null
200 PORT command successful.
150 Opening data connection for /dev/null.
10000+0 records in.
10000+0 records out.
226 Transfer complete.
327680000 bytes sent in 30.3 seconds (1.056e+04 Kbytes/s)
local: dd if=/dev/zero bs=32k count=10000 remote: /dev/null
ftp> quit
221 Goodbye.
hades:/home/duszyk>

From the new application server to the data server.

champ1:/> ftp datavault
Connected to datavault.CHOP.EDU.
220 datavault FTP server (Version 4.1 Thu Dec 7 16:20:00 CST 2006) ready.
Name (datavault:duszyk): duszyk
331 Password required for duszyk.
Password:
230-Last unsuccessful login: Wed Jan 20 09:19:47 EST 2010 on ssh from 000e7f6d9087.hell.org
230-Last login: Thu Feb 4 11:57:33 EST 2010 on ssh from 000e7f6d9087.hell.org
230 User duszyk logged in.
ftp> bin
200 Type set to I.
ftp> put "|dd if=/dev/zero bs=32k count=10000" /dev/null
200 PORT command successful.
150 Opening data connection for /dev/null.
10000+0 records in
10000+0 records out
226 Transfer complete.
327680000 bytes sent in 3.326 seconds (9.62e+04 Kbytes/s)
local: dd if=/dev/zero bs=32k count=10000 remote: /dev/null
ftp> quit
221 Goodbye.
champ1:/>

As you can see, the new server “gets” the data almost ten times faster than then the “old” application server. It is not the network but something else that is responsible for delay. What it is?

Oracle like any other data base uses asynchronous I/O. For quite a while, AIX comes equipped with two types of AIO: the Legacy and the POSIX one. What is AIO? Simply said, Asynchronous IO allows data base to perform useful work while its data is being written to disks (application does not need to wait for completion of I/O). Application aka data base puts its IO requests on a queue and AIX Asynchronous IO Servers does the rest (it proceses the requests from the queue). This simple explanation implies that the performance of a data base server depends to a certain degree on the number of AIO servers. I am not sure if Oracle still uses only the Legacy AIO, so I will check them both and I will compare the results with the existing application server.

On the “old” application server:
root@hades:/root > lsattr -El posix_aio0
autoconfig defined STATE to be configured at system restart True
fastpath enable State of fast path True
kprocprio 39 Server PRIORITY True
maxreqs 4096 Maximum number of REQUESTS True
maxservers 10 MAXIMUM number of servers per cpu True
minservers 1 MINIMUM number of servers True

root@hades:/root> lsattr -El aio0
autoconfig available STATE to be configured at system restart True
fastpath enable State of fast path True
kprocprio 39 Server PRIORITY True
maxreqs 12288 Maximum number of REQUESTS True
maxservers 200 MAXIMUM number of servers per cpu True
minservers 1 MINIMUM number of servers True
root@hades:/root>

On the new application server:
champ1:/root>lsattr -El posix_aio0
autoconfig available STATE to be configured at system restart True
fastpath enable State of fast path True
kprocprio 39 Server PRIORITY True
maxreqs 4096 Maximum number of REQUESTS True
maxservers 10 MAXIMUM number of servers per cpu True
minservers 2 MINIMUM number of servers True

champ1:/root> lsattr -El aio0
autoconfig available STATE to be configured at system restart True
fastpath enable State of fast path True
kprocprio 39 Server PRIORITY True
maxreqs 4096 Maximum number of REQUESTS True
maxservers 10 MAXIMUM number of servers per cpu True
minservers 1 MINIMUM number of servers True
champ1:/root>

The new server is allowed no more than 10 AIO Servers (legacy), while the old server may have up to 200 of them. So here is the first indication why this server is slow. By the way, to see how many AIO servers are currently in use, execute:

pstat -a | grep aioserver | wc -l

Another factor that could slow down an application is associated with mirroring. Mirroring offers protection which is why we mirror but there is a price to pay for this service and if mirroring is done without considerations AIX will set it up very conservatively (for the best level of protection) and this has potential to slow machine down, a lot. Let me check the logical volumes – am I paying the additional performance penalty in the shape of MMC and Write Verify enabled?

for lv in `lsvg -l my_vg_name | grep jfs2 | awk ‘{print $1}’`
do
lslv -l | egrep ‘CONSISTENCY|VERIFY’
echo
done

The answers, consistently look like this one:

TYPE jfs2 WRITE VERIFY: off
MIRROR WRITE CONSISTENCY: off

The lines above show me that the logical volumes are set to minimize the costs of mirroring, all is OK in here. With no errors in error log to show SAN/storage issues I am sending email to the installer to let him know that I am about to set the number of the AIO Servers on the new machine to the same number as on the existing host. This can be done executing smitty aio for the LEGACY and smitty posixaio for the POSIX AIO.

So the moral of this story is? Do not focus your attention only on what other people belive is wrong, take this into account but keep your eyes and focus wide open.

Posted in Real life AIX. Tagged with , , , , , , , , , , , .

Print this Post Email this Post

Few words about paging

The latest addition to AIX classes - a short lesson on paging .

Posted in AIX, AIX Classes. Tagged with , , , , , , , , .

Print this Post Email this Post

© 2008-2010 www.wmduszyk.com All Rights Reserved - Wszystkie Prawa Zastrzeżone -- Copyright notice by Blog Copyright