Skip to content

heartbeat networks and PowerHA

HACMP ver.5.4.1 introduced a new type of heartbeat network – the multi-node one. The preceding sentence gives reason for the following question: what are the differences and why now we have a choice between the two – the traditional and the multi-node heartbeat network?

The following illustration answers both questions.

There is a cluster with four (for example) nodes in an environment with two SAN fabrics. In the traditional heartbeat network, each node needs four disks to communicate with its neighbors – each node has to have read/write access to four disks (one per fabric (for redundancy), two disks per neighbor, four disks for two neighbors.).
The multi-node heartbeat requires only two disks shared (one from each fabric) by all the nodes in the cluster. There are definitely less administrative efforts and physical resources required with this network. There is one important point to consider before choosing this network type. One needs to be concern with utilization of the disk sets underlying the luns created for the multi-node network. If these disk-sets are used a lot then the LUNs could be slow and the number of lost heartbeats undermines the whole reason for this network….. In the traditional heartbeat network each of its disks (luns) is only used by two nodes. In the multi-node network, each heartbeat disk is used by all the nodes = more traffic on this disks…
The multi-node heartbeat network requires at least 32MB disk configured in an enhanced concurrent volume group with one uniquely named logical volume. The traditional network requires only disks – no logical volumes need to be present. Finally, both types do not require dedicated disks, still this is the preferred way.

Shop Amazon’s New Kindle Fire

Posted in HACMP, Real life AIX.

Tagged with , , , , .

hints, tips and usage of the instfix command

I found this “gem” a few days ago, and for my own good I decided to copy and re-post it here. This is re-post or IBM TechNote Ref#T1011859, definitely worth reading especially during the yearly os-upgrade cycle. For your and mine convenience:

usage of the instfix command
Hints, Tips and usage of the ‘instfix’ command
This document will describe many of the various and most common uses of the ‘instfix’ command.

The main topics covered will include:

– TL verses ML – Which is correct?
– Usage of the ‘instfix’ command to check for APARs
– Usage of the ‘instfix’ command to install APARs
– Adding missing APAR information to the ‘fix’ object class of the ODM

Posted in Real life AIX.

Tagged with , , , , , .

creating multi-node disk heartbeats with smitty cl_manage_mndhb

This post shows how to set a multi-node Disk Heartbeat – smitty cl_manage_mndhb. What is the difference between the traditional and the multi-node disk heartbeat? The first one is a “network” between two nodes where one disk is shared between only two nodes. The second one allows one disk to be shared by multiple nodes. The second one requires creation of a logical volume. The first one does not need any volumes. The principle of the “single point of failure” still applies – it is not a good idea to have only one mult-inode heartbeat disk in a cluster.

Executing this shortcut, administrator is presented with a screen allowing the following choices:

Create a new Volume Group and Logical Volume for Multi-Node Disk Heartbeat
Add a Concurrent Logical Volume for Multi-Node Disk Heartbeat
Show Volume Groups in use for Multi-Node Disk Heartbeat
Stop using a Volume Group for Multi-Node Disk Heartbeat
Configure failure action for Multi-Node Disk Heartbeat Volume Groups

It is peculiar that the first two options imply creation of a logical volume…. after creating these entities using the standard method does not require me/you to create a logical volume.

I tried this option today, and I have to say that I really like it as it is a simpler one which does all in a single step. But I failed the very first time I did it. Looking at the ouput form smitty it very quickly became apparent why. See for yourself – the error message:

Error executing mklv -y mndhb_lv_01 -u 1 -c 1 -e m -t jfs -v n -w n -r n mndhb_vg_01 32 hdisk2 on node #####

The last line shows a request to create a logical volume which size equal 1 x 32 = 32MB. What size is the disk (hdisk2) I specified for this action?

bootinfo -s hdisk2

This disk is 20MB. How it is possible to fit a 32MB logical volume in a 20MB physical disk? It is not possible!!! So for me to get this show running, I had to ask SAN administrator to “expand” this and the LUN from another SAN controller (another fabric) to 40MB (I like this “round” number). Next after chvg -g vg_name to make AIX aware of the new disk size, I could finish what I intended to do.
From now, I have to remember to always ask for 40MB LUNs if they are intended to be used as the “multi-node” heartbeat disks.

Posted in HACMP, Real life AIX.

Tagged with , , , , , .

working with “strangely” named files

Today, I decided to put all my JPEG files into one flash drive – I got a “picture frame”!!!! There was just one problem. My camera creates files using the same schema in sub-directory named using the current date. So today, I have to finally spend some time and rename all these files stored in directories so they are uniquely named. To cut the suspense, and to let you know what I mean look bellow. Do you see what I mean by.

-rw-r--r--    1 mduszyk  staff 2706165 Mar 19 2011  Picture 201.jpg
-rw-r--r--    1 mduszyk  staff 2783445 Mar 19 2011  Picture 202.jpg
-rw-r--r--    1 mduszyk  staff 2480088 Mar 19 2011  Picture 203.jpg
-rw-r--r--    1 mduszyk  staff 2553840 Mar 19 2011  Picture 204.jpg
-rw-r--r--    1 mduszyk  staff 2476612 Mar 19 2011  Picture 205.jpg
-rw-r--r--    1 mduszyk  staff 2572827 Mar 19 2011  Picture 206.jpg

It could be because of the early Sunday morning. I mean really early one – just pour my first cup of coffee. Without much thinking (the body is a the keyboard but the brain is still in bed) I typeL

for file in `ls | awk '{print $1}'`
        mv $file aaa$fille

Before the same hand that just hit the ENTER key moves to grab the coffee cup the eyes catch AIX spitting back garbage of the pretty much following format.

ls: 0653-341 The file Picture does not exist.
ls: 0653-341 The file 202.jpg does not exist.

Of course, the rename does not work and nothing happen. So what is going on here? Apparently there is “some character between the Picture and the following it number. For ls command it looks like there are two objects not just one. The first object is called Picture and the second one is an numeral with the extension jpg.

I scratch my head, still no coffee for me. Mickey the Cat just jumped at my table and looks at me with the looks in his eyes that tells me “FEED ME!!!!” – I obey without a word.

Getting Mickey’s food, I get my first sinister idea! Let use ls -i to get the inodes associated with each file and process them with find -inum to rename them. I serve Mickey his food and I feel empowered, live is so great!

Back at the keyboard, I execute:

for file in `ls -i | grep Picture | awk '{print $1}'`
     find . -inum $file -exec mv aaa$file {} \;

Pretty much as before, it does not work. Two times down for me.

It is obvious that what I just entered does not get the file name to rename; the aaa$file works like it is the first argument to the mv and not the second one. I drink my coffee, steer at the screen and think, I get an idea and I type it:

for file in `ls -i | grep Picture | awk '{print $1}'`
    find . -inum $file -exec mv {} aazaa$file.jpg \;

In the find snippet shown last, the {} takes on the first argument of command mv – the file matching the inode number delivered by ls -i. Next, the second argument is created in the format of aazaa$file.jpg and now the mv is working as designed. So now, Picture 201.jpg is not longer, replaced by aazaaPicture201.jpg. After I am done renaming this set of files and they are moved to the flash drive, the second batch will be prefixed with a different prefix and so forth till all of the images are processed and safely stored on my new flash drive in the new picture frame.

Of course, there are other ways to work this situation. For example, one could figure out what character separates the Picture from the number and use sed to either remove it or replace it with something else resulting in one “solid” file name to process. By the way, how do you delete a file name with a blank character at the end of its name? Inodes and find is my bet.

Posted in Real life AIX.

Tagged with , , , , , .

If I were to give you a gift, what would it be?

In you own place, at your own time – I hope you will enjoy it. Follow this link with your eyes wide open 🙂

Posted in Real life AIX.

Tagged with , , , , , .

What do they mean when they say “stanza”?

and why you should never manually edit files in the /etc/security…….

A lot of AIX configuration files have the “stanza” format. Look at the /etc/qconfig or almost any file in /etc/security to see what I mean. So what is the “stanza”?
It is a block of ASCII text starting with a token (a word) ending with : and ending with at least one blank line.

Why do I write about it, today? Well, yesterday I asked my colleague (Jon is the Tivoli Management Framework Administrator – among others) to execute on all our AIX hosts (he can do it with a single stroke of a keyboard) one “small” script that I put together to enable LDAP authentication for two specific users. Here are the contents of this script (there is only one long line starting with the echo command – not few as shown on your browser):


rmuser -p svcvulscan 
rmuser -p svcvulscan2

echo "svcvulscan:\n\tSYSTEM = LDAP\n\tregistry = LDAP\n\nsvcvulscan2:\n\tSYSTEM = LDAP\n\tregistry = LDAP\n" >> /etc/security/user

In the perfect world this should work like a charm….. but I forgot the this is the real world. What happen? On some AIX hosts the last user prior to running this script could no longer log-in. Why? If you look above at the line starting with the echo statement, you will notice that the entry svcvulscan: just get inserted int the file. Plain and simple.
But what is going to happen if the last entry in the /etc/security/user is not a “blank” line? In this case, the last stanza in this file extends “swallowing” the the svcvulscan entry as the result making the last user in this file an LDAP user. The following illustrates what I mean.

        admin = false
        SYSTEM = LDAP
        registry = LDAP

        SYSTEM = LDAP
        registry = LDAP

To really make the point and to clear any doubts, look at the following:

# grep -p brownh /etc/security/user
        admin = false
        SYSTEM = LDAP
        registry = LDAP

At this moment, AIX will not allow brownh to login – AIX cannot make sense of this user stanza in /etc/security/user! It is not just this user, svcvulscan also will not be able to fucntion.
To fix it, the truly yours had to insert a blank line above svcvulscan to mark the end of the stanza defining brownh.

Could this be avoided? Sure. Look bellow.


rmuser -p svcvulscan
rmuser -p svcvulscan2

echo "\nsvcvulscan:\n\tSYSTEM = LDAP\n\tregistry = LDAP\n\nsvcvulscan2:\n\tSYSTEM = LDAP\n\tregistry = LDAP\n" >> /etc/security/user

Do you see that now the script will enter a blank line before inserting the stanzas (the \n in front of svcvulscan:? It does not really matter how many blank lines are used to separate stanzas but there must be at least on for stanza to be a stanza. 🙂

What I have described in this post would not happened if on some machines at one point or another for some “then” valid reasons some AIX administrator (it could be me) manually edited the contents of /etc/security/user forgetting and not leaving at least one blank line at the end of this file. Have a good day!

Posted in Real life AIX.

Tagged with , , .

Improving PowerVM Environment

There is no question about it – PowerVM is here to stay. Its flexibility – the ease of employment of new “partitions” combined with the ease of modifying the existing ones transformed PowerVM from a novelty into something to be expected in each data center housing AIX. Earlier, when building PowerVM environments (VIOS + partitions) and, to be precise, when configuring the networking side of these environments, I noticed that “my” partitions network adapters were all attached to one virtual switch (Ethernet 0).

Well, how this Ethernet 0 switch came to be and if the digit 0 following the Eternet indicates the possibility of additional switches (like for example Ethernet 1, 2, ....) – how to create and use them? Are there any advantages or disadvantages of building and employing multiple Ethernet switches with PowerVMs? For those interested in this subject, I recommended studying this document: “Using Virtual Switches in PowerVM to Drive Maximum Value of 10Gb Ethernet” – thanks Rob for locating it!

Usually, if one builds two VIO servers in a frame, one does it to provide a level of redundancy to protect partitions against a failure of one of the VIO servers delivering resources to frames partitions. If this is the case, then the presence of a single Ethernet is a single point of failure, right? This could be on more reason for you to get acquainted with the above document…..

Posted in Real life AIX, VIO.

Tagged with , , , , .

VIOS Advisor Explained

“The goal of the VIOS advisor is not to provide another monitoring tool, but instead have an expert system view performance metrics already available to the customer and make assessments and recommendations based on the expertise and experience available within the IBM systems performance group.”

Sounds interesting? It does? Follow this link to the latest article by Rob McNelly in the “IBM Systems Magazine“, AIX edition.

Posted in AIX, Real life AIX.

Tagged with , , , .

LDAP users can log into AIX with no or invalid password

Apparently it is nice to be liked. Today, I installed ldap client on a set of Oracle test machines and shortly later Adi tells me that he can ssh to other hosts without a password or with a wrong one. Oops, a big Oops indeed …. .

These two machines are running AIX and all other ones that I have switched into TDS/AD authentication are AIX or and they do not demonstrate this dangerous “abilities”. The few hosts with LDAP client running AIX also do not show this behavior.

This dangerous issue apparently is specific to which explain why any earlier or later OS versions do not show it. IBM has an emergency fix neutralizing this problem which is know as IZ97416. To install it execute:

emgr -e IZ97416.110329.epkg.Z

Now, do verify that the previous password-less logins from the AIX LDAP client to other AIX hosts are no longer possible.

Posted in ldap, Real life AIX.

Tagged with , , , , , , .

Copyright © 2016 - 2017 Waldemar Mark Duszyk. All Rights Reserved. Created by Blog Copyright.