Skip to content


/etc/fstab, UUID‘s RedHat and VMware

Learn through suffering, really!

I started to build VMware RedHat guests with no experience whatsoever and while doing so I followed nothing else but what I have already learnt and what I do know – AIX. Building, I have used LINUX Logical Volume Manager to create volume groups and logical volumes which I topped with file systems. Following the best procedures, I populated /etc/fstab with UUID’s of the disks. Why? Because the literature says that the universally unique identifiers (which AIX equivalent is called PVID) is the preferred way to associate disks with their file systems because once UUID get assigned to a disk it will never change (a disk description like /dev/sda may change to something else if more disks are added to the host).

To see the currently used UUID‘s execute command called blkid.

#  blkid | sort
/dev/mapper/oracle_vg-u01_lv: UUID="7a9e4e58-174b-4567-93a7-9a479d4ce999" TYPE="ext4"
/dev/mapper/vg_sys-lv_home: UUID="8e393748-4432-4752-900d-dbdd71a4f7bb" TYPE="ext4"
/dev/mapper/vg_sys-lv_root: UUID="c2c612e2-ccf0-4311-b705-c1788a8afbbf" TYPE="ext4"
/dev/mapper/vg_sys-lv_swap: UUID="03f634ac-28d6-44fb-9742-dfb1e621e358" TYPE="swap"
/dev/mapper/vg_sys-lv_temp: UUID="0e830c56-c697-403f-993e-ae442e6827f8" TYPE="ext4"
/dev/mapper/vg_sys-lv_usr: UUID="cbda1389-a042-4059-b43a-bb0920e02d2d" TYPE="ext4"
/dev/mapper/vg_sys-lv_var: UUID="48e1864c-1e09-4204-88ad-7ca16429c8cd" TYPE="ext4"
/dev/sda1: UUID="e7d8928a-fc04-40fd-b625-e9da99732c3b" TYPE="ext4"
/dev/sda2: UUID="6wZrl9-XhOr-RALD-9Sbp-Lhoi-GRYe-EIs4LH" TYPE="LVM2_member"
/dev/sdb: UUID="1mddKd-lDw7-YjDe-BVt9-4pKO-j8Oh-w3x6UA" TYPE="LVM2_member"

From the output above, we all can see that the logical volume named u01_lv (member of oracle_vg volume group) is assigned UUID=fa9e4e58-174b-4567-93a7-9a479d4ce341.

The next listing represents the contents of the host /etc/fstab file using the UUID code for u01_lv – as I originally entered it.

UUID=7a9e4e58-174b-4567-93a7-9a479d4ce999  /u01 ext4 defaults   1 2

Then, I noticed that the “other” logical volumes in this file had 1 and 2 so without much thinking I followed the already established pattern and I use them too. By the way, these other logical volumes belonged to the guest “rootvg“.

Days became weeks, weeks became months. Everything worked like a charm. About a year later, a guest had to be rebooted and it did not came back. The console “said” that UUID of the disk holding u01_lv has changed…… Isn’t this just peachy?

At this time, I recognized my mistake which I made without even knowing it. For as long as the UUID of the sdb disk stayed the same my mistake remained hidden. But eventually …… . For some reasons beyond our knowledge, one disk UUID (which should always be “unique” and “constant”) has changed and its corresponding entry in the /etc/fstab file was not longer true which resulted in the following – First, LINUX kernel “see” that /dev/sdb has different UUID and it throws a message about it to the console. Next, kernel want to mount /u10 that is not there so kernel decides to fsck the logical volume and the volume is not there …. . Kernel has no idea what is going on and it surrenders delivering us to the “Maintenance Mode” – please help me! Here, nothing can be done because the /etc/fstab cannot be edited as / is mounted read-only!

It took a few minutes and the guest was booted in the rescue mode – in this mode you can modify the contents of the / file system. What content? Like for example /etc/fstab. Inside this file, the start and the end of the line describing /u01 file system has to be modified. To obtain the new UUID one need to execute the blkid command. Next, comes the end of the line – the last two digits to be exact. As per Mike’s advice, I replaced the existing numbers with 0 (zero). So now the line reads:

UUID=fa9e4e58-174b-4567-93a7-9a479d4ce341  /u01 ext4 defaults   0 0

Why the two zeros? After the host was back on-line, I spent some time and actually read the you guess what (the man pages) – actually I got this info from www.linfo.org where I found a very detailed description of /etc/fstab contents. Below are two two sections dealing with the fifth and the sixth element.

(5) The fifth column is used to determine whether the dump command will backup the file system. This column is rarely used and has two options: 0, do not dump, which is used for most partitions, and 1, dump, which is used for the root partition.
(6) The sixth column is used by the fsck program to determine the order in which the computer checks the file systems when it boots. The three possible values for the column are: 0, do not check, 1, check first (only the root partition should have this setting) and 2, check after the root partition has been checked. Most Linux distributions set all the partitions to 0, except for the root partition. If maintenance is important, 2 should be used, although this can increase the amount of time required for booting.

Going back to the changed UUID it could be possible that this is the reason why – http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1026710).

During this “difficult” time, our new LINUX/TIVOLI administrator Mike “Ski” Swierczynski (also an ex Marine) walked me through the recovery process and he was the one who pointed the wrong options (the last two columns) in the /etc/fstab – thanks Mike and Semper FI!

Posted in Real life AIX.


recovering rootvg missing vSCSI disks

Getting ready to AIX upgrade, it become apparent that “something” happened to one of the two VIO servers of this managed system (frame). All sixteen guests (lpars) had missing disk. The missing disk was always hdisk0 which points (in our case) to vios1 (the hdisk1‘s are delivered from disk pool of vios2).

At this time both VIOS servers are fully operational so the current question is how to quickly recover and restore the rootvg mirroring on the affected partitions?

Start with listing the host dump devices.

# sysdumpdev -l
primary              /dev/dump0
secondary            /dev/dump1
copy directory       /var/adm/ras
forced copy flag     TRUE
always allow dump    FALSE
dump compression     ON
type of dump         fw-assisted
full memory dump     disallow

Temporarily disable them.

# sysdumpdev -P -p /dev/sysdumpnull
# sysdumpdev -P -s /dev/sysdumpnull

Verify the last two steps:

# sysdumpdev -l
primary              /dev/sysdumpnull
secondary            /dev/sysdumpnull
copy directory       /var/adm/ras
forced copy flag     TRUE
always allow dump    TRUE
dump compression     ON

After LVM detects issues with a disk lasting longer than a “certain” length of time, it will declare this disk missing and LVM will no longer be interested in this device. You have to make LVM “re-analyze” rootvg disks executing the varyonvg rootvg command. LVM will change the state of the previously missing disk to active and in this case since volume group is mirrored this will also automatically trigger the syncvg command resulting in gradual disappearance of stale partitions.

# varyonvg rootvg

Verify that both disks are active

# lsvg -p rootvg

Check for the logical volume synchronization executing the ps command and if lvsync is not running start it executing syncvg -P 32 -v rootvg

Finally, activate both dumps πŸ™‚

# sysdumpdev -P -p /dev/dump0
# sysdumpdev -P -s /dev/dump1

Copy/paste/reuse on all remaining partitions.

Posted in Real life AIX.


extending file system in LINUX

This morning there is a ticket in my queue to extend the /tmp file system on one of RedHat 6.2 hosts to 6gb. A few weeks earlier, when needed to do that, I used to commands. First the lvextend to make larger the underlying file system logical volume and next, the command resize2fs to extend the file system to use the additional capacity of its logical volume.

Today, I actually took the time to read the output of the man lvextend and I recognized that this operation just like in AIX can be done with a single step. The objective is to make /tmp 6gb big.

# df -h /tmp
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_sys-lv_temp
                        4.0G  137M  3.7G   4% /tmp

# lvextend -r -L 6G /dev/mapper/vg_sys-lv_temp
  Extending logical volume lv_temp to 6.00 GiB
  Logical volume lv_temp successfully resized
resize2fs 1.41.12 (17-May-2010)
Filesystem at /dev/mapper/vg_sys-lv_temp is mounted on /tmp; on-line resizing required
old desc_blocks = 1, new_desc_blocks = 1
Performing an on-line resize of /dev/mapper/vg_sys-lv_temp to 1572864 (4k) blocks.
The filesystem on /dev/mapper/vg_sys-lv_temp is now 1572864 blocks long.

# df -h /tmp
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_sys-lv_temp
                      6.0G  137M  5.5G   3% /tmp

Note: It is the -r that makes extendlv to extend the logical volume and the associated with it file system. Using -L 6G sets the target size at 6gb but using -L +6G would make /tmp 10gb.

Posted in Real life AIX.


the cost of distraction …… restoring permissions on a filesystem and its contents

Today is a special day for us. We screw up and now we need to change ownership of every file and directory in a file system. How did we get there? We needed to change owners of file systems which names followed the pattern of /u01 through /u60. So what we did was chown oracle.dba /u* instead of chown oracle.dba /u[0-9][0-9]!
As the result the /usr file system and all of its contents got a new “mommy” and “daddy” which are known now as oracle.dba ….. Now, the original owners have to be restored and they might be not root.system …..
So what we did? We located a host with identical version of AIX as the two hosts we just messed up. After login-in, we executed the following script:

#!/bin/ksh
cd /tmp
rm reset.perms.out 2>/dev/null
find /usr -ls |awk β€˜{print $3,$4,$5,$6,$7,$8,$9,$10,$11,$12}’|
awk β€˜{ if ( NF == β€œ9β€³ ) {
printf (β€œchown %s.%s %s\n”,$3,$4,$9)
{
perms=0
if(substr($1,2,1) == β€œr”)
            perms = perms + 400
if(substr($1,3,1) == β€œw”)
            perms = perms + 200
if(substr($1,4,1) == β€œx”)
            perms = perms + 100
if(substr($1,4,1) == β€œS”)
            perms = perms + 4000
if(substr($1,4,1) == β€œs”)
            perms = perms + 4100
if(substr($1,5,1) == β€œr”)
           perms = perms + 40
if(substr($1,6,1) == β€œw”)
           perms = perms + 20
if(substr($1,7,1) == β€œx”)
           perms = perms + 10
if(substr($1,7,1) == β€œS”)
           perms = perms + 2000
if(substr($1,7,1) == β€œs”)
           perms = perms + 2010
if(substr($1,8,1) == β€œr”)
           perms = perms + 4
if(substr($1,9,1) == β€œw”)
           perms = perms + 2
if(substr($1,10,1) == β€œx”)
           perms = perms + 1
if(substr($1,10,1) == β€œT”)
           perms = perms + 1000
if(substr($1,10,1) == β€œt”)
           perms = perms + 1001
           printf(β€œchmod %d %s # %s\n”,perms,$9,$1)
}
}
}’ >reset.perms.out

This script scans /usr and records its of its entities owners and permissions. This information is then stored in the file /tmp/reset.perms.out which was copied to with scp to each host that need /usr ownership restored. Next, the rest.perms.out was made “executable” chmod 700 reset.perms.out and executed. Nice!

You do know what to change if you need to use this script on a different file system, right? Yes, just replace the /usr above with the file system of your choice.

Posted in Real life AIX.

Tagged with .


adventures with npiv, xiv and brocade switches

Something really strange happened to me today ….. year or so ago, I built two “lpars” using the standard (for this site) approach – the toorvg disks as vscsi devices and SAN disks via two virtual FC adapters from each VIO server (four FC adapters in LPAR). Later each partition received two SAN disks and everything went dormant for almost six months. Last Friday, I asked for more storage, got it, created volume groups, logical volumes, file systems and at the end of the day, I rebooted both hosts and went home.

Monday, I was back in my cube to continue what I have left behind and here I have received my surprise – one of the two hosts had no SAN storage! Executing lspv showed only the two vSCSI disks defining the rootvg. Executing the command lsdev -Cc disk showed the “missing” disks but in the “Defined” state! I spend a few minutes trying to “resurrect” them doing the usual rmdev -dl hdisk# / cfgmgr to no avail. I gaved up and took a moment to think about it.

I know that if resources (virtual adapters, memory, CPU, and so forth) are added “dynamically” via HMC with the “Dynamic Logical Partitioning” option and the host is rebooted later the added resources will be “gone”, they will disappear – by the way this is something I cannot get used to (I think DLPAR should be permanent). I also know that to make DLPAR “modification” permanent I have to modify the partition PROFILE and if a reboot is required, I power the host and then I do ACTIVATE its profile on POWER ON!

So what went wrong this time? Is this AIX/VIOS/HMC error or mine or just simply some kind of a witchcraft? I have no idea. I called for help and IBM engineer informed me that they do not have any other customers reporting something like this …… This leaves me and the witchcraft – let spread the guilt equally πŸ™‚

But there is a lesson to be learnt from this experience and if you are new to VIOS, NPIV, AIX and everything in between this post could be your lesson too. Flip to the next page and you will find a valuable and interesting material (straight from IBM customer support) showing how to zone a XIV LUN (via a Brocade switch) to AIX partition with virtual FC adapters. Enjoy it!

Posted in AIX, Real life AIX.


rpm for AIX adminstrator

As I go deeper and deeper into the LINUX woods it becomes apparent that I better learn something about them both – rpm and yum. So what are they? The first one was originally developed to install software packages from the local repositories aka someone downloaded the package or packages to the host prior to execution of rpm. The second one (Yellow Dog Updater and Modifier) has been created with a remote software repository in mind. You could say that rpm is like installp and yum sort of like SUMMA (both in AIX) except that rpm was later modified and now you can use it to install software package from a remote source – the plot thickens …..

What follows it the listing of examples of rpm in action.

Posted in LINUX.

Tagged with .


enabling ftp on a LINUX (RedHat) host

Someone asked my to enable ftp on a LINUX host – what a lesson of humility!!!! I have no clue what to do! I have to learn! So, if you are like me – an otherwise “experienced” AIX administrator who is learning LINUX – you may benefiting from this post.

Remember that in RedHat the ftp package is called vsftpd …… on other LINUX distribution this name could vary(?) . Next, this package could be installed but the ftp service is not activated …..

To start, check if ftp is setup to run on your box. Execute the following command to see if ftp has been loaded and set to execute.

$ chkconfig --list | grep -i vsftpd
vsftpd     0:off  1:off  2:off  3:off  4:off  5:off  6:off

It looks like it is installed but not activated. Make it run at level 3, 4 and 5.

$ chkconfig --level 345 vsftpd on

$ chkconfig --list | grep -i vsftpd
vsftpd     0:off  1:off  2:off  3:on  4:on  5:on  6:off

Either bounce the box or start the service by hand.

$ service vsftpd start

If the package is not installed (rpm -qa | grep -i ftp than installed it with yum install vsftp – to download it directly from RedHat online repository.

To identify users prohibited from using this service view the file shown bellow:

$ cat /etc/vsftpd/ftpusers
# Users that are not allowed to login via ftp
root
bin
daemon
adm
lp
sync
shutdown
halt
mail
news
uucp
operator
games
nobody

If you want to allow root to use ftp remove the appropriate entry from /etc/vsftpd/ftpusers.

By the way, LINUX has more then one ftp “package” to choose from.

Posted in LINUX.

Tagged with , .


entstat equivalent for FC adapters in AIX.

The entstat -d ent# commands delivers a lot of useful information about a network adapter. We often used it to determine if the adapter has “LINK” and with what speed it move the data from and to the host. Its equivalent for FC adapters is called fcstat and it works for both physical and virtual FC adapters.

# fcstat -e fcs0 | grep -E "Type|fcs|Port Name|Port Speed"
FIBRE CHANNEL STATISTICS REPORT: fcs0
Device Type: 8Gb PCI Express Dual Port FC Adapter (df1000f333108a03) (adapter/pciex/df1000f114108a0)
World Wide Port Name: 0x10000220C985EA6B
Port Speed (supported): 8 GBIT
Port Speed (running):   8 GBIT
Port Type: Fabric
# fcstat -e fcs0 | grep -E "Type|fcs|Port Name|Port Speed"
FIBRE CHANNEL STATISTICS REPORT: fcs0
Device Type: Virtual Fibre Channel Client Adapter (adapter/vdevice/IBM,vfc-client)
World Wide Port Name: 0xC05076039B790032
Port Speed (supported): UNKNOWN
Port Speed (running):   8 GBIT
Port Type: Fabric

Even more FC adapter statistics can be obtained using the -D option like for example fcstat -D fcs0. The last command generates almost twice as much information than the fcstat -e fcs0.

Posted in Real life AIX.

Tagged with , , .


How to capture boot debug of a SAN boot PowerVM Virtual I/O Server or AIX/NPIV client partition that is failing to boot?

Does it interest you? Go to page number 2.

Posted in Real life AIX.


links to various performance tools for AIX

You will find it all in one place following this link https://www.ibm.com/developerworks/mydeveloperworks/wikis/home?lang=en#/wiki/Power%20Systems/page/Other%20Performance%20Tools

Posted in AIX.




Copyright © 2016 - 2017 Waldemar Mark Duszyk. All Rights Reserved. Created by Blog Copyright.