Skip to content


an empty file system with no free capacity ….?

it sounds like an oxymoron, doesn’t it? I saw it once many years ago. To day, a colleague of mine noticed it on one of his machines.
Look bellow, the /epic/sup07 has a very little free space left.

/epic/sup07/ifc/stream> df -g .
Filesystem    GB blocks      Free %Used    Iused %Iused Mounted on
/dev/sup07_lv      5.00      0.01  100%        9     1% /epic/sup07

After a closer inspection, we can locate only a pair of sub-directories, all empty.

/epic/sup07/ifc/stream> cd ../../
/epic/sup07> ls -l
total 0
drwxr-sr-x    3 epicadm  cachegrp        256 May 11 09:24 dcifc
drwxr-sr-x    3 epicadm  cachegrp        256 May 11 09:24 ifc
drwxr-xr-x    2 root     system          256 Nov 14 10:44 lost+found
/epic/sup07> du -ak . | sort -nr | more

Yes, there is nothing here to see for us but there still may be something there like for example open files ….
We leave the file system so our presence does not distort it and execute the lsof command against it.

 /epic/sup07> cd

/root> lsof /epic/sup07
In while loop:256
Value of I :61   np:256
COMMAND   PID  USER   FD   TYPE DEVICE   SIZE/OFF NODE NAME
cache 10289344 lyko cwd VDIR 37,39 256 2 /epic/sup07(/dev/sup07_lv)
ksh 11993330 epicadm cwd VDIR 37,39 256 2 /epic/sup07(/dev/sup07_lv)
dsmc 146 root 10 VREG 37,39 5360353284129/epic/sup07(/dev/sup07_lv)
ksh 17694934 lyko cwd VDIR 37,39 256 2 /epic/sup07(/dev/sup07_lv)

Yes, there is a huge file in /epic/sup07 created by the dsmc command. After a further investigation and determination that no active backup/restore takes place we kill the offending process. By the way, in the last output we shortened the PID of the dsmc in order to format the output to fit the screen. Here we go, killing the process.

/root> kill -9 14618974

Now, is there the free space or not?

/root> df -g /epic/sup07
Filesystem    GB blocks      Free %Used    Iused %Iused Mounted on
/dev/sup07_lv      5.00      4.98    1%        8     1% /epic/sup07

Well, we got back what was/is rightfully ours :-)

Posted in Real life AIX.

Tagged with , , .


10 GB Ethernet adapters – the missing details

Today, I was asked to verify firmware level of the new 10GB Ethernet adapters we recently installed in our TSM servers. Without much thinking, the following command was executed:

lscfg -vl ent4
  ent4             U78A0.001.DNWHPY1-P1-C2-T1  10 Gigabit Ethernet Adapter (ct3)

        Network Address.............00145E9952AE
        Displayable Message.........10 Gigabit Ethernet Adapter (ct3)

Wooo, a lot of information is missing. The firmware version is one of them….
To get the missing information, the last command has to be modified slightly…..

lscfg -vpl ent4
  ent4             U78A0.001.DNWHPY1-P1-C2-T1  10 Gigabit Ethernet Adapter (ct3)

        Network Address.............00145E9952AE
        Displayable Message.........10 Gigabit Ethernet Adapter (ct3)

  PLATFORM SPECIFIC

  Name:  ethernet
    Node:  ethernet@0
    Device Type:  network
    Physical Location: U78A0.001.DNWHPY1-P1-C2-T1

Well, still the required information is missing….. Let’s try something else:

lscfg -vp | grep -p "10 Gigabit"
  hba0             U78A0.001.DNWHPY1-P1-C2-T1                                    10 Gigabit Ethernet-SR PCI-Express Host Bus Adapter (2514300014108c03)

      10 Gigabit Ethernet-SR PCI Express Adapter:
        EC Level....................D76809
        FRU Number..................46K7897
        Part Number.................46K7897
        Manufacture ID..............1037
        Feature Code/Marketing ID...5769
        Serial Number...............YL11212300B0
        Network Address.............00145E9952AE
        ROM Level.(alterable).......RR0120
        Hardware Location Code......U78A0.001.DNWHPY1-P1-C2-T1

  ent4             U78A0.001.DNWHPY1-P1-C2-T1                                    10 Gigabit Ethernet Adapter (ct3)

        Network Address.............00145E9952AE
        Displayable Message.........10 Gigabit Ethernet Adapter (ct3)

Finally, we got it all!

Posted in AIX, Real life AIX.

Tagged with , , , .


command line editing pains con’t ……

For example, if a previous command line had the following shape:

crfs -v jfs2 -d u50_lv -A yes -a log=INLINE -m /u50

how would you change all 5′s into 6′s so we could re-execute it in the following way:

crfs -v jfs2 -d u60_lv -A yes -a log=INLINE -m /u60

Assuming that we are working in the ksh shell and the history mechanism has already been activated (set -o vi), we would recall the first line executing (for example) Esc/u50 so the command line reads:

crfs -v jfs2 -d u50_lv -A yes -a log=INLINE -m /u50

Next, we have to enter what I call the vi file edit mode, hitting the Esc-v key combination, which automatically opens vi editor on a temporary file which contents are the recalled entry:

crfs -v jfs2 -d u50_lv -A yes -a log=INLINE -m /u50

To change every 5 into 6, we execute

Esc:%/5/6/g

followed with

Esc:wq!

and this is all.

Esc represent the appropriate key of the keyboard.

Posted in Real life AIX.


AIX hosts access control with TDS LDAP

LDAP defined user is “visible” on every AIX host with running secldapclntd daemon, which may not always be “good” news as often, in the name of security, user population is contained into groups of selected hosts primarily along application boundaries. So for example, users of application “A” cannot login into the hosts serving application “B”.

Tivoli Directory Server has two attributes that provide host access control based on the two opposite principles. One is called hostsallowedlogin and its opposite is called hostsdeniedlogin. So, if the number of hosts to which a user is not allowed to log-in is larger then the number of hosts he/she can log-in than administrator may use the hostsallowedlogin to set this user access policy. In a reverse situation the second attribute makes more sense.

Well, this is all nice and peachy but these attributes are not the “standard” AIX attributes – they are the “additional” ones by the virtue of the Tivoli Directory Server or TDS for short. In order to use them, for AIX host authorization TDS administrator has to “enable” their usage – they have to be “TURN ON”. How?

Posted in Real life AIX, ldap.

Tagged with , , , , .


installing RedHat LINUX on Pseries…..

with a virtual CD on a vio server created and loaded with ver.6.2 of RedHat I started the brand new chapter in my life called LINUX……
I wasted two days trying to install this version and currently have an open case with RedHat support, the error message is:

..................................................
Trying to unpack rootfs image as initramfs...
BUG: soft lockup - CPU#0 stuck for 67s! [swapper:1]
..................................................

The support engineer suggested to try to load the previous version, so I did down loaded RH6.1 and booted the same partition (all resources virtualized) and the boot failed as before but this time with a new message:

RAMDISK: incomplete write (4318 != 32768)
write error
List of all partitions:
No filesystem could mount root, tried:  iso9660
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(1,0)

Call for IBM support proved almost waste of time – we bought the media and support from RedHat not IBM – a lesson learnt; from now on we need to buy both from IBM. Why almost? Well, the IBM engineer still helped. He mentioned this post on the IBM developersWorks – https://www.ibm.com/developerworks/mydeveloperworks/blogs/powermeup/entry/rhel_kernel_image_to_large?lang=en.
Some of you may not have account there and will not be allowed to login to see the solution, you should continue reading this post.

The partition was booted to the OK prompt selecting the Open Firmware OK Prompt option from the Activate -> Partition menu. But after the partition boots in this mode, you will not see the OK prompt, nope. What you will see is the 0 > prompt instead, well not a big deal. I did not investigate yet what real-base base means but I understand that this is the key to a not stressful installation of RedHat 6.1. The value of this variable has to be set to 100000.
Let see what it is now?

0 > printenv real-base
-------------- Partition: common -------- Signature: 0x70 ---------
real-base                2000000             2000000 

The change is required:

0 > setenv real-base 1000000
0 > printenv real-base
-------------- Partition: common -------- Signature: 0x70 ---------
real-base                1000000              2000000

Now, we have to boot to SMS, set the boot device to the virtual CD and start the installation. It works like a charm, so what remains is to learn LINUX …. :-)

At this state, the OS was loaded but except the root account and the very bare necessities nothing else is configured – it has to be done by a trained LINUX administrator – let’s learn!!!!

In a following post, I will explain how to actually download RedHat ISO image, create a virtual CD (DVD), how to load it with the ISO RedHat image and how to serve all of those to a partition for installation.

Posted in AIX, Linux, Real life AIX.

Tagged with , , , , , , .


illegal root access

Suddenly, out of a blue, one particular environment became a source of issues for us – the loyal servants of our user’s community aka your truly system administrators.
First, from being OK the environment became sluggish, slow, unresponsive. IBM asked for snaps which after verification proved that there was nothing wrong with the application server. Its memory, network, CPU and disk I/O did not show any stress, on the contrary there was/is an abundance of resources. So, the Oracle DBA’s in charge of the database associated with this application was asked to “look” into the database server, so he went looking all the way.

A day went by without nothing but the following morning the same application owner declared that some cron jobs and some of his scripts went missing, groups memberships were modified – who dare to do these things? To add more importance, his statement included the following line “It does not look good if/when I have to explain this to managerA or ManagerB“. Since, it is entirely possible that a virgin mind may read this post, I will refrain myself from announcing here what has exactly crossed my mind in response to this guy last statement.

So the pressure is on, every day in our mailboxes we find a new email with the same question – “have you found anything yet?”

Posted in Real life AIX.

Tagged with , , , , .


tuning AIX for XIV ….

or how to get out of bad situation…….. Life is, and always will be life, and as such it is not good nor bad. It is LIFE and we got to live it and have fun doing it. Satisfying the philosophical side of my nature, I will proceed to the task at hand.

This post describe procedure which allows change of SAN attributes like num_cmd_elems (in the case of FC adapters) and queue_depth for SAN disks with no need for the host re-boot.

Posted in AIX, Real life AIX.

Tagged with , , , , , .


timeout_policy the new PCM attribute

Straight from IBM AIX Support:

The new PCM attribute for default AIX PCM, called timeout_policy.

http://www-01.ibm.com/support/docview.wss?uid=isg1IZ96396

timeout_policy adjusts the behavior of the PCM (Path Control Module) related to command timeouts, and transport errors. Setting timeout_policy to either fail_path or disable_path may decrease performance degradation when a MPIO device encounters intermittent SAN fabric issues on some, but not all the paths to the device.

retry_path = First occurrence of command timeout on path will not cause immediate path failure. If a path that failed due to transport issues is recovered by a health check, then that path may be used immediately.

fail_path = Path will be failed on first occurrence of a command timeout (assuming it is not the last path in the path group). If a path that failed due to transport issues recovers, the path will not be used for read/write I/O until a period of time has expired with no failures on that path. Enabling this feature may add a delay before read/write I/O is routed to paths that have just recovered from a transport error.

disable_path = Path will be failed on first occurrence of a command timeout (assuming it is not the last path in the path group). If a path that failed due to transport issues recovers, the path will not be used for read/write I/O until a period of time has expired with no failures on that path. If this path continues to experience multiple command timeouts during a period of time, then it may be disabled. Disabled paths remain disabled (and not usable), until a user specifically runs the chpath command to enable the disabled path (or the affected disk is reconfigured or system rebooted). This option is not recommended for most users, since it may require manual intervention to recover paths. Refer to chpath and lspath man pages for more details regarding uses of these commands.

Posted in Real life AIX.

Tagged with , , , .


removing ShadowImage from a AIX host

In the past, HDS ShadowImage environment was installed and configured with its P-VOLs (the source disks) on one AIX host, and S-VOLs (the target disks) on a second AIX host.
For a convinience, all VOLs were grouped into a consistency group (a named collection of P/S-VOLs pairs). All control over this environment (HORCM) was set on the first host.
After all pairs in consistency group were created with the paircreate command, the data “inside” S-VOLs the consistency group is accessed by “splitting” the S-VOLs executing the pairsplit command. Following the split the recreatevg command executed against the hdisks identified in the horcm0.conf and horcm1.conf as the S-VOLs recreates the volume group and its contents. If there is no more need for this data, this volume group is destroyed. With a new need for the data the consistency group is re-synchronized, split and so forth all over again…… .

One day, there need for ShadowImage disappeared completely. This post shows how to safely remove ShadowImage from the host with running HORCM instance or instances and how to return all VOLs to their original state (SMPL) state as before the paircreate command was executed for the very first time.

Posted in Real life AIX.

Tagged with , , , , , .


command line editing pains ….. .

Hi,

it is bugging me for a some time already.
Do you know how to do that: “we need to change u50_lv into u60_lv and /u50 into /u60 in a “single step”, kind of like a global change in vi?

crfs -v jfs2 -d u50_lv -A yes -a log=INLINE -m /u50

I know, how to turn each 5 into 6, but this requires two steps and I want to do it with just one (is this possible at all ?).
Please let someone show me how this is done.

How to do it as a command line edit….. Not as an edit of a line in a file…..

Thanks,

MarkD :-)

Posted in Real life AIX.




© 2008-2012 www.wmduszyk.com - best viewed with your eyes.