Skip to content


an empty file system with no free capacity ….?

it sounds like an oxymoron, doesn’t it? I saw it once many years ago. To day, a colleague of mine noticed it on one of his machines.
Look bellow, the /epic/sup07 has a very little free space left.

/epic/sup07/ifc/stream> df -g .
Filesystem    GB blocks      Free %Used    Iused %Iused Mounted on
/dev/sup07_lv      5.00      0.01  100%        9     1% /epic/sup07

After a closer inspection, we can locate only a pair of sub-directories, all empty.

/epic/sup07/ifc/stream> cd ../../
/epic/sup07> ls -l
total 0
drwxr-sr-x    3 epicadm  cachegrp        256 May 11 09:24 dcifc
drwxr-sr-x    3 epicadm  cachegrp        256 May 11 09:24 ifc
drwxr-xr-x    2 root     system          256 Nov 14 10:44 lost+found
/epic/sup07> du -ak . | sort -nr | more

Yes, there is nothing here to see for us but there still may be something there like for example open files ….
We leave the file system so our presence does not distort it and execute the lsof command against it.

 /epic/sup07> cd

/root> lsof /epic/sup07
In while loop:256
Value of I :61   np:256
COMMAND   PID  USER   FD   TYPE DEVICE   SIZE/OFF NODE NAME
cache 10289344 lyko cwd VDIR 37,39 256 2 /epic/sup07(/dev/sup07_lv)
ksh 11993330 epicadm cwd VDIR 37,39 256 2 /epic/sup07(/dev/sup07_lv)
dsmc 146 root 10 VREG 37,39 5360353284129/epic/sup07(/dev/sup07_lv)
ksh 17694934 lyko cwd VDIR 37,39 256 2 /epic/sup07(/dev/sup07_lv)

Yes, there is a huge file in /epic/sup07 created by the dsmc command. After a further investigation and determination that no active backup/restore takes place we kill the offending process. By the way, in the last output we shortened the PID of the dsmc in order to format the output to fit the screen. Here we go, killing the process.

/root> kill -9 14618974

Now, is there the free space or not?

/root> df -g /epic/sup07
Filesystem    GB blocks      Free %Used    Iused %Iused Mounted on
/dev/sup07_lv      5.00      4.98    1%        8     1% /epic/sup07

Well, we got back what was/is rightfully ours 🙂

Posted in Real life AIX.

Tagged with , , .


10 GB Ethernet adapters – the missing details

Today, I was asked to verify firmware level of the new 10GB Ethernet adapters we recently installed in our TSM servers. Without much thinking, the following command was executed:

lscfg -vl ent4
  ent4             U78A0.001.DNWHPY1-P1-C2-T1  10 Gigabit Ethernet Adapter (ct3)

        Network Address.............00145E9952AE
        Displayable Message.........10 Gigabit Ethernet Adapter (ct3)

Wooo, a lot of information is missing. The firmware version is one of them….
To get the missing information, the last command has to be modified slightly…..

lscfg -vpl ent4
  ent4             U78A0.001.DNWHPY1-P1-C2-T1  10 Gigabit Ethernet Adapter (ct3)

        Network Address.............00145E9952AE
        Displayable Message.........10 Gigabit Ethernet Adapter (ct3)


  PLATFORM SPECIFIC

  Name:  ethernet
    Node:  ethernet@0
    Device Type:  network
    Physical Location: U78A0.001.DNWHPY1-P1-C2-T1

Well, still the required information is missing….. Let’s try something else:

lscfg -vp | grep -p "10 Gigabit"
  hba0             U78A0.001.DNWHPY1-P1-C2-T1                                    10 Gigabit Ethernet-SR PCI-Express Host Bus Adapter (2514300014108c03)

      10 Gigabit Ethernet-SR PCI Express Adapter:
        EC Level....................D76809
        FRU Number..................46K7897
        Part Number.................46K7897
        Manufacture ID..............1037
        Feature Code/Marketing ID...5769
        Serial Number...............YL11212300B0
        Network Address.............00145E9952AE
        ROM Level.(alterable).......RR0120
        Hardware Location Code......U78A0.001.DNWHPY1-P1-C2-T1

  ent4             U78A0.001.DNWHPY1-P1-C2-T1                                    10 Gigabit Ethernet Adapter (ct3)

        Network Address.............00145E9952AE
        Displayable Message.........10 Gigabit Ethernet Adapter (ct3)

Finally, we got it all!

Posted in AIX, Real life AIX.

Tagged with , , , .


command line editing pains con’t ……

For example, if a previous command line had the following shape:

crfs -v jfs2 -d u50_lv -A yes -a log=INLINE -m /u50

how would you change all 5’s into 6’s so we could re-execute it in the following way:

crfs -v jfs2 -d u60_lv -A yes -a log=INLINE -m /u60

Assuming that we are working in the ksh shell and the history mechanism has already been activated (set -o vi), we would recall the first line executing (for example) Esc/u50 so the command line reads:

crfs -v jfs2 -d u50_lv -A yes -a log=INLINE -m /u50

Next, we have to enter what I call the vi file edit mode, hitting the Esc-v key combination, which automatically opens vi editor on a temporary file which contents are the recalled entry:

crfs -v jfs2 -d u50_lv -A yes -a log=INLINE -m /u50

To change every 5 into 6, we execute

Esc:%/5/6/g

followed with

Esc:wq!

and this is all.

Esc represent the appropriate key of the keyboard.

Posted in Real life AIX.


AIX hosts access control with TDS LDAP

LDAP defined user is “visible” on every AIX host with running secldapclntd daemon, which may not always be “good” news as often, in the name of security, user population is contained into groups of selected hosts primarily along application boundaries. So for example, users of application “A” cannot login into the hosts serving application “B”.

Tivoli Directory Server has two attributes that provide host access control based on the two opposite principles. One is called hostsallowedlogin and its opposite is called hostsdeniedlogin. So, if the number of hosts to which a user is not allowed to log-in is larger then the number of hosts he/she can log-in than administrator may use the hostsallowedlogin to set this user access policy. In a reverse situation the second attribute makes more sense.

Well, this is all nice and peachy but these attributes are not the “standard” AIX attributes – they are the “additional” ones by the virtue of the Tivoli Directory Server or TDS for short. In order to use them, for AIX host authorization TDS administrator has to “enable” their usage – they have to be “TURN ON”. How?

Posted in ldap, Real life AIX.

Tagged with , , , , .


illegal root access

Suddenly, out of a blue, one particular environment became a source of issues for us – the loyal servants of our user’s community aka your truly system administrators.
First, from being OK the environment became sluggish, slow, unresponsive. IBM asked for snaps which after verification proved that there was nothing wrong with the application server. Its memory, network, CPU and disk I/O did not show any stress, on the contrary there was/is an abundance of resources. So, the Oracle DBA’s in charge of the database associated with this application was asked to “look” into the database server, so he went looking all the way.

A day went by without nothing but the following morning the same application owner declared that some cron jobs and some of his scripts went missing, groups memberships were modified – who dare to do these things? To add more importance, his statement included the following line “It does not look good if/when I have to explain this to managerA or ManagerB“. Since, it is entirely possible that a virgin mind may read this post, I will refrain myself from announcing here what has exactly crossed my mind in response to this guy last statement.

So the pressure is on, every day in our mailboxes we find a new email with the same question – “have you found anything yet?”

Posted in Real life AIX.

Tagged with , , , , .


tuning AIX for XIV ….

or how to get out of bad situation…….. Life is, and always will be life, and as such it is not good nor bad. It is LIFE and we got to live it and have fun doing it. Satisfying the philosophical side of my nature, I will proceed to the task at hand.

This post describe procedure which allows change of SAN attributes like num_cmd_elems (in the case of FC adapters) and queue_depth for SAN disks with no need for the host re-boot.

Posted in AIX, Real life AIX.

Tagged with , , , , , .


timeout_policy the new PCM attribute

Straight from IBM AIX Support:

The new PCM attribute for default AIX PCM, called timeout_policy.

http://www-01.ibm.com/support/docview.wss?uid=isg1IZ96396

timeout_policy adjusts the behavior of the PCM (Path Control Module) related to command timeouts, and transport errors. Setting timeout_policy to either fail_path or disable_path may decrease performance degradation when a MPIO device encounters intermittent SAN fabric issues on some, but not all the paths to the device.

retry_path = First occurrence of command timeout on path will not cause immediate path failure. If a path that failed due to transport issues is recovered by a health check, then that path may be used immediately.

fail_path = Path will be failed on first occurrence of a command timeout (assuming it is not the last path in the path group). If a path that failed due to transport issues recovers, the path will not be used for read/write I/O until a period of time has expired with no failures on that path. Enabling this feature may add a delay before read/write I/O is routed to paths that have just recovered from a transport error.

disable_path = Path will be failed on first occurrence of a command timeout (assuming it is not the last path in the path group). If a path that failed due to transport issues recovers, the path will not be used for read/write I/O until a period of time has expired with no failures on that path. If this path continues to experience multiple command timeouts during a period of time, then it may be disabled. Disabled paths remain disabled (and not usable), until a user specifically runs the chpath command to enable the disabled path (or the affected disk is reconfigured or system rebooted). This option is not recommended for most users, since it may require manual intervention to recover paths. Refer to chpath and lspath man pages for more details regarding uses of these commands.

Posted in Real life AIX.

Tagged with , , , .


removing ShadowImage from a AIX host

In the past, HDS ShadowImage environment was installed and configured with its P-VOLs (the source disks) on one AIX host, and S-VOLs (the target disks) on a second AIX host.
For a convinience, all VOLs were grouped into a consistency group (a named collection of P/S-VOLs pairs). All control over this environment (HORCM) was set on the first host.
After all pairs in consistency group were created with the paircreate command, the data “inside” S-VOLs the consistency group is accessed by “splitting” the S-VOLs executing the pairsplit command. Following the split the recreatevg command executed against the hdisks identified in the horcm0.conf and horcm1.conf as the S-VOLs recreates the volume group and its contents. If there is no more need for this data, this volume group is destroyed. With a new need for the data the consistency group is re-synchronized, split and so forth all over again…… .

One day, there need for ShadowImage disappeared completely. This post shows how to safely remove ShadowImage from the host with running HORCM instance or instances and how to return all VOLs to their original state (SMPL) state as before the paircreate command was executed for the very first time.

Posted in Real life AIX.

Tagged with , , , , , .


command line editing pains ….. .

Hi,

it is bugging me for a some time already.
Do you know how to do that: “we need to change u50_lv into u60_lv and /u50 into /u60 in a “single step”, kind of like a global change in vi?

crfs -v jfs2 -d u50_lv -A yes -a log=INLINE -m /u50

I know, how to turn each 5 into 6, but this requires two steps and I want to do it with just one (is this possible at all ?).
Please let someone show me how this is done.

How to do it as a command line edit….. Not as an edit of a line in a file…..

Thanks,

MarkD 🙂

Posted in Real life AIX.


to remove a file with special characters in its name

-rw-r-----  1 root   system 1582339  Apr 05 20:24 smit.log
-rw-------  1 root   system  14751   Apr 06 07:36 .lsof_lawaptpu001
-rw-r-----  1 root   system      0   Apr 06 08:02 *
-rw-r-----  1 root   system 136792   Apr 10 05:25 dsmerror.log
-rw-r-----  1 root   system 37641208 Apr 10 05:25 dsmsched.log

Looking at the output above, you may find a surprise and if you have not dealt with such surprises before you may already be thinking how to remove the “offending” file. Executing rm * will quickly decrease the number of files in this location and most likely it is not what one intends to do. We have a few options.

One could execute the command rm simultaneously suppressing the meaning of the special character or characters included in the file name.

# rm '*'

Above, the command was instructed not to “expand” the * into any existing file name that does not have an extension, but to treat it as the single meaningless character * and to remove only the one file which name is a single *.

What to do if one finds a file named * ?.? In the last case, it is often difficult to establish the number of spaces in the file name. In this situation, you may use the find command to find the value of the inode associated with the file.

# ls -i
   52 *
   36 *    ? 
   11.java
   40 .lsof_lawaptpu001
   47 .sh_history
   64 .ssh
   31 .toc
16384 .topasrecrc
    6 .vi_history

Next, the find command is instructed to associate the the earlier identified inode with the file name which is given to the rm command for a prompt removal.

# find . -inum 36 -exec rm '{}' \;

If instead of removal you are interested in renaming the file, you could proceed as follow:

# find . -inum 36 -exec mv '{}' new_file_name \;

where the new_file_name is the new name.

Posted in Real life AIX.

Tagged with , , , .




Copyright © 2016 - 2017 Waldemar Mark Duszyk. All Rights Reserved. Created by Blog Copyright.