dkms issue with "ps -o lstart" under SLES9

Silacci, Lucas Lucas.Silacci at Teradata.Com
Mon Aug 20 19:32:58 CDT 2007


Hello,
 
I just ran into an issue during an rpm upgrade of my dkms driver package
and I was wondering if anybody else had seen this. During the upgrade a
"dkms remove" in my packages "%preun" was executed even though it was
called with "--rpm_safe_upgrade".
 
For background, I discovered this on SLES9 SP3 with dkms-2.0.13-1
installed (although it looks like the latest version would be
susceptible to the same issue).
 
I looked at the code and was able to get a debug trace of what happened.
Basically dkms is using the output from "ps -o lstart" to help determine
that a "dkms remove" command is coming from the same rpm process that
just did a "dkms add" command during an rpm upgrade. It puts the output
of the "ps -o lstart" into the lockfile and then references that
lockfile later to see if we are in an "rpm_safe_upgrade" situation.

However, it turns out that the output from that command is actually not
guaranteed to be identical for the same process from call to call under
SLES9. I'm not sure about other distros, but it's very obviously not
safe to use in this case. Here's an example of a random process that I
picked:

samoa2:~ # ps -o lstart 24135
                 STARTED
Thu Aug 16 10:13:20 2007
samoa2:~ # ps -o lstart 24135
                 STARTED
Thu Aug 16 10:13:19 2007
samoa2:~ # ps -o lstart 24135
                 STARTED
Thu Aug 16 10:13:19 2007
samoa2:~ # ps -o lstart 24135
                 STARTED
Thu Aug 16 10:13:19 2007
samoa2:~ # ps -o lstart 24135
                 STARTED
Thu Aug 16 10:13:20 2007

So I was wondering if anyone else has seen this issue and whether
there's any plans for a change to dkms for this.

Thanks,
-Lucas

Here's the dirty details...

There are two relevant pieces of code in dkms:
 
#1 (where the lock file gets created):

    # Do stuff for --rpm_safe_upgrade
    if [ -n "$rpm_safe_upgrade" ]; then
	local pppid=`sed -ne 's/PPid:[ \t]*//p' /proc/$PPID/status`
	local temp_dir_name=`mktemp
$tmp_location/dkms_rpm_safe_upgrade_lock.$pppid.XXXXXX 2>/dev/null`
	echo "$module-$module_version" >> $temp_dir_name
	ps -o lstart --no-headers -p $pppid 2>/dev/null >>
$temp_dir_name
    fi

#2 (where we jump out for a safe upgrade):

        # Do --rpm_safe_upgrade check (exit out and don't do remove if
inter-release RPM upgrade scenario occurs)
	if [ -n "$rpm_safe_upgrade" ]; then
	    local pppid=`cat /proc/$PPID/status | grep PPid: | awk
{'print $2'}`
	    local time_stamp=`ps -o lstart --no-headers -p $pppid
2>/dev/null`
	    for lock_file in `ls
$tmp_location/dkms_rpm_safe_upgrade_lock.$pppid.* 2>/dev/null`; do
		lock_head=`head -n 1 $lock_file 2>/dev/null`
		lock_tail=`tail -n 1 $lock_file 2>/dev/null`
		if [ "$lock_head" == "$module-$module_version" ] && [
"$lock_tail" == "$time_stamp" ] && [ -n "$time_stamp" ]; then
		    echo $""
		    echo $"DKMS: Remove cancelled because
--rpm_safe_upgrade scenario detected."
		    rm -f $lock_file
		    exit 0
		fi
	    done
	fi

You can see the problem with "ps -o lstart" in the following debug
output...

Debug output from a good run:

#1 (lstart is saved here):

+ '[' -n true ']'
++ sed -ne 's/PPid:[ \t]*//p' /proc/25854/status
+ local pppid=25852
++ mktemp /tmp/dkms_rpm_safe_upgrade_lock.25852.XXXXXX
+ local temp_dir_name=/tmp/dkms_rpm_safe_upgrade_lock.25852.R25991
+ echo e1000-7.5.5-1
+ ps -o lstart --no-headers -p 25852

#2 (lstart is compared here and everything is fine):

+ '[' -n true ']'
++ cat /proc/26683/status
++ grep PPid:
++ awk '{print $2}'
+ local pppid=25852
++ ps -o lstart --no-headers -p 25852
+ local 'time_stamp=Mon Aug 20 15:07:16 2007'
++ ls /tmp/dkms_rpm_safe_upgrade_lock.25852.R25991
++ head -n 1 /tmp/dkms_rpm_safe_upgrade_lock.25852.R25991
+ lock_head=e1000-7.5.5-1
++ tail -n 1 /tmp/dkms_rpm_safe_upgrade_lock.25852.R25991
+ lock_tail=Mon Aug 20 15:07:16 2007
+ '[' e1000-7.5.5-1 == e1000-7.5.5-1 ']'
+ '[' 'Mon Aug 20 15:07:16 2007' == 'Mon Aug 20 15:07:16 2007' ']'
+ '[' -n 'Mon Aug 20 15:07:16 2007' ']'
+ echo ''

+ echo 'DKMS: Remove cancelled because --rpm_safe_upgrade scenario
detected.'
DKMS: Remove cancelled because --rpm_safe_upgrade scenario detected.
+ rm -f /tmp/dkms_rpm_safe_upgrade_lock.25852.R25991
+ exit 0

Debug output from a bad run:

#1 (grabs lstart here):

+ '[' -n true ']'
++ sed -ne 's/PPid:[ \t]*//p' /proc/28496/status
+ local pppid=28494
++ mktemp /tmp/dkms_rpm_safe_upgrade_lock.28494.XXXXXX
+ local temp_dir_name=/tmp/dkms_rpm_safe_upgrade_lock.28494.y28633
+ echo e1000-7.5.5-1
+ ps -o lstart --no-headers -p 28494

#2 (compares incorrectly here):

+ '[' -n true ']'
++ cat /proc/29326/status
++ grep PPid:
++ awk '{print $2}'
+ local pppid=28494
++ ps -o lstart --no-headers -p 28494
+ local 'time_stamp=Mon Aug 20 15:10:32 2007'
++ ls /tmp/dkms_rpm_safe_upgrade_lock.28494.y28633
++ head -n 1 /tmp/dkms_rpm_safe_upgrade_lock.28494.y28633
+ lock_head=e1000-7.5.5-1
++ tail -n 1 /tmp/dkms_rpm_safe_upgrade_lock.28494.y28633
+ lock_tail=Mon Aug 20 15:10:31 2007
+ '[' e1000-7.5.5-1 == e1000-7.5.5-1 ']'
+ '[' 'Mon Aug 20 15:10:31 2007' == 'Mon Aug 20 15:10:32 2007' ']'

Since the times don't match exactly, my dkms driver gets removed.



More information about the DKMS-devel mailing list