Alter RHEL3 low memory countermeasures?
dhubbard at dino.hostasaurus.com
Wed Jul 6 13:43:51 CDT 2005
Hi everyone, I have a number of RHEL 3 servers that
run a compute-intensive application that is spawned
on a per-request basis. When a request comes in, the
app consumes a limited amount of memory and a lot of
cpu doing the work, then exits. If the server gets
backed up with requests, no problem, they wait in
memory for their turn. However, if an extended
number of requests come in that it can't keep up with,
ultimately memory is exhausted and we have a problem.
The problem is that the out-of-memory killer in linux
targets large memory consumers but in this case,
there are just hundreds of small memory users all
hammering the cpu's making it impossible for the
server to ever recover with the standard OOM algorithm.
The server will eventually completely exhaust
memory and crash. I'm wondering if anyone knows if
the oom killer is tunable in any way; if I could tell
it to always start killing processes with a given name,
99% of the time it would be able to help itself out
of the problem and continue on.
If that's not possible, my next thought was to
have in inittab, an instance of the tcpserver
program (a tcp listener that invokes a program given
as an argument to it when connected to) with the
arguments to call /usr/bin/killall -9 badprogram
when accessed. I could have it listen on a firewalled
unused low port and be niced to a negative number.
That way if the server is overloaded and in the process
of dying, I could just "telnet badserver 12" from
somewhere else and hopefully the high priority of this
process would allow it to sneak in and run and fix
the problem without needing to power cycle the server.
More information about the Linux-PowerEdge