<<< Back to the Linux Tips Index

17th Feb 2017

UNIX Concepts: Zombies

UNIX: Zombies, and the Killing of Parents and Children

Zombies!
     Zombies(5)          UNIX System V (Concepts)           Zombies(5)

     NAME
          Defunct, zombie and immortal processes

     DESCRIPTION
          When a process dies, it becomes a zombie (almost dead)
          process whose only remaining purpose is to hold its death
          certificate (the exit status data returned by the wait
          family of system calls).  When the death certificate has
          been collected, the process is finally removed from
          existence and from the systems's process table.  Zombie
          processes are marked as <defunct> in ps listings.

          If the parent of a child has not disowned the child and the
          parent dies before collecting the child's death certificate,
          the child is sent to the state orphanage.  As long as the
          parent is alive and the child was not disowned, when a child
          tries to die, the zombie child remains around until the
          parent finally collects its death certificate.  The state
          orphanage, process 1 a.k.a. /etc/init, is the second process
          created after the system is booted and has several principal
          functions: starting and in some cases maintaining the system
          daemons and waiting for its children to die.  It is given
          the job of waiting for the deaths of orphaned children as
          well.  This allows zombie children to be put to rest.

          As an aside, when the system is booted, the boot loader
          copies the kernel into memory, creates a stack and calls the
          kernel's main procedure which, in turn, makes itself into
          process 0, forks itself and that child, process 1, executes
          /etc/init.  In parallel to /etc/init starting the system and
          the system daemons, process 0 may continue to fork and
          execute portions of the kernel as asychronous precesses.
          Process 1 is and other processes with process 0 as their
          parent may be protected from being given a KILL signal.

          Processes waiting at very high priorities can not be killed
          because the signal is first posted to the process kernel
          control data of the process; but the remaining processing
          and possible jump to process termination only occurs at
          lower priorities, that is, below PZERO.  In fact, the final
          processing of a signal within a process occurs just as the
          process is being readied for return to user state.  If the
          system is a multiple CPU system, the signaling process and
          the signaled processes are running on different CPUs and the
          signaled process in running in user state, then the
          signaling CPU interrupts the signaled CPU so that the signal
          can be processed for the signaled process.  If the process
          execution does not reach this point of return to user state,
          then the process can not be signaled (in the case of the
          KILL signal, killed).

          A zombie process, since it is already almost dead, can not



     Page 1                                         (printed 12/13/97)






     Zombies(5)          UNIX System V (Concepts)           Zombies(5)



          be killed further.

          Slightly more technical presentation of the above material:

          If the parent of a forked (or sproced) child did not have
          SIGCHLD set to the ignore signal condition and the parent
          exits or is terminated by the system before the parent
          process has issued one of the wait system calls and
          retrieved the ending status of the child, the parentage of
          the child process is reassigned to process 1 (/etc/init in
          most cases).  As long as the the parent is alive and SIGCHLD
          is not set to ignore signal condition, the process struct of
          the terminated process is retained in the kernel so that the
          ending status of the child if and when a wait system call is
          issued for the child process.  Process 1 (/etc/init) after
          it has initialized the non kernel functions of the operating
          system, loops on the wait system call.  When a orphan dies,
          process 1 receives and ignores its ending status---this
          releases the process struct of the terminated process.

          /etc/init is also looking for the death of its own children
          so that it can start other processes dependent of that
          termination or so that it can restart another copy of the
          process that just terminated.  For example, the historical
          login processing is:  init forks itself and execs getty, the
          getty program (in the child) waits for the communications
          port to open, getty emits the ``login: '' prompt, getty
          execs login on top of itself, login authenicates the user,
          initializes the user uid, gid, current and root directories,
          etc. and execs the user's login shell on top of itself.
          When the login process terminates, /etc/init receives its
          ending status (it is the parent) and it forks itself and
          execs getty, ....

          No, there is not a ``fix''.  That is the way it is designed
          to work.  It has been this way back at least as far as
          Release 3 UNIX  (and Release 6 was the first version to
          offically escape Bell Labs).

          Immortal processes (except those specifically protected by
          the kernel, that is, those processes whose parent is 0), are
          caused by the processed waiting for an event (usually I/O
          related) at a very high priority (typically described as
          waiting above PZERO).  Since such processes usually have
          critical system resources locked, breaking the lock in a
          manner that does not release those resource could become a
          major disaster.

          A zombie is immortal.  An immortal process is not
          necessarily a zombie.  A zombie or defunct process is the
          death certificate of a process that has already terminated.
          The only system resource being consumed by it is the process



     Page 2                                         (printed 12/13/97)






     Zombies(5)          UNIX System V (Concepts)           Zombies(5)



          block used to store its termination status until the parent
          process asks for the exit status with a wait(2) family
          system call.

          When the parent finally dies, any surviving children,
          including the zombies are reassigned to the system
          orphanage---process 1.  Process 1, /etc/init, is the system
          reaper of orphaned children as well as its own.  The other
          purposes of init are system start up and shutdown and the
          respawnning (restarting, if you wish) of system services
          such as gettys.

          The fact that zombied orphans survive long enough for you to
          observe them is cause for concern about init's health.

          Defunct processes are zombie processes; these can be deleted
          by killing the parent program.  Use the PPID value to locate
          the parent; if the PPID is 1, then rebooting is the only
          solution.

          There are immortal processes which derive from another
          source. For these, the only practical solution is rebooting.
          If a process, while in the kernel locks system critical
          resources, then the process raises its processing priority
          above or at the PZERO level.  Such processes will not be
          interrupted by the kernel.  If the event for which the
          process is waiting will never occur, then the process
          becomes immortal.  For example, if a tape drive is unpowered
          during an I/O operation, then it will never send an I/O
          complete signal.  The tape drive is a system critical
          resource and therefore the process is waiting above or at
          PZERO.  For another example, in SGI IRIX, kernel mode NFS
          network communications appears to be handle at or above
          PZERO.  Other examples are possible.  An immortal process
          results.

     AUTHOR
          Randolph J. Herber.


     Page 3                                         (printed 12/13/97)

 

Invest in your career. Buy my Shell Scripting Tutorial today:

 

og:image credit: Unknown. Please contact me if source is known.

Steve Parker - Linux / DevOps Consultant
Share on Twitter Share on Facebook Share on LinkedIn Share on Identi.ca Share on StumbleUpon