Sunday, August 28, 2011

WebLogic: NodeManager for Linux Machines

I recently needed to set up Node Manager on a Linux machine.  I was surprised that I wasn’t able to find the documentation that I needed in order to configure it properly, and gave up after a few minutes of searching for the right scripts and articles on the web. 

When I had more time I decided to take another look.  After about an hour of fiddling and searching I got it working as an xinetd service on my Linux machine.  However, I still couldn’t find a good online reference so I decided to create one Smile

What is the Node Manager?

The Node Manager is a process that can be configured to run on machines where WebLogic is installed.  It serves the purpose of starting and stopping Managed Servers of WebLogic domains.  The goal of this process is to ease management of WebLogic domains by allowing you to remotely start and stop Managed Servers.  With the Node Manager you can log into the Admin Console for the domain and select the instances you want to start and stop:

SNAGHTML2e7639

 

One of the great things about the Node Manager is that you can change the classpath and startup (Java) options for your managed servers from the Admin Console as well.  This can be useful when you are troubleshooting a node and adding debug flags, temporary server options, etc:

image

How does it work?

The NodeManager for WebLogic operates in one of two ways:

  • Java Node Manager – This blog post is focused on the Java Node Manager.  This runs as a JVM process on the machine(s) where you have WebLogic Managed Servers or Coherence Cache Servers you want to use with Node Manager.  It defaults to listen on port 5556 and receives requests from the Admin Server to start and stop Managed Servers and Coherence Cache Servers.  This should be set up to run as a System Service such that when the machine is rebooted the service will be restarted as well.
  • Script Based Node Manager – The Script-based node manager uses SSH and shell scripts to start server processes on Machines.  The script-based node manager is not the focus of this article.

 

Making it work on Linux

I had a challenge making the Node Manager work on Linux.  It required a fair amount of troubleshooting and manual steps to get it to work.  Here’s what I did:

1. Investigate the Windows scripts to install the Java Node Manager as a service:

I had gotten the Node Manager to work on Windows several times in the past and it was easy & straightforward.  In order to get it to work on Linux I decided to see how it worked on Windows.

The Windows installation of WebLogic includes scripts to install the Node Manager as a service.  You can find them here: <WL_HOME>/server/bin (C:\Oracle\Middleware\wlserver_10.3\server\bin).  I observed these key lines:

image

Here you can see the listen address & port being set as well as the Java startup class for the Node Manager. 

2. This seemed simple enough, so I translated this to a Linux/Unix shell script:

#!/bin/sh
export NODEMGR_HOME=${WL_HOME}/common/nodemanager
export NODEMGR_OUT=${NODEMGR_HOME}/nodemanager.out
#export NODEMGR_HOST=localhost
export NODEMGR_PORT=5556
export MEM_ARGS="-Xms32m -Xmx200m"
export JAVA_OPTIONS="-Djava.security.policy=${WL_HOME}\server\lib\weblogic.policy\ -Dweblogic.nodemanager.javaHome=${JAVA_HOME}"
. ${WL_HOME}/common/bin/commEnv.sh
export CLASSPATH=${WEBLOGIC_CLASSPATH}
if [ "${JAVA_VENDOR}" = "BEA" ] ; then
  export JAVA_VM="-jrockit"
fi
if [ "${JAVA_VENDOR}" = "Sun" ] ; then
  export JAVA_VM="-server"
fi
if [ "${NODEMGR_HOST}" != "" ] ; then
  JAVA_OPTIONS="${JAVA_OPTIONS} -DListenAddress=${NODEMGR_HOST}"
fi
if [ "${NODEMGR_PORT}" != "" ] ; then
  JAVA_OPTIONS="${JAVA_OPTIONS} -DListenPort=${NODEMGR_PORT}"
fi
COMMAND_LINE="${JAVA_HOME}/bin/java ${MEM_ARGS} -classpath ${CLASSPATH} ${JAVA_OPTIONS} weblogic.NodeManager"
echo "Starting Node Manager in directory=${NODEMGR_HOME} with command line=[${COMMAND_LINE}]"
cd ${NODEMGR_HOME}
nohup ${COMMAND_LINE} >${NODEMGR_OUT}   2>&1 &
echo "Node Manager Output written to ${NODEMGR_OUT}"

I ran this and it worked, but it did not use <WL_HOME>/common/nodemanager as the home directory so it didn’t find nodemanager.domains or nodemanager.properties, so I added the ‘cd ${NODEMGR_HOME} line and then it worked fine.  At this point I was able to test that it was working by starting and stopping my servers (WebLogic and Coherence) from the admin console.  Success!  Only one problem, though, it wasn’t configured as a service.


3. Configure Node Manager as an xinetd service


You may be thinking “but xinetd doesn’t use scripts like the one you created”, and you’re correct!  After looking through the documentation I found this example on setting up an xinetd service:



image


Not knowing anything about xinetd (clearly) I set out to learn how to make this work.  Since its in the documentation, this must be the recommended way to do it, right?  It is here that I noticed the ‘NodeManagerHome’ Java option so I could run my command from anywhere and have Node Manager find the right configuration files.  This also makes the nodemanager.log file wind up in the NodeManager Home directory as well.


I started out replacing the things that seemed obvious. Clearly ‘server’ is referring to the Java executable and I knew what the CLASSPATH was supposed to be based on my investigation with the script I created above. I didn’t know why I needed the LD_LIBRARY_PATH so I just took that out (mistake, and we’ll see why later).  I created a file with the following path: /etc/xinetd.d/nodemgrsvc.  I configured it as follows, but it didn’t work:


# default: off
# description:nodemanager as a service
service nodemgrsvc
{
  type = UNLISTED
  disable = no
  socket_type = stream
  protocol = tcp
  wait = yes
  user = oracle
  port = 5556
  flags = NOLIBWRAP
  log_on_success += DURATION HOST USERID
  server = /labs/wls1035/jrockit_160_24_D1.1.2-4/jre/bin/java
  env = CLASSPATH=/labs/wls1035/patch_wls1035/profiles/default/sys_manifest_classpath/weblogic_patch.jar:/labs/wls1035/patch_ocp360/profiles/default/sys_manifest_classpath/weblogic_patch.jar:/labs/wls1035/jrockit_160_24_D1.1.2-4/lib/tools.jar:/labs/wls1035/wlserver_10.3/server/lib/weblogic_sp.jar:/labs/wls1035/wlserver_10.3/server/lib/weblogic.jar:/labs/wls1035/modules/features/weblogic.server.modules_10.3.5.0.jar:/labs/wls1035/wlserver_10.3/server/lib/webservices.jar:/labs/wls1035/modules/org.apache.ant_1.7.1/lib/ant-all.jar:/labs/wls1035/modules/net.sf.antcontrib_1.1.0.0_1-0b2/lib/ant-contrib.jar
  server_args = -DNodeManagerHome=/labs/wls1035/wlserver_10.3/common/nodemanager/ -Xms32m -Xmx200m -DListenPort=5556 -Djava.security.policy=/labs/wls1035/wlserver_10.3/server/lib/weblogic.policy -Dweblogic.nodemanager.javaHome=/labs/wls1035/jrockit_160_24_D1.1.2-4 weblogic.NodeManager -v
}

In order to see what was going on here, I looked at the <NODEMGR_HOME>/nodemanager.log file and found this exception:


<Aug 21, 2011 10:21:39 AM> <SEVERE> <Fatal error in node manager server>
weblogic.nodemanager.common.ConfigException: Native version is enabled but nodemanager native library could not be loaded
        at weblogic.nodemanager.server.NMServerConfig.initProcessControl(NMServerConfig.java:249)
        at weblogic.nodemanager.server.NMServerConfig.<init>(NMServerConfig.java:190)
        at weblogic.nodemanager.server.NMServer.init(NMServer.java:182)
        at weblogic.nodemanager.server.NMServer.<init>(NMServer.java:148)
        at weblogic.nodemanager.server.NMServer.main(NMServer.java:375)
        at weblogic.NodeManager.main(NodeManager.java:31)
Caused by: java.lang.UnsatisfiedLinkError: no nodemanager in java.library.path
        at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1737)
        at java.lang.Runtime.loadLibrary0(Runtime.java:823)
        at java.lang.System.loadLibrary(System.java:1029)
        at weblogic.nodemanager.util.UnixProcessControl.<init>(UnixProcessControl.java:25)
        at weblogic.nodemanager.util.ProcessControlFactory.getProcessControl(ProcessControlFactory.java:22)
        at weblogic.nodemanager.server.NMServerConfig.initProcessControl(NMServerConfig.java:247)
        ... 5 more

Hmmm… perhaps the LD_LIBRARY_PATH has something to do with this?  Not being a Linux guy, I had to do some investigation.  I Googled “LD_LIBRARY_PATH weblogic node manager” and found this old link to configuring Node Manager for WebLogic 8.1: http://download.oracle.com/docs/cd/E13222_01/wls/docs81/adminguide/confignodemgr.html.  Looking through my installation folder I found the .so libs in /labs/wls1035/wlserver_10.3/server/native/linux/i686, so I added this to my nodemgrsvc for xinetd giving me the following (final) configuration:


  1: # default: off
  2: # description:nodemanager as a service
  3: service nodemgrsvc
  4: {
  5:   type = UNLISTED
  6:   disable = no
  7:   socket_type = stream
  8:   protocol = tcp
  9:   wait = yes
 10:   user = oracle
 11:   port = 5556
 12:   flags = NOLIBWRAP
 13:   log_on_success += DURATION HOST USERID
 14:   server = /labs/wls1035/jrockit_160_24_D1.1.2-4/jre/bin/java
 15:   env = LD_LIBRARY_PATH=/labs/wls1035/wlserver_10.3/server/native/linux/i686 CLASSPATH=/labs/wls1035/patch_wls1035/profiles/default/sys_manifest_classpath/weblogic_patch.jar:/labs/wls1035/patch_ocp360/profiles/default/sys_manifest_classpath/weblogic_patch.jar:/labs/wls1035/jrockit_160_24_D1.1.2-4/lib/tools.jar:/labs/wls1035/wlserver_10.3/server/lib/weblogic_sp.jar:/labs/wls1035/wlserver_10.3/server/lib/weblogic.jar:/labs/wls1035/modules/features/weblogic.server.modules_10.3.5.0.jar:/labs/wls1035/wlserver_10.3/server/lib/webservices.jar:/labs/wls1035/modules/org.apache.ant_1.7.1/lib/ant-all.jar:/labs/wls1035/modules/net.sf.antcontrib_1.1.0.0_1-0b2/lib/ant-contrib.jar
 16:   server_args = -DNodeManagerHome=/labs/wls1035/wlserver_10.3/common/nodemanager/ -Xms32m -Xmx200m -DListenPort=5556 -Djava.security.policy=/labs/wls1035/wlserver_10.3/server/lib/weblogic.policy -Dweblogic.nodemanager.javaHome=/labs/wls1035/jrockit_160_24_D1.1.2-4 weblogic.NodeManager -v
 17: }
 18: 

Next it was time to test this configuration, so I logged in and started one of my servers.  Unfortunately, watching the STDOUT of the process I found this:



<Aug 21, 2011 10:39:15 AM HKT> <Warning> <NodeManager> <BEA-300043> <Node manager native library not found - server process id not saved.>


Knowing now that I needed to add the native libraries into what becomes the java.library.path I went into the ‘Server Start’ configuration for the managed server and added “-Djava.library.path=/labs/wls1035/wlserver_10.3/server/native/linux/i686” to the startup options.  After restarting the Managed Server, this error was gone.  Success! 


You can also set the LD_LIBRARY_PATH in the setDomainEnv.sh script or in the user environment to have it take affect for all servers you start in the domain which is likely the preferred method.


Hope this helps getting the Node Manager to work on Linux!