[kepler-dev] Q: SGE command execution on remote SGE cluster started on local PC fails due to environment problems
Hoeftberger, Johann
Johann_Hoeftberger at DFCI.HARVARD.EDU
Fri Feb 6 12:31:01 PST 2015
Hello,
I try to create my own simple Kepler test workflow on my local PC and
run it via SSH on a SGE cluster.
For that I took the Kepler demo Workflow
"Job_Submission_Using_JobManager" configured it for my local situation
an tried to run it.
I know that the connection to the cluster, the login with the given
credentials and the settings for the working directory work well. (I get
created directories and files for the cluster jobs which should be
executed.)
The execution of the qsub command on the cluster doesn't work because at
first I got the exception "qsub: unknown command" and when I hardcoded
the full path for the qsub command in the implementation (to locate the
error only), I changed the initialization of private String
_sgeSubmitCmd in JobSupportSGE.java to the full path of the qsub command
on my SGE cluster, I got the next exception "Unable to initialize
environment because of error: Please set the environment variable SGE_ROOT".
SGE_ROOT is properly set on the SGE cluster. I tried to set it
additionally on my local PC (afterwards I restarted Eclipse where my
Kepler instance is running) and in the Eclipse - Run - Run Configuration
- Java Application - Environment. All these tries didn't solve the
problem, I still get the same exception about the uninitialized environment.
I found in SshExec.java::public int executeCmd(String command,
OutputStream streamOut, OutputStream streamErr, String thirdPartyTarget)
the documentation
/**
* Execute a command on the remote machine and expect a password/passphrase
* question from the command. The stream <i>streamOut</i> should be
provided
* to get the output and errors merged. <i>streamErr</i> is not used in
this
* method (it will be empty string finally).
*
* @return exit code of command if execution succeeded,
* @throws ExecTimeoutException
* if the command failed because of timeout
* @throws SshException
* if an error occurs for the ssh connection during the command
* execution Note: in this method, the SSH Channel is forcing a
* pseudo-terminal allocation {see setPty(true)} to allow
remote
* commands to read something from their stdin (i.e. from us
* here), thus, (1) remote environment is not set from
* .bashrc/.cshrc and (2) stdout and stderr come back merged in
* one stream.
*/
so I guess my problem is caused through an uninitialized or wrong
initialized used (pseudo) terminal.
For me it seems the set environment variables on the used systems (SGE
cluster, local PC) are not read and the Kepler implementation itself
doesn't set proper values for the needed variables.
I haven't found a description or a configuration possibility for my
issue yet. So I think it is some kind of implementation flaw.
Can somebody give me a hint how to solve this issue. I would like to run
Kepler locally on my PC but execute parts of my Workflow remotely on my
SGE cluster.
Kind regards,
Johann Hoeftberger
--
The information in this e-mail is intended only for the person to whom it is
addressed. If you believe this e-mail was sent to you in error and the e-mail
contains patient information, please contact the Partners Compliance HelpLine at
http://www.partners.org/complianceline . If the e-mail was sent to you in error
but does not contain patient information, please contact the sender and properly
dispose of the e-mail.
More information about the Kepler-dev
mailing list