[kepler-dev] A very thorny question..
glen at glenjarvis.com
Wed Mar 28 15:18:06 PDT 2007
I imagine this will not be an easy question to answer. And, I may get
different opinions from different people on how to approach the
problem. But, before I ask it, let me first take a few minutes to
frame the discussion. These questions are technical and are related to
infrastructure. I used the nightly build kepler20070325 in this
discussion. However, I saw the same problems on Beta3.
As a new user to Kepler, I've been going through the "Getting Started
Guide." I am trying to build an environment for Biologist I work with
(and also as a project that I will need to write a classroom paper on)
that will be so easy to use it can easily replace BioPerl. After
spending a semester's work evaluating Wildfire, Infosense, the Apple
Automator, and even the Lego Mindstorms programmable block
environment, I found Kepler. It's *exactly* the framework I needed.
And, my purpose: To build an environment that is clean to use, has few
confusing messages, and a Biologist with little programming would not
be intimidated with. From the papers I read here, I know many
Biologists use Kepler as is. The Biologists I work with, however feel
intimidated by confusing messages and non-intuitive interfaces. They
want to get on with the Biology and not bogged down by the tools.
While going through the "Getting Started Guide," I found it to be well
written and easy for me to use. I thought, "Gosh, this is almost not a
draft." I have made many notes about small things like parallelism,
missed words, etc. But, then, I discovered that the basic problems I
had from the beginning were all related to the same core situation.
The only "real" problem with the Getting Started Guide was that many
of the examples didn't work. I thought that would be fixed once Kepler
was out of Beta. I now no longer believe this to be the case. I
believe the core problem is related to the interoperability between
systems and would like to know if several of the examples in the
Getting Started Guide could be rewritten to avoid the situation I will
The first of two examples to show this case is the third demo, "Image
Display" example. On my first pass, I was naive and just thought that
"nothing happened." On a more recent pass of the guide, I dug into
more technical details and saw that when I replaced the ImageJ actor
with the Browser Display actor, the following is given to standard out:
> Reading from the browser - val = false
> Error invoking browser, cmd=netscape -remote
Now, this makes perfect sense since I don't have netscape installed (I
use firefox). But, the more fundamental question is - Do we want to
use an example that depends upon a browser that will vary from system
to system? This will inevitably fail on some systems no matter how
good the Getting Started Guide is written. Ultimately, it's a great
demonstration. But, should it be in the first document that is seen
by new users?
The second of my two examples revolves around section 7.1 Sample
Workflow 1 - Simple Statistics.
Upon the first run, I saw nothing happening again. Now, on a second
pass with a more technical mindset, I troubleshot and saw the
following displayed on standard out:
> Problem with creating process in RExpression!
> Error in _exec()
> 54 ms. Memory: 142636K Free: 54674K (38%)
The process couldn't be created because R was not installed on the
system. After installing R with default settings, I see the system now
work. However, there is an additional message "File Error: Could not
open the file." It doesn't stop the demonstration from working, but it
adds confusion to the situation. I'm sure I could resolve this as
well. But, the same question comes to mind. In an introduction to the
software, do we want to use something that involves other programs
outside of our normal control? Would we, in the future, include R as
part of the install and therefore avoid this issue? How important is
it for us to use an R example? Can we give just one example (instead
of many examples) that uses R that stresses boldly how it may fail if
R is not installed. If an initial user doesn't know what R is, or care
to use it, many of the examples will fail.
In summary, these are the impressions from a new set of eyes. Kepler
is impressive as all heck and the framework I want to use for the
project that will probably take the next few years of my life. If I
learned nothing else in studying my second year of Bioinformatics, it
is that if a software looks too confusing, no matter how good it is,
my Biologists tell me they shy away from it. I'd like to see the
software work so well that it becomes the de facto standard like
P.S. Kirsten, I still have about a zillion notes I made on reading the
guide (like parallelism, some omitted words, etc.). But, they seem so
insignificant compared to the big issues seen in this email.
glen at glenjarvis.com
"You must be the change you wish to see in the world." -M. Gandhi
More information about the Kepler-dev