[kepler-dev] [Bug 5611] New: LinearModel actor doesn't handle input variables properly

bugzilla-daemon at ecoinformatics.org bugzilla-daemon at ecoinformatics.org
Tue May 22 14:50:39 PDT 2012


http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5611

             Bug #: 5611
           Summary: LinearModel actor doesn't handle input variables
                    properly
    Classification: Unclassified
           Product: Kepler
           Version: 2.3.0
          Platform: Other
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: actors
        AssignedTo: barseghian at nceas.ucsb.edu
        ReportedBy: regetz at nceas.ucsb.edu
         QAContact: kepler-dev at kepler-project.org


The R code in the RExpression-based LinearModel actor has several problems:
1. In a linear model, it's the _independent_ (predictor) variables that can be
either numeric or factor (categorical). The dependent (response) variable must
be numeric. The actor code gets this reversed, and fails with an error
depending on the inputs.
2. The conditional is.character() in the final 'if' statement always evaluates
to FALSE because of the conversion-to-factor in the first 'if' statement.
3. The intercept and slope aren't reported out in any useful way, and the
intercept isn't even printed to the console output (if displayed).

I'd suggest the following replacement, which doesn't conform exactly to the
intended behavior of the original actor, but I think provides a nice starter
template for doing a univariate linear model fit. I also changed it to emit the
fitted model object itself, which could be passed to another actor for
summarizing, generating an ANOVA table, extracting coefficient estimates, etc.

Note that I wouldn't usually explicitly store the model formula as an object,
but here I think it's useful to indicate the mapping of input ports to model
variables right at the top.

#--------------------------------------------------
model <- Dependent ~ Independent

if (is.character(Independent)) {
    Independent <- factor(Independent)
}

# fit model; fitted lm object is available on output port
results <- lm(model)
print(summary(results))

# plot data, adding regression line if appropriate
plot(model)
if (is.numeric(Independent)) {
    abline(results, col="red")
}
#--------------------------------------------------

-- 
Configure bugmail: http://bugzilla.ecoinformatics.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA Contact for the bug.


More information about the Kepler-dev mailing list