[kepler-dev] [Bug 3095] New: - add data mining actors (weka) and cheminformatics actors

bugzilla-daemon at ecoinformatics.org bugzilla-daemon at ecoinformatics.org
Wed Jan 23 08:46:59 PST 2008


http://bugzilla.ecoinformatics.org/show_bug.cgi?id=3095

           Summary: add data mining actors (weka) and cheminformatics actors
           Product: Kepler
           Version: 1.0.0beta3
          Platform: Other
        OS/Version: All
            Status: NEW
          Keywords: Kepler/CORE
          Severity: enhancement
          Priority: P2
         Component: actors
        AssignedTo: berkley at nceas.ucsb.edu
        ReportedBy: jones at nceas.ucsb.edu
         QAContact: kepler-dev at ecoinformatics.org


Joseph Maria requested the incorporation of data mining actors based on WEKA
and cheminformatics actors based on CDK.  Email correspondence between him and
im McPhillips follows:

Tim McPhillips wrote:

I think this is an excellent idea, it would greatly increase the number of
available data mining algorithms under Kepler and benefit from all current and
future developments in Weka.

I am not a weka specialist, but as far as I know, Weka uses standard interfaces
for most of its components and algorithms (e.g. one interface for all
classifiers, one for all filters, ... ), so it should be possible to write some
fairly generic wrapper(s) to incorporate weka functionality into Kepler.

There is another machine learning package "RapidMiner" (the former "Yale")
(http://rapid-i.com/, http://sourceforge.net/projects/yale) which extends Weka,
it might be useful to look into that to see how they have incorporated Weka or
even use this as a basis for incorporation into Kepler.

Kind regards!

Tim.



Josep Maria Campanera Alsina wrote:
> Hi all again,
> I'd like to know if there are any plan in the kepler project related to two very useful Java open source tools:
>
> - WEKA, http://www.cs.waikato.ac.nz/ml/weka/ . The most known and popular Java library for data mining.
> - CDK - The chemistry development kit, http://cdk.sourceforge.net . Java library for structural chemo- and bioinformatics .
>
> In other words,
> (1) Are there any plan to integrate them into kepler core in the near future?
> (2) Are there any kepler workflow available that already uses these tools?
> (3) What would the strategy be to integrate them into the platforms, I mean since they are in JAVA are there any "easy" standard procedure to implement/embed them into Kepler?
>
> Hopefully, we will see these useful tools embed in to kepler soon!



More information about the Kepler-dev mailing list