[seek-kr-sms] Re: Taxon/KR integration prototype proposal

Deana Pennington dpennington at lternet.edu
Wed Apr 28 10:30:14 PDT 2004


Good comments...  actually, this is a discussion about inductive vs 
deductive methods in science, and you are right, both have a place.  I 
cannot generate a hypothesis without looking at information first, and 
applying inductive reasoning.  The best hypotheses come from merging 
what is already known with new observations.  The trick is to understand 
how the new observations (perhaps from data mining) link with what is 
already known (formal ontologies).  I think ontological hypothesis work 
could include both 1) specifying a hypothesis in terms of existing 
ontologies in order to search for relevant data and analyses, and 2) 
IT-enabled hypothesis generation from a combination of ontologies and 
data mining.  Having worked on a data mining project with Tony Fountain, 
I can say that the tools that are out there are not adequate for the 
kinds of inductive reasoning that we need to do.  Most of them are 
statistical classification algorithms, or event detection algorithms.  
The spatiotemporal pattern searching that goes on conceptually in the 
mind of a scientist are not captured very well by these methods.


Nico M. Franz wrote:

> Hi there:
>    If I get what you're talking about it's both interesting and 
> important. Not being an ecologist proper, I've asked myself sometimes 
> if the mere access to methods and loads of information will make 
> someone switch their PhD thesis topic, or aim at a different thing in 
> their next research proposal. But making a strong judgment either way 
> (ecologists won't do it; they'll go for it and drop everything 
> else...) is probably risky in itself.
>    Other than that, I think that a data-driven approach may not always 
> be optimal for an "NSF-obsessed" person(sometimes the equation goes: 
> NSF grant = tenure). But it may be more attractive to someone more 
> curious and secure. Research grants and papers are written in a 
> certain way that doesn't always have to reflect the sequence of steps 
> that actually led there. Sometimes it's the reverse: an "anomaly" came 
> up and restructured the whole approach. The way it has developed, NSF 
> seems sometimes unable to directly fund the most exploratory, 
> experimental, undirected research - unless the promise is already 
> perceived as being huge.
>    I cannot at all speak for ecology, but in systematics, "NSF" (as if 
> that were an entity removed from actual scientists) has sometimes been 
> swept into funding research approaches that sounded really neat - in 
> terms of hypothesis testing - but were ultimately not as fruitful as 
> others may have been.
>    My current conclusion is that SCIENTISTS should shape what NSF 
> perceives as relevant. That sometimes takes the courage of individuals 
> to argue strongly for what they're doing if it's a little off 
> mainstream. If such individual ecologists can be attracted to the 
> data-driven side, that may be an important sign.
>    It really ought to be a cyclic interaction between observations and 
> theories driving research, and maybe so far it's just been much easier 
> having access to theories.
> Cheers,
> Nico
> Nico M. Franz
> National Center for Ecological Analysis and Synthesis
> 735 State Street, Suite 300
> Santa Barbara, CA 93101
> Phone: (805) 966-1677; Fax: (805) 892-2510; E-mail: franz at nceas.ucsb.edu
> Website: 
> http://www.cals.cornell.edu/dept/entomology/wheeler/Franz/Nico.html
>>>> Deana Pennington wrote:
>>>>> Sorry so long to reply...I've been at a conference without e-mail...
>>>>> The entire scientific process is designed around testing 
>>>>> hypotheses.  You come up with a research question of interest, 
>>>>> then create an analysis to test it.  NSF funding (and other 
>>>>> funding sources) are completely based on the strength (scientific 
>>>>> merit) of the question and how well thought out the proposed 
>>>>> methodology is. 
>>>>> The idea of integrating data simply to see if anything comes out 
>>>>> of it is strongly resisted, as is the idea of tool-driven 
>>>>> science.  The general argument is that science should be directed 
>>>>> and focused along paths that have been rationally determined.  
>>>>> Occasionally a tool comes along that changes the way we can think 
>>>>> about science (like the microsope, for example), and for a short 
>>>>> time, some exploratory analysis is funded.  But that is the 
>>>>> exception, not the norm.  The synthetic work that is being 
>>>>> encouraged may depend on data integration, but it will have to be 
>>>>> proposed as a traditional research question to get funded.  Its 
>>>>> the difference between saying you want to put climate and 
>>>>> hydrology data together over time to look for interesting 
>>>>> patterns, and having a focused question that requires data 
>>>>> integration to do the analysis (hypothesis: drought in the western 
>>>>> US has resulted in reduced evapotranpiration in high elevation 
>>>>> forests, which should result in an increase in runoff for a given 
>>>>> increase in precipitation).
>>>>> Actually, this seems to me to be a fundamental difference in the 
>>>>> way CIS/IM and domain scientists approach problems.  I've been 
>>>>> having a long-term discussion about this with Samantha.  The RCN 
>>>>> classes have presented a data-centric view that works well with 
>>>>> information managers, but did not work well with the domain 
>>>>> scientists at the new fac/postdoc workshop.  They kept wondering 
>>>>> what the goals/objectives were of the information that was 
>>>>> presented early in the week (Why are we doing this?).  For the 
>>>>> distributed graduate seminar, we have intentionally changed that 
>>>>> order around to a research question focus.  We'll see what kind of 
>>>>> response we get, but I think it will resonate with them.  
>>>>> Formulating your ideas through knowledge representation, pulling 
>>>>> together concepts, creating approaches to workflows...those are 
>>>>> early in the seminar, and would occur early in the scientific 
>>>>> process, long before a scientists thinks about data models, 
>>>>> structures, or metadata.
>>>>> Deana


Deana D. Pennington, PhD
Long-term Ecological Research Network Office

UNM Biology Department
MSC03  2020
1 University of New Mexico
Albuquerque, NM  87131-0001

505-272-7288 (office)
505 272-7080 (fax)

More information about the Seek-kr-sms mailing list