Amélioration de la robustesse des systèmes d’aide à la description, à la classification et à la détermination des objets biologiques
Publié le 27 avril 2009, mis à jour le 20 juillet 2009
Toutes les versions de cet article :
Paris IX - Dauphine University thesis - Noël Conruyt - 24th of May 1994 at Paris IX - Dauphine University.
Our approach of robustness for systems that help to describe, classify and identify biological objects is based on the application of the scientific method in biology (experimenting and testing), in order to help naturalists to understand their domain better, test their opinions, and transmit their knowledge. We have built user-friendly computer tools to allow the construction of structured and pre-classified descriptions (the examples), to learn inductive hypothesis (classifications), and test them with new observations (by identification). The quality of descriptions is fundamental for the learning process. Besides, they must be comparable, and so rely on a descriptive model that the expert must explicitly represent and structure. To help him, we have stressed some observational mechanisms from monographs published in scientific literature. The model corresponds to the objects that can be observed in the domain : they are represented in a description tree. Then, a questionnaire that matches the descriptive model is automatically built ; it allows the biologist to use it as an observation guide, to acquire observed descriptions and build a case base that is consistent with the model. Two different types of technology can be used in order to process the case base, depending on the goal to be achieved. For classification purposes, a decision tree is developed using inductive learning from examples to characterize the classes. For identification purposes, a case based reasoning strategy is used. It dynamically extracts the most efficient descriptors and produces better identifications than by following a path of a decision tree. Nevertheless, inductive learning as well as the repetitive use of the questionnaire remains useful for detecting possible inconsistencies within the cases library, thus allowing a validation of the descriptive model. The method proposed here gives to the expert the ability to update the knowledge base according to the results obtained during classification and/or identification, and thus improve iteratively his descriptive model and case base. This process brings more robustness and leads to the elaboration of more powerful CAT (Computer Assisted Taxonomy) tools.