Inductive Logic Programming applied to Bioinformatics
Inductive Logic Programming (ILP) is a Machine Learning approach with foundations in Logic Programming. The problem specification and the models discovered by ILP systems are both represented as Prolog programs allowing for great expressiveness and flexibility. However, this flexibility comes at a high computational cost and ILP systems are known for their difficulty in scaling-up. Constructing and evaluating complex concepts are two of the main problems that prevent ILP systems from tackling many of the most interesting learning problems. Large concepts cannot be constructed or evaluated simply by parallelizing existing top-down search algorithms or improving the underlying Prolog engine. Novel search strategies and cover algorithms are needed. The main focus of this talk is on how to efficiently construct and evaluate such complex hypotheses in an ILP setting. Namely, we will present an efficient theta-subsumption algorithm that improves over Prolog’s SLD-resolution by several orders of magnitude. We will also show how a new bottom-up search strategy coupled with this efficient subsumption algorithm led to the discovery of a better model for a protein-binding application problem.