Tutorial Program & Descriptions

Search Algorithms for Quantum Computers

Tad Hogg

Quantum computers factor integers in polynomial time, a problem thought to be intractable for conventional machines. More relevant for AI is how rapidly they solve NP-hard combinatorial searches. Although unlikely to efficiently solve all NP problems, heuristic algorithms for quantum computers may offer substantial improvement for many searches that arise in practice by operating on the entire search space at once. Furthermore, heuristics pose less stringent hardware requirements than algorithms ignoring problem structure, thereby reducing the formidable challenge of building these machines.

This tutorial will describe the capabilities of quantum computers, e.g., testing exponentially many search states in about the same time conventional machines test just one. Attendees will learn how to use these capabilities for search through a variety of examples including a heuristic for random 3-SAT near a phase transition in typical search cost. The tutorial will also cover theoretical and empirical techniques for evaluating such algorithms and a variety of open research questions that the AI community is well positioned to address.

The tutorial will assume some knowledge of combinatorial searches, such as SAT, and heuristic methods, such as hill-climbing and GSAT. Familiarity with quantum mechanics is not required. For further information see http://www.parc.xerox.com/hogg/ijcai01.html.

Tad Hogg is a member of the research staff at Xerox PARC. His research interests include distributed control with multiagent systems, ecommerce privacy mechanisms, search algorithms for quantum computers, and analogies with physical phase transitions found in combinatorial search problems. He holds a physics PhD from Stanford University.

MA2

Distributed Knowledge-Based Search

Jörg Denzinger

With the increasing availability of multiprocessor computers and networks of computers the wish to use the massive computing power provided by them has become stronger and stronger. With the maturity of the field multi-agent systems, we now have the conceptual and modeling tools to adequately describe and compare different approaches to solve AI problems while employing the additional “dimension”of teams of computers. Knowledge-based search is at the core of many AI systems and even a number of systems that have found their way into “mainstream” computer science, such as scheduling systems and many standard optimization systems.

This tutorial will provide a unified view on different concepts used to distribute knowledge-based search. We will introduce distributed search systems as cooperative multi-agent systems and concentrate on the communication and organization requirements of such systems. The general ideas behind the known distributed search systems will be presented within this multi-agent framework, and the systems will be classified into different categories. For each category, we will present its basic idea independent from a particular application. We will present one typical homogeneous and one typical heterogeneous distribution concept for each category. Finally, we will discuss and compare the requirements, limitations, advantages and disadvantages of the different categories.

Prerequisite knowledge: The tutorial is suitable for a general AI audience, both academic and industrial. Knowledge of some basic search algorithm schemes would be helpful, but it is not essential.

Jörg Denzinger is a professor for AI and multi-agent systems at the University of Calgary. He was head of several DFG projects on distributed knowledge-based search and on the use of previous experiences to improve search controls. His research interests also include learning cooperative behavior of agents. Homepage: http://www.cpsc.ucalgary.ca/~denzinge/.

MA3

Empirical Methods in CS and AI

Paul Cohen, Ian Gent, and Toby Walsh

This tutorial will cover the basic principles of empirical studies, and methods for exploratory data analysis, experiment design, hypothesis testing, and modeling. We will cover the entire lifecycle of empirical studies, including the exploratory phase, which is usually not reported, and the phase in which a research question (why you are running the study in the first place) is turned into an experiment design. While this is not a crash course in statistical methods, we will introduce hypothesis testing and computer intensive statistical methods — a new family of tools particularly appropriate for AI research. Finally we will address questions that arise when trying to publish empirical work. Throughout, we will use examples from our own research: positive examples of good practice, and negative examples to demonstrate what not to do!

The tutorial will be suitable to a general AI audience, as very little background knowledge is assumed and the empirical methods discussed are generally useful. It builds upon the successful tutorial of the same theme presented at AAAI-2000.

Paul Cohen is a professor of computer Science at the University of Massachusetts, where he works on planning, simulation, and learning. Cohen's Empirical Methods for Artificial Intelligence, (The MIT Press) is a textbook on experiment design, data analysis, statistical modeling, and other empirical tools. Cohen is a Fellow of the American Association for Artificial Intelligence.

Ian Gent is a lecturer in computer science at the University of St. Andrews. His research has mainly been in combinatorial search in AI, in domains such as satisfiability and constraint satisfaction. He has done much empirical work, both with Toby Walsh and as part of the APES research group, http://apes.cs.strath.ac.uk.

Toby Walsh is an EPSRC Advanced Research Fellow at the Department of Computer Science (York). He has previously held postdoctoral posts at the Department of AI (Edinburgh), INRIA (Nancy), DIST (Genova), IRST (Trento), and the Department of Computer Science (Strathclyde). He too has done much empirical work with Ian Gent and as part of the APES research group, http://apes.cs.strath.ac.uk.

MA4

Integrating Lisp with the World

Vladimir A. Kulyukin

The rapid growth of diverse software technologies has made it hard and oftentimes impossible to develop sophisticated systems in one programming language. Lisp has been the language of choice for AI researchers and practitioners for over two decades. Yet the AI community has paid little to the integration of Lisp with mainstream development tools. As a consequence, these tools have started claiming the Lisp territory.

The purpose of this tutorial is to show that AI researchers and developers can and should use Lisp in conjunction with such mainstream languages as Java and C++. The tutorial will demonstrate how Lisp can utilize specific functionalities available through C++ and Java, and how C++ and Java can utilize software components written in Lisp. A special emphasis will be placed on CORBA, COM, and foreign function interfaces. Examples will include information retrieval, natural language processing, and robot control.

The tutorial will be of interest to researchers and developers who want to integrate Lisp-based solutions into applications written in mainstream languages. The tutorial will also be relevant to researchers and practitioners who want to master distributed computing with COM and CORBA.

Prerequisite Knowledge : Familiarity with Common Lisp. Knowledge of the Allegro Common Lisp IDE is helpful but not essential.

Vladimir A. Kulyukin is an Assistant Professor of Computer Science at DePaul University. He has a Ph.D. in Computer Science from the University of Chicago. His research interests are robotics, information retrieval, and computer vision.

MA5

Machine Learning for Categorization of Text Documents and Web Pages

Fabrizio Sebastiani & Alessandro Sperduti

In this tutorial we look at the main approaches that have been taken towards automatic text categorization within the general machine learning paradigm. A general presentation of the basic issues in document categorization will be followed by the presentation of basic (such as linear separators, decision trees, etc.) and advanced machine learning concepts and techniques (such as boosting, support vector machines, etc.). Then issues pertaining to document indexing, classifier construction, and classifier evaluation, will be discussed in detail, and a review of the current most relevant research in text categorization by machine learning tools will be presented. Finally, the special case of automatic classification of Web pages is considered and the concepts and techniques specifically devised for this case are discussed.

We assume that attendees will be familiar with basic knowledge of linear algebra, calculus, and probability.

Fabrizio Sebastiani has been a research associate of the Italian National Council of Research since 1988. He has published several papers in international journals and conferences in the areas of natural language processing, logic-based knowledge representation, information retrieval, and automated text categorization. On these two last topics he has taught several tutorials at international conferences and summer schools. His research interests concern information retrieval, machine learning, and automated text categorization.

Alessandro Sperduti is Associate Professor at the Computer Science Department, University of Pisa. His research interests include pattern recognition, machine learning, neural networks. He co-organized several workshops in these areas and served in the PC of neural networks conferences. More recently his research has focused on machine learning for the Web.

MP1

Ant Algorithms and Swarm Intelligence

Marco Dorigo

Ant colonies, and more generally social insect societies, are distributed systems that in spite of the simplicity of their individuals present a highly structured social organization. As a result of this organization, ant colonies can accomplish astonishingly complex tasks that in some cases far exceed the individual capacities of a single ant. The study of ant colonies behavior and of their self-organizing capacities is interesting for computer scientists because it provides models of distributed organization which are useful to solve difficult optimization and distributed control problems. This is particularly true in application environments in which rapid and autonomous adaptation to environmental changes, as well as robustness to system failures, are important features.

In this tutorial I will present some models derived from the observation of real ants and other insect societies, and I will explain how these models can be used to design multi-agent systems for the solution of problems like distributed and adaptive routing in Internet-like networks, combinatorial optimization, optimal allocation of resources, and distributed task allocation in a fleet of autonomous robots.

Marco Dorigo is a Senior Researcher for the Belgian FNRS. He is the inventor of the Ant Colony Optimization metaheuristic and author of the book Swarm Intelligence (Oxford University Press, 1999). He published more than twenty papers in international journals and conferences in the last five years on the tutorial subject.

MP2

Integration of Operations Research and AI Constraint- Based Techniques for Combinatorial Optimization

Michela Milano

The tutorial provides an overview of recent directions in the integration of Mathematical Programming (MP) techniques, used in Operations Research (OR), and Artificial Intelligence Constraint Satisfaction (CS) techniques for facing Combinatorial Optimization Problems (COPs). The tutorial starts by reviewing basic concepts in COPs, CS, and Constraint Programming (CP). Participants are assumed to have some familiarity with these preliminaries. OR concepts are described in more detail since we do not require any prerequisite knowledge of the field. We describe (Mixed) Integer Programming and Linear Programming, their geometrical properties, and solving algorithms; cutting planes generation techniques are presented, branch-and-bound and branch-and-cut frameworks discussed; finally column generation approaches are introduced. The aim of this introduction is not to provide details on how these techniques are implemented, but rather to explain how results can be exploited from a software engineering viewpoint.

In the second part, we compare Constraint Satisfaction and Optimization Problems, underlining differences and similarities from a modeling and solving perspective. In the third part, we describe approaches toward integration that have been investigated to date, again from a modeling and solving viewpoint. Finally, we discuss open problems and research directions. We provide references to recent literature throughout.

Michela Milano received a PhD in 1998 from the University of Bologna, based on a thesis entitled Reasoning on Constraints in Constraint Logic Programming. Currently, she is a researcher in the same University. Her primary research interests are Constraint Programming, Constraint Satisfaction, and Optimization.

MP3

Knowledge Markup and Resource Semantics

Harold Boley, Stefan Decker, and Michael Sintek

Semi-formal and formal knowledge on the Web attracts increasing numbers of content providers, and much of the ongoing formalization can be done with AI knowledge representation techniques and by AI people. This tutorial introduces techniques for _knowledge markup_: how to map formal AI representations (e.g., logics and frames) to XML. It also explains _resource semantics_: how to describe heterogeneous Web resources using AI-inspired semantic RDF metadata.

We survey existing XML and RDF applications for knowledge bases/ontologies, deal with the acquisition and processing of these representations, go into agent architectures built around XML or RDF, and detail selected applications. A special emphasis will be on current efforts to define a shared markup language for realizing Tim Berners-Lee’s “Semantic Web” vision (e.g., DAML and OIL).

After the tutorial, participants will have absorbed the theoretical foundations as well as the practical use of knowledge markup and resource semantics; they will also be able to assess proposed XML and RDF applications for AI and to identify further applications of these techniques.

Participants should have some previous experience with knowledge representations (logics or frames) and markup languages (e.g., HTML).

Harold Boley has used markup techniques for knowledge representation and KR for resource semantics. He developed the Relational-Functional Markup Language (RFML) and started the Rule Markup Initiative (RuleML). He wrote invited papers explaining relationships between Logic Programming and XML as well as RDF. He has developed AI-oriented XML and RDF courses. See http://www.dfki.uni-kl.de/~boley/.

Stefan Decker is a postdoctoral fellow at the department of computer science at Stanford University, where he leads the OntoAgents project in the DARPA DAML program. His research interests include knowledge representation, database systems for the Semantic Web, information integration and translation, and ontology articulation and merging. See http://www-db.stanford.edu/~stefan and http://www.SemanticWeb.org.

Michael Sintek is the project leader of the FRODO (http://www.dfki.uni-kl.de/frodo/) project at DFKI Kaiserslautern where an XML/RDF-based framework for building distributed organizational memories is developed. His research interests include logic programming, knowledge representation, ontologies, and web technologies. See http://www.dfki.uni-kl.de/~sintek/.

MP4

Practical Machine Learning for Software

Tim Menzies

Machine learning (ML) is not hard and should be a standard part of any software engineer’s toolkit. Software engineers can use machine learners to simplify systems development. This tutorial explains how to use ML to assist in the construction of systems that support classification, prediction, diagnosis, planning, monitoring, requirements engineering, validation, and maintenance. Case study material will be presented using examples from software fault estimation, software time estimation, software risk reduction, decision support systems for geologists, medical diagnostic systems, electrical diagnosis systems, and reverse engineering.

This tutorial is industrial practitioner-oriented. For example, most of its material is suitable for the AI-novice or the technical manager of software engineering projects. Also, the tutorial explores how to use machine learning in {\em data-starved} domains; lacks the large data sets needed traditional machine learning. Many software engineering companies operate in such data-starved domains, particularly the newer, smaller dot-com software companies. In such data-starved domains, learning must be preceded by a modeling process to generate a model we can use to generate data sets. Machine learning for software engineering is practical when both the modeling and learning stages are simple and inexpensive. This tutorial presents such simple and inexpensive techniques.

Dr. Tim Menzies developed this tutorial while working with NASA on using machine learning for software engineering. He holds a Ph.D. in AI, and has worked for many years as an OO and expert systems consultant. Dr. Menzies is an assistant professor at Electrical and Computer Engineering, University of British Columbia.

MP5

Tractability in Qualitative Spatial and Temporal Reasoning

Frank Anger, Hans Guesgen, & Gerard Ligozat

The field of qualitative temporal reasoning has been around in AI at least since Allen’s pioneering work 20 years ago. More recently, similar approaches have been introduced for reasoning about space. Some general threads have emerged, especially as far as complexity problems are concerned; hence an integrated presentation of the basic results becomes feasible in a systematic way. The potential applications of the field include natural language understanding, planning, GIS, robotics, automatic mail processing, and human-machine communication, among others.

This tutorial will guide practitioners by describing the main methods and results in the field. It will introduce researchers and graduate students to an area with exciting open problems and perspectives.

Prerequisite knowledge: The tutorial assumes only a basic knowledge of AI and Knowledge Representation techniques. The logical notions will be introduced when required.

Dr. Frank Anger is Program Director and Acting Deputy Division Director at NSF. He holds degrees from Princeton, Cornell and Florida and held professorships at several universities before joining NSF. He has published over 50 papers covering a wide range of topics and is a founding member of three professional organizations.

Dr. Hans Guesgen is an associate professor in computer science at the University of Auckland, NZ. His areas of research include spatio-temporal reasoning and constraint satisfaction, with more than 50 publications in these areas. He co-organized and co-chaired various workshops on spatial and temporal reasoning.

Dr. Gerard Ligozat is a professor of computer science at the University of Paris at Orsay, France. His fields of interest include temporal and spatial representation and reasoning in connection with formal and natural language issues. Along with many publications, he authored or co-authored two books on knowledge representation.