The Computing Innovation Fellows Project

Matchmaking Service for Mentors and CIFellows

* Post a Profile!
* Update a Profile

Click for Available Candidate Profiles

Tony Jebara

University/Research Lab: Columbia University
Location: (New York, NY)
Personal Research Web Page: http://www.cs.columbia.edu/~jebara

Keywords: machine learning, graphs, matchings, perfect graphs, networks, graphical models, inference, large margin methods, structured prediction, kernel methods, spatiotemporal data, generative and discriminative learning, maximum entropy, combinatorics

Posted on: Monday, May 16th, 2011
Broad Research Area: AI / Machine Learning / Robotics / Vision

Research Interests:

The following describes the projects I am currently most interested in.

The field of machine learning increasingly leverages graph representations and graph-based algorithms in a variety of applications and problem settings. For instance, semi-supervised learning, dimensionality reduction, and unsupervised clustering problems are frequently solved by casting data points as nodes in a graph and then applying a graph-theoretic algorithm. Similarly, probabilistic inference problems are solved by casting random variables as nodes in a graph and then applying graph algorithms such as message passing, tree approximations and other variants. Most of these learning and inference graph problems are NP-hard in general. Similarly, various combinatorial graph problems such as graph coloring, maximum stable set and maximum clique are NP-hard in general. However, in combinatorics, scientists have identified a large family of graphs known as perfect graphs where combinatorial problems become tractable and admit polynomial time algorithms. We hope to investiga te a variety of hard learning problems and, by compiling them into combinatorial problems on perfect graphs, identify situations in which they may be solved exactly or approximated well. We plan to investigate various algorithms for exact solution including linear programming, semidefinite programming and message passing.

The first type of learning setting involves inference in graphical models: graph-based representations for encoding a set of conditional independence properties that hold in a probability distribution p(x_1,\ldots,x_n) defined over a set of random variables. Two important canonical computational problems therein are MAP Estimation (finding the configuration that maximizes p(x_1,\ldots,x_n)) and Marginal Inference (finding the marginal configuration of variables, such as p(x_i)). Both of these are NP-hard in general. However, it is possible to compile the graphical model into another graph and solve an equivalent maximum stable set problem. If this graph happens to be a perfect graph, the inference problems are tractable. We hope to follow this general approach to identify sufficient conditions for tractable graphical model inference.

The second type of learning setting to investigate involves operations on data graphs: graph-based representations which encode pairwise similarity relationships between pairs of objects {x_1,\ldots, x_n}. Two canonical computational problems therein are Clustering (finding multiple disjoint subsets of the data with relatively high self-similarity) and Anomaly Detection (finding a single subset of data with relatively high self-similarity).

We aim to develop methods of solving these problems by ensuring that the data graph constructed from these samples is perfect. This allows the clustering problem to be mapped to a graph coloring problem and allows the anomaly detection problem to be mapped to a maximum clique problem. Both of these problems are solvable in polynomial time for perfect graphs using convex programming methods.

Please apply for a CI Fellow position with me and if you are in the New York area, please try to drop in for a visit.

Contact Information:

Send me an email or meet me in person at the upcoming machine learning conferences (ICML, COLT or UAI in Montreal June 14-21).

twitter-icon

Browse Mentor Posts in other Research Areas