Organizers: Qi Tian, Kay Robbins, Weining Zhang, Tom Bylander, Carola Wenk, Yufei Huang (EE),
Time: 11:00-12:00 am, Every Friday
Note: Due to a conflict with other scheduled events in the CS department, we have to change some of the talks scheduled on Friday to Wednesday afternoons, or on a case-by-case basis.
Place: SB 4.01.20, CS Conference Room
Schedule for Spring 2010
Jan 15 Dr. Andre Schulz (MIT)
Jan 22 Dr. David P. Wagner (Korea University)
Jan 29 Dr. Pat Langley (Arizona State University)
Feb 5 Jie Xiao, cancelled
Feb 12 Chengwei Lei
Feb 19 Jamiul Jahid
Feb 26 Jessica Sherette
Mar 12 Lijie Zhangcancelled
Mar 19 (spring break)
Mar 24 Shiliang Zhang (Wed, 3pm)
Mar 24 Xianming Liu (Wed, 3pm)
Mar 26 Huang's studentcancelled
Apr 2 Xia Li
Apr 9 Hongwei Tian
Apr 16 Jia Meng
Apr 23 Dr. Shouhuai Xu
01/15/10 Pseudo-Triangulations as Statically Rigid Graphs
Speaker: Andre Schulz
A pseudo-triangle is a polygon with exactly three acute angles. The decomposition of the convex hull of a 2d point set into pseudo-triangles is called a pseudo-triangulation. Pseudo-triangulations find applications as geometric data structures in algorithms for ray shooting, polygon guarding, collision detection, polygon unfolding and so on.
If every vertex in a pseudo-triangulation is incident to an angle greater 180 degree, the pseudo-triangulation is called pointed. Pointed pseudo-triangulation are known to be minimally rigid. This means that if we keep the edge lengths fixed, the graph, seen as bar and joint framework, is (infinitesimal) rigid. On the other hand, the graph gets a one degree of freedom mechanism if any of its edges is removed. This property leads to an algorithm that transforms any polygon to a convex polygon while preserving edge lengths, and therefore answers the long-standing open carpenter rule problem positively.
Infinitesimally rigid frameworks can also be understood as frameworks that can absorb any set of forces applied to its joints. This well known dual concept is known as static rigidity. I show that pseudo-triangulations can be considered as special statically rigid frameworks. Under the restriction that only expanding tensions in the interior are allowed, there is exactly one pointed pseudo-triangulation that resolves a fixed set of forces.
The proof of this phenomenon uses linear programming dualism. As feasible area of the primal LP we use the polytope of pointed pseudo-triangulations. This high dimensional polytope encodes all possible pointed pseudo-triangulations of a point set as its vertices. As consequence of the special statical rigidity we obtain a modified (more direct, compared to the original work of Rote, Santos and Streinu) proof of the correctness of the polytope of pointed pseudo-triangulations.
Bio: Andre Schulz is a postdoctoral researcher at the Computer Science and Artificial Intelligence Lab at MIT. After completing his doctoral studies at the Free University Berlin, Germany, he spent one year as postdoc at Smith College. His research addresses problems in computational and discrete geometry. In particular, he is interested in realizing polytopes on the grid, graph drawing and rigidity theory.
Guenter Rote and Andre Schulz.
Resolving loads with positive interior stresses.
In: Algorithms and Data Structures Symposium-WADS 2009, Banff, August 2009, Herausgeber: Frank Dehne, Ian Munro, JÃ¶rg-RÃ¼diger Sack und Roberto Tamassia, Lecture Notes in Computer Science, 5664, Springer-Verlag, 2009, pp. 530-541. doi:10.1007/978-3-642-03367-4_46
- Guenter Rote, Francisco Santos und Ileana Streinu. Expansive motions and the polytope of pointed pseudo-triangulations. In: Discrete and Computational Geometry - The Goodman-Pollack Festschrift. Herausgeber: Boris Aronov, Saugata Basu, Janos Pach und Micha Sharir, Algorithms and Combinatorics, 25, Springer Verlag, Berlin 2003, pp. 699-736, arXiv:math/0206027 [math.CO].
- Rote, Guenter; Santos, Francisco; Streinu, Ileana (2006). Pseudo-triangulations - a survey. arXiv:math.CO/0612672.
- leana Streinu, Pseudo-Triangulations, Rigidity and Motion Planning in Discrete and Computational Geometry , vol. 34, no. 4, pp. 587-635, Nov. 2005. doi:10.1109/SFCS.2000.892132
01/22/10 An O(n5/2logn) algorithm for the Rectilinear Minimum Link-Distance Problem in three dimensions
Speaker: Dr. David P. Wagner (Korea University)
In this paper we consider the Rectilinear Minimum Link Path Problem among rectilinear obstacles in three dimensions. The problem is well studied in two dimensions, but is relatively unexplored in higher dimensions. We solve the problem in O(@bnlogn) time, where n is the number of corners among all obstacles, and @b is the size of a binary space partition (BSP) decomposition of the space containing the obstacles. There exist methods to find a BSP where in the worst-case @b=@Q(n^3^/^2), giving us an overall worst-case time of O(n^5^/^2logn). Previously known algorithms have had worst-case running times of @W(n^3).
Bio: Dr. David Wagner is currently an Assistant Professor at Korea University. He obtained his PhD from Dartmouth College in 2006 and his MS from SUNY - Stony Brook in 1995, both in Computer Science. His research interests include computational geometry, approximation algorithms, and path finding algorithms.
- David P. Wagner, Robert Scot Drysdale, and Clifford Stein, An O(n5/2logn) algorithm for the Rectilinear Minimum Link-Distance Problem in three dimensions, Computational Geometry: Theory and Applications, 42(5):376-387, 2009 Link
01/29/10 An Interactive Environment for Constructing Ecological Models
Speaker: Dr. Pat Langley (Arizona State University)
The field of ecology has a long tradition of interactive software environments that aid model construction, visualization, and simulation, but these typically require hand crafting of model structures and parameters. In contrast, machine learning has developed techniques that automatically induce models from data, but these typically focus on shallow predictors that make little contact with concepts or notations from scientific disciplines. In this talk, I present progress on Prometheus, an interactive environment for the construction, evaluation, and discovery of explanatory models. The framework encodes ecological models as sets of processes that incorporate differential equations, simulates these models' behavior over time, incorporates background knowledge to constrain model construction, and induces the models from time-series data. I illustrate this approach on data and models from ecology and environmental science. In closing, I report on recent extensions that draw upon additional knowledge, stated as domain-specific constraints, to reduce search and make models more comprehensible.
This talk describes joint work at Stanford University and ISLE with Kevin Arrigo, Stuart Borrett, Matt Bravo, Will Bridewell, and Ljupco Todorovski.
Bio: Prof. Pat Langley currently serves as Director for the Institute for the Study of Learning and Expertise, as Professor of Computing and Informatics at Arizona State University, and as Head of CSLI's Computational Learning Laboratory. He is also a Consulting Professor of Symbolic Systems and Computational Mathematics and Engineering at Stanford University. He has contributed to the fields of artificial intelligence and cognitive science for over 25 years, having published 200 papers and five books on these topics, including the text Elements of Machine Learning. His research has dealt with learning in planning, reasoning, language, vision, robotics, and scientific knowledge discovery, and he has contributed novel methods to a variety of paradigms, including logical, probabilistic, and case-based learning. His current research focuses on methods for constructing explanatory process models in scientific domains and on integrated architectures for intelligent physical agents. http://www.isle.org/~langley/
02/05/10 In Defense of Nearest-Neighbor Based Image Classification
Speaker: Jie Xiao
The state-of-the-art image classification approaches require tensive training/learning stage, which make them outperform the non-parametric Nearest-Neighbor methods. In this paper, the authors analyze the reason of inferior performance and propose a NN-based efficient and simple approach. It outperforms the state-of-the-art algorithms in the standard testing dataset.
Oren Boiman, Eli Shechtman, and Michal Irani, " In Defense of Nearest-Neighbor Based Image Classification ", CVPR 2008.
02/12/10 Model-based analysis of ChIP-Seq (MACS)
Speaker: Chengwei Lei
We present Model-based Analysis of ChIP-Seq data, MACS, which analyzes data generated by short read sequencers such as Solexa's Genome Analyzer. MACS empirically models the shift size of ChIP-Seq tags, and uses it to improve the spatial resolution of predicted binding sites. MACS also uses a dynamic Poisson distribution to effectively capture local biases in the genome, allowing for more robust predictions. MACS compares favorably to existing ChIP-Seq peak-finding algorithms, and is freely available.
Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, Nussbaum C, Myers RM, Brown M, Li W, Liu XS, "Model-based analysis of ChIP-Seq (MACS)", Genome Biology. 2008;9(9):R137.
02/19/10 Learning on Weighted Hypergraphs to Integrate Protein Interactions and Gene Expressions for Cancer Outcome Prediction
Speaker: Jamiul Jahid
Building reliable predictive models from multiple complementary genomic data for cancer study is a crucial step towards successful cancer treatment and a full understanding of the underlying biological principles. To tackle this challenging data integration problem, we propose a hypergraph-based learning algorithm called HyperGene to integrate microarray gene expressions and protein-protein interactions for cancer outcome prediction and biomarker identification. HyperGene is a robust two-step iterative method that alternatively finds the optimal outcome prediction and the optimal weighting of the marker genes guided by a protein-protein interaction network. Under the hypothesis that cancer-related genes tend to interact with each other, the HyperGene algorithm uses a protein-protein interaction network as prior knowledge by imposing a consistent weighting of interacting genes. Our experimental results on two large-scale breast cancer gene expression datasets show that HyperGene utilizing a curated protein-protein interaction network achieves significantly improved cancer outcome prediction. Moreover, HyperGene can also retrieve many known cancer genes as highly weighted marker genes.
TH Hwang, Z Tian, R Kuang and JP Kocher, Learning on Weighted Hypergraphs to Integrate Protein Interactions and Gene Expressions for Cancer Outcome Prediction , Proceedings of the 2008 Eighth IEEE International Conference on Data Mining (ICDM-08)
02/26/10 Can we Compute the Similarity Between Surfaces?
Speaker: Jessica Sherette
Similarity metrics between surfaces are used in graphics and computer-aided manufacturing. For example, in computer-aided manufacturing, their use helps to ensure quality control. The Fréchet distance in particular is a useful similarity metric because it takes the continuity of the given surfaces into account. The relaxed version of Fréchet distance known as weak Fréchet distance has been shown to be computable between triangulated surfaces in polynomial time. In contrast, the Fréchet distance between triangulated surfaces is only known to be upper semi-computable.
Alt, Helmut, and Maike Buchin. "Can we Compute the Similarity Between Surfaces?" Discrete and Computational Geometry 43.1 (2009): 78-99.
03/24/10 Descriptive Visual Words and Visual Phrases for Image Applications?
Speaker: Shiliang Zhang
Bag-of-visual Words (BoW s ) representation has been applied for various problems in the fields of multimedia and computer vision. The basic idea is to represent images as visual documents composed of repeatable and distinctive visual elements, which are comparable to the words in texts. Notwithstanding its great success and wide adoption, visual vocabulary created from single local image descriptors is often shown to be not as effective as it is desired . In this paper, D escriptive V isual W ords (DVWs) and D escriptive V isual P hrases (DVPs) are proposed as the visual correspondences to text words and phrases, where visual phrases refer to the frequently co-occurring visual word pairs. Since images are the carriers of visual objects and scenes, descriptive visual element set can be composed by the visual words and their combinations which are effective in representing certain visual objects or scenes. Based on this idea, a general framework is proposed for generating DVWs and DVPs for various applications. In a large-scale image database containing 1506 object and scene categories, the visual words and visual word pairs descriptive to certain scenes or objects are identified and collected as the DVWs and DVPs. Experiments show that the DVWs and DVPs are informative and descriptive, thus are more comparable with the text words than the classic visual words. We apply the identified DVWs and DVPs in several applications including large-scale near-duplicated image retrieval, image search re-ranking, and object recognition. The combination of DVW and DVP performs better than the state-of-the-are feature in large-scale near- duplicated image retrieval in terms of accuracy, efficiency and memory consumption . The proposed image search re-ranking algorithm: DWPRank outperforms the state-of-the-art image re-ranking algorithm by 12.4% in mean average precision and about 11 times faster in efficiency.
- S. Zhang, Q. Tian, G. Hua, Q. huang, and S. Li. Descriptive Visual Words and Visual Phrases for Image Applications. ACM Multimedia , 2009.
- D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree. Proc. CVPR, pp. 2161-2168, 2006.
- D. G. Lowe. Distinctive image features from scale-invariant keypoints. IJCV, 60(2): 91-110, Nov. 2004.
03/24/10 Properties for Tag Distribution of Flickr Images & Topic Formulation for Image Annotation
Speaker: Xianming Liu
Does there exist a compact set of visual topics in form of keyword clusters capable to represent all images visual content within an acceptable error? In this takl, we answer this question by analyzing distribution laws for keywords from image descriptions and comparing with traditional techniques in NLP, thereby propose three assumptions: Sparse Distribution Attribute, Local Convergent Assumption and Global Convergent Conjecture. They are essential for keywords selection and image content understanding to overcome the semantic gap. Experiments are performed on a 60,000 web crawled images, and the correctness is validated by the performance.
Based on these three properties, further application is developed, which aims at finding topics and selecting keywords for image annotation and retrieval. Also, considering the diversities from visual feature distribution, a spectral cluster based visual topic model is introduced, in order to find more compact topics for images on the web.
- Liu, Xianming and Yao, Hongxun and Ji, Rongrong and Xu, Pengfei and Sun, Xiaoshuai. What is a complete set of keywords for image description & annotation on the web? Proceedings of the seventeen ACM international conference on Multimedia, pp 613-616, 2009
- Xianming Liu, Hongxun Yao, Rongrong Ji, Exploring Statistical Properties for Semantic Annotation: Sparse Distributed and Convergent Assumptions for Keywords, Proc. ICASSP 2010
04/02/10 Query Expansion for Hash-based Image Object Retrieval
Speaker: Xia Li
With the exponential growth of photos and videos, hash-based indexing methods have been shown efficient and essential for content-based image retrieval. In this work, the authors identified the low-recall problem of hash-based image object search, and proposed two novel expansion strategies: intra-expansion and inter-expansion. The experiments on two consumer-photo benchmarks demonstrated the proposed expansion methods are complementary to each other and can collaboratively improve the performance significantly over the original hash-based method.
- Y. Kuo, K. Chen, C. Chaiang, W.H. Hsu, Query Expansion for Hash-based Image Object Retrieval. ACM MM 2009 [ PDF ]
- M.Datar, N. Immorlica, P. Indyk, V.S. Mirrokni, Locality-Sensitive Hashing Scheme Based on p-Stable Distributions. SoCG 2004 [ PDF ]
- Q. Lv et al. Multi-probe Hash: efficient indexing for high-dimensional similarity search. VLDB 2007 [ PDF ]
04/09/10 Distributionbased Microdata Anonymization
Speaker: Hongwei Tian
Before sharing to support ad hoc aggregate analyses, microdata often need to be anonymized to protect the privacy of individuals. A variety of privacy models have been proposed for microdata anonymization. Many of these models (e.g., t-closeness) essentially require that, after anonymization, groups of sensitive attribute values follow specified distributions. To support such models, this paper studies the problem of transforming a group of sensitive attribute values to follow a certain target distribution with minimal data distortion. Specifically, this paper develops and evaluates a novel methodology that combines the use of sensitive attribute permutation and generalization with the addition of fake sensitive attribute values to achieve this transformation. This paper identifies metrics related to accuracy of aggregate query answers over the transformed data, and develops efficient anonymization algorithms to optimize these accuracy metrics. Using a variety of data sets, this paper experimentally demonstrates the effectiveness of its techniques.
Reference: Nick Koudas, Divesh Srivastava, and Ting Yu andQing Zhang. Distribution-based microdata anonymization. The VLDB Journal, 2:2150-8097, 2009. [ PDF ]
04/16/10 Uncovering Transcriptional Regulatory Networks by Sparse Bayesian Factor Model
Speaker: Jia Meng
The problem of uncovering transcriptional regulation by transcription factors (TFs) based on microarray data is considered. A novel Bayesian sparse correlated rectified factor model (BSCRFM) is proposed that models the un- known TF protein level activity, the correlated regulations between TFs, and the sparse nature of TF regulated genes. The model admits prior knowledge from existing database regarding TF regulated target genes based on a sparse prior and through a developed Gibbs sampling algorithm, a context-specific transcriptional regulatory network specific to the experimental condition of the microarray data can be obtained. The proposed model and the Gibbs sampling algorithm were evaluated on the simulated systems and results demonstrated the validity and effectiveness of the proposed approach. The proposed model was then applied to the Breast cancer microarray data of patients with Estrogen Receptor positive (ER+) status and Estrogen Receptor negative (ER-) status, respectively.
Reference: J. Meng, J. Zhang, Y. Cheng, Y. Huang, Uncovering Transcriptional Regulatory Networks by Sparse Bayesian Factor Model. submitted to Genomic Signal Processing [ PDF ]
04/23/10 In Quest of A Foundation for Cyber Security
Speaker: Shouhuai Xu
Questions and Comments?