Yanlei Diao
Location: (Amherst, Massachusetts)
Personal Research Web Page: http://www.cs.umass.edu/~yanlei/
Keywords: Information architectures, data management systems; data streams, uncertain data management, sensor databases, flash databases, data dissemination, and XML query processing.
Posted on: Tuesday, June 2nd, 2009
Broad Research Area: Databases / Information Retrieval / Data Mining, Information Systems / Information Science
Research Interests:
Uncertain Data Streams.
The goal of this project is to design and develop a stream processing system that captures data uncertainty from data collection to query processing to final result generation. Such uncertain data stream processing is crucial to many real-world applications such as hazardous weather monitoring, object tracking and monitoring, and traffic monitoring. To achieve this goal, our project takes a principled approach grounded in probability and statistical theory to support uncertainty as a first-class citizen, and efficiently integrate this approach into high-volume stream processing. The project addresses technical challenges that arise in capturing uncertainty of raw data streams as well as in capturing uncertainty as data propagates through various query processing operators.
Complex Event Processing.
We study stream processing in the context of large-scale event-based systems that are gaining adoption in applications such as supply chain management, surveillance, network and application monitoring, and environmental monitoring. These systems create high volumes of events. End applications require these events to be filtered and correlated for complex pattern detection, aggregated on different temporal and geographic scales, and transformed to new events that reach a semantic level appropriate for the applications. We address issues involved in stream-based event processing ranging from the query language to computation complexity to fast implementation.
RFID Data Management.
Radio Frequency Identification (RFID) technology is gaining acceptance in an increasing number of applications for tracking and monitoring purposes. Despite its promise to provide unprecedented visibility in various domains, RFID technology presents numerous challenges, including incomplete and noisy data, lack of information about inter-object relationships, and high volumes. In this project, we design and develop an inference and query processing system over RFID streams. Our system offers accurate interpretation of incomplete and insufficient raw data by way of inferring locations of unobserved objects and inter-object relationships such as collocation and containment. It further performs complex query processing over the inferred data streams and scales such processing to multiple data centers and numerous objects.
Flash Databases.
Flash memories are in ubiquitous use for storage on sensor nodes and mobile devices, and are anticipated to soon play an important role in enterprise servers due to superior read performance and energy efficiency. However, flash memories have read and write characteristics fundamentally different from magnetic disks. In this project, we examine the benefits and challenges of building flash databases in several contexts. We first design a new sensor database on flash that supports power-constrained processing and multi-resolution storage. We further consider the use of flash in high-end computing such as in large data centers and address issues ranging from index construction to query processing to data aging.
Contact Information:
Please enter your e-mail address, phone number or web contact form URL here. Email: Email: email obfuscated - click to reveal
Phone: 413.545.1135
Address:
Department of Computer Science Room 232
140 Governors Drive
Amherst, MA 01003-9264
