Conference on Intelligent Data Understanding 2010 (CIDU)

KEYWORD SEARCH IN TEXT CUBE: FINDING TOP-K RELEVANT CELLS

Shared by Elizabeth Foughty, updated on Oct 13, 2010

Summary

Abstract

KEYWORD SEARCH IN TEXT CUBE: FINDING TOP-K RELEVANT CELLS

BOLIN DING, YINTAO YU, BO ZHAO, CINDY XIDE LIN, JIAWEI HAN, AND CHENGXIANG ZHAI

Abstract. We study the problem of keyword search in a data cube with text-rich dimension(s)
(so-called text cube). The text cube is built on a multidimensional text database, where each row
is associated with some text data (e.g., a document) and other structural dimensions (attributes).
A cell in the text cube aggregates a set of documents with matching attribute values in a subset
of dimensions. A cell document is the concatenation of all documents in a cell. Given a keyword
query, our goal is to find the top-k most relevant cells (ranked according to the relevance scores of
cell documents w.r.t. the given query) in the text cube.
We define a keyword-based query language and apply IR-style relevance model for scoring and
ranking cell documents in the text cube. We propose two efficient approaches to find the top-k
answers. The proposed approaches support a general class of IR-style relevance scoring formulas
that satisfy certain basic and common properties. One of them uses more time for pre-processing
and less time for answering online queries; and the other one is more efficient in pre-processing and
consumes more time for online queries. Experimental studies on the ASRS dataset are conducted
to verify the efficiency and effectiveness of the proposed approaches.

show more info

Publication Name: N/A
Publication Location: N/A
Year Published: N/A

Files

Paper 12 .pdf: KEYWORD SEARCH IN TEXT CUBE: FINDING TOP-K RELEVANT CELLS

671.4 KB

548 downloads

Discussions

Add New Comment