KEYWORD SEARCH IN TEXT CUBE: FINDING TOP-K RELEVANT CELLS
Shared by Elizabeth Foughty, updated on Oct 13, 2010
Summary
- Abstract
KEYWORD SEARCH IN TEXT CUBE: FINDING TOP-K RELEVANT CELLS
BOLIN DING, YINTAO YU, BO ZHAO, CINDY XIDE LIN, JIAWEI HAN, AND CHENGXIANG ZHAI
Abstract. We study the problem of keyword search in a data cube with text-rich dimension(s)
(so-called text cube). The text cube is built on a multidimensional text database, where each row
is associated with some text data (e.g., a document) and other structural dimensions (attributes).
A cell in the text cube aggregates a set of documents with matching attribute values in a subset
of dimensions. A cell document is the concatenation of all documents in a cell. Given a keyword
query, our goal is to find the top-k most relevant cells (ranked according to the relevance scores of
cell documents w.r.t. the given query) in the text cube.
We define a keyword-based query language and apply IR-style relevance model for scoring and
ranking cell documents in the text cube. We propose two efficient approaches to find the top-k
answers. The proposed approaches support a general class of IR-style relevance scoring formulas
that satisfy certain basic and common properties. One of them uses more time for pre-processing
and less time for answering online queries; and the other one is more efficient in pre-processing and
consumes more time for online queries. Experimental studies on the ASRS dataset are conducted
to verify the efficiency and effectiveness of the proposed approaches.
- Publication Name
- N/A
- Publication Location
- N/A
- Year Published
- N/A
Files
|
671.4 KB | 548 downloads |