[PDF][PDF] Semantic query optimization in an automata-algebra combined XQuery engine over XML streams

H Su, EA Rundensteiner, M Mani - … conference on Very large data bases …, 2004 - vldb.org
Proceedings of the Thirtieth international conference on Very large data bases …, 2004vldb.org
Our Raindrop framework [6, 9] aims at tackling challenges of stream processing that are
particular to XML. In contrast to the tuple-based or object-based data streams, XML streams
are usually modeled as a sequence of primitive tokens, such as a start tag, an end tag or a
PCDATA item. Unlike a self-contained tuple or object whose semantics are completely
determined by its own values, a token lacks semantics without the context provided by other
tokens in the stream. This poses specific challenges for query processing over such XML …
Our Raindrop framework [6, 9] aims at tackling challenges of stream processing that are particular to XML. In contrast to the tuple-based or object-based data streams, XML streams are usually modeled as a sequence of primitive tokens, such as a start tag, an end tag or a PCDATA item. Unlike a self-contained tuple or object whose semantics are completely determined by its own values, a token lacks semantics without the context provided by other tokens in the stream. This poses specific challenges for query processing over such XML streams.
State-of-the-Art. Since the automata model was originally designed for matching patterns over strings, it is a natural paradigm for structural pattern retrieval on XML token streams [7, 8, 4]. However the automata model suffers from not being able to strike a balance between the expressive power of the query it can handle and the manageability of its constructs. It either provides limited “recognizer-like” query capabilities, eg,[4] gives only boolean answers to XPath expressions rather than constructing the results. Or, it may require a huge number of states, actions and transitions, resulting from the low level description of the patches for providing more query capabilities [8, 7]. In contrast, the algebraic query processing paradigm has been proven to be practical for query optimization, because of (1) its modularity of composing a query from individual operators, and (2) its support for iterative and thus manageable optimization decisions at several abstraction levels (eg, logical and physical plans). However, the data model underlying this paradigm assumes sets of self-contained tuples. XML
vldb.org
以上显示的是最相近的搜索结果。 查看全部搜索结果