Contextualized text OLAP based on information retrieval
International Journal of Data Warehousing and Mining (IJDWM), 2015•igi-global.com
Current data warehousing and On-Line Analytical Processing (OLAP) systems are not yet
particularly appropriate for textual data analysis. It is therefore crucial to develop a new data
model and an OLAP system to provide the necessary analyses for textual data. To achieve
this objective, this paper proposes a new approach based on information retrieval (IR)
techniques. Moreover, several contextual factors may significantly affect the information
relevant to a decision-maker. Thus, the paper proposes to consider contextual factors in an …
particularly appropriate for textual data analysis. It is therefore crucial to develop a new data
model and an OLAP system to provide the necessary analyses for textual data. To achieve
this objective, this paper proposes a new approach based on information retrieval (IR)
techniques. Moreover, several contextual factors may significantly affect the information
relevant to a decision-maker. Thus, the paper proposes to consider contextual factors in an …
Abstract
Current data warehousing and On-Line Analytical Processing (OLAP) systems are not yet particularly appropriate for textual data analysis. It is therefore crucial to develop a new data model and an OLAP system to provide the necessary analyses for textual data. To achieve this objective, this paper proposes a new approach based on information retrieval (IR) techniques. Moreover, several contextual factors may significantly affect the information relevant to a decision-maker. Thus, the paper proposes to consider contextual factors in an OLAP system to provide relevant results. It provides a generalized approach for Text OLAP analysis which consists of two parts: The first one is a context-based text cube model, denoted CXT-Cube. It is characterized by several contextual dimensions. Hence, during the OLAP analysis process, CXT-Cube exploits the contextual information in order to better consider the semantics of textual data. Besides, the work associates to CXT-Cube a new text analysis measure based on an OLAP-adapted vector space model and a relevance propagation technique. The second part is an OLAP aggregation operator called ORank (OLAP-Rank) which allows to aggregate textual data in an OLAP environment while considering relevant contextual factors. To consider the user context, this paper proposes a query expansion method based on a decision-maker profile. Based on IR metrics, it evaluates the proposed aggregation operator in different cases using several data analysis queries. The evaluation shows that the precision of the system is significantly better than that of a Text OLAP system based on classical IR. This is due to the consideration of the contextual factors.
IGI Global
以上显示的是最相近的搜索结果。 查看全部搜索结果