Extracting logical structures from HTML tables

YS Kim, KH Lee - Computer Standards & Interfaces, 2008 - Elsevier
While HTML is mainly designed for the visual rendering of Web documents, XML is widely
accepted as a standard format to process and manage information. In particular, it can
embed the information of logical structures. However, in order to utilize XML, the logical
structures of HTML tables should first be extracted and transformed into XML
representations. This paper presents an efficient method for the process, which consists of
two phases: area segmentation and structure analysis. The area segmentation cleans up …
以上显示的是最相近的搜索结果。 查看全部搜索结果