[PDF][PDF] FlashProfile: Interactive Synthesis of Syntactic Profiles.
CoRR, 2017•escholarship.org
We address the problem of learning comprehensive syntactic profiles for a set of strings.
Real-world datasets, typically curated from multiple sources, often contain data in various
formats. Thus any data processing task is preceded by the critical step of data format
identification. However, manual inspection of data to identify various formats is infeasible in
standard big-data scenarios. We present a technique for generating comprehensive
syntactic profiles in terms of user-defined patterns that also allows for interactive refinement …
Real-world datasets, typically curated from multiple sources, often contain data in various
formats. Thus any data processing task is preceded by the critical step of data format
identification. However, manual inspection of data to identify various formats is infeasible in
standard big-data scenarios. We present a technique for generating comprehensive
syntactic profiles in terms of user-defined patterns that also allows for interactive refinement …
Abstract
We address the problem of learning comprehensive syntactic profiles for a set of strings. Real-world datasets, typically curated from multiple sources, often contain data in various formats. Thus any data processing task is preceded by the critical step of data format identification. However, manual inspection of data to identify various formats is infeasible in standard big-data scenarios.
We present a technique for generating comprehensive syntactic profiles in terms of user-defined patterns that also allows for interactive refinement. We define a syntactic profile as a set of succinct patterns that describe the entire dataset. Our approach efficiently learns such profiles, and allows refinement by exposing a desired number of patterns. Our implementation, FlashProfile, shows a median profiling time of 0.7 s over 142 tasks on 74 real datasets. We also show that access to the generated data profiles allow for more accurate synthesis of programs, using fewer examples in programming-by-example workflows.
escholarship.org
以上显示的是最相近的搜索结果。 查看全部搜索结果