Big data visualization tools

N Bikakis - arXiv preprint arXiv:1801.08336, 2018 - arxiv.org
Data visualization is the presentation of data in a pictorial or graphical format, and a data
visualization tool is the software that generates this presentation. Data visualization provides …

Exploring the role of machine learning in scientific workflows: Opportunities and challenges

A Nouri, PE Davis, P Subedi, M Parashar - arXiv preprint arXiv …, 2021 - arxiv.org
In this survey, we discuss the challenges of executing scientific workflows as well as existing
Machine Learning (ML) techniques to alleviate those challenges. We provide the context …

How to architect a query compiler, revisited

RY Tahboub, GM Essertel, T Rompf - Proceedings of the 2018 …, 2018 - dl.acm.org
To leverage modern hardware platforms to their fullest, more and more database systems
embrace compilation of query plans to native code. In the research community, there is an …

Filter before you parse: Faster analytics on raw data with sparser

S Palkar, F Abuzaid, P Bailis, M Zaharia - Proceedings of the VLDB …, 2018 - dl.acm.org
Exploratory big data applications often run on raw unstructured or semi-structured data
formats, such as JSON files or text logs. These applications can spend 80--90% of their …

Fast queries over heterogeneous data through engine customization

M Karpathiotakis, I Alagiannis, A Ailamaki - Proceedings of the VLDB …, 2016 - dl.acm.org
Industry and academia are continuously becoming more data-driven and data-intensive,
relying on the analysis of a wide variety of heterogeneous datasets to gain insights. The …

[PDF][PDF] The case for heterogeneous HTAP

R Appuswamy, M Karpathiotakis… - … on Innovative Data …, 2017 - infoscience.epfl.ch
Modern database engines balance the demanding requirements of mixed, hybrid
transactional and analytical processing (HTAP) workloads by relying on i) global shared …

[PDF][PDF] Architecting data lake-houses in the cloud: Best practices and future directions

A Nuthalapati - Int. J. Sci. Res. Arch, 2024 - repository-ijsra.com
As the volume of data has grown exponentially, what this means for organisations are
significant opportunities and challenges. Three significant challenges face traditional data …

Byteslice: Pushing the envelop of main memory data processing with a new storage layout

Z Feng, E Lo, B Kao, W Xu - Proceedings of the 2015 ACM SIGMOD …, 2015 - dl.acm.org
Scan and lookup are two core operations in main memory column stores. A scan operation
scans a column and returns a result bit vector that indicates which records satisfy a filter …

Just-in-time data virtualization: Lightweight data management with ViDa

M Karpathiotakis, I Alagiannis, T Heinis… - Proceedings of the …, 2015 - infoscience.epfl.ch
As the size of data and its heterogeneity increase, traditional database system architecture
becomes an obstacle to data analysis. Integrating and ingesting (loading) data into …

Adaptive partitioning and indexing for in situ query processing

M Olma, M Karpathiotakis, I Alagiannis… - The VLDB Journal, 2020 - Springer
The constant flux of data and queries alike has been pushing the boundaries of data
analysis systems. The increasing size of raw data files has made data loading an expensive …