Query processing on tensor computation runtimes
The huge demand for computation in artificial intelligence (AI) is driving unparalleled
investments in hardware and software systems for AI. This leads to an explosion in the …
investments in hardware and software systems for AI. This leads to an explosion in the …
GPU Database Systems Characterization and Optimization
GPUs offer massive parallelism and high-bandwidth memory access, making them an
attractive option for accelerating data analytics in database systems. However, while modern …
attractive option for accelerating data analytics in database systems. However, while modern …
Auto-differentiation of relational computations for very large scale machine learning
The relational data model was designed to facilitate large-scale data management and
analytics. We consider the problem of how to differentiate computations expressed …
analytics. We consider the problem of how to differentiate computations expressed …
Joinboost: Grow trees over normalized data using only SQL
Although dominant for tabular data, ML libraries that train tree models over normalized
databases (eg, LightGBM, XGBoost) require the data to be denormalized as a single table …
databases (eg, LightGBM, XGBoost) require the data to be denormalized as a single table …
The tensor data platform: Towards an ai-centric database system
Database engines have historically absorbed many of the innovations in data processing,
adding features to process graph data, XML, object oriented, and text among many others. In …
adding features to process graph data, XML, object oriented, and text among many others. In …
MaskSearch: Querying Image Masks at Scale
Machine learning tasks over image databases often generate masks that annotate image
content (eg, saliency maps, segmentation maps, depth maps) and enable a variety of …
content (eg, saliency maps, segmentation maps, depth maps) and enable a variety of …
Bullion: A Column Store for Machine Learning
The past two decades have witnessed columnar storage revolutionizing data warehousing
and analytics. However, the rapid growth of machine learning poses new challenges to this …
and analytics. However, the rapid growth of machine learning poses new challenges to this …
The Duck's Brain: Training and Inference of Neural Networks in Modern Database Engines
Although database systems perform well in data access and manipulation, their relational
model hinders data scientists from formulating machine learning algorithms in SQL …
model hinders data scientists from formulating machine learning algorithms in SQL …
Random Forests over normalized data in CPU-GPU DBMSes
This short paper studies query execution based on message passing on CPU-GPU systems,
using random forests training as the workload. We investigate different data placement and …
using random forests training as the workload. We investigate different data placement and …
Teaching Blue Elephants the Maths for Machine Learning
C Ruck, ME Schüle - Proceedings of the Seventh Workshop on Data …, 2023 - dl.acm.org
Code-generation suits well for reverse mode automatic differentiation as it stores each
partial derivative as a virtual register. Since the introduction of just-in-time compilation in …
partial derivative as a virtual register. Since the introduction of just-in-time compilation in …