Tensor relational algebra for distributed machine learning system design

B Yuan, D Jankov, J Zou, Y Tang, D Bourgeois… - Proceedings of the …, 2021 - par.nsf.gov
We consider the question: what is the abstraction that should be implemented by the
computational engine of a machine learning system? Current machine learning systems …

Tensor relational algebra for machine learning system design

B Yuan, D Jankov, J Zou, Y Tang, D Bourgeois… - arXiv preprint arXiv …, 2020 - arxiv.org
We consider the question: what is the abstraction that should be implemented by the
computational engine of a machine learning system? Current machine learning systems …

Distributed numerical and machine learning computations via two-phase execution of aggregated join trees

D Jankov, B Yuan, S Luo, C Jermaine - Proceedings of the VLDB …, 2021 - par.nsf.gov
When numerical and machine learning (ML) computations are expressed relationally,
classical query execution strategies (hash-based joins and aggregations) can do a poor job …

Multidimensional array data management

F Rusu - Foundations and Trends® in Databases, 2023 - nowpublishers.com
Multidimensional arrays are a fundamental abstraction to represent data across scientific
domains ranging from astronomy to genetics, medicine, business intelligence, and …

FuseME: Distributed matrix computation engine based on cuboid-based fused operator and plan generation

D Han, J Lee, MS Kim - … of the 2022 International Conference on …, 2022 - dl.acm.org
Operator fusion is essentially and widely used in a large number of matrix computation
systems in science and industry. The existing distributed operator fusion methods focus on …

Fast matrix multiplication via compiler‐only layered data reorganization and intrinsic lowering

B Kuzma, I Korostelev, JPL de Carvalho… - Software: Practice …, 2023 - Wiley Online Library
The resurgence of machine learning has increased the demand for high‐performance basic
linear algebra subroutines (BLAS), which have long depended on libraries to achieve peak …

Redundancy elimination in distributed matrix computation

Z Chen, B Han, C Xu, W Qian, A Zhou - Proceedings of the 2022 …, 2022 - dl.acm.org
As matrix computation becomes increasingly prevalent in large-scale data analysis,
distributed matrix computation solutions have emerged. These solutions support query …

面向大数据分析的分布式矩阵计算系统研究进展

陈梓浩, 徐辰, 钱卫宁, 周傲英 - 软件学报, 2022 - jos.org.cn
在大数据治理应用中, 数据分析是必不可少的一环, 且具有耗时长, 计算资源需求大的特点, 因此,
优化其执行效率至关重要. 早期由于数据规模不大, 数据分析师可以利用传统的矩阵计算工具 …

Efficient matrix computation for sgd-based algorithms on apache spark

B Han, Z Chen, C Xu, A Zhou - International Conference on Database …, 2022 - Springer
With the increasing of matrix size in large-scale data analysis, a series of Spark-based
distributed matrix computation systems have emerged. Typically, these systems split a matrix …

Hybrid evaluation for distributed iterative matrix computation

Z Chen, C Xu, J Soto, V Markl, W Qian… - Proceedings of the 2021 …, 2021 - dl.acm.org
Distributed matrix computation is common in large-scale data processing and machine
learning applications. Existing systems that support distributed matrix computation already …