NWChem: Past, present, and future E Apra, EJ Bylaska, WA De Jong, N Govind, K Kowalski, TP Straatsma, ... The Journal of chemical physics 152 (18), 2020 | 553 | 2020 |
MCM-GPU: Multi-chip-module GPUs for continued performance scalability A Arunkumar, E Bolotin, B Cho, U Milic, E Ebrahimi, O Villa, A Jaleel, ... ACM SIGARCH Computer Architecture News 45 (2), 320-332, 2017 | 241 | 2017 |
Dynamic load balancing on single-and multi-GPU systems L Chen, O Villa, S Krishnamoorthy, GR Gao 2010 IEEE International Symposium on Parallel & Distributed Processing …, 2010 | 223 | 2010 |
NWChem E Apra, EJ Bylaska, WA de Jong, N Govind, K Kowalski, TP Straatsma, ... American Institute of Physics, 2020 | 215 | 2020 |
Scaling the power wall: a path to exascale O Villa, DR Johnson, M Oconnor, E Bolotin, D Nellans, J Luitjens, ... SC'14: Proceedings of the International Conference for High Performance …, 2014 | 170 | 2014 |
Nvbit: A dynamic binary instrumentation framework for nvidia gpus O Villa, M Stephenson, D Nellans, SW Keckler Proceedings of the 52nd Annual IEEE/ACM International Symposium on …, 2019 | 149 | 2019 |
Efficient breadth-first search on the cell/be processor DP Scarpazza, O Villa, F Petrini IEEE Transactions on Parallel and Distributed Systems 19 (10), 1381-1395, 2008 | 100 | 2008 |
Exploration of distributed shared memory architectures for NoC-based multiprocessors M Monchiero, G Palermo, C Silvano, O Villa Journal of Systems Architecture 53 (10), 719-732, 2007 | 100 | 2007 |
GPU-based implementations of the noniterative regularized-CCSD (T) corrections: applications to strongly correlated systems W Ma, S Krishnamoorthy, O Villa, K Kowalski Journal of chemical theory and computation 7 (5), 1316-1327, 2011 | 91 | 2011 |
Beyond the socket: NUMA-aware GPUs U Milic, O Villa, E Bolotin, A Arunkumar, E Ebrahimi, A Jaleel, A Ramirez, ... Proceedings of the 50th Annual IEEE/ACM International Symposium on …, 2017 | 89 | 2017 |
Active-space completely-renormalized equation-of-motion coupled-cluster formalism: Excited-state studies of green fluorescent protein, free-base porphyrin, and oligoporphyrin dimer K Kowalski, S Krishnamoorthy, O Villa, JR Hammond, N Govind The Journal of chemical physics 132 (15), 2010 | 72 | 2010 |
Combining HW/SW mechanisms to improve NUMA performance of multi-GPU systems V Young, A Jaleel, E Bolotin, E Ebrahimi, D Nellans, O Villa 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture …, 2018 | 69 | 2018 |
Nvbitfi: Dynamic fault injection for gpus T Tsai, SKS Hari, M Sullivan, O Villa, SW Keckler 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems …, 2021 | 68 | 2021 |
Peak-performance DFA-based string matching on the Cell processor DP Scarpazza, O Villa, F Petrini 2007 IEEE International Parallel and Distributed Processing Symposium, 1-8, 2007 | 67 | 2007 |
Accelerating real-time string searching with multicore processors O Villa, DP Scarpazza, F Petrini Computer 41 (4), 42-50, 2008 | 65 | 2008 |
Efficient synchronization for embedded on-chip multiprocessors M Monchiero, G Palermo, C Silvano, O Villa IEEE Transactions on very large scale integration (VLSI) systems 14 (10 …, 2006 | 64 | 2006 |
Efficiency and scalability of barrier synchronization on noc based many-core architectures O Villa, G Palermo, C Silvano Proceedings of the 2008 international conference on Compilers, architectures …, 2008 | 57 | 2008 |
Efficient pattern matching on GPUs for intrusion detection systems A Tumeo, O Villa, D Sciuto Proceedings of the 7th ACM international conference on Computing frontiers …, 2010 | 56 | 2010 |
Aho-Corasick string matching on shared and distributed-memory parallel architectures A Tumeo, O Villa, DG Chavarría-Miranda IEEE Transactions on Parallel and Distributed Systems 23 (3), 436-443, 2011 | 53 | 2011 |
Optimizing tensor contraction expressions for hybrid CPU-GPU execution W Ma, S Krishnamoorthy, O Villa, K Kowalski, G Agrawal Cluster computing 16, 131-155, 2013 | 50 | 2013 |