Coalescing adjacent gather/scatter operations

AT Forsyth, BJ Hickmann, JC Hall… - US Patent 9,348,601, 2016 - Google Patents
According to one embodiment, a processor includes an instruction decoder to decode a first
instruction to gather data elements from memory, the first instruction having a first operand …

No-locality hint vector memory access processors, methods, systems, and instructions

CJ Hughes - US Patent 9,600,442, 2017 - Google Patents
(57) ABSTRACT A processor of an aspect includes a plurality of packed data registers, and
a decode unit to decode a no-locality hint vector memory access instruction. The no-locality …

Scatter using index array and finite state machine

Z Sperber, R Valentine, S Raikin… - US Patent …, 2017 - Google Patents
Methods and apparatus are disclosed using an index array and finite state machine for
scatter/gather operations. Embodiment of apparatus may comprise: decode logic to decode …

Scatter/gather accessing multiple cache lines in a single cache port

JC Hall, S Kottapalli, AT Forsyth - US Patent App. 13/250,223, 2012 - Google Patents
0002 Modern processors often include instructions to provide operations that are
computationally intensive, but offer a high level of data parallelism that can be exploited …

Transposition operation device, integrated circuit for the same, and transposition method

T Nishimura, H Morishita - US Patent 9,201,899, 2015 - Google Patents
(57) ABSTRACT A transposition operation device includes: a register group storing a matrix
of data Such that elements are readable one at a time; an output data rearrangement unit …

Hardware prefetcher for indirect access patterns

X Yu, CJ Hughes, NR Satish - US Patent 9,582,422, 2017 - Google Patents
Two techniques address bottlenecking in processors. The first is indirect prefetching. The
technique can be especially useful for graph analytics and sparse matrix applications. For …

Facilitating efficient prefetching for scatter/gather operations

S Kapil, DJ Gove - US Patent 9,817,762, 2017 - Google Patents
The disclosed embodiments relate to a computing system that facilitates performing
prefetching for scatter/gather operations. During operation, the system receives a …

Gathering and scattering multiple data elements

CJ Hughes, YKYK Chen, M Bomb, JW Brandt… - US Patent …, 2019 - Google Patents
According to a first aspect, efficient data transfer operations can be achieved by: decoding
by a processor device, a single instruction specifying a transfer operation for a plurality of …

No-locality hint vector memory access processors, methods, systems, and instructions

CJ Hughes - US Patent 10,210,091, 2019 - Google Patents
(57) ABSTRACT A processor of an aspect includes a plurality of packed data registers, and
a decode unit to decode a no-locality hint vector memory access instruction. The no-locality …

Systems, apparatuses, and methods for performing a conversion of a writemask register to a list of index values in a vector register

E Ould-Ahmed-Vall, T Willhalm, GT Drysdale - US Patent 9,454,507, 2016 - Google Patents
Embodiments of systems, apparatuses, and methods for performing in a computer processor
conversion of a mask register into a list of index values in response to a single vector packed …