MMX === Shift ----- XMM ~~~ _mm_sll_pi16 ^^^^^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 count :Param ETypes: UI16 a, UI16 count .. code-block:: C __m64 _mm_sll_pi16(__m64 a, __m64 count); .. admonition:: Intel Description Shift packed 16-bit integers in "a" left by "count" while shifting in zeros, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 IF count[63:0] > 15 dst[i+15:i] := 0 ELSE dst[i+15:i] := ZeroExtend16(a[i+15:i] << count[63:0]) FI ENDFOR _mm_slli_pi16 ^^^^^^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, int imm8 :Param ETypes: UI16 a, IMM imm8 .. code-block:: C __m64 _mm_slli_pi16(__m64 a, int imm8); .. admonition:: Intel Description Shift packed 16-bit integers in "a" left by "imm8" while shifting in zeros, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 IF imm8[7:0] > 15 dst[i+15:i] := 0 ELSE dst[i+15:i] := ZeroExtend16(a[i+15:i] << imm8[7:0]) FI ENDFOR _mm_sll_pi32 ^^^^^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 count :Param ETypes: UI32 a, UI32 count .. code-block:: C __m64 _mm_sll_pi32(__m64 a, __m64 count); .. admonition:: Intel Description Shift packed 32-bit integers in "a" left by "count" while shifting in zeros, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*32 IF count[63:0] > 31 dst[i+31:i] := 0 ELSE dst[i+31:i] := ZeroExtend32(a[i+31:i] << count[63:0]) FI ENDFOR _mm_slli_pi32 ^^^^^^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, int imm8 :Param ETypes: UI32 a, IMM imm8 .. code-block:: C __m64 _mm_slli_pi32(__m64 a, int imm8); .. admonition:: Intel Description Shift packed 32-bit integers in "a" left by "imm8" while shifting in zeros, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*32 IF imm8[7:0] > 31 dst[i+31:i] := 0 ELSE dst[i+31:i] := ZeroExtend32(a[i+31:i] << imm8[7:0]) FI ENDFOR _mm_sll_si64 ^^^^^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 count :Param ETypes: UI64 a, UI64 count .. code-block:: C __m64 _mm_sll_si64(__m64 a, __m64 count); .. admonition:: Intel Description Shift 64-bit integer "a" left by "count" while shifting in zeros, and store the result in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text IF count[63:0] > 63 dst[63:0] := 0 ELSE dst[63:0] := ZeroExtend64(a[63:0] << count[63:0]) FI _mm_slli_si64 ^^^^^^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, int imm8 :Param ETypes: UI64 a, IMM imm8 .. code-block:: C __m64 _mm_slli_si64(__m64 a, int imm8); .. admonition:: Intel Description Shift 64-bit integer "a" left by "imm8" while shifting in zeros, and store the result in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text IF imm8[7:0] > 63 dst[63:0] := 0 ELSE dst[63:0] := ZeroExtend64(a[63:0] << imm8[7:0]) FI _mm_sra_pi16 ^^^^^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 count :Param ETypes: UI16 a, UI16 count .. code-block:: C __m64 _mm_sra_pi16(__m64 a, __m64 count); .. admonition:: Intel Description Shift packed 16-bit integers in "a" right by "count" while shifting in sign bits, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 IF count[63:0] > 15 dst[i+15:i] := (a[i+15] ? 0xFFFF : 0x0) ELSE dst[i+15:i] := SignExtend16(a[i+15:i] >> count[63:0]) FI ENDFOR _mm_srai_pi16 ^^^^^^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, int imm8 :Param ETypes: UI16 a, IMM imm8 .. code-block:: C __m64 _mm_srai_pi16(__m64 a, int imm8); .. admonition:: Intel Description Shift packed 16-bit integers in "a" right by "imm8" while shifting in sign bits, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 IF imm8[7:0] > 15 dst[i+15:i] := (a[i+15] ? 0xFFFF : 0x0) ELSE dst[i+15:i] := SignExtend16(a[i+15:i] >> imm8[7:0]) FI ENDFOR _mm_sra_pi32 ^^^^^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 count :Param ETypes: UI32 a, UI32 count .. code-block:: C __m64 _mm_sra_pi32(__m64 a, __m64 count); .. admonition:: Intel Description Shift packed 32-bit integers in "a" right by "count" while shifting in sign bits, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*32 IF count[63:0] > 31 dst[i+31:i] := (a[i+31] ? 0xFFFFFFFF : 0x0) ELSE dst[i+31:i] := SignExtend32(a[i+31:i] >> count[63:0]) FI ENDFOR _mm_srai_pi32 ^^^^^^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, int imm8 :Param ETypes: UI32 a, IMM imm8 .. code-block:: C __m64 _mm_srai_pi32(__m64 a, int imm8); .. admonition:: Intel Description Shift packed 32-bit integers in "a" right by "imm8" while shifting in sign bits, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*32 IF imm8[7:0] > 31 dst[i+31:i] := (a[i+31] ? 0xFFFFFFFF : 0x0) ELSE dst[i+31:i] := SignExtend32(a[i+31:i] >> imm8[7:0]) FI ENDFOR _mm_srl_pi16 ^^^^^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 count :Param ETypes: UI16 a, UI16 count .. code-block:: C __m64 _mm_srl_pi16(__m64 a, __m64 count); .. admonition:: Intel Description Shift packed 16-bit integers in "a" right by "count" while shifting in zeros, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 IF count[63:0] > 15 dst[i+15:i] := 0 ELSE dst[i+15:i] := ZeroExtend16(a[i+15:i] >> count[63:0]) FI ENDFOR _mm_srli_pi16 ^^^^^^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, int imm8 :Param ETypes: UI16 a, IMM imm8 .. code-block:: C __m64 _mm_srli_pi16(__m64 a, int imm8); .. admonition:: Intel Description Shift packed 16-bit integers in "a" right by "imm8" while shifting in zeros, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 IF imm8[7:0] > 15 dst[i+15:i] := 0 ELSE dst[i+15:i] := ZeroExtend16(a[i+15:i] >> imm8[7:0]) FI ENDFOR _mm_srl_pi32 ^^^^^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 count :Param ETypes: UI32 a, UI32 count .. code-block:: C __m64 _mm_srl_pi32(__m64 a, __m64 count); .. admonition:: Intel Description Shift packed 32-bit integers in "a" right by "count" while shifting in zeros, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*32 IF count[63:0] > 31 dst[i+31:i] := 0 ELSE dst[i+31:i] := ZeroExtend32(a[i+31:i] >> count[63:0]) FI ENDFOR _mm_srli_pi32 ^^^^^^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, int imm8 :Param ETypes: UI32 a, IMM imm8 .. code-block:: C __m64 _mm_srli_pi32(__m64 a, int imm8); .. admonition:: Intel Description Shift packed 32-bit integers in "a" right by "imm8" while shifting in zeros, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*32 IF imm8[7:0] > 31 dst[i+31:i] := 0 ELSE dst[i+31:i] := ZeroExtend32(a[i+31:i] >> imm8[7:0]) FI ENDFOR _mm_srl_si64 ^^^^^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 count :Param ETypes: UI64 a, UI64 count .. code-block:: C __m64 _mm_srl_si64(__m64 a, __m64 count); .. admonition:: Intel Description Shift 64-bit integer "a" right by "count" while shifting in zeros, and store the result in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text IF count[63:0] > 63 dst[63:0] := 0 ELSE dst[63:0] := ZeroExtend64(a[63:0] >> count[63:0]) FI _mm_srli_si64 ^^^^^^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, int imm8 :Param ETypes: UI64 a, IMM imm8 .. code-block:: C __m64 _mm_srli_si64(__m64 a, int imm8); .. admonition:: Intel Description Shift 64-bit integer "a" right by "imm8" while shifting in zeros, and store the result in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text IF imm8[7:0] > 63 dst[63:0] := 0 ELSE dst[63:0] := ZeroExtend64(a[63:0] >> imm8[7:0]) FI MMX ~~~ _m_psllw ^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 count :Param ETypes: UI64 a, UI64 count .. code-block:: C __m64 _m_psllw(__m64 a, __m64 count); .. admonition:: Intel Description Shift packed 16-bit integers in "a" left by "count" while shifting in zeros, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 IF count[63:0] > 15 dst[i+15:i] := 0 ELSE dst[i+15:i] := ZeroExtend16(a[i+15:i] << count[63:0]) FI ENDFOR _m_psllwi ^^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, int imm8 :Param ETypes: UI64 a, IMM imm8 .. code-block:: C __m64 _m_psllwi(__m64 a, int imm8); .. admonition:: Intel Description Shift packed 16-bit integers in "a" left by "imm8" while shifting in zeros, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 IF imm8[7:0] > 15 dst[i+15:i] := 0 ELSE dst[i+15:i] := ZeroExtend16(a[i+15:i] << imm8[7:0]) FI ENDFOR _m_pslld ^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 count :Param ETypes: UI64 a, UI64 count .. code-block:: C __m64 _m_pslld(__m64 a, __m64 count); .. admonition:: Intel Description Shift packed 32-bit integers in "a" left by "count" while shifting in zeros, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*32 IF count[63:0] > 31 dst[i+31:i] := 0 ELSE dst[i+31:i] := ZeroExtend32(a[i+31:i] << count[63:0]) FI ENDFOR _m_pslldi ^^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, int imm8 :Param ETypes: UI64 a, IMM imm8 .. code-block:: C __m64 _m_pslldi(__m64 a, int imm8); .. admonition:: Intel Description Shift packed 32-bit integers in "a" left by "imm8" while shifting in zeros, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*32 IF imm8[7:0] > 31 dst[i+31:i] := 0 ELSE dst[i+31:i] := ZeroExtend32(a[i+31:i] << imm8[7:0]) FI ENDFOR _m_psllq ^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 count :Param ETypes: UI64 a, UI64 count .. code-block:: C __m64 _m_psllq(__m64 a, __m64 count); .. admonition:: Intel Description Shift 64-bit integer "a" left by "count" while shifting in zeros, and store the result in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text IF count[63:0] > 63 dst[63:0] := 0 ELSE dst[63:0] := ZeroExtend64(a[63:0] << count[63:0]) FI _m_psllqi ^^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, int imm8 :Param ETypes: UI64 a, IMM imm8 .. code-block:: C __m64 _m_psllqi(__m64 a, int imm8); .. admonition:: Intel Description Shift 64-bit integer "a" left by "imm8" while shifting in zeros, and store the result in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text IF imm8[7:0] > 63 dst[63:0] := 0 ELSE dst[63:0] := ZeroExtend64(a[63:0] << imm8[7:0]) FI _m_psraw ^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 count :Param ETypes: UI64 a, UI64 count .. code-block:: C __m64 _m_psraw(__m64 a, __m64 count); .. admonition:: Intel Description Shift packed 16-bit integers in "a" right by "count" while shifting in sign bits, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 IF count[63:0] > 15 dst[i+15:i] := (a[i+15] ? 0xFFFF : 0x0) ELSE dst[i+15:i] := SignExtend16(a[i+15:i] >> count[63:0]) FI ENDFOR _m_psrawi ^^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, int imm8 :Param ETypes: SI64 a, IMM imm8 .. code-block:: C __m64 _m_psrawi(__m64 a, int imm8); .. admonition:: Intel Description Shift packed 16-bit integers in "a" right by "imm8" while shifting in sign bits, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 IF imm8[7:0] > 15 dst[i+15:i] := (a[i+15] ? 0xFFFF : 0x0) ELSE dst[i+15:i] := SignExtend16(a[i+15:i] >> imm8[7:0]) FI ENDFOR _m_psrad ^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 count :Param ETypes: SI64 a, UI64 count .. code-block:: C __m64 _m_psrad(__m64 a, __m64 count); .. admonition:: Intel Description Shift packed 32-bit integers in "a" right by "count" while shifting in sign bits, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*32 IF count[63:0] > 31 dst[i+31:i] := (a[i+31] ? 0xFFFFFFFF : 0x0) ELSE dst[i+31:i] := SignExtend32(a[i+31:i] >> count[63:0]) FI ENDFOR _m_psradi ^^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, int imm8 :Param ETypes: SI64 a, IMM imm8 .. code-block:: C __m64 _m_psradi(__m64 a, int imm8); .. admonition:: Intel Description Shift packed 32-bit integers in "a" right by "imm8" while shifting in sign bits, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*32 IF imm8[7:0] > 31 dst[i+31:i] := (a[i+31] ? 0xFFFFFFFF : 0x0) ELSE dst[i+31:i] := SignExtend32(a[i+31:i] >> imm8[7:0]) FI ENDFOR _m_psrlw ^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 count :Param ETypes: UI64 a, UI64 count .. code-block:: C __m64 _m_psrlw(__m64 a, __m64 count); .. admonition:: Intel Description Shift packed 16-bit integers in "a" right by "count" while shifting in zeros, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 IF count[63:0] > 15 dst[i+15:i] := 0 ELSE dst[i+15:i] := ZeroExtend16(a[i+15:i] >> count[63:0]) FI ENDFOR _m_psrlwi ^^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, int imm8 :Param ETypes: UI64 a, IMM imm8 .. code-block:: C __m64 _m_psrlwi(__m64 a, int imm8); .. admonition:: Intel Description Shift packed 16-bit integers in "a" right by "imm8" while shifting in zeros, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 IF imm8[7:0] > 15 dst[i+15:i] := 0 ELSE dst[i+15:i] := ZeroExtend16(a[i+15:i] >> imm8[7:0]) FI ENDFOR _m_psrld ^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 count :Param ETypes: UI64 a, UI64 count .. code-block:: C __m64 _m_psrld(__m64 a, __m64 count); .. admonition:: Intel Description Shift packed 32-bit integers in "a" right by "count" while shifting in zeros, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*32 IF count[63:0] > 31 dst[i+31:i] := 0 ELSE dst[i+31:i] := ZeroExtend32(a[i+31:i] >> count[63:0]) FI ENDFOR _m_psrldi ^^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, int imm8 :Param ETypes: UI64 a, IMM imm8 .. code-block:: C __m64 _m_psrldi(__m64 a, int imm8); .. admonition:: Intel Description Shift packed 32-bit integers in "a" right by "imm8" while shifting in zeros, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*32 IF imm8[7:0] > 31 dst[i+31:i] := 0 ELSE dst[i+31:i] := ZeroExtend32(a[i+31:i] >> imm8[7:0]) FI ENDFOR _m_psrlq ^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 count :Param ETypes: UI64 a, UI64 count .. code-block:: C __m64 _m_psrlq(__m64 a, __m64 count); .. admonition:: Intel Description Shift 64-bit integer "a" right by "count" while shifting in zeros, and store the result in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text IF count[63:0] > 63 dst[63:0] := 0 ELSE dst[63:0] := ZeroExtend64(a[63:0] >> count[63:0]) FI _m_psrlqi ^^^^^^^^^ :Tech: MMX :Category: Shift :Header: mmintrin.h :Searchable: MMX-Shift-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, int imm8 :Param ETypes: UI64 a, IMM imm8 .. code-block:: C __m64 _m_psrlqi(__m64 a, int imm8); .. admonition:: Intel Description Shift 64-bit integer "a" right by "imm8" while shifting in zeros, and store the result in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text IF imm8[7:0] > 63 dst[63:0] := 0 ELSE dst[63:0] := ZeroExtend64(a[63:0] >> imm8[7:0]) FI General Support --------------- XMM ~~~ _mm_empty ^^^^^^^^^ :Tech: MMX :Category: General Support :Header: mmintrin.h :Searchable: MMX-General Support-XMM :Register: XMM 128 bit :Return Type: void .. code-block:: C void _mm_empty(void ); .. admonition:: Intel Description Empty the MMX state, which marks the x87 FPU registers as available for use by x87 instructions. This instruction must be used at the end of all MMX technology procedures. .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. MMX ~~~ _m_empty ^^^^^^^^ :Tech: MMX :Category: General Support :Header: mmintrin.h :Searchable: MMX-General Support-MMX :Register: MMX 64 bit :Return Type: void .. code-block:: C void _m_empty(void ); .. admonition:: Intel Description Empty the MMX state, which marks the x87 FPU registers as available for use by x87 instructions. This instruction must be used at the end of all MMX technology procedures. .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. Logical ------- XMM ~~~ _mm_and_si64 ^^^^^^^^^^^^ :Tech: MMX :Category: Logical :Header: mmintrin.h :Searchable: MMX-Logical-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _mm_and_si64(__m64 a, __m64 b); .. admonition:: Intel Description Compute the bitwise AND of 64 bits (representing integer data) in "a" and "b", and store the result in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[63:0] := (a[63:0] AND b[63:0]) _mm_andnot_si64 ^^^^^^^^^^^^^^^ :Tech: MMX :Category: Logical :Header: mmintrin.h :Searchable: MMX-Logical-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _mm_andnot_si64(__m64 a, __m64 b); .. admonition:: Intel Description Compute the bitwise NOT of 64 bits (representing integer data) in "a" and then AND with "b", and store the result in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[63:0] := ((NOT a[63:0]) AND b[63:0]) _mm_or_si64 ^^^^^^^^^^^ :Tech: MMX :Category: Logical :Header: mmintrin.h :Searchable: MMX-Logical-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _mm_or_si64(__m64 a, __m64 b); .. admonition:: Intel Description Compute the bitwise OR of 64 bits (representing integer data) in "a" and "b", and store the result in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[63:0] := (a[63:0] OR b[63:0]) _mm_xor_si64 ^^^^^^^^^^^^ :Tech: MMX :Category: Logical :Header: mmintrin.h :Searchable: MMX-Logical-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _mm_xor_si64(__m64 a, __m64 b); .. admonition:: Intel Description Compute the bitwise XOR of 64 bits (representing integer data) in "a" and "b", and store the result in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[63:0] := (a[63:0] XOR b[63:0]) MMX ~~~ _m_pand ^^^^^^^ :Tech: MMX :Category: Logical :Header: mmintrin.h :Searchable: MMX-Logical-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _m_pand(__m64 a, __m64 b); .. admonition:: Intel Description Compute the bitwise AND of 64 bits (representing integer data) in "a" and "b", and store the result in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[63:0] := (a[63:0] AND b[63:0]) _m_pandn ^^^^^^^^ :Tech: MMX :Category: Logical :Header: mmintrin.h :Searchable: MMX-Logical-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _m_pandn(__m64 a, __m64 b); .. admonition:: Intel Description Compute the bitwise NOT of 64 bits (representing integer data) in "a" and then AND with "b", and store the result in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[63:0] := ((NOT a[63:0]) AND b[63:0]) _m_por ^^^^^^ :Tech: MMX :Category: Logical :Header: mmintrin.h :Searchable: MMX-Logical-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _m_por(__m64 a, __m64 b); .. admonition:: Intel Description Compute the bitwise OR of 64 bits (representing integer data) in "a" and "b", and store the result in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[63:0] := (a[63:0] OR b[63:0]) _m_pxor ^^^^^^^ :Tech: MMX :Category: Logical :Header: mmintrin.h :Searchable: MMX-Logical-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _m_pxor(__m64 a, __m64 b); .. admonition:: Intel Description Compute the bitwise XOR of 64 bits (representing integer data) in "a" and "b", and store the result in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[63:0] := (a[63:0] XOR b[63:0]) Swizzle ------- XMM ~~~ _mm_unpackhi_pi8 ^^^^^^^^^^^^^^^^ :Tech: MMX :Category: Swizzle :Header: mmintrin.h :Searchable: MMX-Swizzle-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI8 a, UI8 b .. code-block:: C __m64 _mm_unpackhi_pi8(__m64 a, __m64 b); .. admonition:: Intel Description Unpack and interleave 8-bit integers from the high half of "a" and "b", and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text DEFINE INTERLEAVE_HIGH_BYTES(src1[63:0], src2[63:0]) { dst[7:0] := src1[39:32] dst[15:8] := src2[39:32] dst[23:16] := src1[47:40] dst[31:24] := src2[47:40] dst[39:32] := src1[55:48] dst[47:40] := src2[55:48] dst[55:48] := src1[63:56] dst[63:56] := src2[63:56] RETURN dst[63:0] } dst[63:0] := INTERLEAVE_HIGH_BYTES(a[63:0], b[63:0]) _mm_unpackhi_pi16 ^^^^^^^^^^^^^^^^^ :Tech: MMX :Category: Swizzle :Header: mmintrin.h :Searchable: MMX-Swizzle-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI16 a, UI16 b .. code-block:: C __m64 _mm_unpackhi_pi16(__m64 a, __m64 b); .. admonition:: Intel Description Unpack and interleave 16-bit integers from the high half of "a" and "b", and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text DEFINE INTERLEAVE_HIGH_WORDS(src1[63:0], src2[63:0]) { dst[15:0] := src1[47:32] dst[31:16] := src2[47:32] dst[47:32] := src1[63:48] dst[63:48] := src2[63:48] RETURN dst[63:0] } dst[63:0] := INTERLEAVE_HIGH_WORDS(a[63:0], b[63:0]) _mm_unpackhi_pi32 ^^^^^^^^^^^^^^^^^ :Tech: MMX :Category: Swizzle :Header: mmintrin.h :Searchable: MMX-Swizzle-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI32 a, UI32 b .. code-block:: C __m64 _mm_unpackhi_pi32(__m64 a, __m64 b); .. admonition:: Intel Description Unpack and interleave 32-bit integers from the high half of "a" and "b", and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[31:0] := a[63:32] dst[63:32] := b[63:32] _mm_unpacklo_pi8 ^^^^^^^^^^^^^^^^ :Tech: MMX :Category: Swizzle :Header: mmintrin.h :Searchable: MMX-Swizzle-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI8 a, UI8 b .. code-block:: C __m64 _mm_unpacklo_pi8(__m64 a, __m64 b); .. admonition:: Intel Description Unpack and interleave 8-bit integers from the low half of "a" and "b", and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text DEFINE INTERLEAVE_BYTES(src1[63:0], src2[63:0]) { dst[7:0] := src1[7:0] dst[15:8] := src2[7:0] dst[23:16] := src1[15:8] dst[31:24] := src2[15:8] dst[39:32] := src1[23:16] dst[47:40] := src2[23:16] dst[55:48] := src1[31:24] dst[63:56] := src2[31:24] RETURN dst[63:0] } dst[63:0] := INTERLEAVE_BYTES(a[63:0], b[63:0]) _mm_unpacklo_pi16 ^^^^^^^^^^^^^^^^^ :Tech: MMX :Category: Swizzle :Header: mmintrin.h :Searchable: MMX-Swizzle-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI16 a, UI16 b .. code-block:: C __m64 _mm_unpacklo_pi16(__m64 a, __m64 b); .. admonition:: Intel Description Unpack and interleave 16-bit integers from the low half of "a" and "b", and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text DEFINE INTERLEAVE_WORDS(src1[63:0], src2[63:0]) { dst[15:0] := src1[15:0] dst[31:16] := src2[15:0] dst[47:32] := src1[31:16] dst[63:48] := src2[31:16] RETURN dst[63:0] } dst[63:0] := INTERLEAVE_WORDS(a[63:0], b[63:0]) _mm_unpacklo_pi32 ^^^^^^^^^^^^^^^^^ :Tech: MMX :Category: Swizzle :Header: mmintrin.h :Searchable: MMX-Swizzle-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI32 a, UI32 b .. code-block:: C __m64 _mm_unpacklo_pi32(__m64 a, __m64 b); .. admonition:: Intel Description Unpack and interleave 32-bit integers from the low half of "a" and "b", and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[31:0] := a[31:0] dst[63:32] := b[31:0] MMX ~~~ _m_punpckhbw ^^^^^^^^^^^^ :Tech: MMX :Category: Swizzle :Header: mmintrin.h :Searchable: MMX-Swizzle-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _m_punpckhbw(__m64 a, __m64 b); .. admonition:: Intel Description Unpack and interleave 8-bit integers from the high half of "a" and "b", and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text DEFINE INTERLEAVE_HIGH_BYTES(src1[63:0], src2[63:0]) { dst[7:0] := src1[39:32] dst[15:8] := src2[39:32] dst[23:16] := src1[47:40] dst[31:24] := src2[47:40] dst[39:32] := src1[55:48] dst[47:40] := src2[55:48] dst[55:48] := src1[63:56] dst[63:56] := src2[63:56] RETURN dst[63:0] } dst[63:0] := INTERLEAVE_HIGH_BYTES(a[63:0], b[63:0]) _m_punpckhwd ^^^^^^^^^^^^ :Tech: MMX :Category: Swizzle :Header: mmintrin.h :Searchable: MMX-Swizzle-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _m_punpckhwd(__m64 a, __m64 b); .. admonition:: Intel Description Unpack and interleave 16-bit integers from the high half of "a" and "b", and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text DEFINE INTERLEAVE_HIGH_WORDS(src1[63:0], src2[63:0]) { dst[15:0] := src1[47:32] dst[31:16] := src2[47:32] dst[47:32] := src1[63:48] dst[63:48] := src2[63:48] RETURN dst[63:0] } dst[63:0] := INTERLEAVE_HIGH_WORDS(a[63:0], b[63:0]) _m_punpckhdq ^^^^^^^^^^^^ :Tech: MMX :Category: Swizzle :Header: mmintrin.h :Searchable: MMX-Swizzle-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _m_punpckhdq(__m64 a, __m64 b); .. admonition:: Intel Description Unpack and interleave 32-bit integers from the high half of "a" and "b", and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[31:0] := a[63:32] dst[63:32] := b[63:32] _m_punpcklbw ^^^^^^^^^^^^ :Tech: MMX :Category: Swizzle :Header: mmintrin.h :Searchable: MMX-Swizzle-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _m_punpcklbw(__m64 a, __m64 b); .. admonition:: Intel Description Unpack and interleave 8-bit integers from the low half of "a" and "b", and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text DEFINE INTERLEAVE_BYTES(src1[63:0], src2[63:0]) { dst[7:0] := src1[7:0] dst[15:8] := src2[7:0] dst[23:16] := src1[15:8] dst[31:24] := src2[15:8] dst[39:32] := src1[23:16] dst[47:40] := src2[23:16] dst[55:48] := src1[31:24] dst[63:56] := src2[31:24] RETURN dst[63:0] } dst[63:0] := INTERLEAVE_BYTES(a[63:0], b[63:0]) _m_punpcklwd ^^^^^^^^^^^^ :Tech: MMX :Category: Swizzle :Header: mmintrin.h :Searchable: MMX-Swizzle-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _m_punpcklwd(__m64 a, __m64 b); .. admonition:: Intel Description Unpack and interleave 16-bit integers from the low half of "a" and "b", and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text DEFINE INTERLEAVE_WORDS(src1[63:0], src2[63:0]) { dst[15:0] := src1[15:0] dst[31:16] := src2[15:0] dst[47:32] := src1[31:16] dst[63:48] := src2[31:16] RETURN dst[63:0] } dst[63:0] := INTERLEAVE_WORDS(a[63:0], b[63:0]) _m_punpckldq ^^^^^^^^^^^^ :Tech: MMX :Category: Swizzle :Header: mmintrin.h :Searchable: MMX-Swizzle-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _m_punpckldq(__m64 a, __m64 b); .. admonition:: Intel Description Unpack and interleave 32-bit integers from the low half of "a" and "b", and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[31:0] := a[31:0] dst[63:32] := b[31:0] Arithmetic ---------- XMM ~~~ _mm_add_pi8 ^^^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI8 a, UI8 b .. code-block:: C __m64 _mm_add_pi8(__m64 a, __m64 b); .. admonition:: Intel Description Add packed 8-bit integers in "a" and "b", and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 7 i := j*8 dst[i+7:i] := a[i+7:i] + b[i+7:i] ENDFOR _mm_add_pi16 ^^^^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI16 a, UI16 b .. code-block:: C __m64 _mm_add_pi16(__m64 a, __m64 b); .. admonition:: Intel Description Add packed 16-bit integers in "a" and "b", and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 dst[i+15:i] := a[i+15:i] + b[i+15:i] ENDFOR _mm_add_pi32 ^^^^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI32 a, UI32 b .. code-block:: C __m64 _mm_add_pi32(__m64 a, __m64 b); .. admonition:: Intel Description Add packed 32-bit integers in "a" and "b", and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*32 dst[i+31:i] := a[i+31:i] + b[i+31:i] ENDFOR _mm_adds_pi8 ^^^^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: SI8 a, SI8 b .. code-block:: C __m64 _mm_adds_pi8(__m64 a, __m64 b); .. admonition:: Intel Description Add packed signed 8-bit integers in "a" and "b" using saturation, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 7 i := j*8 dst[i+7:i] := Saturate8( a[i+7:i] + b[i+7:i] ) ENDFOR _mm_adds_pi16 ^^^^^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: SI16 a, SI16 b .. code-block:: C __m64 _mm_adds_pi16(__m64 a, __m64 b); .. admonition:: Intel Description Add packed signed 16-bit integers in "a" and "b" using saturation, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 dst[i+15:i] := Saturate16( a[i+15:i] + b[i+15:i] ) ENDFOR _mm_adds_pu8 ^^^^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI8 a, UI8 b .. code-block:: C __m64 _mm_adds_pu8(__m64 a, __m64 b); .. admonition:: Intel Description Add packed unsigned 8-bit integers in "a" and "b" using saturation, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 7 i := j*8 dst[i+7:i] := SaturateU8( a[i+7:i] + b[i+7:i] ) ENDFOR _mm_adds_pu16 ^^^^^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI16 a, UI16 b .. code-block:: C __m64 _mm_adds_pu16(__m64 a, __m64 b); .. admonition:: Intel Description Add packed unsigned 16-bit integers in "a" and "b" using saturation, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 dst[i+15:i] := SaturateU16( a[i+15:i] + b[i+15:i] ) ENDFOR _mm_sub_pi8 ^^^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI8 a, UI8 b .. code-block:: C __m64 _mm_sub_pi8(__m64 a, __m64 b); .. admonition:: Intel Description Subtract packed 8-bit integers in "b" from packed 8-bit integers in "a", and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 7 i := j*8 dst[i+7:i] := a[i+7:i] - b[i+7:i] ENDFOR _mm_sub_pi16 ^^^^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI16 a, UI16 b .. code-block:: C __m64 _mm_sub_pi16(__m64 a, __m64 b); .. admonition:: Intel Description Subtract packed 16-bit integers in "b" from packed 16-bit integers in "a", and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 dst[i+15:i] := a[i+15:i] - b[i+15:i] ENDFOR _mm_sub_pi32 ^^^^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI32 a, UI32 b .. code-block:: C __m64 _mm_sub_pi32(__m64 a, __m64 b); .. admonition:: Intel Description Subtract packed 32-bit integers in "b" from packed 32-bit integers in "a", and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*32 dst[i+31:i] := a[i+31:i] - b[i+31:i] ENDFOR _mm_subs_pi8 ^^^^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: SI8 a, SI8 b .. code-block:: C __m64 _mm_subs_pi8(__m64 a, __m64 b); .. admonition:: Intel Description Subtract packed signed 8-bit integers in "b" from packed 8-bit integers in "a" using saturation, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 7 i := j*8 dst[i+7:i] := Saturate8(a[i+7:i] - b[i+7:i]) ENDFOR _mm_subs_pi16 ^^^^^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: SI16 a, SI16 b .. code-block:: C __m64 _mm_subs_pi16(__m64 a, __m64 b); .. admonition:: Intel Description Subtract packed signed 16-bit integers in "b" from packed 16-bit integers in "a" using saturation, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 dst[i+15:i] := Saturate16(a[i+15:i] - b[i+15:i]) ENDFOR _mm_subs_pu8 ^^^^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI8 a, UI8 b .. code-block:: C __m64 _mm_subs_pu8(__m64 a, __m64 b); .. admonition:: Intel Description Subtract packed unsigned 8-bit integers in "b" from packed unsigned 8-bit integers in "a" using saturation, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 7 i := j*8 dst[i+7:i] := SaturateU8(a[i+7:i] - b[i+7:i]) ENDFOR _mm_subs_pu16 ^^^^^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI16 a, UI16 b .. code-block:: C __m64 _mm_subs_pu16(__m64 a, __m64 b); .. admonition:: Intel Description Subtract packed unsigned 16-bit integers in "b" from packed unsigned 16-bit integers in "a" using saturation, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 dst[i+15:i] := SaturateU16(a[i+15:i] - b[i+15:i]) ENDFOR _mm_madd_pi16 ^^^^^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: SI16 a, SI16 b .. code-block:: C __m64 _mm_madd_pi16(__m64 a, __m64 b); .. admonition:: Intel Description Multiply packed signed 16-bit integers in "a" and "b", producing intermediate signed 32-bit integers. Horizontally add adjacent pairs of intermediate 32-bit integers, and pack the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*32 dst[i+31:i] := SignExtend32(a[i+31:i+16]*b[i+31:i+16]) + SignExtend32(a[i+15:i]*b[i+15:i]) ENDFOR _mm_mulhi_pi16 ^^^^^^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: SI16 a, SI16 b .. code-block:: C __m64 _mm_mulhi_pi16(__m64 a, __m64 b); .. admonition:: Intel Description Multiply the packed signed 16-bit integers in "a" and "b", producing intermediate 32-bit integers, and store the high 16 bits of the intermediate integers in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 tmp[31:0] := SignExtend32(a[i+15:i]) * SignExtend32(b[i+15:i]) dst[i+15:i] := tmp[31:16] ENDFOR _mm_mullo_pi16 ^^^^^^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI16 a, UI16 b .. code-block:: C __m64 _mm_mullo_pi16(__m64 a, __m64 b); .. admonition:: Intel Description Multiply the packed 16-bit integers in "a" and "b", producing intermediate 32-bit integers, and store the low 16 bits of the intermediate integers in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 tmp[31:0] := a[i+15:i] * b[i+15:i] dst[i+15:i] := tmp[15:0] ENDFOR MMX ~~~ _m_paddb ^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _m_paddb(__m64 a, __m64 b); .. admonition:: Intel Description Add packed 8-bit integers in "a" and "b", and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 7 i := j*8 dst[i+7:i] := a[i+7:i] + b[i+7:i] ENDFOR _m_paddw ^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _m_paddw(__m64 a, __m64 b); .. admonition:: Intel Description Add packed 16-bit integers in "a" and "b", and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 dst[i+15:i] := a[i+15:i] + b[i+15:i] ENDFOR _m_paddd ^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _m_paddd(__m64 a, __m64 b); .. admonition:: Intel Description Add packed 32-bit integers in "a" and "b", and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*32 dst[i+31:i] := a[i+31:i] + b[i+31:i] ENDFOR _m_paddsb ^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: SI64 a, SI64 b .. code-block:: C __m64 _m_paddsb(__m64 a, __m64 b); .. admonition:: Intel Description Add packed signed 8-bit integers in "a" and "b" using saturation, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 7 i := j*8 dst[i+7:i] := Saturate8( a[i+7:i] + b[i+7:i] ) ENDFOR _m_paddsw ^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: SI64 a, SI64 b .. code-block:: C __m64 _m_paddsw(__m64 a, __m64 b); .. admonition:: Intel Description Add packed signed 16-bit integers in "a" and "b" using saturation, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 dst[i+15:i] := Saturate16( a[i+15:i] + b[i+15:i] ) ENDFOR _m_paddusb ^^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _m_paddusb(__m64 a, __m64 b); .. admonition:: Intel Description Add packed unsigned 8-bit integers in "a" and "b" using saturation, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 7 i := j*8 dst[i+7:i] := SaturateU8( a[i+7:i] + b[i+7:i] ) ENDFOR _m_paddusw ^^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _m_paddusw(__m64 a, __m64 b); .. admonition:: Intel Description Add packed unsigned 16-bit integers in "a" and "b" using saturation, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 dst[i+15:i] := SaturateU16( a[i+15:i] + b[i+15:i] ) ENDFOR _m_psubb ^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _m_psubb(__m64 a, __m64 b); .. admonition:: Intel Description Subtract packed 8-bit integers in "b" from packed 8-bit integers in "a", and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 7 i := j*8 dst[i+7:i] := a[i+7:i] - b[i+7:i] ENDFOR _m_psubw ^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _m_psubw(__m64 a, __m64 b); .. admonition:: Intel Description Subtract packed 16-bit integers in "b" from packed 16-bit integers in "a", and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 dst[i+15:i] := a[i+15:i] - b[i+15:i] ENDFOR _m_psubd ^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _m_psubd(__m64 a, __m64 b); .. admonition:: Intel Description Subtract packed 32-bit integers in "b" from packed 32-bit integers in "a", and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*32 dst[i+31:i] := a[i+31:i] - b[i+31:i] ENDFOR _m_psubsb ^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: SI64 a, SI64 b .. code-block:: C __m64 _m_psubsb(__m64 a, __m64 b); .. admonition:: Intel Description Subtract packed signed 8-bit integers in "b" from packed 8-bit integers in "a" using saturation, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 7 i := j*8 dst[i+7:i] := Saturate8(a[i+7:i] - b[i+7:i]) ENDFOR _m_psubsw ^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: SI64 a, SI64 b .. code-block:: C __m64 _m_psubsw(__m64 a, __m64 b); .. admonition:: Intel Description Subtract packed signed 16-bit integers in "b" from packed 16-bit integers in "a" using saturation, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 dst[i+15:i] := Saturate16(a[i+15:i] - b[i+15:i]) ENDFOR _m_psubusb ^^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _m_psubusb(__m64 a, __m64 b); .. admonition:: Intel Description Subtract packed unsigned 8-bit integers in "b" from packed unsigned 8-bit integers in "a" using saturation, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 7 i := j*8 dst[i+7:i] := SaturateU8(a[i+7:i] - b[i+7:i]) ENDFOR _m_psubusw ^^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _m_psubusw(__m64 a, __m64 b); .. admonition:: Intel Description Subtract packed unsigned 16-bit integers in "b" from packed unsigned 16-bit integers in "a" using saturation, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 dst[i+15:i] := SaturateU16(a[i+15:i] - b[i+15:i]) ENDFOR _m_pmaddwd ^^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: SI64 a, SI64 b .. code-block:: C __m64 _m_pmaddwd(__m64 a, __m64 b); .. admonition:: Intel Description Multiply packed signed 16-bit integers in "a" and "b", producing intermediate signed 32-bit integers. Horizontally add adjacent pairs of intermediate 32-bit integers, and pack the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*32 dst[i+31:i] := SignExtend32(a[i+31:i+16]*b[i+31:i+16]) + SignExtend32(a[i+15:i]*b[i+15:i]) ENDFOR _m_pmulhw ^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: SI64 a, SI64 b .. code-block:: C __m64 _m_pmulhw(__m64 a, __m64 b); .. admonition:: Intel Description Multiply the packed signed 16-bit integers in "a" and "b", producing intermediate 32-bit integers, and store the high 16 bits of the intermediate integers in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 tmp[31:0] := SignExtend32(a[i+15:i]) * SignExtend32(b[i+15:i]) dst[i+15:i] := tmp[31:16] ENDFOR _m_pmullw ^^^^^^^^^ :Tech: MMX :Category: Arithmetic :Header: mmintrin.h :Searchable: MMX-Arithmetic-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _m_pmullw(__m64 a, __m64 b); .. admonition:: Intel Description Multiply the packed 16-bit integers in "a" and "b", producing intermediate 32-bit integers, and store the low 16 bits of the intermediate integers in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 tmp[31:0] := a[i+15:i] * b[i+15:i] dst[i+15:i] := tmp[15:0] ENDFOR Compare ------- XMM ~~~ _mm_cmpeq_pi8 ^^^^^^^^^^^^^ :Tech: MMX :Category: Compare :Header: mmintrin.h :Searchable: MMX-Compare-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI8 a, UI8 b .. code-block:: C __m64 _mm_cmpeq_pi8(__m64 a, __m64 b); .. admonition:: Intel Description Compare packed 8-bit integers in "a" and "b" for equality, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 7 i := j*8 dst[i+7:i] := ( a[i+7:i] == b[i+7:i] ) ? 0xFF : 0 ENDFOR _mm_cmpeq_pi16 ^^^^^^^^^^^^^^ :Tech: MMX :Category: Compare :Header: mmintrin.h :Searchable: MMX-Compare-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI16 a, UI16 b .. code-block:: C __m64 _mm_cmpeq_pi16(__m64 a, __m64 b); .. admonition:: Intel Description Compare packed 16-bit integers in "a" and "b" for equality, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 dst[i+15:i] := ( a[i+15:i] == b[i+15:i] ) ? 0xFFFF : 0 ENDFOR _mm_cmpeq_pi32 ^^^^^^^^^^^^^^ :Tech: MMX :Category: Compare :Header: mmintrin.h :Searchable: MMX-Compare-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI32 a, UI32 b .. code-block:: C __m64 _mm_cmpeq_pi32(__m64 a, __m64 b); .. admonition:: Intel Description Compare packed 32-bit integers in "a" and "b" for equality, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*32 dst[i+31:i] := ( a[i+31:i] == b[i+31:i] ) ? 0xFFFFFFFF : 0 ENDFOR _mm_cmpgt_pi8 ^^^^^^^^^^^^^ :Tech: MMX :Category: Compare :Header: mmintrin.h :Searchable: MMX-Compare-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: SI8 a, SI8 b .. code-block:: C __m64 _mm_cmpgt_pi8(__m64 a, __m64 b); .. admonition:: Intel Description Compare packed signed 8-bit integers in "a" and "b" for greater-than, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 7 i := j*8 dst[i+7:i] := ( a[i+7:i] > b[i+7:i] ) ? 0xFF : 0 ENDFOR _mm_cmpgt_pi16 ^^^^^^^^^^^^^^ :Tech: MMX :Category: Compare :Header: mmintrin.h :Searchable: MMX-Compare-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: SI16 a, SI16 b .. code-block:: C __m64 _mm_cmpgt_pi16(__m64 a, __m64 b); .. admonition:: Intel Description Compare packed signed 16-bit integers in "a" and "b" for greater-than, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 dst[i+15:i] := ( a[i+15:i] > b[i+15:i] ) ? 0xFFFF : 0 ENDFOR _mm_cmpgt_pi32 ^^^^^^^^^^^^^^ :Tech: MMX :Category: Compare :Header: mmintrin.h :Searchable: MMX-Compare-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: SI32 a, SI32 b .. code-block:: C __m64 _mm_cmpgt_pi32(__m64 a, __m64 b); .. admonition:: Intel Description Compare packed signed 32-bit integers in "a" and "b" for greater-than, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*32 dst[i+31:i] := ( a[i+31:i] > b[i+31:i] ) ? 0xFFFFFFFF : 0 ENDFOR MMX ~~~ _m_pcmpeqb ^^^^^^^^^^ :Tech: MMX :Category: Compare :Header: mmintrin.h :Searchable: MMX-Compare-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _m_pcmpeqb(__m64 a, __m64 b); .. admonition:: Intel Description Compare packed 8-bit integers in "a" and "b" for equality, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 7 i := j*8 dst[i+7:i] := ( a[i+7:i] == b[i+7:i] ) ? 0xFF : 0 ENDFOR _m_pcmpeqw ^^^^^^^^^^ :Tech: MMX :Category: Compare :Header: mmintrin.h :Searchable: MMX-Compare-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _m_pcmpeqw(__m64 a, __m64 b); .. admonition:: Intel Description Compare packed 16-bit integers in "a" and "b" for equality, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 dst[i+15:i] := ( a[i+15:i] == b[i+15:i] ) ? 0xFFFF : 0 ENDFOR _m_pcmpeqd ^^^^^^^^^^ :Tech: MMX :Category: Compare :Header: mmintrin.h :Searchable: MMX-Compare-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: UI64 a, UI64 b .. code-block:: C __m64 _m_pcmpeqd(__m64 a, __m64 b); .. admonition:: Intel Description Compare packed 32-bit integers in "a" and "b" for equality, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*32 dst[i+31:i] := ( a[i+31:i] == b[i+31:i] ) ? 0xFFFFFFFF : 0 ENDFOR _m_pcmpgtb ^^^^^^^^^^ :Tech: MMX :Category: Compare :Header: mmintrin.h :Searchable: MMX-Compare-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: SI64 a, SI64 b .. code-block:: C __m64 _m_pcmpgtb(__m64 a, __m64 b); .. admonition:: Intel Description Compare packed 8-bit integers in "a" and "b" for greater-than, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 7 i := j*8 dst[i+7:i] := ( a[i+7:i] > b[i+7:i] ) ? 0xFF : 0 ENDFOR _m_pcmpgtw ^^^^^^^^^^ :Tech: MMX :Category: Compare :Header: mmintrin.h :Searchable: MMX-Compare-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: SI64 a, SI64 b .. code-block:: C __m64 _m_pcmpgtw(__m64 a, __m64 b); .. admonition:: Intel Description Compare packed 16-bit integers in "a" and "b" for greater-than, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 dst[i+15:i] := ( a[i+15:i] > b[i+15:i] ) ? 0xFFFF : 0 ENDFOR _m_pcmpgtd ^^^^^^^^^^ :Tech: MMX :Category: Compare :Header: mmintrin.h :Searchable: MMX-Compare-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: SI64 a, SI64 b .. code-block:: C __m64 _m_pcmpgtd(__m64 a, __m64 b); .. admonition:: Intel Description Compare packed 32-bit integers in "a" and "b" for greater-than, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*32 dst[i+31:i] := ( a[i+31:i] > b[i+31:i] ) ? 0xFFFFFFFF : 0 ENDFOR Set --- XMM ~~~ _mm_setzero_si64 ^^^^^^^^^^^^^^^^ :Tech: MMX :Category: Set :Header: mmintrin.h :Searchable: MMX-Set-XMM :Register: XMM 128 bit :Return Type: __m64 .. code-block:: C __m64 _mm_setzero_si64(void ); .. admonition:: Intel Description Return vector of type __m64 with all elements set to zero. .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[MAX:0] := 0 _mm_set_pi32 ^^^^^^^^^^^^ :Tech: MMX :Category: Set :Header: mmintrin.h :Searchable: MMX-Set-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: int e1, int e0 :Param ETypes: UI32 e1, UI32 e0 .. code-block:: C __m64 _mm_set_pi32(int e1, int e0); .. admonition:: Intel Description Set packed 32-bit integers in "dst" with the supplied values. .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[31:0] := e0 dst[63:32] := e1 _mm_set_pi16 ^^^^^^^^^^^^ :Tech: MMX :Category: Set :Header: mmintrin.h :Searchable: MMX-Set-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: short e3, short e2, short e1, short e0 :Param ETypes: UI16 e3, UI16 e2, UI16 e1, UI16 e0 .. code-block:: C __m64 _mm_set_pi16(short e3, short e2, short e1, short e0); .. admonition:: Intel Description Set packed 16-bit integers in "dst" with the supplied values. .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[15:0] := e0 dst[31:16] := e1 dst[47:32] := e2 dst[63:48] := e3 _mm_set_pi8 ^^^^^^^^^^^ :Tech: MMX :Category: Set :Header: mmintrin.h :Searchable: MMX-Set-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: char e7, char e6, char e5, char e4, char e3, char e2, char e1, char e0 :Param ETypes: UI8 e7, UI8 e6, UI8 e5, UI8 e4, UI8 e3, UI8 e2, UI8 e1, UI8 e0 .. code-block:: C __m64 _mm_set_pi8(char e7, char e6, char e5, char e4, char e3, char e2, char e1, char e0) .. admonition:: Intel Description Set packed 8-bit integers in "dst" with the supplied values. .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[7:0] := e0 dst[15:8] := e1 dst[23:16] := e2 dst[31:24] := e3 dst[39:32] := e4 dst[47:40] := e5 dst[55:48] := e6 dst[63:56] := e7 _mm_set1_pi32 ^^^^^^^^^^^^^ :Tech: MMX :Category: Set :Header: mmintrin.h :Searchable: MMX-Set-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: int a :Param ETypes: UI32 a .. code-block:: C __m64 _mm_set1_pi32(int a); .. admonition:: Intel Description Broadcast 32-bit integer "a" to all elements of "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*32 dst[i+31:i] := a[31:0] ENDFOR _mm_set1_pi16 ^^^^^^^^^^^^^ :Tech: MMX :Category: Set :Header: mmintrin.h :Searchable: MMX-Set-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: short a :Param ETypes: UI16 a .. code-block:: C __m64 _mm_set1_pi16(short a); .. admonition:: Intel Description Broadcast 16-bit integer "a" to all all elements of "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*16 dst[i+15:i] := a[15:0] ENDFOR _mm_set1_pi8 ^^^^^^^^^^^^ :Tech: MMX :Category: Set :Header: mmintrin.h :Searchable: MMX-Set-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: char a :Param ETypes: UI8 a .. code-block:: C __m64 _mm_set1_pi8(char a); .. admonition:: Intel Description Broadcast 8-bit integer "a" to all elements of "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 7 i := j*8 dst[i+7:i] := a[7:0] ENDFOR _mm_setr_pi32 ^^^^^^^^^^^^^ :Tech: MMX :Category: Set :Header: mmintrin.h :Searchable: MMX-Set-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: int e1, int e0 :Param ETypes: UI32 e1, UI32 e0 .. code-block:: C __m64 _mm_setr_pi32(int e1, int e0); .. admonition:: Intel Description Set packed 32-bit integers in "dst" with the supplied values in reverse order. .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[31:0] := e1 dst[63:32] := e0 _mm_setr_pi16 ^^^^^^^^^^^^^ :Tech: MMX :Category: Set :Header: mmintrin.h :Searchable: MMX-Set-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: short e3, short e2, short e1, short e0 :Param ETypes: UI16 e3, UI16 e2, UI16 e1, UI16 e0 .. code-block:: C __m64 _mm_setr_pi16(short e3, short e2, short e1, short e0); .. admonition:: Intel Description Set packed 16-bit integers in "dst" with the supplied values in reverse order. .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[15:0] := e3 dst[31:16] := e2 dst[47:32] := e1 dst[63:48] := e0 _mm_setr_pi8 ^^^^^^^^^^^^ :Tech: MMX :Category: Set :Header: mmintrin.h :Searchable: MMX-Set-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: char e7, char e6, char e5, char e4, char e3, char e2, char e1, char e0 :Param ETypes: UI8 e7, UI8 e6, UI8 e5, UI8 e4, UI8 e3, UI8 e2, UI8 e1, UI8 e0 .. code-block:: C __m64 _mm_setr_pi8(char e7, char e6, char e5, char e4, char e3, char e2, char e1, char e0) .. admonition:: Intel Description Set packed 8-bit integers in "dst" with the supplied values in reverse order. .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[7:0] := e7 dst[15:8] := e6 dst[23:16] := e5 dst[31:24] := e4 dst[39:32] := e3 dst[47:40] := e2 dst[55:48] := e1 dst[63:56] := e0 Convert ------- XMM ~~~ _mm_cvtsi32_si64 ^^^^^^^^^^^^^^^^ :Tech: MMX :Category: Convert :Header: mmintrin.h :Searchable: MMX-Convert-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: int a :Param ETypes: UI32 a .. code-block:: C __m64 _mm_cvtsi32_si64(int a); .. admonition:: Intel Description Copy 32-bit integer "a" to the lower elements of "dst", and zero the upper element of "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[31:0] := a[31:0] dst[63:32] := 0 _mm_cvtsi64_si32 ^^^^^^^^^^^^^^^^ :Tech: MMX :Category: Convert :Header: mmintrin.h :Searchable: MMX-Convert-XMM :Register: XMM 128 bit :Return Type: int :Param Types: __m64 a :Param ETypes: FP32 a .. code-block:: C int _mm_cvtsi64_si32(__m64 a); .. admonition:: Intel Description Copy the lower 32-bit integer in "a" to "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[31:0] := a[31:0] _mm_cvtm64_si64 ^^^^^^^^^^^^^^^ :Tech: MMX :Category: Convert :Header: mmintrin.h :Searchable: MMX-Convert-XMM :Register: XMM 128 bit :Return Type: __int64 :Param Types: __m64 a :Param ETypes: FP32 a .. code-block:: C __int64 _mm_cvtm64_si64(__m64 a); .. admonition:: Intel Description Copy 64-bit integer "a" to "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[63:0] := a[63:0] _mm_cvtsi64_m64 ^^^^^^^^^^^^^^^ :Tech: MMX :Category: Convert :Header: mmintrin.h :Searchable: MMX-Convert-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __int64 a :Param ETypes: UI64 a .. code-block:: C __m64 _mm_cvtsi64_m64(__int64 a); .. admonition:: Intel Description Copy 64-bit integer "a" to "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[63:0] := a[63:0] MMX ~~~ _m_from_int64 ^^^^^^^^^^^^^ :Tech: MMX :Category: Convert :Header: mmintrin.h :Searchable: MMX-Convert-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __int64 a :Param ETypes: UI64 a .. code-block:: C __m64 _m_from_int64(__int64 a); .. admonition:: Intel Description Copy 64-bit integer "a" to "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[63:0] := a[63:0] _m_to_int64 ^^^^^^^^^^^ :Tech: MMX :Category: Convert :Header: mmintrin.h :Searchable: MMX-Convert-MMX :Register: MMX 64 bit :Return Type: __int64 :Param Types: __m64 a :Param ETypes: FP32 a .. code-block:: C __int64 _m_to_int64(__m64 a); .. admonition:: Intel Description Copy 64-bit integer "a" to "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[63:0] := a[63:0] _m_from_int ^^^^^^^^^^^ :Tech: MMX :Category: Convert :Header: mmintrin.h :Searchable: MMX-Convert-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: int a :Param ETypes: UI32 a .. code-block:: C __m64 _m_from_int(int a); .. admonition:: Intel Description Copy 32-bit integer "a" to the lower elements of "dst", and zero the upper element of "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[31:0] := a[31:0] dst[63:32] := 0 _m_to_int ^^^^^^^^^ :Tech: MMX :Category: Convert :Header: mmintrin.h :Searchable: MMX-Convert-MMX :Register: MMX 64 bit :Return Type: int :Param Types: __m64 a :Param ETypes: FP32 a .. code-block:: C int _m_to_int(__m64 a); .. admonition:: Intel Description Copy the lower 32-bit integer in "a" to "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[31:0] := a[31:0] Miscellaneous ------------- XMM ~~~ _mm_packs_pi16 ^^^^^^^^^^^^^^ :Tech: MMX :Category: Miscellaneous :Header: mmintrin.h :Searchable: MMX-Miscellaneous-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: SI16 a, SI16 b .. code-block:: C __m64 _mm_packs_pi16(__m64 a, __m64 b); .. admonition:: Intel Description Convert packed signed 16-bit integers from "a" and "b" to packed 8-bit integers using signed saturation, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[7:0] := Saturate8(a[15:0]) dst[15:8] := Saturate8(a[31:16]) dst[23:16] := Saturate8(a[47:32]) dst[31:24] := Saturate8(a[63:48]) dst[39:32] := Saturate8(b[15:0]) dst[47:40] := Saturate8(b[31:16]) dst[55:48] := Saturate8(b[47:32]) dst[63:56] := Saturate8(b[63:48]) _mm_packs_pi32 ^^^^^^^^^^^^^^ :Tech: MMX :Category: Miscellaneous :Header: mmintrin.h :Searchable: MMX-Miscellaneous-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: SI32 a, SI32 b .. code-block:: C __m64 _mm_packs_pi32(__m64 a, __m64 b); .. admonition:: Intel Description Convert packed signed 32-bit integers from "a" and "b" to packed 16-bit integers using signed saturation, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[15:0] := Saturate16(a[31:0]) dst[31:16] := Saturate16(a[63:32]) dst[47:32] := Saturate16(b[31:0]) dst[63:48] := Saturate16(b[63:32]) _mm_packs_pu16 ^^^^^^^^^^^^^^ :Tech: MMX :Category: Miscellaneous :Header: mmintrin.h :Searchable: MMX-Miscellaneous-XMM :Register: XMM 128 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: SI16 a, SI16 b .. code-block:: C __m64 _mm_packs_pu16(__m64 a, __m64 b); .. admonition:: Intel Description Convert packed signed 16-bit integers from "a" and "b" to packed 8-bit integers using unsigned saturation, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[7:0] := SaturateU8(a[15:0]) dst[15:8] := SaturateU8(a[31:16]) dst[23:16] := SaturateU8(a[47:32]) dst[31:24] := SaturateU8(a[63:48]) dst[39:32] := SaturateU8(b[15:0]) dst[47:40] := SaturateU8(b[31:16]) dst[55:48] := SaturateU8(b[47:32]) dst[63:56] := SaturateU8(b[63:48]) MMX ~~~ _m_packsswb ^^^^^^^^^^^ :Tech: MMX :Category: Miscellaneous :Header: mmintrin.h :Searchable: MMX-Miscellaneous-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: SI16 a, SI16 b .. code-block:: C __m64 _m_packsswb(__m64 a, __m64 b); .. admonition:: Intel Description Convert packed signed 16-bit integers from "a" and "b" to packed 8-bit integers using signed saturation, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[7:0] := Saturate8(a[15:0]) dst[15:8] := Saturate8(a[31:16]) dst[23:16] := Saturate8(a[47:32]) dst[31:24] := Saturate8(a[63:48]) dst[39:32] := Saturate8(b[15:0]) dst[47:40] := Saturate8(b[31:16]) dst[55:48] := Saturate8(b[47:32]) dst[63:56] := Saturate8(b[63:48]) _m_packssdw ^^^^^^^^^^^ :Tech: MMX :Category: Miscellaneous :Header: mmintrin.h :Searchable: MMX-Miscellaneous-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: SI32 a, SI32 b .. code-block:: C __m64 _m_packssdw(__m64 a, __m64 b); .. admonition:: Intel Description Convert packed signed 32-bit integers from "a" and "b" to packed 16-bit integers using signed saturation, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[15:0] := Saturate16(a[31:0]) dst[31:16] := Saturate16(a[63:32]) dst[47:32] := Saturate16(b[31:0]) dst[63:48] := Saturate16(b[63:32]) _m_packuswb ^^^^^^^^^^^ :Tech: MMX :Category: Miscellaneous :Header: mmintrin.h :Searchable: MMX-Miscellaneous-MMX :Register: MMX 64 bit :Return Type: __m64 :Param Types: __m64 a, __m64 b :Param ETypes: SI16 a, SI16 b .. code-block:: C __m64 _m_packuswb(__m64 a, __m64 b); .. admonition:: Intel Description Convert packed signed 16-bit integers from "a" and "b" to packed 8-bit integers using unsigned saturation, and store the results in "dst". .. deprecated:: X87 MMX technology intrinsics can cause issues on modern processors and should generally be avoided. Use SSE2, AVX, or later instruction sets instead, especially when targeting modern processors. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[7:0] := SaturateU8(a[15:0]) dst[15:8] := SaturateU8(a[31:16]) dst[23:16] := SaturateU8(a[47:32]) dst[31:24] := SaturateU8(a[63:48]) dst[39:32] := SaturateU8(b[15:0]) dst[47:40] := SaturateU8(b[31:16]) dst[55:48] := SaturateU8(b[47:32]) dst[63:56] := SaturateU8(b[63:48]) SSE_ALL ======= Shift ----- XMM ~~~ _mm_slli_si128 ^^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Shift :Header: emmintrin.h :Searchable: SSE_ALL-Shift-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128i a, int imm8 :Param ETypes: M128 a, IMM imm8 .. code-block:: C __m128i _mm_slli_si128(__m128i a, int imm8); .. admonition:: Intel Description Shift "a" left by "imm8" bytes while shifting in zeros, and store the results in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text tmp := imm8[7:0] IF tmp > 15 tmp := 16 FI dst[127:0] := a[127:0] << (tmp*8) _mm_bslli_si128 ^^^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Shift :Header: emmintrin.h :Searchable: SSE_ALL-Shift-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128i a, int imm8 :Param ETypes: M128 a, IMM imm8 .. code-block:: C __m128i _mm_bslli_si128(__m128i a, int imm8); .. admonition:: Intel Description Shift "a" left by "imm8" bytes while shifting in zeros, and store the results in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text tmp := imm8[7:0] IF tmp > 15 tmp := 16 FI dst[127:0] := a[127:0] << (tmp*8) _mm_bsrli_si128 ^^^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Shift :Header: emmintrin.h :Searchable: SSE_ALL-Shift-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128i a, int imm8 :Param ETypes: M128 a, IMM imm8 .. code-block:: C __m128i _mm_bsrli_si128(__m128i a, int imm8); .. admonition:: Intel Description Shift "a" right by "imm8" bytes while shifting in zeros, and store the results in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text tmp := imm8[7:0] IF tmp > 15 tmp := 16 FI dst[127:0] := a[127:0] >> (tmp*8) _mm_slli_epi16 ^^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Shift :Header: emmintrin.h :Searchable: SSE_ALL-Shift-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128i a, int imm8 :Param ETypes: UI16 a, IMM imm8 .. code-block:: C __m128i _mm_slli_epi16(__m128i a, int imm8); .. admonition:: Intel Description Shift packed 16-bit integers in "a" left by "imm8" while shifting in zeros, and store the results in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 7 i := j*16 IF imm8[7:0] > 15 dst[i+15:i] := 0 ELSE dst[i+15:i] := ZeroExtend16(a[i+15:i] << imm8[7:0]) FI ENDFOR _mm_sll_epi16 ^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Shift :Header: emmintrin.h :Searchable: SSE_ALL-Shift-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128i a, __m128i count :Param ETypes: UI16 a, UI16 count .. code-block:: C __m128i _mm_sll_epi16(__m128i a, __m128i count); .. admonition:: Intel Description Shift packed 16-bit integers in "a" left by "count" while shifting in zeros, and store the results in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 7 i := j*16 IF count[63:0] > 15 dst[i+15:i] := 0 ELSE dst[i+15:i] := ZeroExtend16(a[i+15:i] << count[63:0]) FI ENDFOR _mm_slli_epi32 ^^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Shift :Header: emmintrin.h :Searchable: SSE_ALL-Shift-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128i a, int imm8 :Param ETypes: UI32 a, IMM imm8 .. code-block:: C __m128i _mm_slli_epi32(__m128i a, int imm8); .. admonition:: Intel Description Shift packed 32-bit integers in "a" left by "imm8" while shifting in zeros, and store the results in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*32 IF imm8[7:0] > 31 dst[i+31:i] := 0 ELSE dst[i+31:i] := ZeroExtend32(a[i+31:i] << imm8[7:0]) FI ENDFOR _mm_sll_epi32 ^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Shift :Header: emmintrin.h :Searchable: SSE_ALL-Shift-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128i a, __m128i count :Param ETypes: UI32 a, UI32 count .. code-block:: C __m128i _mm_sll_epi32(__m128i a, __m128i count); .. admonition:: Intel Description Shift packed 32-bit integers in "a" left by "count" while shifting in zeros, and store the results in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*32 IF count[63:0] > 31 dst[i+31:i] := 0 ELSE dst[i+31:i] := ZeroExtend32(a[i+31:i] << count[63:0]) FI ENDFOR _mm_slli_epi64 ^^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Shift :Header: emmintrin.h :Searchable: SSE_ALL-Shift-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128i a, int imm8 :Param ETypes: UI64 a, IMM imm8 .. code-block:: C __m128i _mm_slli_epi64(__m128i a, int imm8); .. admonition:: Intel Description Shift packed 64-bit integers in "a" left by "imm8" while shifting in zeros, and store the results in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*64 IF imm8[7:0] > 63 dst[i+63:i] := 0 ELSE dst[i+63:i] := ZeroExtend64(a[i+63:i] << imm8[7:0]) FI ENDFOR _mm_sll_epi64 ^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Shift :Header: emmintrin.h :Searchable: SSE_ALL-Shift-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128i a, __m128i count :Param ETypes: UI64 a, UI64 count .. code-block:: C __m128i _mm_sll_epi64(__m128i a, __m128i count); .. admonition:: Intel Description Shift packed 64-bit integers in "a" left by "count" while shifting in zeros, and store the results in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*64 IF count[63:0] > 63 dst[i+63:i] := 0 ELSE dst[i+63:i] := ZeroExtend64(a[i+63:i] << count[63:0]) FI ENDFOR _mm_srai_epi16 ^^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Shift :Header: emmintrin.h :Searchable: SSE_ALL-Shift-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128i a, int imm8 :Param ETypes: SI16 a, IMM imm8 .. code-block:: C __m128i _mm_srai_epi16(__m128i a, int imm8); .. admonition:: Intel Description Shift packed 16-bit integers in "a" right by "imm8" while shifting in sign bits, and store the results in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 7 i := j*16 IF imm8[7:0] > 15 dst[i+15:i] := (a[i+15] ? 0xFFFF : 0x0) ELSE dst[i+15:i] := SignExtend16(a[i+15:i] >> imm8[7:0]) FI ENDFOR _mm_sra_epi16 ^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Shift :Header: emmintrin.h :Searchable: SSE_ALL-Shift-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128i a, __m128i count :Param ETypes: UI16 a, UI16 count .. code-block:: C __m128i _mm_sra_epi16(__m128i a, __m128i count); .. admonition:: Intel Description Shift packed 16-bit integers in "a" right by "count" while shifting in sign bits, and store the results in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 7 i := j*16 IF count[63:0] > 15 dst[i+15:i] := (a[i+15] ? 0xFFFF : 0x0) ELSE dst[i+15:i] := SignExtend16(a[i+15:i] >> count[63:0]) FI ENDFOR _mm_srai_epi32 ^^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Shift :Header: emmintrin.h :Searchable: SSE_ALL-Shift-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128i a, int imm8 :Param ETypes: SI32 a, IMM imm8 .. code-block:: C __m128i _mm_srai_epi32(__m128i a, int imm8); .. admonition:: Intel Description Shift packed 32-bit integers in "a" right by "imm8" while shifting in sign bits, and store the results in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*32 IF imm8[7:0] > 31 dst[i+31:i] := (a[i+31] ? 0xFFFFFFFF : 0x0) ELSE dst[i+31:i] := SignExtend32(a[i+31:i] >> imm8[7:0]) FI ENDFOR _mm_sra_epi32 ^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Shift :Header: emmintrin.h :Searchable: SSE_ALL-Shift-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128i a, __m128i count :Param ETypes: UI32 a, UI32 count .. code-block:: C __m128i _mm_sra_epi32(__m128i a, __m128i count); .. admonition:: Intel Description Shift packed 32-bit integers in "a" right by "count" while shifting in sign bits, and store the results in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*32 IF count[63:0] > 31 dst[i+31:i] := (a[i+31] ? 0xFFFFFFFF : 0x0) ELSE dst[i+31:i] := SignExtend32(a[i+31:i] >> count[63:0]) FI ENDFOR _mm_srli_si128 ^^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Shift :Header: emmintrin.h :Searchable: SSE_ALL-Shift-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128i a, int imm8 :Param ETypes: M128 a, IMM imm8 .. code-block:: C __m128i _mm_srli_si128(__m128i a, int imm8); .. admonition:: Intel Description Shift "a" right by "imm8" bytes while shifting in zeros, and store the results in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text tmp := imm8[7:0] IF tmp > 15 tmp := 16 FI dst[127:0] := a[127:0] >> (tmp*8) _mm_srli_epi16 ^^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Shift :Header: emmintrin.h :Searchable: SSE_ALL-Shift-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128i a, int imm8 :Param ETypes: UI16 a, IMM imm8 .. code-block:: C __m128i _mm_srli_epi16(__m128i a, int imm8); .. admonition:: Intel Description Shift packed 16-bit integers in "a" right by "imm8" while shifting in zeros, and store the results in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 7 i := j*16 IF imm8[7:0] > 15 dst[i+15:i] := 0 ELSE dst[i+15:i] := ZeroExtend16(a[i+15:i] >> imm8[7:0]) FI ENDFOR _mm_srl_epi16 ^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Shift :Header: emmintrin.h :Searchable: SSE_ALL-Shift-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128i a, __m128i count :Param ETypes: UI16 a, UI16 count .. code-block:: C __m128i _mm_srl_epi16(__m128i a, __m128i count); .. admonition:: Intel Description Shift packed 16-bit integers in "a" right by "count" while shifting in zeros, and store the results in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 7 i := j*16 IF count[63:0] > 15 dst[i+15:i] := 0 ELSE dst[i+15:i] := ZeroExtend16(a[i+15:i] >> count[63:0]) FI ENDFOR _mm_srli_epi32 ^^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Shift :Header: emmintrin.h :Searchable: SSE_ALL-Shift-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128i a, int imm8 :Param ETypes: UI32 a, IMM imm8 .. code-block:: C __m128i _mm_srli_epi32(__m128i a, int imm8); .. admonition:: Intel Description Shift packed 32-bit integers in "a" right by "imm8" while shifting in zeros, and store the results in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*32 IF imm8[7:0] > 31 dst[i+31:i] := 0 ELSE dst[i+31:i] := ZeroExtend32(a[i+31:i] >> imm8[7:0]) FI ENDFOR _mm_srl_epi32 ^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Shift :Header: emmintrin.h :Searchable: SSE_ALL-Shift-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128i a, __m128i count :Param ETypes: UI32 a, UI32 count .. code-block:: C __m128i _mm_srl_epi32(__m128i a, __m128i count); .. admonition:: Intel Description Shift packed 32-bit integers in "a" right by "count" while shifting in zeros, and store the results in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 3 i := j*32 IF count[63:0] > 31 dst[i+31:i] := 0 ELSE dst[i+31:i] := ZeroExtend32(a[i+31:i] >> count[63:0]) FI ENDFOR _mm_srli_epi64 ^^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Shift :Header: emmintrin.h :Searchable: SSE_ALL-Shift-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128i a, int imm8 :Param ETypes: UI64 a, IMM imm8 .. code-block:: C __m128i _mm_srli_epi64(__m128i a, int imm8); .. admonition:: Intel Description Shift packed 64-bit integers in "a" right by "imm8" while shifting in zeros, and store the results in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*64 IF imm8[7:0] > 63 dst[i+63:i] := 0 ELSE dst[i+63:i] := ZeroExtend64(a[i+63:i] >> imm8[7:0]) FI ENDFOR _mm_srl_epi64 ^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Shift :Header: emmintrin.h :Searchable: SSE_ALL-Shift-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128i a, __m128i count :Param ETypes: UI64 a, UI64 count .. code-block:: C __m128i _mm_srl_epi64(__m128i a, __m128i count); .. admonition:: Intel Description Shift packed 64-bit integers in "a" right by "count" while shifting in zeros, and store the results in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text FOR j := 0 to 1 i := j*64 IF count[63:0] > 63 dst[i+63:i] := 0 ELSE dst[i+63:i] := ZeroExtend64(a[i+63:i] >> count[63:0]) FI ENDFOR Cryptography ------------ XMM ~~~ _mm_crc32_u8 ^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Cryptography :Header: nmmintrin.h :Searchable: SSE_ALL-Cryptography-XMM :Register: XMM 128 bit :Return Type: unsigned int :Param Types: unsigned int crc, unsigned char v :Param ETypes: UI32 crc, UI8 v .. code-block:: C unsigned int _mm_crc32_u8(unsigned int crc, unsigned char v); .. admonition:: Intel Description Starting with the initial value in "crc", accumulates a CRC32 value for unsigned 8-bit integer "v", and stores the result in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text tmp1[7:0] := v[0:7] // bit reflection tmp2[31:0] := crc[0:31] // bit reflection tmp3[39:0] := tmp1[7:0] << 32 tmp4[39:0] := tmp2[31:0] << 8 tmp5[39:0] := tmp3[39:0] XOR tmp4[39:0] tmp6[31:0] := MOD2(tmp5[39:0], 0x11EDC6F41) // remainder from polynomial division modulus 2 dst[31:0] := tmp6[0:31] // bit reflection _mm_crc32_u16 ^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Cryptography :Header: nmmintrin.h :Searchable: SSE_ALL-Cryptography-XMM :Register: XMM 128 bit :Return Type: unsigned int :Param Types: unsigned int crc, unsigned short v :Param ETypes: UI32 crc, UI16 v .. code-block:: C unsigned int _mm_crc32_u16(unsigned int crc, unsigned short v); .. admonition:: Intel Description Starting with the initial value in "crc", accumulates a CRC32 value for unsigned 16-bit integer "v", and stores the result in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text tmp1[15:0] := v[0:15] // bit reflection tmp2[31:0] := crc[0:31] // bit reflection tmp3[47:0] := tmp1[15:0] << 32 tmp4[47:0] := tmp2[31:0] << 16 tmp5[47:0] := tmp3[47:0] XOR tmp4[47:0] tmp6[31:0] := MOD2(tmp5[47:0], 0x11EDC6F41) // remainder from polynomial division modulus 2 dst[31:0] := tmp6[0:31] // bit reflection _mm_crc32_u32 ^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Cryptography :Header: nmmintrin.h :Searchable: SSE_ALL-Cryptography-XMM :Register: XMM 128 bit :Return Type: unsigned int :Param Types: unsigned int crc, unsigned int v :Param ETypes: UI32 crc, UI32 v .. code-block:: C unsigned int _mm_crc32_u32(unsigned int crc, unsigned int v); .. admonition:: Intel Description Starting with the initial value in "crc", accumulates a CRC32 value for unsigned 32-bit integer "v", and stores the result in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text tmp1[31:0] := v[0:31] // bit reflection tmp2[31:0] := crc[0:31] // bit reflection tmp3[63:0] := tmp1[31:0] << 32 tmp4[63:0] := tmp2[31:0] << 32 tmp5[63:0] := tmp3[63:0] XOR tmp4[63:0] tmp6[31:0] := MOD2(tmp5[63:0], 0x11EDC6F41) // remainder from polynomial division modulus 2 dst[31:0] := tmp6[0:31] // bit reflection _mm_crc32_u64 ^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Cryptography :Header: nmmintrin.h :Searchable: SSE_ALL-Cryptography-XMM :Register: XMM 128 bit :Return Type: unsigned __int64 :Param Types: unsigned __int64 crc, unsigned __int64 v :Param ETypes: UI64 crc, UI64 v .. code-block:: C unsigned __int64 _mm_crc32_u64(unsigned __int64 crc, unsigned __int64 v); .. admonition:: Intel Description Starting with the initial value in "crc", accumulates a CRC32 value for unsigned 64-bit integer "v", and stores the result in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text tmp1[63:0] := v[0:63] // bit reflection tmp2[31:0] := crc[0:31] // bit reflection tmp3[95:0] := tmp1[31:0] << 32 tmp4[95:0] := tmp2[63:0] << 64 tmp5[95:0] := tmp3[95:0] XOR tmp4[95:0] tmp6[31:0] := MOD2(tmp5[95:0], 0x11EDC6F41) // remainder from polynomial division modulus 2 dst[31:0] := tmp6[0:31] // bit reflection Move ---- XMM ~~~ _mm_move_ss ^^^^^^^^^^^ :Tech: SSE_ALL :Category: Move :Header: xmmintrin.h :Searchable: SSE_ALL-Move-XMM :Register: XMM 128 bit :Return Type: __m128 :Param Types: __m128 a, __m128 b :Param ETypes: FP32 a, FP32 b .. code-block:: C __m128 _mm_move_ss(__m128 a, __m128 b); .. admonition:: Intel Description Move the lower single-precision (32-bit) floating-point element from "b" to the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[31:0] := b[31:0] dst[127:32] := a[127:32] _mm_movehl_ps ^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Move :Header: xmmintrin.h :Searchable: SSE_ALL-Move-XMM :Register: XMM 128 bit :Return Type: __m128 :Param Types: __m128 a, __m128 b :Param ETypes: FP32 a, FP32 b .. code-block:: C __m128 _mm_movehl_ps(__m128 a, __m128 b); .. admonition:: Intel Description Move the upper 2 single-precision (32-bit) floating-point elements from "b" to the lower 2 elements of "dst", and copy the upper 2 elements from "a" to the upper 2 elements of "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[31:0] := b[95:64] dst[63:32] := b[127:96] dst[95:64] := a[95:64] dst[127:96] := a[127:96] _mm_movelh_ps ^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Move :Header: xmmintrin.h :Searchable: SSE_ALL-Move-XMM :Register: XMM 128 bit :Return Type: __m128 :Param Types: __m128 a, __m128 b :Param ETypes: FP32 a, FP32 b .. code-block:: C __m128 _mm_movelh_ps(__m128 a, __m128 b); .. admonition:: Intel Description Move the lower 2 single-precision (32-bit) floating-point elements from "b" to the upper 2 elements of "dst", and copy the lower 2 elements from "a" to the lower 2 elements of "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[31:0] := a[31:0] dst[63:32] := a[63:32] dst[95:64] := b[31:0] dst[127:96] := b[63:32] _mm_movpi64_epi64 ^^^^^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Move :Header: emmintrin.h :Searchable: SSE_ALL-Move-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m64 a :Param ETypes: UI64 a .. code-block:: C __m128i _mm_movpi64_epi64(__m64 a); .. admonition:: Intel Description Copy the 64-bit integer "a" to the lower element of "dst", and zero the upper element. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[63:0] := a[63:0] dst[127:64] := 0 _mm_move_epi64 ^^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Move :Header: emmintrin.h :Searchable: SSE_ALL-Move-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128i a :Param ETypes: UI64 a .. code-block:: C __m128i _mm_move_epi64(__m128i a); .. admonition:: Intel Description Copy the lower 64-bit integer in "a" to the lower element of "dst", and zero the upper element. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[63:0] := a[63:0] dst[127:64] := 0 _mm_move_sd ^^^^^^^^^^^ :Tech: SSE_ALL :Category: Move :Header: emmintrin.h :Searchable: SSE_ALL-Move-XMM :Register: XMM 128 bit :Return Type: __m128d :Param Types: __m128d a, __m128d b :Param ETypes: FP64 a, FP64 b .. code-block:: C __m128d _mm_move_sd(__m128d a, __m128d b); .. admonition:: Intel Description Move the lower double-precision (64-bit) floating-point element from "b" to the lower element of "dst", and copy the upper element from "a" to the upper element of "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[63:0] := b[63:0] dst[127:64] := a[127:64] _mm_movedup_pd ^^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Move :Header: pmmintrin.h :Searchable: SSE_ALL-Move-XMM :Register: XMM 128 bit :Return Type: __m128d :Param Types: __m128d a :Param ETypes: FP64 a .. code-block:: C __m128d _mm_movedup_pd(__m128d a); .. admonition:: Intel Description Duplicate the low double-precision (64-bit) floating-point element from "a", and store the results in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[63:0] := a[63:0] dst[127:64] := a[63:0] _mm_movehdup_ps ^^^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Move :Header: pmmintrin.h :Searchable: SSE_ALL-Move-XMM :Register: XMM 128 bit :Return Type: __m128 :Param Types: __m128 a :Param ETypes: FP32 a .. code-block:: C __m128 _mm_movehdup_ps(__m128 a); .. admonition:: Intel Description Duplicate odd-indexed single-precision (32-bit) floating-point elements from "a", and store the results in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[31:0] := a[63:32] dst[63:32] := a[63:32] dst[95:64] := a[127:96] dst[127:96] := a[127:96] _mm_moveldup_ps ^^^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Move :Header: pmmintrin.h :Searchable: SSE_ALL-Move-XMM :Register: XMM 128 bit :Return Type: __m128 :Param Types: __m128 a :Param ETypes: FP32 a .. code-block:: C __m128 _mm_moveldup_ps(__m128 a); .. admonition:: Intel Description Duplicate even-indexed single-precision (32-bit) floating-point elements from "a", and store the results in "dst". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[31:0] := a[31:0] dst[63:32] := a[31:0] dst[95:64] := a[95:64] dst[127:96] := a[95:64] Cast ---- XMM ~~~ _mm_castpd_ps ^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Cast :Header: emmintrin.h :Searchable: SSE_ALL-Cast-XMM :Register: XMM 128 bit :Return Type: __m128 :Param Types: __m128d a :Param ETypes: FP64 a .. code-block:: C __m128 _mm_castpd_ps(__m128d a); .. admonition:: Intel Description Cast vector of type __m128d to type __m128. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. _mm_castpd_si128 ^^^^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Cast :Header: emmintrin.h :Searchable: SSE_ALL-Cast-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128d a :Param ETypes: FP64 a .. code-block:: C __m128i _mm_castpd_si128(__m128d a); .. admonition:: Intel Description Cast vector of type __m128d to type __m128i. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. _mm_castps_pd ^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Cast :Header: emmintrin.h :Searchable: SSE_ALL-Cast-XMM :Register: XMM 128 bit :Return Type: __m128d :Param Types: __m128 a :Param ETypes: FP32 a .. code-block:: C __m128d _mm_castps_pd(__m128 a); .. admonition:: Intel Description Cast vector of type __m128 to type __m128d. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. _mm_castps_si128 ^^^^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Cast :Header: emmintrin.h :Searchable: SSE_ALL-Cast-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128 a :Param ETypes: FP32 a .. code-block:: C __m128i _mm_castps_si128(__m128 a); .. admonition:: Intel Description Cast vector of type __m128 to type __m128i. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. _mm_castsi128_pd ^^^^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Cast :Header: emmintrin.h :Searchable: SSE_ALL-Cast-XMM :Register: XMM 128 bit :Return Type: __m128d :Param Types: __m128i a :Param ETypes: UI64 a .. code-block:: C __m128d _mm_castsi128_pd(__m128i a); .. admonition:: Intel Description Cast vector of type __m128i to type __m128d. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. _mm_castsi128_ps ^^^^^^^^^^^^^^^^ :Tech: SSE_ALL :Category: Cast :Header: emmintrin.h :Searchable: SSE_ALL-Cast-XMM :Register: XMM 128 bit :Return Type: __m128 :Param Types: __m128i a :Param ETypes: UI32 a .. code-block:: C __m128 _mm_castsi128_ps(__m128i a); .. admonition:: Intel Description Cast vector of type __m128i to type __m128. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. String Compare -------------- XMM ~~~ _mm_cmpistrm ^^^^^^^^^^^^ :Tech: SSE_ALL :Category: String Compare :Header: nmmintrin.h :Searchable: SSE_ALL-String Compare-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128i a, __m128i b, const int imm8 :Param ETypes: M128 a, M128 b, IMM imm8 .. code-block:: C __m128i _mm_cmpistrm(__m128i a, __m128i b, const int imm8); .. admonition:: Intel Description Compare packed strings with implicit lengths in "a" and "b" using the control in "imm8", and store the generated mask in "dst". [strcmp_note] .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text size := (imm8[0] ? 16 : 8) // 8 or 16-bit characters UpperBound := (128 / size) - 1 BoolRes := 0 // compare all characters aInvalid := 0 bInvalid := 0 FOR i := 0 to UpperBound m := i*size FOR j := 0 to UpperBound n := j*size BoolRes.word[i].bit[j] := (a[m+size-1:m] == b[n+size-1:n]) ? 1 : 0 // invalidate characters after EOS IF a[m+size-1:m] == 0 aInvalid := 1 FI IF b[n+size-1:n] == 0 bInvalid := 1 FI // override comparisons for invalid characters CASE (imm8[3:2]) OF 0: // equal any IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 FI 1: // ranges IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 FI 2: // equal each IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 1 FI 3: // equal ordered IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 1 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 1 FI ESAC ENDFOR ENDFOR // aggregate results CASE (imm8[3:2]) OF 0: // equal any IntRes1 := 0 FOR i := 0 to UpperBound FOR j := 0 to UpperBound IntRes1[i] := IntRes1[i] OR BoolRes.word[i].bit[j] ENDFOR ENDFOR 1: // ranges IntRes1 := 0 FOR i := 0 to UpperBound FOR j := 0 to UpperBound IntRes1[i] := IntRes1[i] OR (BoolRes.word[i].bit[j] AND BoolRes.word[i].bit[j+1]) j += 2 ENDFOR ENDFOR 2: // equal each IntRes1 := 0 FOR i := 0 to UpperBound IntRes1[i] := BoolRes.word[i].bit[i] ENDFOR 3: // equal ordered IntRes1 := (imm8[0] ? 0xFF : 0xFFFF) FOR i := 0 to UpperBound k := i FOR j := 0 to UpperBound-i IntRes1[i] := IntRes1[i] AND BoolRes.word[k].bit[j] k := k+1 ENDFOR ENDFOR ESAC // optionally negate results bInvalid := 0 FOR i := 0 to UpperBound IF imm8[4] IF imm8[5] // only negate valid IF b[n+size-1:n] == 0 bInvalid := 1 FI IF bInvalid // invalid, don't negate IntRes2[i] := IntRes1[i] ELSE // valid, negate IntRes2[i] := -1 XOR IntRes1[i] FI ELSE // negate all IntRes2[i] := -1 XOR IntRes1[i] FI ELSE // don't negate IntRes2[i] := IntRes1[i] FI ENDFOR // output IF imm8[6] // byte / word mask FOR i := 0 to UpperBound j := i*size IF IntRes2[i] dst[j+size-1:j] := (imm8[0] ? 0xFF : 0xFFFF) ELSE dst[j+size-1:j] := 0 FI ENDFOR ELSE // bit mask dst[UpperBound:0] := IntRes2[UpperBound:0] dst[127:UpperBound+1] := 0 FI _mm_cmpistri ^^^^^^^^^^^^ :Tech: SSE_ALL :Category: String Compare :Header: nmmintrin.h :Searchable: SSE_ALL-String Compare-XMM :Register: XMM 128 bit :Return Type: int :Param Types: __m128i a, __m128i b, const int imm8 :Param ETypes: M128 a, M128 b, IMM imm8 .. code-block:: C int _mm_cmpistri(__m128i a, __m128i b, const int imm8); .. admonition:: Intel Description Compare packed strings with implicit lengths in "a" and "b" using the control in "imm8", and store the generated index in "dst". [strcmp_note] .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text size := (imm8[0] ? 16 : 8) // 8 or 16-bit characters UpperBound := (128 / size) - 1 BoolRes := 0 // compare all characters aInvalid := 0 bInvalid := 0 FOR i := 0 to UpperBound m := i*size FOR j := 0 to UpperBound n := j*size BoolRes.word[i].bit[j] := (a[m+size-1:m] == b[n+size-1:n]) ? 1 : 0 // invalidate characters after EOS IF a[m+size-1:m] == 0 aInvalid := 1 FI IF b[n+size-1:n] == 0 bInvalid := 1 FI // override comparisons for invalid characters CASE (imm8[3:2]) OF 0: // equal any IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 FI 1: // ranges IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 FI 2: // equal each IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 1 FI 3: // equal ordered IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 1 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 1 FI ESAC ENDFOR ENDFOR // aggregate results CASE (imm8[3:2]) OF 0: // equal any IntRes1 := 0 FOR i := 0 to UpperBound FOR j := 0 to UpperBound IntRes1[i] := IntRes1[i] OR BoolRes.word[i].bit[j] ENDFOR ENDFOR 1: // ranges IntRes1 := 0 FOR i := 0 to UpperBound FOR j := 0 to UpperBound IntRes1[i] := IntRes1[i] OR (BoolRes.word[i].bit[j] AND BoolRes.word[i].bit[j+1]) j += 2 ENDFOR ENDFOR 2: // equal each IntRes1 := 0 FOR i := 0 to UpperBound IntRes1[i] := BoolRes.word[i].bit[i] ENDFOR 3: // equal ordered IntRes1 := (imm8[0] ? 0xFF : 0xFFFF) FOR i := 0 to UpperBound k := i FOR j := 0 to UpperBound-i IntRes1[i] := IntRes1[i] AND BoolRes.word[k].bit[j] k := k+1 ENDFOR ENDFOR ESAC // optionally negate results bInvalid := 0 FOR i := 0 to UpperBound IF imm8[4] IF imm8[5] // only negate valid IF b[n+size-1:n] == 0 bInvalid := 1 FI IF bInvalid // invalid, don't negate IntRes2[i] := IntRes1[i] ELSE // valid, negate IntRes2[i] := -1 XOR IntRes1[i] FI ELSE // negate all IntRes2[i] := -1 XOR IntRes1[i] FI ELSE // don't negate IntRes2[i] := IntRes1[i] FI ENDFOR // output IF imm8[6] // most significant bit tmp := UpperBound dst := tmp DO WHILE ((tmp >= 0) AND a[tmp] == 0) tmp := tmp - 1 dst := tmp OD ELSE // least significant bit tmp := 0 dst := tmp DO WHILE ((tmp <= UpperBound) AND a[tmp] == 0) tmp := tmp + 1 dst := tmp OD FI _mm_cmpistrz ^^^^^^^^^^^^ :Tech: SSE_ALL :Category: String Compare :Header: nmmintrin.h :Searchable: SSE_ALL-String Compare-XMM :Register: XMM 128 bit :Return Type: int :Param Types: __m128i a, __m128i b, const int imm8 :Param ETypes: M128 a, M128 b, IMM imm8 .. code-block:: C int _mm_cmpistrz(__m128i a, __m128i b, const int imm8); .. admonition:: Intel Description Compare packed strings with implicit lengths in "a" and "b" using the control in "imm8", and returns 1 if any character in "b" was null, and 0 otherwise. [strcmp_note] .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text size := (imm8[0] ? 16 : 8) // 8 or 16-bit characters UpperBound := (128 / size) - 1 bInvalid := 0 FOR j := 0 to UpperBound n := j*size IF b[n+size-1:n] == 0 bInvalid := 1 FI ENDFOR dst := bInvalid _mm_cmpistrc ^^^^^^^^^^^^ :Tech: SSE_ALL :Category: String Compare :Header: nmmintrin.h :Searchable: SSE_ALL-String Compare-XMM :Register: XMM 128 bit :Return Type: int :Param Types: __m128i a, __m128i b, const int imm8 :Param ETypes: M128 a, M128 b, IMM imm8 .. code-block:: C int _mm_cmpistrc(__m128i a, __m128i b, const int imm8); .. admonition:: Intel Description Compare packed strings with implicit lengths in "a" and "b" using the control in "imm8", and returns 1 if the resulting mask was non-zero, and 0 otherwise. [strcmp_note] .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text size := (imm8[0] ? 16 : 8) // 8 or 16-bit characters UpperBound := (128 / size) - 1 BoolRes := 0 // compare all characters aInvalid := 0 bInvalid := 0 FOR i := 0 to UpperBound m := i*size FOR j := 0 to UpperBound n := j*size BoolRes.word[i].bit[j] := (a[m+size-1:m] == b[n+size-1:n]) ? 1 : 0 // invalidate characters after EOS IF a[m+size-1:m] == 0 aInvalid := 1 FI IF b[n+size-1:n] == 0 bInvalid := 1 FI // override comparisons for invalid characters CASE (imm8[3:2]) OF 0: // equal any IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 FI 1: // ranges IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 FI 2: // equal each IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 1 FI 3: // equal ordered IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 1 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 1 FI ESAC ENDFOR ENDFOR // aggregate results CASE (imm8[3:2]) OF 0: // equal any IntRes1 := 0 FOR i := 0 to UpperBound FOR j := 0 to UpperBound IntRes1[i] := IntRes1[i] OR BoolRes.word[i].bit[j] ENDFOR ENDFOR 1: // ranges IntRes1 := 0 FOR i := 0 to UpperBound FOR j := 0 to UpperBound IntRes1[i] := IntRes1[i] OR (BoolRes.word[i].bit[j] AND BoolRes.word[i].bit[j+1]) j += 2 ENDFOR ENDFOR 2: // equal each IntRes1 := 0 FOR i := 0 to UpperBound IntRes1[i] := BoolRes.word[i].bit[i] ENDFOR 3: // equal ordered IntRes1 := (imm8[0] ? 0xFF : 0xFFFF) FOR i := 0 to UpperBound k := i FOR j := 0 to UpperBound-i IntRes1[i] := IntRes1[i] AND BoolRes.word[k].bit[j] k := k+1 ENDFOR ENDFOR ESAC // optionally negate results bInvalid := 0 FOR i := 0 to UpperBound IF imm8[4] IF imm8[5] // only negate valid IF b[n+size-1:n] == 0 bInvalid := 1 FI IF bInvalid // invalid, don't negate IntRes2[i] := IntRes1[i] ELSE // valid, negate IntRes2[i] := -1 XOR IntRes1[i] FI ELSE // negate all IntRes2[i] := -1 XOR IntRes1[i] FI ELSE // don't negate IntRes2[i] := IntRes1[i] FI ENDFOR // output dst := (IntRes2 != 0) _mm_cmpistrs ^^^^^^^^^^^^ :Tech: SSE_ALL :Category: String Compare :Header: nmmintrin.h :Searchable: SSE_ALL-String Compare-XMM :Register: XMM 128 bit :Return Type: int :Param Types: __m128i a, __m128i b, const int imm8 :Param ETypes: M128 a, M128 b, IMM imm8 .. code-block:: C int _mm_cmpistrs(__m128i a, __m128i b, const int imm8); .. admonition:: Intel Description Compare packed strings with implicit lengths in "a" and "b" using the control in "imm8", and returns 1 if any character in "a" was null, and 0 otherwise. [strcmp_note] .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text size := (imm8[0] ? 16 : 8) // 8 or 16-bit characters UpperBound := (128 / size) - 1 aInvalid := 0 FOR i := 0 to UpperBound m := i*size IF a[m+size-1:m] == 0 aInvalid := 1 FI ENDFOR dst := aInvalid _mm_cmpistro ^^^^^^^^^^^^ :Tech: SSE_ALL :Category: String Compare :Header: nmmintrin.h :Searchable: SSE_ALL-String Compare-XMM :Register: XMM 128 bit :Return Type: int :Param Types: __m128i a, __m128i b, const int imm8 :Param ETypes: M128 a, M128 b, IMM imm8 .. code-block:: C int _mm_cmpistro(__m128i a, __m128i b, const int imm8); .. admonition:: Intel Description Compare packed strings with implicit lengths in "a" and "b" using the control in "imm8", and returns bit 0 of the resulting bit mask. [strcmp_note] .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text size := (imm8[0] ? 16 : 8) // 8 or 16-bit characters UpperBound := (128 / size) - 1 BoolRes := 0 // compare all characters aInvalid := 0 bInvalid := 0 FOR i := 0 to UpperBound m := i*size FOR j := 0 to UpperBound n := j*size BoolRes.word[i].bit[j] := (a[m+size-1:m] == b[n+size-1:n]) ? 1 : 0 // invalidate characters after EOS IF a[m+size-1:m] == 0 aInvalid := 1 FI IF b[n+size-1:n] == 0 bInvalid := 1 FI // override comparisons for invalid characters CASE (imm8[3:2]) OF 0: // equal any IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 FI 1: // ranges IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 FI 2: // equal each IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 1 FI 3: // equal ordered IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 1 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 1 FI ESAC ENDFOR ENDFOR // aggregate results CASE (imm8[3:2]) OF 0: // equal any IntRes1 := 0 FOR i := 0 to UpperBound FOR j := 0 to UpperBound IntRes1[i] := IntRes1[i] OR BoolRes.word[i].bit[j] ENDFOR ENDFOR 1: // ranges IntRes1 := 0 FOR i := 0 to UpperBound FOR j := 0 to UpperBound IntRes1[i] := IntRes1[i] OR (BoolRes.word[i].bit[j] AND BoolRes.word[i].bit[j+1]) j += 2 ENDFOR ENDFOR 2: // equal each IntRes1 := 0 FOR i := 0 to UpperBound IntRes1[i] := BoolRes.word[i].bit[i] ENDFOR 3: // equal ordered IntRes1 := (imm8[0] ? 0xFF : 0xFFFF) FOR i := 0 to UpperBound k := i FOR j := 0 to UpperBound-i IntRes1[i] := IntRes1[i] AND BoolRes.word[k].bit[j] k := k+1 ENDFOR ENDFOR ESAC // optionally negate results bInvalid := 0 FOR i := 0 to UpperBound IF imm8[4] IF imm8[5] // only negate valid IF b[n+size-1:n] == 0 bInvalid := 1 FI IF bInvalid // invalid, don't negate IntRes2[i] := IntRes1[i] ELSE // valid, negate IntRes2[i] := -1 XOR IntRes1[i] FI ELSE // negate all IntRes2[i] := -1 XOR IntRes1[i] FI ELSE // don't negate IntRes2[i] := IntRes1[i] FI ENDFOR // output dst := IntRes2[0] _mm_cmpistra ^^^^^^^^^^^^ :Tech: SSE_ALL :Category: String Compare :Header: nmmintrin.h :Searchable: SSE_ALL-String Compare-XMM :Register: XMM 128 bit :Return Type: int :Param Types: __m128i a, __m128i b, const int imm8 :Param ETypes: M128 a, M128 b, IMM imm8 .. code-block:: C int _mm_cmpistra(__m128i a, __m128i b, const int imm8); .. admonition:: Intel Description Compare packed strings with implicit lengths in "a" and "b" using the control in "imm8", and returns 1 if "b" did not contain a null character and the resulting mask was zero, and 0 otherwise. [strcmp_note] .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text size := (imm8[0] ? 16 : 8) // 8 or 16-bit characters UpperBound := (128 / size) - 1 BoolRes := 0 // compare all characters aInvalid := 0 bInvalid := 0 FOR i := 0 to UpperBound m := i*size FOR j := 0 to UpperBound n := j*size BoolRes.word[i].bit[j] := (a[m+size-1:m] == b[n+size-1:n]) ? 1 : 0 // invalidate characters after EOS IF a[m+size-1:m] == 0 aInvalid := 1 FI IF b[n+size-1:n] == 0 bInvalid := 1 FI // override comparisons for invalid characters CASE (imm8[3:2]) OF 0: // equal any IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 FI 1: // ranges IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 FI 2: // equal each IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 1 FI 3: // equal ordered IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 1 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 1 FI ESAC ENDFOR ENDFOR // aggregate results CASE (imm8[3:2]) OF 0: // equal any IntRes1 := 0 FOR i := 0 to UpperBound FOR j := 0 to UpperBound IntRes1[i] := IntRes1[i] OR BoolRes.word[i].bit[j] ENDFOR ENDFOR 1: // ranges IntRes1 := 0 FOR i := 0 to UpperBound FOR j := 0 to UpperBound IntRes1[i] := IntRes1[i] OR (BoolRes.word[i].bit[j] AND BoolRes.word[i].bit[j+1]) j += 2 ENDFOR ENDFOR 2: // equal each IntRes1 := 0 FOR i := 0 to UpperBound IntRes1[i] := BoolRes.word[i].bit[i] ENDFOR 3: // equal ordered IntRes1 := (imm8[0] ? 0xFF : 0xFFFF) FOR i := 0 to UpperBound k := i FOR j := 0 to UpperBound-i IntRes1[i] := IntRes1[i] AND BoolRes.word[k].bit[j] k := k+1 ENDFOR ENDFOR ESAC // optionally negate results bInvalid := 0 FOR i := 0 to UpperBound IF imm8[4] IF imm8[5] // only negate valid IF b[n+size-1:n] == 0 bInvalid := 1 FI IF bInvalid // invalid, don't negate IntRes2[i] := IntRes1[i] ELSE // valid, negate IntRes2[i] := -1 XOR IntRes1[i] FI ELSE // negate all IntRes2[i] := -1 XOR IntRes1[i] FI ELSE // don't negate IntRes2[i] := IntRes1[i] FI ENDFOR // output dst := (IntRes2 == 0) AND bInvalid _mm_cmpestrm ^^^^^^^^^^^^ :Tech: SSE_ALL :Category: String Compare :Header: nmmintrin.h :Searchable: SSE_ALL-String Compare-XMM :Register: XMM 128 bit :Return Type: __m128i :Param Types: __m128i a, int la, __m128i b, int lb, const int imm8 :Param ETypes: M128 a, UI32 la, M128 b, UI32 lb, IMM imm8 .. code-block:: C __m128i _mm_cmpestrm(__m128i a, int la, __m128i b, int lb, const int imm8) .. admonition:: Intel Description Compare packed strings in "a" and "b" with lengths "la" and "lb" using the control in "imm8", and store the generated mask in "dst". [strcmp_note] .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text size := (imm8[0] ? 16 : 8) // 8 or 16-bit characters UpperBound := (128 / size) - 1 BoolRes := 0 // compare all characters aInvalid := 0 bInvalid := 0 FOR i := 0 to UpperBound m := i*size FOR j := 0 to UpperBound n := j*size BoolRes.word[i].bit[j] := (a[m+size-1:m] == b[n+size-1:n]) ? 1 : 0 // invalidate characters after EOS IF i == la aInvalid := 1 FI IF j == lb bInvalid := 1 FI // override comparisons for invalid characters CASE (imm8[3:2]) OF 0: // equal any IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 FI 1: // ranges IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 FI 2: // equal each IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 1 FI 3: // equal ordered IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 1 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 1 FI ESAC ENDFOR ENDFOR // aggregate results CASE (imm8[3:2]) OF 0: // equal any IntRes1 := 0 FOR i := 0 to UpperBound FOR j := 0 to UpperBound IntRes1[i] := IntRes1[i] OR BoolRes.word[i].bit[j] ENDFOR ENDFOR 1: // ranges IntRes1 := 0 FOR i := 0 to UpperBound FOR j := 0 to UpperBound IntRes1[i] := IntRes1[i] OR (BoolRes.word[i].bit[j] AND BoolRes.word[i].bit[j+1]) j += 2 ENDFOR ENDFOR 2: // equal each IntRes1 := 0 FOR i := 0 to UpperBound IntRes1[i] := BoolRes.word[i].bit[i] ENDFOR 3: // equal ordered IntRes1 := (imm8[0] ? 0xFF : 0xFFFF) FOR i := 0 to UpperBound k := i FOR j := 0 to UpperBound-i IntRes1[i] := IntRes1[i] AND BoolRes.word[k].bit[j] k := k+1 ENDFOR ENDFOR ESAC // optionally negate results FOR i := 0 to UpperBound IF imm8[4] IF imm8[5] // only negate valid IF i >= lb // invalid, don't negate IntRes2[i] := IntRes1[i] ELSE // valid, negate IntRes2[i] := -1 XOR IntRes1[i] FI ELSE // negate all IntRes2[i] := -1 XOR IntRes1[i] FI ELSE // don't negate IntRes2[i] := IntRes1[i] FI ENDFOR // output IF imm8[6] // byte / word mask FOR i := 0 to UpperBound j := i*size IF IntRes2[i] dst[j+size-1:j] := (imm8[0] ? 0xFF : 0xFFFF) ELSE dst[j+size-1:j] := 0 FI ENDFOR ELSE // bit mask dst[UpperBound:0] := IntRes2[UpperBound:0] dst[127:UpperBound+1] := 0 FI _mm_cmpestri ^^^^^^^^^^^^ :Tech: SSE_ALL :Category: String Compare :Header: nmmintrin.h :Searchable: SSE_ALL-String Compare-XMM :Register: XMM 128 bit :Return Type: int :Param Types: __m128i a, int la, __m128i b, int lb, const int imm8 :Param ETypes: M128 a, UI32 la, M128 b, UI32 lb, IMM imm8 .. code-block:: C int _mm_cmpestri(__m128i a, int la, __m128i b, int lb, const int imm8) .. admonition:: Intel Description Compare packed strings in "a" and "b" with lengths "la" and "lb" using the control in "imm8", and store the generated index in "dst". [strcmp_note] .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text size := (imm8[0] ? 16 : 8) // 8 or 16-bit characters UpperBound := (128 / size) - 1 BoolRes := 0 // compare all characters aInvalid := 0 bInvalid := 0 FOR i := 0 to UpperBound m := i*size FOR j := 0 to UpperBound n := j*size BoolRes.word[i].bit[j] := (a[m+size-1:m] == b[n+size-1:n]) ? 1 : 0 // invalidate characters after EOS IF i == la aInvalid := 1 FI IF j == lb bInvalid := 1 FI // override comparisons for invalid characters CASE (imm8[3:2]) OF 0: // equal any IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 FI 1: // ranges IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 FI 2: // equal each IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 1 FI 3: // equal ordered IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 1 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 1 FI ESAC ENDFOR ENDFOR // aggregate results CASE (imm8[3:2]) OF 0: // equal any IntRes1 := 0 FOR i := 0 to UpperBound FOR j := 0 to UpperBound IntRes1[i] := IntRes1[i] OR BoolRes.word[i].bit[j] ENDFOR ENDFOR 1: // ranges IntRes1 := 0 FOR i := 0 to UpperBound FOR j := 0 to UpperBound IntRes1[i] := IntRes1[i] OR (BoolRes.word[i].bit[j] AND BoolRes.word[i].bit[j+1]) j += 2 ENDFOR ENDFOR 2: // equal each IntRes1 := 0 FOR i := 0 to UpperBound IntRes1[i] := BoolRes.word[i].bit[i] ENDFOR 3: // equal ordered IntRes1 := (imm8[0] ? 0xFF : 0xFFFF) FOR i := 0 to UpperBound k := i FOR j := 0 to UpperBound-i IntRes1[i] := IntRes1[i] AND BoolRes.word[k].bit[j] k := k+1 ENDFOR ENDFOR ESAC // optionally negate results FOR i := 0 to UpperBound IF imm8[4] IF imm8[5] // only negate valid IF i >= lb // invalid, don't negate IntRes2[i] := IntRes1[i] ELSE // valid, negate IntRes2[i] := -1 XOR IntRes1[i] FI ELSE // negate all IntRes2[i] := -1 XOR IntRes1[i] FI ELSE // don't negate IntRes2[i] := IntRes1[i] FI ENDFOR // output IF imm8[6] // most significant bit tmp := UpperBound dst := tmp DO WHILE ((tmp >= 0) AND a[tmp] == 0) tmp := tmp - 1 dst := tmp OD ELSE // least significant bit tmp := 0 dst := tmp DO WHILE ((tmp <= UpperBound) AND a[tmp] == 0) tmp := tmp + 1 dst := tmp OD FI _mm_cmpestrz ^^^^^^^^^^^^ :Tech: SSE_ALL :Category: String Compare :Header: nmmintrin.h :Searchable: SSE_ALL-String Compare-XMM :Register: XMM 128 bit :Return Type: int :Param Types: __m128i a, int la, __m128i b, int lb, const int imm8 :Param ETypes: M128 a, UI32 la, M128 b, UI32 lb, IMM imm8 .. code-block:: C int _mm_cmpestrz(__m128i a, int la, __m128i b, int lb, const int imm8) .. admonition:: Intel Description Compare packed strings in "a" and "b" with lengths "la" and "lb" using the control in "imm8", and returns 1 if any character in "b" was null, and 0 otherwise. [strcmp_note] .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text size := (imm8[0] ? 16 : 8) // 8 or 16-bit characters UpperBound := (128 / size) - 1 dst := (lb <= UpperBound) _mm_cmpestrc ^^^^^^^^^^^^ :Tech: SSE_ALL :Category: String Compare :Header: nmmintrin.h :Searchable: SSE_ALL-String Compare-XMM :Register: XMM 128 bit :Return Type: int :Param Types: __m128i a, int la, __m128i b, int lb, const int imm8 :Param ETypes: M128 a, UI32 la, M128 b, UI32 lb, IMM imm8 .. code-block:: C int _mm_cmpestrc(__m128i a, int la, __m128i b, int lb, const int imm8) .. admonition:: Intel Description Compare packed strings in "a" and "b" with lengths "la" and "lb" using the control in "imm8", and returns 1 if the resulting mask was non-zero, and 0 otherwise. [strcmp_note] .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text size := (imm8[0] ? 16 : 8) // 8 or 16-bit characters UpperBound := (128 / size) - 1 BoolRes := 0 // compare all characters aInvalid := 0 bInvalid := 0 FOR i := 0 to UpperBound m := i*size FOR j := 0 to UpperBound n := j*size BoolRes.word[i].bit[j] := (a[m+size-1:m] == b[n+size-1:n]) ? 1 : 0 // invalidate characters after EOS IF i == la aInvalid := 1 FI IF j == lb bInvalid := 1 FI // override comparisons for invalid characters CASE (imm8[3:2]) OF 0: // equal any IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 FI 1: // ranges IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 FI 2: // equal each IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 1 FI 3: // equal ordered IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 1 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 1 FI ESAC ENDFOR ENDFOR // aggregate results CASE (imm8[3:2]) OF 0: // equal any IntRes1 := 0 FOR i := 0 to UpperBound FOR j := 0 to UpperBound IntRes1[i] := IntRes1[i] OR BoolRes.word[i].bit[j] ENDFOR ENDFOR 1: // ranges IntRes1 := 0 FOR i := 0 to UpperBound FOR j := 0 to UpperBound IntRes1[i] := IntRes1[i] OR (BoolRes.word[i].bit[j] AND BoolRes.word[i].bit[j+1]) j += 2 ENDFOR ENDFOR 2: // equal each IntRes1 := 0 FOR i := 0 to UpperBound IntRes1[i] := BoolRes.word[i].bit[i] ENDFOR 3: // equal ordered IntRes1 := (imm8[0] ? 0xFF : 0xFFFF) FOR i := 0 to UpperBound k := i FOR j := 0 to UpperBound-i IntRes1[i] := IntRes1[i] AND BoolRes.word[k].bit[j] k := k+1 ENDFOR ENDFOR ESAC // optionally negate results FOR i := 0 to UpperBound IF imm8[4] IF imm8[5] // only negate valid IF i >= lb // invalid, don't negate IntRes2[i] := IntRes1[i] ELSE // valid, negate IntRes2[i] := -1 XOR IntRes1[i] FI ELSE // negate all IntRes2[i] := -1 XOR IntRes1[i] FI ELSE // don't negate IntRes2[i] := IntRes1[i] FI ENDFOR // output dst := (IntRes2 != 0) _mm_cmpestrs ^^^^^^^^^^^^ :Tech: SSE_ALL :Category: String Compare :Header: nmmintrin.h :Searchable: SSE_ALL-String Compare-XMM :Register: XMM 128 bit :Return Type: int :Param Types: __m128i a, int la, __m128i b, int lb, const int imm8 :Param ETypes: M128 a, UI32 la, M128 b, UI32 lb, IMM imm8 .. code-block:: C int _mm_cmpestrs(__m128i a, int la, __m128i b, int lb, const int imm8) .. admonition:: Intel Description Compare packed strings in "a" and "b" with lengths "la" and "lb" using the control in "imm8", and returns 1 if any character in "a" was null, and 0 otherwise. [strcmp_note] .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text size := (imm8[0] ? 16 : 8) // 8 or 16-bit characters UpperBound := (128 / size) - 1 dst := (la <= UpperBound) _mm_cmpestro ^^^^^^^^^^^^ :Tech: SSE_ALL :Category: String Compare :Header: nmmintrin.h :Searchable: SSE_ALL-String Compare-XMM :Register: XMM 128 bit :Return Type: int :Param Types: __m128i a, int la, __m128i b, int lb, const int imm8 :Param ETypes: M128 a, UI32 la, M128 b, UI32 lb, IMM imm8 .. code-block:: C int _mm_cmpestro(__m128i a, int la, __m128i b, int lb, const int imm8) .. admonition:: Intel Description Compare packed strings in "a" and "b" with lengths "la" and "lb" using the control in "imm8", and returns bit 0 of the resulting bit mask. [strcmp_note] .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text size := (imm8[0] ? 16 : 8) // 8 or 16-bit characters UpperBound := (128 / size) - 1 BoolRes := 0 // compare all characters aInvalid := 0 bInvalid := 0 FOR i := 0 to UpperBound m := i*size FOR j := 0 to UpperBound n := j*size BoolRes.word[i].bit[j] := (a[m+size-1:m] == b[n+size-1:n]) ? 1 : 0 // invalidate characters after EOS IF i == la aInvalid := 1 FI IF j == lb bInvalid := 1 FI // override comparisons for invalid characters CASE (imm8[3:2]) OF 0: // equal any IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 FI 1: // ranges IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 FI 2: // equal each IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 1 FI 3: // equal ordered IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 1 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 1 FI ESAC ENDFOR ENDFOR // aggregate results CASE (imm8[3:2]) OF 0: // equal any IntRes1 := 0 FOR i := 0 to UpperBound FOR j := 0 to UpperBound IntRes1[i] := IntRes1[i] OR BoolRes.word[i].bit[j] ENDFOR ENDFOR 1: // ranges IntRes1 := 0 FOR i := 0 to UpperBound FOR j := 0 to UpperBound IntRes1[i] := IntRes1[i] OR (BoolRes.word[i].bit[j] AND BoolRes.word[i].bit[j+1]) j += 2 ENDFOR ENDFOR 2: // equal each IntRes1 := 0 FOR i := 0 to UpperBound IntRes1[i] := BoolRes.word[i].bit[i] ENDFOR 3: // equal ordered IntRes1 := (imm8[0] ? 0xFF : 0xFFFF) FOR i := 0 to UpperBound k := i FOR j := 0 to UpperBound-i IntRes1[i] := IntRes1[i] AND BoolRes.word[k].bit[j] k := k+1 ENDFOR ENDFOR ESAC // optionally negate results FOR i := 0 to UpperBound IF imm8[4] IF imm8[5] // only negate valid IF i >= lb // invalid, don't negate IntRes2[i] := IntRes1[i] ELSE // valid, negate IntRes2[i] := -1 XOR IntRes1[i] FI ELSE // negate all IntRes2[i] := -1 XOR IntRes1[i] FI ELSE // don't negate IntRes2[i] := IntRes1[i] FI ENDFOR // output dst := IntRes2[0] _mm_cmpestra ^^^^^^^^^^^^ :Tech: SSE_ALL :Category: String Compare :Header: nmmintrin.h :Searchable: SSE_ALL-String Compare-XMM :Register: XMM 128 bit :Return Type: int :Param Types: __m128i a, int la, __m128i b, int lb, const int imm8 :Param ETypes: M128 a, UI32 la, M128 b, UI32 lb, IMM imm8 .. code-block:: C int _mm_cmpestra(__m128i a, int la, __m128i b, int lb, const int imm8) .. admonition:: Intel Description Compare packed strings in "a" and "b" with lengths "la" and "lb" using the control in "imm8", and returns 1 if "b" did not contain a null character and the resulting mask was zero, and 0 otherwise. [strcmp_note] .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text size := (imm8[0] ? 16 : 8) // 8 or 16-bit characters UpperBound := (128 / size) - 1 BoolRes := 0 // compare all characters aInvalid := 0 bInvalid := 0 FOR i := 0 to UpperBound m := i*size FOR j := 0 to UpperBound n := j*size BoolRes.word[i].bit[j] := (a[m+size-1:m] == b[n+size-1:n]) ? 1 : 0 // invalidate characters after EOS IF i == la aInvalid := 1 FI IF j == lb bInvalid := 1 FI // override comparisons for invalid characters CASE (imm8[3:2]) OF 0: // equal any IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 FI 1: // ranges IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 FI 2: // equal each IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 1 FI 3: // equal ordered IF (!aInvalid && bInvalid) BoolRes.word[i].bit[j] := 0 ELSE IF (aInvalid && !bInvalid) BoolRes.word[i].bit[j] := 1 ELSE IF (aInvalid && bInvalid) BoolRes.word[i].bit[j] := 1 FI ESAC ENDFOR ENDFOR // aggregate results CASE (imm8[3:2]) OF 0: // equal any IntRes1 := 0 FOR i := 0 to UpperBound FOR j := 0 to UpperBound IntRes1[i] := IntRes1[i] OR BoolRes.word[i].bit[j] ENDFOR ENDFOR 1: // ranges IntRes1 := 0 FOR i := 0 to UpperBound FOR j := 0 to UpperBound IntRes1[i] := IntRes1[i] OR (BoolRes.word[i].bit[j] AND BoolRes.word[i].bit[j+1]) j += 2 ENDFOR ENDFOR 2: // equal each IntRes1 := 0 FOR i := 0 to UpperBound IntRes1[i] := BoolRes.word[i].bit[i] ENDFOR 3: // equal ordered IntRes1 := (imm8[0] ? 0xFF : 0xFFFF) FOR i := 0 to UpperBound k := i FOR j := 0 to UpperBound-i IntRes1[i] := IntRes1[i] AND BoolRes.word[k].bit[j] k := k+1 ENDFOR ENDFOR ESAC // optionally negate results FOR i := 0 to UpperBound IF imm8[4] IF imm8[5] // only negate valid IF i >= lb // invalid, don't negate IntRes2[i] := IntRes1[i] ELSE // valid, negate IntRes2[i] := -1 XOR IntRes1[i] FI ELSE // negate all IntRes2[i] := -1 XOR IntRes1[i] FI ELSE // don't negate IntRes2[i] := IntRes1[i] FI ENDFOR // output dst := (IntRes2 == 0) AND (lb > UpperBound) General Support --------------- XMM ~~~ _mm_getcsr ^^^^^^^^^^ :Tech: SSE_ALL :Category: General Support :Header: immintrin.h :Searchable: SSE_ALL-General Support-XMM :Register: XMM 128 bit :Return Type: unsigned int .. code-block:: C unsigned int _mm_getcsr(void ); .. admonition:: Intel Description Get the unsigned 32-bit value of the MXCSR control and status register. .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text dst[31:0] := MXCSR _mm_setcsr ^^^^^^^^^^ :Tech: SSE_ALL :Category: General Support :Header: immintrin.h :Searchable: SSE_ALL-General Support-XMM :Register: XMM 128 bit :Return Type: void :Param Types: unsigned int a :Param ETypes: UI32 a .. code-block:: C void _mm_setcsr(unsigned int a); .. admonition:: Intel Description Set the MXCSR control and status register with the value in unsigned 32-bit integer "a". .. admonition:: Intel Implementation Psudeo-Code .. code-block:: text MXCSR := a[31:0] _mm_prefetch ^^^^^^^^^^^^ :Tech: SSE_ALL :Category: General Support :Header: immintrin.h :Searchable: SSE_ALL-General Support-XMM :Register: XMM 128 bit :Return Type: void :Param Types: char const* p, int i :Param ETypes: UI8 p, IMM i .. code-block:: C void _mm_prefetch(char const* p, int i); .. admonition:: Intel Description Fetch the line of data from memory that contains address "p" to a location in the cache hierarchy specified by the locality hint "i", which can be one of: