[AMDGPU] GCNDPPCombine: don't shrink V_ADD_CO_U32 if carry out is used

Don't shrink VOP3 instructions if there are any uses of a carry-out operand, because the shrunken form of the instruction would write the carry-out to vcc instead of to a virtual register. Differential Revision: https://reviews.llvm.org/D100760
author: Jay Foad <jay.foad@amd.com> 2021-04-19 14:48:20 +0100
committer: Jay Foad <jay.foad@amd.com> 2021-04-20 09:17:52 +0100
commit: b22721f01a580ceed64923f528e1b7f3d66a12a9 (patch)
tree: a4ae9a5f0e2abe335652aeec73ba4f7dd1cf977a
parent: [X86][AMX] Verify illegal types or instructions for x86_amx. (diff)
download: llvm-project-b22721f01a580ceed64923f528e1b7f3d66a12a9.tar.gz
llvm-project-b22721f01a580ceed64923f528e1b7f3d66a12a9.tar.bz2
llvm-project-b22721f01a580ceed64923f528e1b7f3d66a12a9.zip
2 files changed, 27 insertions, 0 deletions
diff --git a/llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp b/llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp
index 53c84639abda..2bf365168048 100644
--- a/llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp
+++ b/llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp
@@ -123,6 +123,13 @@ bool GCNDPPCombine::isShrinkable(MachineInstr &MI) const {
     LLVM_DEBUG(dbgs() << "  Inst hasn't e32 equivalent\n");
     return false;
   }
+  if (const auto *SDst = TII->getNamedOperand(MI, AMDGPU::OpName::sdst)) {
+    // Give up if there are any uses of the carry-out from instructions like
+    // V_ADD_CO_U32. The shrunken form of the instruction would write it to vcc
+    // instead of to a virtual register.
+    if (!MRI->use_nodbg_empty(SDst->getReg()))
+      return false;
+  }
   // check if other than abs|neg modifiers are set (opsel for example)
   const int64_t Mask = ~(SISrcMods::ABS | SISrcMods::NEG);
   if (!hasNoImmOrEqual(MI, AMDGPU::OpName::src0_modifiers, 0, Mask) ||
diff --git a/llvm/test/CodeGen/AMDGPU/dpp_combine.mir b/llvm/test/CodeGen/AMDGPU/dpp_combine.mir
index c00c71a0d538..1c896b44b3ac 100644
--- a/llvm/test/CodeGen/AMDGPU/dpp_combine.mir
+++ b/llvm/test/CodeGen/AMDGPU/dpp_combine.mir
@@ -354,6 +354,26 @@ body:             |
     %6:vgpr_32 = V_ADD_U32_e64 %5, %1, 1, implicit $exec
 ...
 
+# GCN-LABEL: name: add_co_u32_e64
+# GCN: %4:vgpr_32, %5:sreg_64_xexec = V_ADD_CO_U32_e64 %3, %1, 0, implicit $exec
+
+name: add_co_u32_e64
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    liveins: $vgpr0, $vgpr1
+
+    %0:vgpr_32 = COPY $vgpr0
+    %1:vgpr_32 = COPY $vgpr1
+    %2:vgpr_32 = IMPLICIT_DEF
+
+    ; this shouldn't be combined as the carry-out is used
+    %3:vgpr_32 = V_MOV_B32_dpp undef %2, %0, 1, 15, 15, 1, implicit $exec
+    %4:vgpr_32, %5:sreg_64_xexec = V_ADD_CO_U32_e64 %3, %1, 0, implicit $exec
+
+    S_NOP 0, implicit %5
+...
+
 # tests on sequences of dpp consumers
 # GCN-LABEL: name: dpp_seq
 # GCN: %4:vgpr_32 = V_ADD_CO_U32_dpp %1, %0, %1, 1, 14, 15, 0, implicit-def $vcc, implicit $exec
author	Jay Foad <jay.foad@amd.com>	2021-04-19 14:48:20 +0100
committer	Jay Foad <jay.foad@amd.com>	2021-04-20 09:17:52 +0100
commit	b22721f01a580ceed64923f528e1b7f3d66a12a9 (patch)
tree	a4ae9a5f0e2abe335652aeec73ba4f7dd1cf977a
parent	[X86][AMX] Verify illegal types or instructions for x86_amx. (diff)
download	llvm-project-b22721f01a580ceed64923f528e1b7f3d66a12a9.tar.gz llvm-project-b22721f01a580ceed64923f528e1b7f3d66a12a9.tar.bz2 llvm-project-b22721f01a580ceed64923f528e1b7f3d66a12a9.zip