CANN/catlass Copy Gm To L1基础模板
Copy Gm To L1基础模板【免费下载链接】catlass本项目是CANN的算子模板库提供NPU上高性能矩阵乘及其相关融合类算子模板样例。项目地址: https://gitcode.com/cann/catlass代码位置[TOC]CopyGmToL1功能说明原型结构体模板template class ArchTag, // 架构标签 class GmType, // GM上操作数的Gemm类型 class L1Type void // L1上操作数的Gemm类型 struct CopyGmToL1偏特化实现templateArchTagGmTypeL1Typeclass ArchTag, class ElementArchTagGemm::GemmTypeElement, layout::RowMajorGemm::GemmTypeElement, layout::zN, AscendC::TPosition::A1class ArchTag, class ElementArchTagGemm::GemmTypeElement, layout::RowMajorGemm::GemmTypeElement, layout::zZ, AscendC::TPosition::B1class ArchTag, class ElementArchTagGemm::GemmTypeElement, layout::ColumnMajorGemm::GemmTypeElement, layout::nN, AscendC::TPosition::A1class ArchTag, class ElementArchTagGemm::GemmTypeElement, layout::ColumnMajorGemm::GemmTypeElement, layout::nZ, AscendC::TPosition::B1class ArchTag, class ElementArchTagGemm::GemmTypeElement, layout::ColumnMajorGemm::GemmTypeElement, layout::nZ, AscendC::TPosition::A1class ArchTag, class ElementArchTagGemm::GemmTypeElement, layout::VectorLayoutGemm::GemmTypeElement, layout::zN, AscendC::TPosition::A1class ArchTag, class ElementArchTagGemm::GemmTypeElement, layout::NDC1HWC0, AscendC::TPosition::GM-class ArchTag, class ElementArchTagGemm::GemmTypeElement, layout::KDC1KHKWN1N0C0, AscendC::TPosition::GM-class ArchTag, class ElementArchTagGemm::GemmTypeElement, layout::ColumnMajorGemm::GemmTypeElement, layout::nN, AscendC::TPosition::B1class ArchTag, class ElementArchTagGemm::GemmTypeElement, layout::RowMajorGemm::GemmTypeElement, layout::zN, AscendC::TPosition::B1class ElementArch::AtlasA2Gemm::GemmTypeElement, layout::RowMajor-class ElementArch::AtlasA2Gemm::GemmTypeElement, layout::ColumnMajor-class ArchTag, class ElementArchTagGemm::GemmTypeElement, layout::zN-class ArchTag, class ElementArchTagGemm::GemmTypeElement, layout::nZ-class ElementArch::AtlasA2Gemm::GemmTypeElement, layout::PaddingRowMajor-class ElementArch::AtlasA2Gemm::GemmTypeElement, layout::PaddingColumnMajor-class ElementArch::AtlasA2Gemm::GemmTypeElement, layout::RowMajorGemm::GemmTypeElement, layout::RowMajor, AscendC::TPosition::A1class ArchTag, class ElementArchTagGemm::GemmTypeElement, layout::VectorLayout, AscendC::TPosition::GMGemm::GemmTypeElement, layout::VectorLayout, AscendC::TPosition::A1调用void operator()( AscendC::LocalTensorElement const dstTensor, // 目的操作数LocalTensor AscendC::GlobalTensorElement const srcTensor, // 源操作数LocalTensor LayoutDst const layoutDst, // 目的操作数layout LayoutSrc const layoutSrc // 源操作数layout )CopyGmToL1IntervalDataCopy功能说明原型结构体模板template class ArchTag, // 架构标签 class GmType, // GM上操作数的Gemm类型 class L1Type void // L1上操作数的Gemm类型 struct CopyGmToL1IntervalDataCopy偏特化实现templateArchTagGmTypeL1Type-Arch::AtlasA2Gemm::GemmTypehalf, layout::RowMajor--Arch::AtlasA2Gemm::GemmTypehalf, layout::PaddingRowMajor--Arch::AtlasA2Gemm::GemmTypehalf, layout::ColumnMajor--Arch::AtlasA2Gemm::GemmTypehalf, layout::PaddingColumnMajor-CopyGmToL1GMMPTD功能说明原型结构体模板template class ArchTag, // 架构标签 class GmType, // GM上操作数的Gemm类型 class L1Type void // L1上操作数的Gemm类型 struct CopyGmToL1GMMPTD偏特化实现templateArchTagGmTypeL1Typeclass ElementArch::AtlasA2Gemm::GemmTypeElement, layout::RowMajor-CopyGmToL1DynamicOptimized功能说明原型结构体模板template class ArchTag, // 架构标签 class GmType, // GM上操作数的Gemm类型 class L1Type void // L1上操作数的Gemm类型 struct CopyGmToL1DynamicOptimized偏特化实现templateArchTagGmTypeL1Typeclass ElementArch::AtlasA2Gemm::GemmTypeElement, layout::RowMajor-class ElementArch::AtlasA2Gemm::GemmTypeElement, layout::ColumnMajor-class ElementArch::AtlasA2Gemm::GemmTypeElement, layout::zN-class ElementArch::AtlasA2Gemm::GemmTypeElement, layout::nZ-class ElementArch::AtlasA2Gemm::GemmTypeElement, layout::PaddingRowMajor-class ElementArch::AtlasA2Gemm::GemmTypeElement, layout::PaddingColumnMajor-TileCopyTla功能说明原型结构体模板template class ElementSrc, // 源操作数的数据类型 class ElementDst, // 目的操作数的数据类型 class LayoutSrc, // 操作数的layout class LayoutDst, // 目的操作数的layout class CoordSrc, // 源操作数在tensor中的坐标 class CoordDst // 目的操作数在tensor中的坐标 struct TileCopyTla Arch::AtlasA2, // 架构标签 tla::TensorAscendC::GlobalTensorElementSrc, LayoutSrc, CoordSrc, AscendC::TPosition::GM, tla::TensorAscendC::LocalTensorElementDst, // 源操作数的tensor结构 LayoutDst, CoordDst, AscendC::TPosition::A1, // 目的操作数的tensor结构 std::enable_if_tcond0 cond1 // 判断条件cond0和cond1见下列偏特化实现 偏特化实现cond0cond1tla::detail::isRowMajorLayoutSrc::valuetla::detail::iszNElementDst, LayoutDst::valuetla::detail::isColumnMajorLayoutSrc::valuetla::detail::isnZElementDst, LayoutDst::valuetla::detail::iszNLayoutSrc::valuetla::detail::iszNElementDst, LayoutDst::valuetla::detail::isnZLayoutSrc::valuetla::detail::isnZElementDst, LayoutDst::valueTileCopyTlaExt功能说明原型结构体模板template class ElementSrc, // 源操作数的数据类型 class ElementDst, // 目的操作数的数据类型 class LayoutSrc, // 操作数的layout class LayoutDst, // 目的操作数的layout class CoordSrc, // 源操作数在tensor中的坐标 class CoordDst // 目的操作数在tensor中的坐标 struct TileCopyTla Arch::AtlasA2, // 架构标签 tla::TensorAscendC::GlobalTensorElementSrc, LayoutSrc, CoordSrc, AscendC::TPosition::GM, tla::TensorAscendC::LocalTensorElementDst, // 源操作数的tensor结构 LayoutDst, CoordDst, AscendC::TPosition::A1, // 目的操作数的tensor结构 cond0, // 见下面偏特化实现 cond1 // 见下面偏特化实现 偏特化实现cond0cond1layout::RowMajorlayout::zNlayout::PaddingRowMajorlayout::zNlayout::ColumnMajorlayout::nZlayout::PaddingColumnMajorlayout::nZlayout::zNlayout::zNlayout::nZlayout::nZ【免费下载链接】catlass本项目是CANN的算子模板库提供NPU上高性能矩阵乘及其相关融合类算子模板样例。项目地址: https://gitcode.com/cann/catlass创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考