Spaces:
Running
Running
Commit History
CANN: Add the basic supports of Flash Attention kernel (llama/13627)
112c144
Bizhao Shi
commited on
CANN: Support MUL_MAT_ID for q8_0 and q4_0 (llama/13705)
6a9f9dc
Chenguang Li
commited on
CANN: Support MOE Model MUL_MAT_ID (llama/13042)
f013e2d
Chenguang Li
commited on
CANN: Add support for async operator submission (llama/12864)
1b9d0f0
CANN: Add 310P operator support check (llama/12962)
14d0d7c
Chenguang Li
commited on
CANN: Add x86 build ci (llama/12950)
f4c9b36
CANN: Opt ROPE optimization (llama/12865)
3773a09
Chenguang Li
commited on
CANN: Optimize CANN buffer pool memory management (llama/12875)
66b93b3
CANN: Support more ops (llama/12841)
6aecea5
Chenguang Li
commited on
CANN: Support Opt CONV_TRANSPOSE_1D and ELU (llama/12786)
3b46fdc
Chenguang Li
commited on
ggml : add bilinear upscale support (ggml/1185)
4c5e449
Diego Devesa
commited on
CANN: fix typo in ggml-cann (llama/12733)
65ced74
CANN: Refactor to reduce duplicate code (llama/12731)
44ac81c
CANN: Support operator SIN COS ARGMAX (llama/12709)
904aaf5
CANN: Fix failed test cases (llama/12708)
7d5f3d4
get_rows and dup optimization (llama/12671)
ffa5f14
MUL_MAT optimization (llama/12382)
9dd08d5
Chenguang Li
commited on
ggml : upgrade init_tensor API to return a ggml_status (llama/11854)
d6b6852
William Tambellini
slaren
commited on
ggml : refactor online repacking (llama/10446)
163128e
CANN: RoPE operator optimization (llama/10563)
3ad7b0a
CANN: ROPE operator optimization (llama/10540)
63ee002
CANN: Improve the Inferencing Performance for Ascend NPU Device (llama/10454)
f9fd6d6
Shanshan Shen
shanshan shen
Frank Mai
commited on