Publications

Journal Papers

W. Li, G. Wang, D. Lyu, N. Xu, G. He, “Efficient Hardware Architecture Design for Rotary Position Embedding of Large Language Models,” in IEEE Journal on Emerging and Selected Topics in Circuits and Systems, accepted, 2025.
D. Lyu, Z. Li, Y. Xu, G. Wang, W. Li, Y. Chen, L. Chen, W. He and G. He, “An Efficient Multi-View Cross-Attention Accelerator for Vision-Centric 3D Perception in Autonomous Driving,” in IEEE Transactions on Circuits and Systems I: Regular Papers, accepted, 2025.
G. Wang, S. Cai, W. Li, D. Lyu and G. He, “OFQ-LLM: Outlier-Flexing Quantization for Efficient Low-Bit Large Language Model Acceleration,” in IEEE Transactions on Circuits and Systems I: Regular Papers, accepted, 2025.
W. Li, D. Lyu, G. Wang, A. Hu, N. Xu, G. He, “Hardware-Oriented Algorithms for Softmax and Layer Normalization of Large Language Models,” in SCIENCE CHINA Information Sciences, 2024, 67(10): 200404.
G. Wang, S. Cai, W. Li, D. Lyu, G. He, “BSViT: A Bit-Serial Vision Transformer Accelerator Exploiting Dynamic Patch and Weight Bit-Group Quantization,” in IEEE Transactions on Circuits and Systems I: Regular Papers, early access, 2024.
W. Li, A. Hu, N. Xu, G. He, “CoDA: A Co-Design Framework for Versatile and Efficient Attention Accelerators,” in IEEE Transactions on Computers, vol. 73, no. 8, pp. 1924-1938, Aug. 2024.
W. Li, A. Hu, N. Xu, G. He, “Quantization and Hardware Architecture Co-Design for Matrix-Vector Multiplications of Large Language Models,” in IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 71, no. 6, pp. 2858-2871, June 2024.
A. Hu, W. Li, D. Lyu, G. He, “Efficient Parallel Stochastic Computing Multiply-Accumulate (MAC) Technique Using Pseudo-Sobol Bit-Streams”, in IEEE Transactions on Nanotechnology, vol. 23, pp. 170-179, 2024.
W. Li, A. Hu, N. Xu and G. He, “A Precision-Scalable Deep Neural Network Accelerator With Activation Sparsity Exploitation,” in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 43, no. 1, pp. 263-276, Jan. 2024.
W. Li, A. Hu, G. Wang, N. Xu and G. He, “Low-Complexity Precision-Scalable Multiply-Accumulate Unit Architectures for Deep Neural Network Accelerators,” in IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 70, no. 4, pp. 1610-1614, April 2023.
W. Li, N. Xu, R. Wang and G. He, “Efficient Compression Methods for Wire-Spread-Based Stochastic Computing Deep Neural Networks,” in IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 69, no. 11, pp. 4538-4542, Nov. 2022.
W. Li, J. Lin and Z. Wang, “Multi-Layer Generalized Integrated Interleaved Codes,” in IEEE Communications Letters, vol. 24, no. 9, pp. 1880-1884, Sept. 2020.
W. Li, J. Lin and Z. Wang, “A 124-Gb/s Decoder for Generalized Integrated Interleaved Codes,” in IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 66, no. 8, pp. 3174-3187, Aug. 2019.
W. Li, J. Tian, J. Lin and Z. Wang, “Modified GII-BCH Codes for Low-Complexity and Low-Latency Encoders,” in IEEE Communications Letters, vol. 23, no. 5, pp. 785-788, May 2019.

Conference Papers

A. Hu, W. Li, D. Lv, G. He, “An Efficient Stochastic Convolution Accelerator based on Pseudo-Sobol Sequences,” 17th ACM International Symposium on Nanoscale Architectures (NANOARCH), 2022, pp. 1–6.
W. Li, J. Lin and Z. Wang, “Improved Soft-Assisted Iterative Bounded Distance Decoding for Product Codes,” 2019 IEEE 5th International Conference on Computer and Communications (ICCC), 2019, pp. 710-714.
Z. Yan, W. Li, J. Lin and Z. Wang, “A Low-Complexity Error-and-Erasure Decoding Algorithm for t=2 RS Codes,” 2019 IEEE International Workshop on Signal Processing Systems (SiPS), 2019, pp. 43-47.
Z. Yan, W. Li, J. Lin and Z. Wang, “Fast and Low-Complexity Decoding Algorithm and Architecture for Quadruple-Error-Correcting RS codes,” 2018 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS), 2018, pp. 191-194.
W. Li, J. Lin and Z. Wang, “Comparison between Generalized Integrated Interleaved Codes and Generalized Error Location Codes,” 2018 IEEE 18th International Conference on Communication Technology (ICCT), 2018, pp. 21-25.
W. Li, J. Lin and Z. Wang, “An efficient post processing scheme to lower the error floor of LDPC decoders,” 2017 IEEE 17th International Conference on Communication Technology (ICCT), 2017, pp. 122-126.

You can also find my articles on my Google Scholar profile.

Wenjie Li (李文杰)

Publications

Journal Papers

Conference Papers