Publications

Reward-aware Preference Optimization: A Unified Mathematical Framework for Model Alignment Shengyang Sun, Yian Zhang*, Alexander Bukharin*, David Mosallanezhad, Jiaqi Zeng, Soumye Singhal, Gerald Shen, Adithya Renduchintala, Tugrul Konuk, Yi Dong, Zhilin Wang, Dmitry Chichkov, Olivier Delalleau, Oleksii Kuchaiev RLHF-RPO-DPO
Nemotron-4 340B Technical Report Shengyang Sun et al. Techical Report. Nemotron-4-340B-Instruct Synthetic Data Generation
NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment Gerald Shen, Zhilin Wang, Olivier Delalleau, Jiaqi Zeng, Yi Dong, Daniel Egert, Shengyang Sun, Jimmy Zhang, Sahil Jain, Ali Taghibakhshi, Markel Sanz Ausin, Ashwath Aithal, Oleksii Kuchaiev COLM 2024. PPO Pipeline in Nemo-Aligner
Information-theoretic Online Memory Selection for Continual Learning Shengyang Sun, Daniele Calandriello, Huiyi Hu, Ang Li, Michalis Titsias ICLR 2022. Online Memory Selection
Understanding the Variance Collapse of SVGD in High Dimensions Jimmy Ba, Murat A Erdogdu, Marzyeh Ghassemi, Shengyang Sun, Taiji Suzuki, Denny Wu, Tianzong Zhang ICLR 2022. MMD-Descent recovers the Ground-Truth HMC Posterior
Scalable Variational Gaussian Processes via Harmonic Kernel Decomposition Shengyang Sun, Jiaxin Shi, Andrew Gordon Wilson, Roger Grosse ICML 2021. Orthogonally Decomposed Gaussian Processes
Neural Networks as Inter-Domain Inducing Points Shengyang Sun*, Jiaxin Shi*, Roger Grosse AABI 2021. A two-layer neural network is equivalent to a sparse Gaussian process
Beyond Marginal Uncertainty: How Accurately can Bayesian Regression Models Estimate Posterior Predictive Correlations?[oral] Chaoqi Wang*, Shengyang Sun*, Roger Grosse AISTATS 2021. Predictive posterior correlations
Fast-rate PAC-Bayes Generalization Bounds via Shifted Rademacher Processes Jun Yang*, Shengyang Sun*, Daniel M. Roy NeurIPS 2019. Fast-Rate PAC-Bayes Generalization Bounds
Functional Variational Bayesian Neural Networks Shengyang Sun*, Guodong Zhang*, Jiaxin Shi*, Roger Grosse ICLR 2019. FBNN learns from implicit piecewise constant/linear function priors
Aggregated Momentum: Stability Through Passive Damping James Lucas, Shengyang Sun, Richard Zemel, Roger Grosse ICLR 2019. Aggregated Momentum
Differentiable Compositional Kernel Learning for Gaussian Processes Shengyang Sun, Guodong Zhang, Chaoqi Wang, Wenyuan Zeng, Jiaman Li, Roger Grosse ICML 2018. Neural Kernel Network learns GP priors automatically from training data.
Noisy Natural Gradient as Variational Inference Guodong Zhang*, Shengyang Sun*, David Duvenaud, Roger Grosse ICML 2018. Gaussian Variational Posteriors
A Spectral Approach to Gradient Estimation for Implicit Distributions Jiaxin Shi, Shengyang Sun, Jun Zhu ICML 2018. Image Generation using the SSGE
Kernel implicit variational inference Jiaxin Shi*, Shengyang Sun*, Jun Zhu ICLR 2018. Image Inteporation
ZhuSuan: A library for Bayesian deep learning Jiaxin Shi, Jianfei Chen, Jun Zhu, Shengyang Sun, Yucen Luo, Yihong Gu, Yuhao Zhou ZhuSuan Implementation for Bayesian neural networks
Learning structured weight uncertainty in bayesian neural networks Shengyang Sun, Changyou Chen, Lawrence Carin AISTATS 2017. Bayesian neural networks weights with matrix-variate Gaussian posteriors
On the Spectral Efficiency of Massive MIMO Systems With Low-Resolution ADCs. Jiayi Zhang, Linglong Dai, Shengyang Sun, Zhaocheng Wang IEEE Communications Letters 2016. MIMO