I am currently an AI researcher working on embodied AI with Dr. Tao Kong at ByteDance Research.

I received my Master’s degree in Artificial Intelligence from Fudan University (Sep. 2021 - Jun. 2024), where Prof. Tao Chen is my advisor. I am fortunate to work closely with Dr. Hongyuan Zhu from A*STAR, Singapore, and Dr. Gang Yu, Dr. Xin Chen, and Dr. Chi Zhang from Tencent. Before this, I obtained my Bachelor’s degree in Data Science and Big Data Technology also from Fudan University (Sep. 2017 - Jun. 2021).

My long-term research goal is to develop robust and generalized multi-modality systems that can perceive, understand, and interact with the physical world.

📣 If you are interested in my previous projects, feel free to check out my resume here.

🔥 News

Apr. 2025. 🎉🎉 We release OmniSVG , a family of VLMs that progressively generate SVGs spanning from simple icons to intricate anime characters.
Jan. 2025. 🎉🎉 Our MeshAnything is accepted to ICLR 2025.
Sep. 2024. 🎉🎉 Two papers accepted to NeurIPS 2024, one focuses on foundational 3D generative models (MeshXL ), and another one explores Mamba architecture for 3D detection (3DET-Mamba).
Jul. 2024. 🎉🎉 Our M3DBench , a dataset querying 3D LLMs with multi-modal prompts, is accepted to ECCV 2024.
May. 2024. 🎉🎉 We release MeshXL , a family of generative 3D foundation models for 3D meshes.
May. 2024. 🎉🎉 I successfully defended my master’s thesis! [defense slides]
Apr. 2024. 🎉🎉 Our state-of-the-art 3D dense captioning method Vote2Cap-DETR++ , is accepted to T-PAMI 2024.
Feb. 2024. 🎉🎉 Our Large Language 3D Assistant, LL3DA , is accepted to CVPR 2024.
Jan. 2024. 🐧🐧 Join Tencent as a research intern, working on 3D generation.
Oct. 2023. 🥇🥇 Win the Scan2Cap Challenge at ICCV 2023.
Feb. 2023. 🎉🎉 Our Vote2Cap-DETR is accepted to CVPR 2023.

📝 Selected Publications

I started my research from exploring how to use language for better 3D scene understanding (Vote2Cap-DETR and Vote2Cap-DETR++). Then, as large language models exhibits tremendous generalist potentials, I also explored whether LLMs can understand 3D (LL3DA and M3DBench). After that, I spent a wonderful half year exploring whether LLMs can speak 3D (MeshXL). Currently, I am working on both embodied AI and AIGC.

ArXiv Pre-print

OmniSVG: A Unified Scalable Vector Graphics Generation Model
ArXiv Pre-print |
Yiying Yang$^\star$, Wei Cheng$^\star$, Sijin Chen, Xianfang Zeng, Jiaxu Zhang, Liao Wang, Gang Yu, Xinjun Ma, Yu-Gang Jiang

project | arXiv | github

OmniSVG progressively generates SVGs spanning from simple icons to intricate anime characters.

NeurIPS 2024

MeshXL: Neural Coordinate Field for Generative 3D Foundation Models
NeurIPS 2024 |
Sijin Chen, Xin Chen$^{\dagger}$, Anqi Pang, Xianfang Zeng, Wei Cheng, Yijun Fu, Fukun Yin, Yanru Wang, Zhibin Wang, Chi Zhang, Jingyi Yu, Gang Yu, Bin Fu, Tao Chen$^{\ddagger}$

project | arXiv | github

MeshXL turns a 3D mesh into one unique coordinate sequence, facilitating an end-to-end training pipeline for large-scale 3D mesh data.
🎉 Please also see our MeshAnything , a model that mimics human artists in extracting meshes from any 3D representation, which is accepted to ICLR 2025

CVPR 2024

LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning
CVPR 2024 |
Sijin Chen, Xin Chen$^{\dagger}$, Chi Zhang, Mingsheng Li, Gang Yu, Hao Fei, Hongyuan Zhu, Jiayuan Fan, Tao Chen$^{\ddagger}$

paper | project | arXiv | github | youtube

Propose a Large Language 3D Assistant that responds to both visual interactions and textual instructions in complex 3D environments.
🎉 Please also see our M3DBench , a dataset querying 3D LLMs with multi-modal prompts, which is accepted to ECCV 2024.

T-PAMI 2024

Vote2Cap-DETR++: Decoupling Localization and Describing for End-to-End 3D Dense Captioning
T-PAMI 2024 |
Sijin Chen, Hongyuan Zhu, Mingsheng Li, Xin Chen, Peng Guo, Yinjie Lei, Gang Yu, Taihao Li, Tao Chen$^{\dagger}$

paper | arXiv | github

Decoupled feature extraction and task decoding for 3D Dense Captioning.

CVPR 2023

End-to-End 3D Dense Captioning with Vote2Cap-DETR
CVPR 2023 |
Sijin Chen, Hongyuan Zhu, Xin Chen, Yinjie Lei, Gang Yu, Tao Chen$^{\dagger}$

paper | arXiv | github | youtube

We address 3D Dense Captioning as a set prediction problem with parallel decoding.
The first non-“detect-then-describe” framework for 3D Dense Captioning.
🥇 Winner of the Scan2Cap Challenge in the 3rd Language for 3D Scene Workshop at ICCV 2023. [talk]

🥇 Awards and Scholarships

Apr. 2024. Award for Outstanding Graduate Student (rank 1/24).
Oct. 2023. 1st place of the Scan2Cap Challenge in the 3rd Language for 3D Scene Workshop at ICCV 2023.
Sep. 2023. National Scholarship (rank 1/46).
Sep. 2022. 2nd prize of the Scholarship for Outstanding Students of Master’s Degrees.
Sep. 2021. Award for the Scholarship for Outstanding Students of Master’s Degrees.
Jun. 2021. 2nd prize of the Scholarship for Outstanding Students.

📖 Educations

Sep. 2021 - Jun. 2024. Master student at Fudan University.
Sep. 2017 - Jun. 2021. Bachelor student at Fudan University.

💬 Oral Presentations

Jul. 2024. “MeshXL: Neural Coordinate Field for Generative 3D Foundation Models”. MeshXL paves the way for scaling up training on large-scale 3D mesh data. Our mesh representation turns a 3D mesh into one unique coordinate sequence, which enables us to simplify our architecture design into a decoder-only transformer model, facilitating an end-to-end training pipeline for large-scale 3D mesh data. A technical report at miHoYo.
Oct. 2023. “Vote2Cap-DETR: A Set-to-Set Perspective Towards 3D Dense Captioning”. By treating 3D Dense Captioning as a translation task from a set of object queries into a set of ``box-caption’’ pairs, we present a set-to-set perspective towards 3D Dense Captioning. A winner presentation for the Scan2Cap challenge at ICCV 2023. [talk | slides]
Jun. 2023. “End-to-End 3D Dense Captioning with Vote2Cap-DETR”. We present an end-to-end transformer model for localizing and describing objects in parallel within diverse 3D environments. A paper presentation at VALSE 2023, Wuxi, China.

Sijin Chen(思锦 陈)

🔥 News

📝 Selected Publications

🥇 Awards and Scholarships

📖 Educations

💬 Oral Presentations

Sijin Chen(思锦陈)