 |
孙晓帅
厦门大学 教授,博士生导师,人工智能系系主任
媒体分析与计算实验室
地址:厦门大学翔安校区 西部片区5号楼110-2 邮编:361005
电子邮件:xssun@xmu.edu.cn
|
|
代码仓库
■
名称:External-Attention-pytorch
链接: https://github.com/xmu-xiaoma666/External-Attention-pytorch
简介:
External-Attention-pytorch是一个针对深度学习科研和工业社区的开源代码库,旨在提供一系列可重用的深度学习模块,以便科研工作者和工程师们快速搭建自己的模型。
项目主要特点包括:
简单易用:项目的每个组件都非常简洁明了,同时提供详细的文档和示例代码,让深度学习小白也能快速上手。
高度可定制:项目的组件设计灵活,可以通过参数配置和继承来实现个性化定制,满足不同任务和应用场景的需求。
开源共享:项目采用MIT协议开源,任何人都可以免费使用、修改和分发代码库。
■
名称:Clover
链接: https://github.com/LeeYN-43/Clover
引文:
Jingjia Huang, Yinan Li, Jiashi Feng, Xinglong Wu, Xiaoshuai Sun(通讯作者), Rongrong Ji.
Clover: Towards A Unified Video-Language Alignment and Fusion Model.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
■
名称:RefCLIP
链接: https://github.com/kingthreestones/RefCLIP
引文:
Lei Jin, Gen Luo, Yiyi Zhou, Xiaoshuai Sun(通讯作者)
, Guannan Jiang
, Annan Shu
, Rongrong Ji.
RefCLIP: A Universal Teacher for Weakly Supervised Referring Expression Comprehension.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
■
名称:LSTNet
链接: https://github.com/xmu-xiaoma666/LSTNet
引文:
Yiwei Ma
, Jiayi Ji
, Xiaoshuai Sun(通讯作者)
, Yiyi Zhoua
, Rongrong Ji.
Towards local visual modeling for image captioning.
Pattern Recognition (PR), 2023.
■
名称:EPNG
链接: https://github.com/Mr-Neko/EPNG
引文:
Haowei Wang, Jiayi Ji, Yiyi Zhou, Yongjian Wu, Xiaoshuai Sun(通讯作者).
Towards Real-Time Panoptic Narrative Grounding by an End-to-End Grounding Network.
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2023.
■
名称:EoID
链接: https://github.com/mrwu-mac/EoID
引文:
Mingrui Wu, Jiaxin Gu, Yunhang Shen, Mingbao Lin, Chao Chen, Xiaoshuai Sun(通讯作者).
End-to-End Zero-Shot HOI Detection via Vision and Language Knowledge Distillation.
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2023.
■
名称:X-CLIP
链接: https://github.com/xuguohai/X-CLIP
引文:
Yiwei Ma, Guohai Xu, Xiaoshuai Sun(通讯作者), Ming Yan, Ji Zhang, Rongrong Ji.
X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval.
ACM International Conference on Multimedia (ACM MM), 2022.
■
名称:MDSANet
链接: https://github.com/Young499/image-captioning-MDSANet
引文:
Jiayi Ji
, Xiaoyang Huang
, Xiaoshuai Sun(通讯作者)
,Yiyi Zhou, Gen Luo, Liujuan Cao, Jianzhuang Liu.
Multi-Branch Distance-Sensitive Self-Attention Network for Image Captioning.
IEEE Transactions on Multimedia (TMM), 2022.
■
名称:SDATR
链接: https://github.com/xmu-xiaoma666/SDATR
引文:
Yiwei Ma, Jiayi Ji, Xiaoshuai Sun(通讯作者),Yiyi Zhou, Yongjian Wu, Feiyue Huang, Rongrong Ji.
Knowing What It Is: Semantic-Enhanced Dual Attention Transformer.
IEEE Transactions on Multimedia (TMM), 2022.
■
名称:MFM
链接: https://github.com/xmu-xiaoma666/MFM
引文:
Jiayi Ji, Yiwei Ma, Xiaoshuai Sun(通讯作者), Yiyi Zhou, Yongjian Wu, Rongrong Ji.
Knowing What to Learn: A Metric-Oriented Focal Mechanism for Image Captioning.
IEEE Transactions on Image Processing (TIP), 2022.
■
名称:DIFNet
链接: https://github.com/mrwu-mac/DIFNet
引文:
Mingrui Wu, Xuying Zhang, Xiaoshuai Sun(通讯作者), Yiyi Zhou, Chao Chen, Jiaxin Gu, Xing Sun, Rongrong Ji.
DIFNet: Boosting Visual Information Flow for Image Captioning.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
■
名称:TRAR-VQA
链接: https://github.com/rentainhe/TRAR-VQA
引文:
Yiyi Zhou, Tianhe Ren, Chaoyang Zhu, Xiaoshuai Sun(通讯作者), Jianzhuang Liu, Xinghao Ding, Mingliang Xu, Rongrong Ji.
TRAR: Routing the Attention Spans in Transformer for Visual Question Answering.
International Conference on Computer Vision (ICCV), 2021.
■
名称:DLCT
链接: https://github.com/luo3300612/image-captioning-DLCT
引文:
Yunpeng Luo, Jiayi Ji, Xiaoshuai Sun(通讯作者), Liujuan Cao, Yongjian Wu, Feiyue Huang, Chia-Wen Lin, Rongrong Ji.
Dual-Level Collaborative Transformer for Image Captioning.
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2021.
■
名称:RSTNet
链接: https://github.com/zhangxuying1004/RSTNet
引文:
Xuying Zhang, Xiaoshuai Sun(通讯作者), Yunpeng Luo, Jiayi Ji, Yiyi Zhou, Yongjian Wu, Feiyue Huang, Rongrong Ji.
RSTNet: Captioning With Adaptive Attention on Visual and Non-Visual Words.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
■
名称:MSA
链接: https://github.com/zhangxuying1004/MSA
引文:
Xiaoshuai Sun(第一作者), Xuying Zhang, Liujuan Cao, Yongjian Wu, Feiyue Huang, Rongrong Ji.
Exploring Language Prior for Mode-Sensitive Visual Attention Modeling.
ACM International Conference on Multimedia (ACM MM), 2020.
|