厦门大学人工智能研究院
孙晓帅

厦门大学    教授,博士生导师,人工智能系系主任

媒体分析与计算实验室

地址:厦门大学翔安校区 西部片区5号楼110-2 邮编:361005

电子邮件:xssun@xmu.edu.cn


 

代码仓库

  ■

名称:External-Attention-pytorch

链接 https://github.com/xmu-xiaoma666/External-Attention-pytorch

简介: External-Attention-pytorch是一个针对深度学习科研和工业社区的开源代码库,旨在提供一系列可重用的深度学习模块,以便科研工作者和工程师们快速搭建自己的模型。 项目主要特点包括: 简单易用:项目的每个组件都非常简洁明了,同时提供详细的文档和示例代码,让深度学习小白也能快速上手。 高度可定制:项目的组件设计灵活,可以通过参数配置和继承来实现个性化定制,满足不同任务和应用场景的需求。 开源共享:项目采用MIT协议开源,任何人都可以免费使用、修改和分发代码库。

  ■

名称:Clover

链接 https://github.com/LeeYN-43/Clover

引文
Jingjia Huang, Yinan Li, Jiashi Feng, Xinglong Wu, Xiaoshuai Sun(通讯作者), Rongrong Ji.
Clover: Towards A Unified Video-Language Alignment and Fusion Model.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.

  ■

名称:RefCLIP

链接 https://github.com/kingthreestones/RefCLIP

引文
Lei Jin, Gen Luo, Yiyi Zhou, Xiaoshuai Sun(通讯作者) , Guannan Jiang , Annan Shu , Rongrong Ji.
RefCLIP: A Universal Teacher for Weakly Supervised Referring Expression Comprehension.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.

  ■

名称:LSTNet

链接 https://github.com/xmu-xiaoma666/LSTNet

引文
Yiwei Ma , Jiayi Ji , Xiaoshuai Sun(通讯作者) , Yiyi Zhoua , Rongrong Ji.
Towards local visual modeling for image captioning.
Pattern Recognition (PR), 2023.

  ■

名称:EPNG

链接 https://github.com/Mr-Neko/EPNG

引文
Haowei Wang, Jiayi Ji, Yiyi Zhou, Yongjian Wu, Xiaoshuai Sun(通讯作者).
Towards Real-Time Panoptic Narrative Grounding by an End-to-End Grounding Network.
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2023.

  ■

名称:EoID

链接 https://github.com/mrwu-mac/EoID

引文
Mingrui Wu, Jiaxin Gu, Yunhang Shen, Mingbao Lin, Chao Chen, Xiaoshuai Sun(通讯作者).
End-to-End Zero-Shot HOI Detection via Vision and Language Knowledge Distillation.
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2023.

  ■

名称:X-CLIP

链接 https://github.com/xuguohai/X-CLIP

引文
Yiwei Ma, Guohai Xu, Xiaoshuai Sun(通讯作者), Ming Yan, Ji Zhang, Rongrong Ji.
X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval.
ACM International Conference on Multimedia (ACM MM), 2022.

  ■

名称:MDSANet

链接 https://github.com/Young499/image-captioning-MDSANet

引文
Jiayi Ji , Xiaoyang Huang , Xiaoshuai Sun(通讯作者) ,Yiyi Zhou, Gen Luo, Liujuan Cao, Jianzhuang Liu.
Multi-Branch Distance-Sensitive Self-Attention Network for Image Captioning.
IEEE Transactions on Multimedia (TMM), 2022.

  ■

名称:SDATR

链接 https://github.com/xmu-xiaoma666/SDATR

引文
Yiwei Ma, Jiayi Ji, Xiaoshuai Sun(通讯作者),Yiyi Zhou, Yongjian Wu, Feiyue Huang, Rongrong Ji.
Knowing What It Is: Semantic-Enhanced Dual Attention Transformer.
IEEE Transactions on Multimedia (TMM), 2022.

  ■

名称:MFM

链接 https://github.com/xmu-xiaoma666/MFM

引文
Jiayi Ji, Yiwei Ma, Xiaoshuai Sun(通讯作者), Yiyi Zhou, Yongjian Wu, Rongrong Ji.
Knowing What to Learn: A Metric-Oriented Focal Mechanism for Image Captioning.
IEEE Transactions on Image Processing (TIP), 2022.

  ■

名称:DIFNet

链接 https://github.com/mrwu-mac/DIFNet

引文
Mingrui Wu, Xuying Zhang, Xiaoshuai Sun(通讯作者), Yiyi Zhou, Chao Chen, Jiaxin Gu, Xing Sun, Rongrong Ji.
DIFNet: Boosting Visual Information Flow for Image Captioning.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

  ■

名称:TRAR-VQA

链接 https://github.com/rentainhe/TRAR-VQA

引文
Yiyi Zhou, Tianhe Ren, Chaoyang Zhu, Xiaoshuai Sun(通讯作者), Jianzhuang Liu, Xinghao Ding, Mingliang Xu, Rongrong Ji.
TRAR: Routing the Attention Spans in Transformer for Visual Question Answering.
International Conference on Computer Vision (ICCV), 2021.

  ■

名称:DLCT

链接 https://github.com/luo3300612/image-captioning-DLCT

引文
Yunpeng Luo, Jiayi Ji, Xiaoshuai Sun(通讯作者), Liujuan Cao, Yongjian Wu, Feiyue Huang, Chia-Wen Lin, Rongrong Ji.
Dual-Level Collaborative Transformer for Image Captioning.
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2021.

  ■

名称:RSTNet

链接 https://github.com/zhangxuying1004/RSTNet

引文
Xuying Zhang, Xiaoshuai Sun(通讯作者), Yunpeng Luo, Jiayi Ji, Yiyi Zhou, Yongjian Wu, Feiyue Huang, Rongrong Ji.
RSTNet: Captioning With Adaptive Attention on Visual and Non-Visual Words.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.

  ■

名称:MSA

链接 https://github.com/zhangxuying1004/MSA

引文
Xiaoshuai Sun(第一作者), Xuying Zhang, Liujuan Cao, Yongjian Wu, Feiyue Huang, Rongrong Ji.
Exploring Language Prior for Mode-Sensitive Visual Attention Modeling.
ACM International Conference on Multimedia (ACM MM), 2020.