代码仓库-厦门大学媒体分析与计算组 MAC-Media Analytics and Computing

■

名称：StoryWeaver

链接：https://github.com/Aria-Zhangjl/StoryWeaver

引文：Jinlu Zhang, Jiji Tang, Rongsheng Zhang, Tangjie Lv, Xiaoshuai Sun*. StoryWeaver: A Unified World Model for Knowledge-Enhanced Story Character Customization. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI).

简介：This package is the implementation of our StoryWeaver for character-customization story visualization. It is implemented based on the PyTorch platform, which contains the proposed benchmark, training code and sample code.

■

名称：BoMS

链接：https://github.com/theFool32/BoMS

引文：Hong Liu, Jie Li, Yongjian Wu, Rongrong Ji*. Learning Neural Bag-of-Matrix-Summarization with Riemannian Network. Thirty-Third AAAI Conference on Artificial Intelligence (AAAI).

简介：This package is the implementation of our Bag-of-Matrix Summarization method, which trains a more stable Riemannian network. It is implemented based on the PyTorch platform, which contains training code, validation code, and test code. The proposed method can be used for many real-world tasks, i.e., face validation, person ReID, Brain-interface Single Processing.

■

名称：OCH

链接：https://github.com/LynnHongLiu/OCH

引文：Hong Liu, Rongrong Ji*, Jingdong Wang, Chunhua Shen. Ordinal Constraint Binary Coding for Approximate Nearest Neighbor Search. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI)

简介：This package is the implementation of our Ordinal Constraint Hashing, which target at learning a robust hash function. It is implemented based on the Matlab platform, which contains training code, test code, and an example dataset. The proposed method can be used for large-scale image/text/music retrieval.

■

名称：CMGAT

链接：Person Re-Identification with Generative Adversarial Training.rar

引文：Pingyang Dai, Rongrong JI*, Haibin Wang, Qiong Wu, Yuyu Huang. Cross-Modality Person Re-Identification with Generative Adversarial Training. International Joint Conference on Artificial Intelligence (IJCAI).

简介：This package is the implementation of our person Re-ID system where cross-modality generative adversarial training was adopted to achieve efficent training and cutting edge accuracy.

■

名称：DGRL

链接：https://github.com/zhengxiawu/DGRL_OPFE

引文： Xiawu Zheng, Rongrong Ji*, Xiaoshuai Sun, Baochang Zhang, Yongjian Wu, Feiyue Huang. Towards Optimal Fine Grained Retrieval via Decorrelated Centralized Loss with Normalize-Scale layer. Thirty-Third AAAI Conference on Artificial Intelligence (AAAI)

简介：This package is the implementation of our Fine Grained Image Retrieval which trains a discriminative deep models with limited Time. It also includes the validation codes on FGIR.

■

名称：UAP_retrieval

链接：https://github.com/theFool32/UAP_retrieval

引文：Jie Li, Rongrong Ji*, Hong Liu, Xiaopeng Hong, Yue Gao, Qi Tian. Universal Perturbation Attack Against Image Retrieval. IEEE International Conference on Computer Vision (ICCV).

简介：This package is the implementation of our universal adversarial perturbation against image retrieval. It is based on the popular image retrieval system and can significantly reduce the retrieval performance on the classical datasets, i.e., the Paris6k and the Oxford5k.

■

名称：ICP

链接：https://github.com/hujiecpp/InformationCompetingProcess

引文：Jie Hu, Rongrong Ji*, ShengChuan Zhang, Xiaoshuai Sun, Qixiang Ye, Chia-Wen Lin, Qi Tian. Information Competing Process for Learning Diversified Representations. Neural Information Processing Systems (NIPS).

简介：This package is the implementation of our Information Competing Process for learning diversified representations (i.e., powerful and disentangled features).

■

名称：ControlMLLM

链接：https://github.com/mrwu-mac/ControlMLLM

引文：Mingrui Wu, Xinyue Cai, Jiayi Ji*, Jiale Li, Oucheng Huang, Gen Luo, Hao Fei, Guannan Jiang, Xiaoshuai Sun, Rongrong Ji. ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models. Proceedings of the 38th Conference on Neural Information Processing Systems (NeurIPS 2024).

简介：This package implements our ControlMLLM, which supports training-free integration of referring abilities into MLLMs. It is implemented based on the PyTorch platform, which contains demo code and test code. .

■

名称：X-Oscar

链接：https://github.com/LinZhekai/X-Oscar

引文：Yiwei Ma, Zhekai Lin, Jiayi Ji, Yijun Fan, Xiaoshuai Sun*, Rongrong Ji. X-Oscar: A Progressive Framework for High-quality Text-guided 3D Animatable Avatar Generation. Forty-first International Conference on Machine Learning (ICML 2024).

简介： This package is the implementation of our 3D animatable avatar generation method. It is implemented based on the PyTorch platform, which contains the complete avatar training generation and animation process code, enabling users to create detailed and realistic 3D avatars with the ability to animate them based on textual descriptions.

■

名称：R-Bench

链接：https://github.com/mrwu-mac/R-Bench

引文：ingrui Wu and Jiayi Ji and Oucheng Huang and Jiale Li and Yuhang Wu and Xiaoshuai Sun and Rongrong Ji. Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models. Forty-first International Conference on Machine Learning, (ICML)2024.

简介：This package is the implementation of our R-Bench, which used for evaluating Relationship Hallucinations in Large Vision-Language Models. It is implemented based on the PyTorch platform, which contains eval code and R-Bench data.

■

名称：3D-STMN

链接：https://github.com/sosppxo/3D-STMN

引文：Changli Wu, Yiwei Ma, Qi Chen, Haowei Wang, Gen Luo, Jiayi Ji*, Xiaoshuai Sun. 3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referring Expression Segmentation. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI).

简介：This package is the implementation of our Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referring Expression Segmentation method. It is implemented based on the PyTorch platform, which contains training code, validation code, and test code.

■

名称：OMPQ

链接：https://github.com/MAC-AutoML/OMPQ

引文：Yuexiao Ma, Taisong Jin*, Xiawu Zheng, Yan Wang, Huixia Li, Yongjian Wu, Guannan Jiang, Wei Zhang, Rongrong Ji. OMPQ: Orthogonal Mixed Precision Quantization. Thirty-Seventh AAAI Conference on Artificial Intelligence(AAAI).

简介：This package is the implementation of our paper OMPQ: Orthogonal Mixed Precision Quantization. Includes mixed bit configuration calculations, QAT and PTQ.

■

名称：MRECG

链接：https://github.com/bytedance/MRECG

引文：Yuexiao Ma, Huixia Li, Xiawu Zheng, Xuefeng Xiao, Rui Wang, Shilei Wen, Xin Pan, Fei Chao, Rongrong Ji*. Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective. IEEE International Conference on Computer Vision and Pattern Recognition(CVPR).

简介：This package is the implementation of our paper Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective. Includes different CNN models.

■

名称：AffineQuant

链接：https://github.com/bytedance/AffineQuant

引文：Yuexiao Ma, Huixia Li, Xiawu Zheng, Feng Ling, Xuefeng Xiao, Rui Wang, Shilei Wen, Fei Chao, Rongrong Ji. AffineQuant: Affine Transformation Quantization for Large Language Models. International Conference on Learning Representations (ICLR).

简介：This package is the implementation of our paper AffineQuant: Affine Transformation Quantization for Large Language Models. Includes blockwise optimization of LLMs and model validation..

■

名称：GDP

链接：https://github.com/ShaohuiLin/GDP

引文：Shaohui Lin, Rongrong Ji*, Yuchao Li, Yongjian Wu, Feiyue Huang, Baochang Zhang. Accelerating Convolutional Networks via Global & Dynamic Filter Pruning. International Joint Conference on Artificial Intelligence (IJCAI).

简介：Tensorflow implementation for GDP, a novel global & dynamic pruning (GDP) scheme to prune redundant ?lters for CNN acceleration.

■

名称：LRDKT

链接：https://github.com/ShaohuiLin/LRDKT

引文：Shaohui Lin, Rongrong Ji*, Yongjian Wu, and Xuelong Li.Towards Convolutional Neural Networks Compressing via Global Error Reconstruction. Twitten-fifth International Joint Conference on Artificial Intelligence (IJCAI).

简介：The cnn compression algorithm LRDKT, which implement by Caffe.

■

名称：CapsAtt

链接： https://github.com/XMUVQA/CapsAtt

引文：Yiyi Zhou, Rongrong Ji*, Jinsong Su, Xiaoshuai Sun, Weiqiu Chen. Dynamic Capsule Attention for Visual Question Answering. Thirty-Third AAAI Conference on Artificial Intelligence (AAAI).

简介：This package is the implementation of the proposed Dynamic Capsule Attention algorithms for Visual Question Answering and Image Captioning, which can help the deep learning model to perform multiple reasoning steps with less parameters.

■

名称：SSR

链接：https://github.com/ShaohuiLin/SSR

引文：Shaohui Lin, Rongrong Ji*, Yuchao Li, Cheng Deng, Xuelong Li. Towards Compact ConvNets via Structure-sparsity Regularized Filter Pruning. IEEE Transactions on Neural Networks and Learning Systems (TNN).

简介：This package inclues a novel filter pruning scheme, termed structured sparsity regularization (SSR), to simultaneously speedup the computation and reduce the memory overhead of CNN, which can be well supported by various off-the-shelf deep learning libraries.

■

名称：GAL

链接：https://github.com/ShaohuiLin/GAL

引文：Shaohui Lin, Rongrong Ji*, Chenqian Yan, Baochang Zhang, Liujuan Cao, Qixiang Ye, Feiyue Huang, David Doermann. Towards Optimal Structured CNN Pruning via Generative Adversarial Learning. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).

简介：PyTorch implementation for GAL, which solves the compression problem by generative adversarial learning (GAL), which learns a sparse soft mask in a label-free and an end-to-end manner.

■

名称：KSE

链接：https://github.com/yuchaoli/KSE

引文：Yuchao Li, Shaohui Lin, Baochang Zhang, Jianzhuang Liu, David Doermann, Yongjian Wu, Feiyue Huang, Rongrong Ji*Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).

简介：PyTorch implementation for KSE, which compress the network by kernel sparsity and entropy (KSE) indicator.

■

名称：MDE

链接：https://github.com/tanglang96/MDENAS

引文：Xiawu Zheng, Rongrong Ji* , Lang Tang, Baochang Zhang, Jianzhuang Liu, Qi Tian. Multinomial Distribution Learning for Effective Neural Architecture Search [pdf] [bibtex] IEEE International Conference on Computer Vision (ICCV).

简介：This package is the implementation of our paper Multinomial Distribution Learning for Effective Neural Architecture Search, which contains test codes and pretrained models.

■

名称：SSPNG

链接： https://github.com/nini0919/SSPNG

引文：Danni Yang, Jiayi Ji*, Xiaoshuai Sun, Haowei Wang, Yinan Li, Yiwei Ma, Rongrong Ji. Semi-Supervised Panoptic Narrative Grounding. Thirty-First ACM International Conference on Multimedia(ACM MM23).

简介： This package is the implementation of our paper solving Panoptic Narrative Grounding task in a semi-supervised manner.

■

名称：EoID

链接：https://github.com/mrwu-mac/EoID

引文：Mingrui Wu, Jiaxin Gu, Yunhang Shen, Mingbao Lin, Chao Chen, Xiaoshuai Sun*. End-to-End Zero-Shot HOI Detection via Vision and Language Knowledge Distillation. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI).

简介：This package is the implementation of our End-to-End Zero-Shot HOI Detection method, which trained via Vision and Language Knowledge Distillation. It is implemented based on the PyTorch platform, which contains training code, validation code, and test code.

■

名称：ActiveTeacher

链接：https://github.com/HunterJ-Lin/ActiveTeacher

引文：Peng Mi, Jianghang Lin, Yiyi Zhou, Yunhang Shen, Gen Luo, Xiaoshuai Sun, Liujuan Cao*, Rongrong Fu, Qiang Xu, Rongrong Ji. Active Teacher for Semi-Supervised Object Detection. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).

简介：This package is the implementation of our Active Teacher for Semi-Supervised Object Detection. It includes the three metrics for measuring data proposed in the paper, as well as semi-supervised training and inference code.

■

名称：WSOVOD

链接：https://github.com/HunterJ-Lin/WSOVOD

引文：Jianghang Lin, Yunhang Shen, Bingquan Wang, Liujuan Cao*, Shaohui Lin, Ke Li. Weakly Supervised Open-Vocabulary Object Detection. The 38th Annual AAAI Conference on Artificial Intelligence (AAAI).

简介：This package is the implementation of our Weakly Supervised Open-Vocabulary Object Detection (WSOVOD). It includes the training and testing codes on various mixed datasets in a open vocabulary manner.

■

名称：SSAL

链接：暂不开源

引文：Rongrong Ji, Ke Li*, Yan Wang, Xiaowei Guo, Yongjian Wu, Feiyue Huang, and Jiebo Luo. Semi-Supervised Adversarial Monocular Depth Estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI).

简介：This package is the implementation of our Semi-Supervised Adversarial Learning Framework which trains competitive deep models with limited Labelled Data. It also includes the validation codes on a typical computer vision application, i.e., Monocular Depth Estimation.

■

名称：ASV

链接：A New Transfer Function for Volume Visualization.mp4

引文：Chenxi Huang, Yisha Lan, Guokai Zhang, Gaowei Xu, Guoyan Zheng, Nianyin Zeng, E.Y.K. NG, YongQiang Cheng, Landu Jiang, NingZhi Han, Rongrong Ji*, Yonghong Peng. A New Transfer Function for Volume Visualization of Aortic Stent and Virtual Endoscopy Application. Transactions on Multimedia Computing, Communications, and Applications (TOMM).

简介：We release a video demonstration of our volume visualization method for virtual endoscopy application.

■

名称：MLDT

链接：https://github.com/Klitter/Pyramidal_Person_ReID

引文：Feng Zheng, Rongrong Ji*, Cheng Deng, Xing Sun, Xingyang Jiang, Xiaowei Guo, Zongqiao Yu, Feiyue Huang. Pyramidal Person Re-IDentification via Multi-Loss Dynamic Training. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).

简介：This is an implementation of our multi-loss dynamic training framework which is targeted on the application of pyramidal person re-ID.

■

名称：FGVQA

链接：http://mac.xmu.edu.cn/rrji/codes/YiyiZhou_TPAMI2019_Fine-Grained Learning for Visual Question Answering.rar

引文：Yiyi Zhou, Rongrong Ji*, Xiaoshuai Sun, Jingsong Su, Deyu Meng, Yue Gao, Chunhua Shen. Plenty is Plague: Fine-Grained Learning for Visual Question Answering. [pdf] [bibtex] IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI).

简介：This package is the implementation of Fine-grained learning strategy for VQA on VQA2.0, VQA-CP and Visual Genome datasets, which train VQA models with less training examples and steps. It includes the Actor-Critic based training scheme and three sampling strategies proposed in the paper.

■

名称：PIL

链接：https://github.com/xiangmingLi/PIL

引文：Yiyi Zhou, Rongrong Ji, Jinsong Su, Xiangming Li ,Xiaoshuai Sun. Free VQA Models from Knowledge Inertia By Pairwise Inconformity Learning. [pdf] [bibtex] Thirty-Third AAAI Conference on Artificial Intelligence (AAAI).

简介：This package is the implementation of the proposed Pairwise Inconformity Learning strategy for Visual Question Answering on VQA2.0 dataset, which can help the VQA models to reduce the effect of language bias to improve the model's robustness. Meanwhile, the model performance can be also improved.

■

名称：WS-JDS

链接：https://github.com/shenyunhang/WS-JDS

引文：Yunhang Shen, Rongrong Ji*, Yan Wang, Yongjian Wu, Liujuan Cao. Cyclic Guidance for Weakly Supervised Joint Detection and Segmentation. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).

简介：This package is the implementation of our efficient and effective framework termed Weakly Supervised Joint Detection and Segmentation (WS-JDS). It includes the training and testing codes on various datasets. It also includes serveral the state-of-the-art WSOD methods.

■

名称：PPS

链接：https://github.com/shenyunhang/PPS

引文：Yunhang Shen, Rongrong Ji*, Xiaopeng Hong, Feng Zheng, Xiaowei Guo, Yongjian Wu, Feiyue Huang. A Part Power Set Model for Scale-Free Person Retrieval. International Joint Conference on Artificial Intelligence (IJCAI).

简介：This package is the implementation of PPS, which is an end-to-end part power set model with multi-scale features. PPS captures the discriminative parts of pedestrians from global to local, and from coarse to fine, enabling part-based scale-free person re-ID. It includes the training and testing codes on various datasets. It also includes serveral the state-of-the-art re-ID methods.

■

名称：GAL-fWSD

链接：https://github.com/shenyunhang/GAL-fWSD

引文：Yunhang Shen, Rongrong Ji*, Shengchuan Zhang, Wangmeng Zuo, Yan Wang, Feiyue Huang. Generative Adversarial Learning Towards Fast Weakly Supervised Detection. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).

简介：This package is the implementation of a novel Generative Adversarial Learning (GAL) paradigm, termed Generative Adversarial Learning Towards Fast Weakly Supervised Detection (GAL-fWSD), which regards the inference of WSD as a generative process, supervised by a discriminator. It includes the training and testing codes on various datasets.

■

名称：CSC

链接：https://github.com/shenyunhang/CSC

引文：Yunhang Shen, Rongrong Ji*, Kuiyuan Yang, Cheng Deng, Changhu Wang. Category-Aware Spatial Constraint for Weakly Supervised Detection. IEEE Transactions on Image Processing (TIP).

简介：This package is the implementation of Category-aware Spatial Constraint (CSC) scheme for proposals, which is integrated into weakly supervised object detection in an end-to-end learning manner. In partic- ular, we incorporate the global shape information of objects as an unsupervised constraint, which is inferred from build-in foreground-and-background cues, termed Category-specific Pixel Gradient (CPG) maps. It includes the training and testing codes on various datasets.

■

名称：OPG

链接：https://github.com/shenyunhang/OPG

引文：Yunhang Shen, RongrongJi*, Changhu Wang, Xi Li, Xuelong Li. Weakly Supervised Object Detection via Object-Specific Pixel Gradient. IEEE Transactions on Neural Networks and Learning Systems (TNN).

简介：This package is the implementation of a novel scheme to perform weakly supervised object localization, termed object-specific pixel gradient (OPG). The OPG is trained by using image-level annotations alone, which performs in an iterative manner to localize potential objects in a given image robustly and efficiently. It includes the training and testing codes.

■

名称：Bi-MHGL

链接：https://github.com/cfh3c/BiMHGL

引文：Rongrong Ji, Fuhai Chen, Liujuan Cao*, Yue Gao*. Cross-Modality Microblog Sentiment Prediction via Bi-Layer Multimodal Hypergraph Learning. IEEE Transactions on Multimedia (TMM).

简介：This is the implementation of bi-Layer multimodal hypergraph learning for microblog sentiment prediction. It contains transductive learning, evaluation and inference. The entrances are in run_CV_gridsearch.m. BiHG_learning2.m presents the core part of BiMHGL.