# VLFND

**Repository Path**: linrongxin2024/vlfnd

## Basic Information

- **Project Name**: VLFND
- **Description**: 基于视觉语言大模型的虚假新闻检测
- **Primary Language**: Python
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 2
- **Forks**: 0
- **Created**: 2025-04-20
- **Last Updated**: 2025-11-24

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# 基于视觉语言大模型的多模态虚假新闻检测

## 环境搭建 Build Environment

### Anaconda

1. 在Linux系统将Anaconda安装到指定目录（比如系统盘空间小）

下载地址：[Download Now | Anaconda](https://www.anaconda.com/download/success)

2. 进入页面后点击企鹅图标对应Linux版本，会直接跳转到页面下方，一般选择x86第一项，右击选择“复制链接地址”，如图
3. 切换到自己的Linux系统目录，使用wget指令从指定URL下载此文件

```bash
wget https://repo.anaconda.com/archive/Anaconda3-2024.10-1-Linux-x86_64.sh
```

4. 下载成功后在目录中会显示.sh文件；
5. 安装Anaconda3

```bash
bash Anaconda3-2024.10-1-Linux-x86_64.sh
```

6. 修改全局变量

```bash
# 编辑 Linux 系统中的全局环境变量配置文件
vim /etc/profile
# 设置环境变量
export PATH=/root/anaconda3/bin:$PATH
source /etc/profile
```

* `export` 是一个用于设置环境变量的命令，它将变量导出到当前 shell 进程及其子进程中，使得这些进程都可以使用该变量。
* `PATH` 是一个非常重要的环境变量，它存储了一系列目录的列表，系统在查找可执行文件时会按照 `PATH` 中指定的目录顺序依次查找。
* `/root/anaconda3/bin` 是 Anaconda 安装目录下的 `bin` 目录，该目录包含了 Anaconda 提供的各种命令和工具，如 `python`、`conda` 等。
* `$PATH` 表示引用当前的 `PATH` 环境变量的值。
* 整行命令的作用是将 `/root/anaconda3/bin` 目录添加到 `PATH` 环境变量的开头，这样系统在查找可执行文件时会优先在该目录中查找。

7. 验证

```bash
conda info --envs
```

### Python环境 Python Environment

```python
# torch
pip install torch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 --index-url https://download.pytorch.org/whl/cu121

pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu121
pip install jieba
pip install opencv-python
pip install zhconv

pip install modelscope
pip install pandas
pip install git+https://gitee.com/linrongxin2024/hello accelerate
pip install scikit-learn
pip install qwen-vl-utils[decord]==0.0.8

# git clone https://gitee.com/linrongxin2024/Qwen2.5.git

# 下载模型文件
modelscope download --model 'Qwen/Qwen2.5-VL-7B-Instruct' --local_dir '/share/lin/models/Qwen2.5-VL-7B'
modelscope download --model Qwen/Qwen-VL-Chat --local_dir /root/autodl-tmp/models/Qwen-VL-Chat
modelscope download --model 'Qwen/Qwen2.5-VL-7B-Instruct' --local_dir '/root/autodl-tmp/models/Qwen2.5-VL-7B'
```

## 快速开始 Getting Started

本论文基于三个数据集Weibo、Twitter和PHEME进行测试，分为三个模块：描述集生成、模型结构和描述集。

### PHEME

基于PHEME数据集

#### Prompt

```python
# generate prompt for images and texts
python ./PHEME_code/prompt/prompt_text.py
python ./PHEME_code/prompt/prompt_image.py

# Infer though Qwen2.5-VL-Instruct to generate the detailed description of images and texts
python ./PHEME_code/prompt/VL_PHEME_text.py
python ./PHEME_code/prompt/VL_PHEME_image.py
```

#### generate

生成的描述集数据放在目录 `generate`下，其中 `desimage`放置图像描述，`textinfo`放置文本描述。

#### models

##### 满血版-模型结构base

```python
# train and test
python Instruct-base.py
```

##### 模型结构Ex

```python
# train and test
python Instruct-Ex.py
```

"Ex IT""代表交换描述集融合的信息源，文本特征与图像描述融合，图像特征与文本描述融合

##### 模型结构Image

```python
# train and test
python Instruct-Image.py
```

"w/o T""代表移除新闻文本对应的描述文本

##### 模型结构text

```python
# train and test
python Instruct-Text.py
```

"w/o I"代表移除新闻图像描述文本

##### 模型结构Original

```python
# train and test
python Instruct-Original.py
```

"w/o all""代表移除所有描述集，保留特征融合模块，融合文本和图像特征

##### 模型结构nothing

```python
# train and test
python Instruct-nothing.py
```

"w/o"代表移除所有描述集和特征融合模块

### Weibo

基于Weibo数据集

#### Prompt

```python
# generate prompt for images and texts
python ./WEIBO_code/prompt/prompt_text.py
python ./WEIBO_code/prompt/prompt_image.py

# Infer though Qwen2.5-VL-Instruct to generate the detailed description of images and texts
python ./WEIBO_code/prompt/VL_PHEME_text.py
python ./WEIBO_code/prompt/VL_PHEME_image.py
```

#### generate

生成的描述集数据放在目录 `generate`下，其中 `desimage`放置图像描述，`textinfo`放置文本描述。

#### models

##### 满血版-模型结构base

```python
# train and test
python Instruct-base.py
```

##### 模型结构Ex

```python
# train and test
python Instruct-Ex.py
```

"Ex IT""代表交换描述集融合的信息源，文本特征与图像描述融合，图像特征与文本描述融合

##### 模型结构Image

```python
# train and test
python Instruct-Image.py
```

"w/o T""代表移除新闻文本对应的描述文本

##### 模型结构text

```python
# train and test
python Instruct-Text.py
```

"w/o I"代表移除新闻图像描述文本

##### 模型结构Original

```python
# train and test
python Instruct-Original.py
```

"w/o all""代表移除所有描述集，保留特征融合模块，融合文本和图像特征

##### 模型结构nothing

```python
# train and test
python Instruct-nothing.py
```

"w/o"代表移除所有描述集和特征融合模块

### Twitter

基于Twitter数据集

#### Prompt

```python
# generate prompt for images and texts
python ./Twitter_code/prompt/prompt_text.py
python ./Twitter_code/prompt/prompt_image.py

# Infer though Qwen2.5-VL-Instruct to generate the detailed description of images and texts
python ./Twitter_code/prompt/VL_PHEME_text.py
python ./Twitter_code/prompt/VL_PHEME_image.py
```

#### generate

生成的描述集数据放在目录 `generate`下，其中 `desimage`放置图像描述，`textinfo`放置文本描述。

#### models

##### 满血版-模型结构base

```python
# train and test
python Instruct-base.py
```

##### 模型结构Ex

```python
# train and test
python Instruct-Ex.py
```

"Ex IT""代表交换描述集融合的信息源，文本特征与图像描述融合，图像特征与文本描述融合

##### 模型结构Image

```python
# train and test
python Instruct-Image.py
```

"w/o T""代表移除新闻文本对应的描述文本

##### 模型结构text

```python
# train and test
python Instruct-Text.py
```

"w/o I"代表移除新闻图像描述文本

##### 模型结构Original

```python
# train and test
python Instruct-Original.py
```

"w/o all""代表移除所有描述集，保留特征融合模块，融合文本和图像特征

##### 模型结构nothing

```python
# train and test
python Instruct-nothing.py
```

"w/o"代表移除所有描述集和特征融合模块

### 主图案例

`example.ipynb`从当前流行的新闻中选取案例，得到对应的图像描述和文本描述。

## 引用 Citation

If you find our code are helpful, please cite the following papar:

```python

```