GitHub - tshu-w/EMBer: Code and data for the paper "Bridging the Gap between Reality and Ideality of Entity Matching: A Revisiting and Benchmark Re-Construction"

Bridging the Gap between Reality and Ideality of Entity Matching:
A Revisiting and Benchmark Re-Construction

Description

Code and data for the paper:

Bridging the Gap between Reality and Ideality of Entity Matching: A Revisiting and Benchmark Re-Construction

Data

Details of the released data can be found in the REAME of the data.

How to run

First, install dependencies

# clone project
git clone https://github.com/tshu-w/EMBer
cd EMBer

# [SUGGESTED] use conda environment
conda env create -n ember -f environment.yaml
conda activate ember

# [ALTERNATIVE] install requirements directly
pip install -r requirements.txt

Next, to obtain the main results of the paper:

bash scripts/download_images.sh

python scripts/run_ali.py --gpus 0 1 2 3
python scripts/test_ali.py --gpus 0 1 2 3
python scripts/run_dm_ali.py --gpus 0 1 2 3
python scripts/test_dm_ali.py --gpus 0 1 2 3

python scripts/print_results results/test -k test/f1 test/prc test/rec

You can also run experiments with the run script.

# fit with the TextMatcher config
./run fit --config configs/ali_tm.yaml
# or specific command line arguments
./run fit --model TextMatcher --data AliDataModule --data.batch_size 32 --trainer.gpus 0,

# evaluate with the checkpoint
./run test --config configs/ali_tm.yaml --ckpt_path ckpt_path

# get the script help
./run --help
./run fit --help

Citation

@inproceedings{ijcai2022p552,
  title     = {Bridging the Gap between Reality and Ideality of Entity Matching: A Revisting and Benchmark Re-Constrcution},
  author    = {Wang, Tianshu and Lin, Hongyu and Fu, Cheng and Han, Xianpei and Sun, Le and Xiong, Feiyu and Chen, Hui and Lu, Minlong and Zhu, Xiuwen},
  booktitle = {Proceedings of the Thirty-First International Joint Conference on
               Artificial Intelligence, {IJCAI-22}},
  publisher = {International Joint Conferences on Artificial Intelligence Organization},
  editor    = {Lud De Raedt},
  pages     = {3978--3984},
  year      = {2022},
  month     = {7},
  note      = {Main Track},
  doi       = {10.24963/ijcai.2022/552},
  url       = {https://doi.org/10.24963/ijcai.2022/552},
}

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
configs		configs
data/ali		data/ali
notebooks		notebooks
results		results
scripts		scripts
src		src
.dir-locals.el		.dir-locals.el
.envrc		.envrc
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CITATION.bib		CITATION.bib
Dockerfile		Dockerfile
README.md		README.md
environment.yaml		environment.yaml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run		run

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bridging the Gap between Reality and Ideality of Entity Matching:
A Revisiting and Benchmark Re-Construction

Description

Data

How to run

Citation

About

Contributors 2

Languages

tshu-w/EMBer

Folders and files

Latest commit

History

Repository files navigation

Bridging the Gap between Reality and Ideality of Entity Matching:A Revisiting and Benchmark Re-Construction

Description

Data

How to run

Citation

About

Topics

Resources

Stars

Watchers

Forks

Contributors 2

Languages

Bridging the Gap between Reality and Ideality of Entity Matching:
A Revisiting and Benchmark Re-Construction