# README This repository contains code and data for "The Impact of Peer Review on the Contribution Potential of Scientific Papers." # Library requirements and versions ## Python - python: 3.6.10 - numpy: 1.15.4 - pandas: 1.0.4 - statsmodels: 0.11.1 - nltk: 3.5 - textblob: 0.15.3 - tqdm: 4.46.1 ## R - R: 3.5.3 - survival: 2.44.1.1 - stargazer : 5.2.2 # Code for preprocessing and analysis ## Regression analysis For those who only want to run the analysis codes, we already put the preprocessed data at `data_for_analysis`: ### For Table 3 run `analysis_code/sentiment_citaion_mixed_effect.ipynb` ### For Table 2, A2-1, A2-2 run `analysis_code/revise_decision_logit.R` ## LDA (Figure 2 and A4-1) - download `lda_text_diff.txt` from [the google drive](https://drive.google.com/file/d/1Aazyqpn1jbq77U9QzM-TE9HaCC2mzFtk/view?usp=sharing), unzip and put it at `LDA/` - run `diff_topic_analysis.ipynb` # Preprocessing To re-run all analyses, including preprocessing: From the [google drive](https://drive.google.com/file/d/1Aazyqpn1jbq77U9QzM-TE9HaCC2mzFtk/view?usp=sharing), download .zip file and unzip it. Then, - Put `peerj_review_data.json` at `data/raw_data` - Put `diff_content.json` at `LDA/` Run these codes ( in `pre.sh`) - `prepro/prepro_review_data.py` - `prepro/individual_sentiment_calculation.py` - `prepro/reviwers_authors_sentiment.py` - `prepro/individual_sentiment_calculation_vader.py` - `prepro/reviwers_authors_sentiment_vader.py` - `prepro/round_count.py` - `prepro/revision_decision_logit_data.py` - `LDA/lda_prepro.py` After finishing preprocessing, run analysis codes. ## Regression analysis ### For Table3 and 4 in `analysis_code/sentiment_citaion_mixed_effect.ipynb` ### For Table 2, A2-1 and A2-2 run `analysis_code/revise_decision_logit.R` ### Appendix ### Vader sentiment calculation and Altmetric score in `analysis_code/sentiment_citaion_mixed_effect.ipynb` ## LDA (Figure 2 and A4-1) - in `diff_topic_analysis.ipynb`