An open automation system for predatory journal detection

Li Xian Chen, Shih Wen Su, Chia Hung Liao, Kai Sin Wong, Shyan Ming Yuan*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

19 Scopus citations


The growing number of online open-access journals promotes academic exchanges, but the prevalence of predatory journals is undermining the scholarly reporting process. Data collection, feature extraction, and model prediction are common steps in tools designed to distinguish between legitimate and predatory academic journals and publisher websites. The authors include them in their proposed academic journal predatory checking (AJPC) system based on machine learning methods. The AJPC data collection process extracts 833 blacklists and 1213 whitelists information from websites to be used for identifying words and phrases that might indicate the presence of predatory journals. Feature extraction is used to identify words and terms that help detect predatory websites, and the system’s prediction stage uses eight classification algorithms to distinguish between potentially predatory and legitimate journals. We found that enhancing the classification efficiency of the bag of words model and TF-IDF algorithm with diff scores (a measure of differences in specific word frequencies between journals) can assist in identifying predatory journal feature words. Results from performance tests suggest that our system works as well as or better than those currently being used to identify suspect publishers and publications. The open system only provides reference results rather than absolute opinions and accepts user inquiries and feedback to update the system and optimize performance.

Original languageEnglish
Article number2976
JournalScientific reports
Issue number1
StatePublished - Dec 2023


Dive into the research topics of 'An open automation system for predatory journal detection'. Together they form a unique fingerprint.

Cite this