Scientific Papers Reads Prediction

TEAM

Maryam Omar
Sara Altamimi

OVERVIEW

in this project we'll be building a model to predicit the number of reads of scientific articles from different social science departments on researchgate.

STEPS

Topic modeling: first we'll extract the topics from the papers abstract, using NMF and LDA, then we'll compare the results.
Get stats from text like number of words and scentences in the titles and abstracts, and add these info as new feauteres.
Perform feautre engineering.
Create a regression model to predict the number of reads.

DATA

The data is scraped form researchgate.net, it includes information about all publications from different social science departments. we chose social science becuase topics in social science do not need domian-experties to understand, and have a solid number of publications and diverse topics.

source:

https://www.researchgate.net/

Description

the scraped datase consists of 1,899 rows and 12 intial columns.

col	Description	Type
title	article's title	string
auther	auther name	string
abstract	abstract's full-text	string
category	article, literature review, conference paper..etc	string
date_published	date the article was puplished	date
date_added	date the article was uploaded to researchgate	date
figuers	1 if there're figures & 0 if not	int
full_text?	avilability of full text ( using download or Request full-text as keywords)	string
citation	number of times that paper was cited	int
reads	number of views (target variable)	int

we faced some limitations during the scraping process becuase researchgate only allows accounts with orginzational emails and gives limited daily access.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
Pickeled_files		Pickeled_files
data		data
figures		figures
Articles_Topic_Modeling .ipynb		Articles_Topic_Modeling .ipynb
Classification .ipynb		Classification .ipynb
README.md		README.md
Regression.ipynb		Regression.ipynb
Scraping researchgate data .ipynb		Scraping researchgate data .ipynb
slides.pdf		slides.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scientific Papers Reads Prediction

TEAM

OVERVIEW

STEPS

DATA

source:

Description

TOOLS

Data collection, Preprocessing & Modeling

Collaboration Tools

About

Languages

MaryamOmar/research_popularity_prediction

Folders and files

Latest commit

History

Repository files navigation

Scientific Papers Reads Prediction

TEAM

OVERVIEW

STEPS

DATA

source:

Description

TOOLS

Data collection, Preprocessing & Modeling

Collaboration Tools

About

Topics

Resources

Stars

Watchers

Forks

Languages