Skip to content

Project for scraping amazon.com reviews, analyzing the data, and generating new reviews.

License

Notifications You must be signed in to change notification settings

Jacktavitt/product_review_fun

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Mining Amazon Product Reviews

Final project for csc558, Data Mining and Predictive Analytics II. I chose a broad and unrelated set of products based on their dissimilarity to eachother, and how many reviews they contained. This data is then sent in two different directions, 'smart' and 'dumb'. 'Smart' has the text tokenized, and part-of-speach and word count information statistics are analyzed. 'Dumb' is tokenized text, and single words or n-grams of words are analyzed by frequency. These results are then used to experiment with generating a positive or negative review. One technique is using Keras and Tensorflow to build an LSTM trained on the input text. The other technique is generating a Markov chain based on review text.

Getting Started (still writing this)

You'll want Python 3.x (more things coming)

Prerequisites (still writing this)

NLTK library

nltk.download('stopwords')
nltk.download('punkt')

Installing (still writing this)

A step by step series of examples that tell you how to get a development env running

Step 1

example

Step n-1

finished

End with an example of getting some data out of the system or using it for a little demo

Author

License

If its helpful, let me know!

Acknowledgments

  • Billie Thompson README template - PurpleBooth
  • More
  • etc

About

Project for scraping amazon.com reviews, analyzing the data, and generating new reviews.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages