top of page

Viral Articles prediction model

Built a model on Python using a dataset provided by Mashable.com, a global, multi-platform media and entertainment company, which contained a summary of different attributes of their published articles. 

The model was built by adding the next best attribute that would increase the accuracy of the model until the next added attribute added no increase to the overall accuracy.

Because it can also be argued that the model is non-linear, (ie: an article under the category of entertainment is more likely to have more images in it, and both entertainment and number of images are both attributes being considered), then it was also appropriate to build a decision tree to see which attributes should be added to the logistic regression model.

bottom of page