I am planning to use the DL4J Doc2Vec implementation for a sentiment analysis.
However, I don’t want to start with an empty network but the staring point should be a pre-trained network: The initial trining should be done with the Sentiment140 dataset which can be found at https://www.kaggle.com/kazanova/sentiment140. It contains 1,600,000 tweets extracted using the twitter api.
In this Gist I describe how to train and save a DL4J Doc2Vec. The serialized model is available on Kaggle.
0 Comments