We are in phase three of our machine learning project now ̵
I had also spent a few dollars on Amazon Web Services computing time to discover this. Experimentation can be a little expensive. (Hint: If you have a budget, do not use “AutoPilot” mode.)
We had tried some approaches to analyze our collection of 11,000 headlines from 5,500 headline tests – half winners, half losers. First we had taken the whole corpus in comma-separated value form and tried a “Hail Mary” (or, as I see it in retrospect, a “Leeroy Jenkins”) with the Autopilot tool in AWS ‘SageMaker Studio. This came back with an accuracy result in validation of 53 percent. This turns out not to be so bad, in retrospect, because when I used a model specifically built for natural language processing – AWS ‘BlazingText – the result was 49 percent accuracy, or even worse than a coin toss. (If a lot of this sounds like nonsense, by the way, I recommend watching Part 2, where I go a lot more over these tools.)
It was both a little comforting and also a little discouraging that AWS technical evangelist Julien Simon had a similar lack of luck with our data. Trying an alternative model with our dataset in binary classification mode gave only an accuracy rate of 53 to 54 percent. So now it was time to find out what was going on, and if we could fix it with a few adjustments to the learning model. Otherwise, it may be time to take a completely different approach.