Home / Technology / Our overall experiment with AI continues: Did we break the machine?

Our overall experiment with AI continues: Did we break the machine?



Our overall experiment with AI continues: Did we break the machine?

Aurich Lawson | Getty Images

We are in phase three of our machine learning project now ̵

1; that is, we have come past denial and anger, and now we are slipping into negotiations and depression. I have been tasked with using Ars Technica’s amount of data from five years of headline tests, which link two ideas against each other in an “A / B” test to let readers decide which one to use for an article. The goal is to try to build a machine learning algorithm that can predict the success of a given headline. And from my last check in it went … not according to plan.

I had also spent a few dollars on Amazon Web Services computing time to discover this. Experimentation can be a little expensive. (Hint: If you have a budget, do not use “AutoPilot” mode.)

We had tried some approaches to analyze our collection of 11,000 headlines from 5,500 headline tests – half winners, half losers. First we had taken the whole corpus in comma-separated value form and tried a “Hail Mary” (or, as I see it in retrospect, a “Leeroy Jenkins”) with the Autopilot tool in AWS ‘SageMaker Studio. This came back with an accuracy result in validation of 53 percent. This turns out not to be so bad, in retrospect, because when I used a model specifically built for natural language processing – AWS ‘BlazingText – the result was 49 percent accuracy, or even worse than a coin toss. (If a lot of this sounds like nonsense, by the way, I recommend watching Part 2, where I go a lot more over these tools.)

It was both a little comforting and also a little discouraging that AWS technical evangelist Julien Simon had a similar lack of luck with our data. Trying an alternative model with our dataset in binary classification mode gave only an accuracy rate of 53 to 54 percent. So now it was time to find out what was going on, and if we could fix it with a few adjustments to the learning model. Otherwise, it may be time to take a completely different approach.


Source link