Every day, some small logic constructed by very specific pieces of artificial intelligence technology makes decisions that affect how you experience the world. It could be the ads that are served up to you on social media or shopping sites, or the face recognition that unlocks your phone, or the instructions you take to get where you are going. These discrete, invisible decisions are largely made by algorithms created by machine learning (ML), a segment of artificial intelligence technology that is trained in identifying the relationship between datasets and their results. We have heard in movies and television for many years that computers rule the world, but we have finally reached the point where computers make real autonomous decisions about things. Welcome to the future, I guess.
In my days as an employee of Ars, I wrote not so little about artificial intelligence and machine learning. I talked to computer scientists who built predictive analytics systems based on terabyte telemetry from complex systems, and I chatted with developers who were trying to build systems that could defend networks against attack ̵
Many of the problems that ML can be used for are tasks whose conditions are obvious to humans. This is because we are trained to notice these problems through observation – which cat is more fluffy or at what time of day the traffic is most congested. Other problems that suit ML can be solved by humans in addition to getting enough raw data – if humans had a perfect memory, perfect vision and an innate understanding of statistical modeling, that is.
But machines can do these tasks much faster because they have no human limitations. And ML lets them do these tasks without people having to program the specific math involved. Instead, an ML system can learn (or at least “learn”) from the data provided to it, and create a problem-solving model itself.
However, this bootstrappy strength can also be a weakness. Understanding how the ML system came to the decision-making process is usually impossible once the ML algorithm is built (despite ongoing work to create explanatory ML). And the quality of the results depends a lot on the quality and the amount of data. ML can only answer questions that are visible from the data itself. Poor data or insufficient data results in inaccurate models and poor machine learning.
Despite my previous experiences, I have never done any actual construction of machine learning systems. I’m a jack of all tech trade, and while I’m good at basic data analysis and running all kinds of database issues, I do not consider myself a computer scientist or an ML programmer. My previous Python experiences are more about hacking interfaces than creating them. And most of my coding and analysis skills have been late in turning to using ML tools for very specific purposes related to information security research.
My only real superpower is not to be afraid to try and fail. And with that, readers, I’m here to bend that superpower.
The task at hand
Here is a task that some Ars writers are exceptionally good at: writing a solid headline. (Beth Mole, please report to retrieve the award.)
And headline writing is hard! It is a task with many limitations – the length is the largest (Ars headings are limited to 70 characters), but not near the only one. It is a challenge to cram into a small space enough information to carefully and sufficiently tease a story, while also including all the things you need to put in a headline (the traditional “who, what, where, when, why and where”). many “collection of facts). Some of the elements are dynamic – a “who” or a “what” with a particularly long name that eats the character count can really throw a wrench at things.
In addition, we know from experience that Ars readers do not like clickbait, and will fill the comment section with ridicule when they think they see it. We also know that there are some things that people want click without fail. And we also know that regardless of topic, some headlines result in more people clicking on them than others. (Is this clickbait? There’s a philosophical argument there, but the most important thing that separates “a headline that everyone wants to click on” from “clickbait” is the headline’s honesty – does the story under the headline fully deliver the headline’s promise?)
Regardless, we know that some headlines are more effective than others because we do A / B testing of headlines. Each Ars article starts with two possible headings assigned to it, and then the site presents both options on the homepage for a short period of time to see which one draws in more traffic.
There have been a few studies done by computer scientists with much more experience in computer modeling and machine learning that have looked at what distinguishes “clickbait” headlines (those that are designed strictly to get a large number of people to click through to an article ) from “good” headlines (those who actually summarize the articles behind them effectively and do not make you write long complaints about the headlines on Twitter or in the comments). But these studies have been focused on understanding the content of the headlines instead of how many actual clicks they get.
To get a picture of what readers seem to like in a headline – and to try to understand how I can write better headlines for the Ars audience – I grabbed a set of 500 of the most clicked Ars headlines from the last five years and did some natural language processing on them. After removing the “stop words” – the most common words in English that are not usually related to the title of the headline – I generated a word cloud to see which topics attract the most attention.
Here it is: the shape of Ars headlines.
There’s a lot of Trump in there – the last few years have included a lot of technical news involving the administration, so it’s probably inevitable. But these are just the words from some of the winning headlines. I wanted to get a sense of what the difference between winning and losing headlines was. So I took back the corpus of all of Ar’s headline pairs and divided them between winners and losers. These are the winners:
And here are the losers:
Remember that these headlines were written for exactly the same stories as the victorious headlines. And for the most part, they use the same words – with some notable differences. There is much less “Trump” in the losing headlines. “Million” is strongly favored in winning headlines, but somewhat less in losing them. And the word “can” – a rather indecisive overword – is more often found in losing headlines than in winning.
This is interesting information, but it does not help in itself to predict whether a headline for a given story will succeed. Would it be possible to use ML to predict whether a headline would get more or fewer clicks? Can we use the accumulated wisdom of Ars readers to create a black box that can predict which headlines will be more successful?
Hell if I know, but we’ll try.
All of this brings us to where we are now: Ars has provided me with data on over 5,500 headline tests over the past four years – 11,000 headlines, each with their clickthrough rate. My assignment is to build a machine learning model that can calculate what makes a good Ars headline. And by “good” I mean someone who appeals to you, dear Ars reader. To achieve this, I have been given a small budget for Amazon Web Services computing resources and a month of nights and weekends (after all, I have a day job). No problem, right?
Before I started looking for Stack Exchange and various Git sites for magic solutions, however, I wanted to anchor myself in what is possible with ML and look at what more talented people than I have already done with it. This survey is as much a roadmap for potential solutions as it is a source of inspiration.