We here at Federated Media have decided it’s time to lift the curtain a bit and talk about some of the high level technology behind the products of FM. As a result, we’ve decided to create the FM Tech Blog. In this blog we’ll share some of the results, surprises and problems we’ve uncovered along the way, as well as our ideas on where things might go in the future. And of course we’ll take your questions and try to provide cogent answers.
To get started, I’d like to talk about the technology that underlies Conversation Targeting – a new ad product launched in Beta in the first half of 2011. You can probably guess that we have a semantic technology behind this. However, it’s probably not what you think it is. Yes, we have an engine that auto-generates tags, and even overarching topic labels, for thousands of new web pages every day. But there’s a lot more to it than that. Let me point out three points of differentiation that I think you’ll find interesting.
The first point is what we define as a “conversation.” By “conversation targeting” we mean (1) getting ads onto pages that are part of an ongoing dialogue – such as blogs that spawn substantive discussion threads, and (2) targeting more than a single topic (think of a cluster of related topics).
Consider the “small business” conversation. Does that mean pages with “small business” (or an obvious synonym, such as “home-based business”) in the content? In which case, does this amount to nothing more than a whole bunch of character string matches (regular expressions, or such)?
Not really. It means that we notice where people are answering questions about the oddities of Quickbooks, or trying to determine if a certain shipping company is less expensive for sending their customers extra-heavy packages. And many more things. In fact, several dozens of related topics make up the “small business” conversation at any given time. Thus, our semantic technology does a ton of analysis about how topics are related to other topics within communities of conversation. (We use a couple of different methods for that, which I can explain in a later post).
Second point: Our community of authors and publishers are already telling us what the topics are – and we realize we’re better off just accepting this fact. So, our semantic engine was designed to work “bottoms up”, not “top down”. That means we don’t have an elite editorial team dictating “from on high” what the official topics ought to be. Similarly, we don’t run a classifier to shoehorn our URL’s into a canned taxonomy of topics. Instead, our engine finds out which phrases and concepts are pivotal in the conversations on our networks. It lets the community tell us what the topics of conversation are. Then, our advertisers are invited to join in with appropriate ads.
Third point: The key topics within a conversation are always changing. As of this writing, “Google Plus”, while a hot topic in some places, has not registered as a significant topic in the “small business” conversation. But who knows – it could do so at any moment. We have to keep a finger on that. It means our algorithms have to be sensitive to a dormant topic suddenly erupting, i.e., detection of a topic reaching critical mass within the relevant subset of our total collection of recent URL’s. We have an algorithm for that, too.
We’ll be publishing some research results this year, with plenty of facts and figures about Conversation Targeting. Keep an eye out here for sneak peaks at the ongoing research.
And we welcome any opinions on how content-centered advertising could (or should) evolve. What else do you think needs to improve, in order for advertising to really be a value-add to a web page (instead of a distraction from it)?