Over the past few weeks I have published a series of posts on targeting related innovations and their impact on the German online display advertising industry. My intention was to bridge the gap between a business view and a more technical understanding of this key area of the internet evolution. For an international audience the German ecosystem might be less interesting, therefore I focus my summary primarily on methodological considerations and insights. As I will argue the latest innovation wave looks promising with regard to the infrastructural side of the targeting conundrum, but leaves some key methodological aspects in the dark – aspects which need to be solved in order to make display advertising more effective.

Since last year Real-Time-Bidding (RTB) as well as Demand Side (DSP) and Supply Side Platforms (SSP) have dominated the technology debate about the future of the online display ad space. The RTB proposition to value and buy each ad impression separately in real time is in fact a huge step forward towards a more efficient market place. However, the RTB discussion is full of silent assumptions regarding the quality of the bidding input. Most expert blogs convey the impression that getting the valuation of an ad impression right for a specific campaign is just a matter of algorithms – a mathematical problem so to speak. Unfortunately it is a little more complex. Let me be clear: RTB is worthless when the information on which the valuation of the ad impression is based is either incorrect or irrelevant for a proper prediction of ad related human perception and behavior.

The key question is: What indicates ad relevant interest and receptivity? Most targeting solutions base their profiling on interest taxonomies which mirror more or less product categories such as automotive, finance, health etc. These top-level items are then broken down into sub-categories (usually in a hierarchical, mutually exclusive ways). Websites are then tagged accordingly – usually page by page. To cut a long story short: This kind of categorization is pretty imprecise with regard to users’ motivations. It is difficult to determine why someone visits a content site; a visitor of a car content site could be just generally interested in luxury cars (although he/she would never be able to afford one) or actually be in a buying decision mode. Sure, data mining methodologies like RFM (recence, frequency, monetary value) help to detect when someone intensifies his or her interest, or enters in a product search phase. However, these methods require sufficient observation data per user, ideally across websites which is a real hurdle for most of the profiling systems still.

Another issue is that content categorization usually ignores the context of the content – a finance channel within a portal might reflect a different type of interest and attract a different type of visitor than a stand-alone finance destination. Nevertheless both carry the very same interest label. Thus, my hypothesis is that URLs often carry contextual information which goes beyond content categorization. Good examples for this are “intent” indicating environments such as e-commerce sites. A visitor of such a site has most likely a more vibrant interest than a visitor of an analogue content channel within a portal for example. Companies like eXelate and Bluekai have built their business on this profiling advantage. Using e-Commerce sites for profiling seems effective for performance campaigns, but does not help in higher regions of the consumer decision funnels. A branding or product awareness campaign can’t be based entirely on e-commerce-profiles which are close to transactions. Thus, an improved profiling approach for non-transaction content sites is required.

But how should a profiling approach look like which eliminates or diminishes at least human bias and lack of precision? How can the context of a site’s content be captured? The most granular approach would be a direct use of URLs as input parameters. On this basis, however, pattern recognition would be quite difficult since a URL as such cannot be generalized. Only high traffic URLs would be part of discovered rules.

Human categorizations of contexts carry the same risk of inducing a bias as the described taxonomy approach. Nevertheless – I think – it could be worthwhile annotating content with meta-information such as “OEM website”, “e-commerce environment”, “information portal”, “stand-alone interest site addressing xyz”. Of course this meta-categorization system needs to be thought through and then tested in order to see whether this additional information on the nature of the site really helps to capture the visitor’s motivation more appropriately.

Another approach could be a machine driven bottom up classification of URLs. Methods like “latent semantic indexing” or – as an enhancement – “latent Dirichlet allocation” could be used to shift the classification decision from humans to machines – this time on an entirely semantic basis. Whether such an approach really reflects the contextual aspect effectively is entirely unclear, but should be clarified.

All these ideas and considerations have one hypothesis in common: There is still space for improvement with regard to profiling methodologies. Whoever makes significant progress on this front, will be able to estimate the value of any given ad impression more accurately then the seller and – ideally – any other competing buyer, thereby realizing arbitrage profits. This brings us back to the ecosystem level. The outcome of the display advertising battle depends heavily on customer knowledge which in turn is the result of customer intelligence. Media agencies and premium publishers are working on these systems while large brands are still waiting at the side lines. It is an interesting question to what extent brands and OEMs could build customer intelligence systems themselves. An important prerequisite would be a broad profiling base. The key question is: Which brands have sufficient digital touch points for profiling their existing and potential customers? But even if they lack these touch points, can big consumer brands afford to let agencies build these customer intelligence systems on their own, thereby occupying the strategic pole position in the online advertising and sales market? May be not…

This content is published under the Attribution-Noncommercial-No Derivative Works 3.0 Unported license.

Tagged with:

1 Response » to “Targeting – Quo Vadis? Summary of Methodological Key Insights & Hypotheses”

  1. [...] eingelöst und zwar im wesentlichen aus technisch-methodischen (siehe dazu auch meine früheren Blog-Beiträge) UND kommerziellen („zu teuer“) Gründen. Im Ergebnis haben wir nach wie vor eine [...]

Leave a Reply