Engineers are, as a group, generally quick to absorb new information. But even the most nimble-minded among us can be forgiven if the convulsive, roiling world of digital advertising — 24 years after the first clickable banner ad — still feels a little arcane.
This article will attempt to demystify recent developments in digital advertising, provide some context for understanding how we got here, and touch on a few ongoing areas of concern for technologists.
Why Flash Had to Die
The death of Flash was presaged in the very early days of the mobile revolution. The harbinger of it’s death was, of course, the late, great Steve Jobs. When Apple announced in 2010 that it’s new mobile operating system wouldn’t support Adobe’s (heretofore) ubiquitous web technology, the writing was on the wall.
The stated reasons — security, performance, energy consumption, quality control — are common knowledge to most of you, as is the ensuing fallout; but as keenly as this disruption was felt by web developers everywhere, the ramifications for digital advertising were positively seismic.
To understand why, we need to define a few wonky vocabulary terms:
VPAID (Video Player Advertising Interface Definition): Officially released by the IAB in 2009, VPAID established an industry-standard API-definition to facilitate interactions between players and digital ads. Prior to the advent of VPAID, advertisers and agencies who wanted to do anything interactive in a video context needed to work with publishers to create bespoke solutions. VPAID was designed to facilitate this kind of innovation at scale.
Verification: A general term encompassing fraud detection and prevention. While ad servers routinely fulfill this function as part of their standard product offering, a number of third party vendors (Double Verify, Integral Ad Science, RealVu, et al) have made a business out of ensuring that ads are being served in a manner and location consistent with advertiser expectations (i.e. never on black-listed sites, only to the appropriate geo regions, etc.). In cases where a violation is determined, some verification components will attempt to preempt ad playback on the client.
Viewability: A subset of verification that concerns itself specifically with determining the extent to which served ads were actually displayed in a viewable area of the screen. At time of writing, the Media Ratings Council — the standards body responsible for ratifying metrics definitions — defines viewable as “50% of the ad’s pixels are visible on a screen for at least two consecutive seconds”.
So, what does all this have to do with the death of Flash?
Well, virtually all the VPAID ad creative in circulation today is authored in Flash (specifically .swf files). Why? Because swf containers afford the ultimate level of control to advertisers, delivering a payload of ad creative and related functionality, all neatly self-contained in a compiled executable format. Unfortunately, many advertisers and agencies have ignored the original intent of VPAID — i.e. to facilitate the creation of new interactive ad models — and instead, have utilized VPAID to freight their ad creative with verification and viewabilty capabilities. It's precisely this bundling of business concerns — media, measurement, verification, and targeting — that has the whole industry hooked on Flash VPAID, but it’s also a source of significant performance issues.
Consider the fact that in addition to loading the VPAID creative itself, the client is often then forced to wait while verification and viewability components are downloaded, accompanied perhaps by a number retargeting pixels, each hosted on separate domains, and you get a sense for how sluggish VPAID creative can often feel to the end user. The problem can be compounded by the fact that requests for third party verification and viewabiity components are often synchronous — given that these kinds of services typically need to be loaded and initialized prior to displaying the actual creative.
All of this has put a tremendous strain on publishers to validate the vendors they work with for compliance with industry standards and internal policies.
At Viacom, we've instituted an extensive “Ad Readiness” program which validates new vendors and ad products for performance and stability. The program is a multi-team effort, marshaling QA, operations, and engineering resources in the service of a better user experience.
But the problems endemic in the digital ad ecosystem are too widespread for any one publisher to rectify.
A VAST Improvement?
In order to address the inherent weaknesses of the current ad ecosystem, industry organizations like the IAB have been working with publishers, networks, agencies and other ad vendors to establish standards which allow all parties involved to extract value from digital advertising in a way that protects and privileges user experience. One such emerging standard — VAST 4.0 — is designed to increase transparency and accommodate new, more performant ad models for a broad range of video-capable platforms and devices.
VAST (Video Ad Serving Template) is an XML schema used to deliver structured information about an ad campaign to video clients. It organizes the constituent resources of a campaign — i.e. media URIs, tracking pixels, companion display ads, etc — in a standard way to facilitate playback on VAST-compliant players.
VAST has been around since 2008, but has evolved significantly over the years, and in it’s latest incarnation, is positioned to address many of the issues we’ve come to associate with VPAID.
In particular, the VAST 4.0 specification states that “Publishers and ad vendors need a way to separate the video file from its interactive components to ensure that ads play in systems that cannot execute the interactive components. These ads should also execute more efficiently in players that are equipped to handle the interactions.”
Accordingly, VAST 4.0 provides separate elements for media files, verification components, and scripts which enable interactivity — in addition to the customary set of elements for companion display ads, tracking pixels, etc.
Of course, there’s nothing inherent in VAST 4.0 that prohibits the continued use Flash, but it’s timely death has thrown into stark relief the need to separate concerns — interactive functionality, verification, media — in the interest of improved performance and interoperability.
The underlying motivation for driving towards this new atomized model, however, isn’t the demise of Flash VPAID as viable commercial format. Rather, it’s an entirely new development in the world of digital advertising that's forcing advertisers, publishers and service vendors alike to establish the next paradigm.
Stream stitching — or ad stitching — has emerged in the past few years as the next evolution in digital video advertising. Instead of relying on the client to manage ad insertion at runtime, stream stitching relies on server-side processes to retrieve assets from the existing ad network infrastructure, and stitch them directly into the content stream.
There are several models for this stitching process, but the most common method is to deliver content and ad media via an ABR technology like HLS. Using this technique, media segments for the ad typically reside on an ad server (where auditing and reporting functions are instrumented), while content segments can be hosted on the publisher’s CDN. Ads and content are then “stitched” together in the manifest files for the stream.
This technique has a number of distinct advantages over the client-side methodology. Chief among these is performance. Client-side ad insertion requires that a content stream be paused or preempted to request and play a separate set of streaming or progressive advertising assets. Stream stitching, by contrast, doesn’t suffer from this context switching penalty, and so delivers a seamless playback experience akin to linear television.
Stream stitching also promises stability — particularly over complex and error-prone client-side VPAID creative — and broad compatibility, since any device or platform that can stream video content can also deliver ads, without the need for platform-specific client-side ad components.
What’s Not to Like?
So if stream stitching can deliver content and ads to more devices, quicker and more reliably than traditional client-side methods, why hasn’t the entire industry rushed to embrace it?
Well, while adoption of stream stitching is actually on the upswing (Viacom is rolling out it’s home-grown solution on the web imminently), the fact that it’s not ubiquitous at this point stems from the commercial demand to preserve the verification and reporting capabilities that the industry is now accustomed to (in no small part, thanks to VPAID).
Some of these demands are at least partially tractable with stream-stitching. For instance, impression tracking (reporting if and how much of an ad has played) can be managed from the server — albeit imperfectly — by recording when segments of a given ad are requested, and dispatching beacons from the server accordingly. The fact that these impression beacons will now originate from a limited set of IP addresses, and therefore look potentially like fraud, can be mitigated by using X-Forwarded-For and X-Device-User-Agent request headers.
However, as anyone who’s worked for any length of time in the Web will tell you, requesting a video asset is not the same as playing it. Depending on prevailing network conditions or the size of a player’s buffer, a TS chunk for an ad could be requested many seconds (or tens of seconds) before it actually plays. If a user exits or scrubs to a different spot in the timeline, an impression (i.e. view) may be counted where none has occurred.
Other kinds information, such as viewability (in the MRC sense), or reporting on interactivity such as a tap or a click, can only be registered and reported from the client.
In the short term, it seems likely that some foot print will need to be maintained in the client in order to satisfy commercial requirements. But as server-side stitching solutions evolve over time, expect to see a model where video players send heartbeat signals to the stitching service, with data payloads to indicate specific events (e.g. CLICK, VIEWABLE-IMPRESSION, n%-COMPLETE, etc). In other words, lighter clients, smarter servers.
The future of video advertising is bright, but it certainly won't be Flash(y).