PTF Blog

How AI is fighting the monopoly in sports advertising with GPUs and servers

Sporting events today are rife with advertising, from commercials on screens to static company logos on stadium billboards. The airwaves span multiple countries with different brands and advertising laws. Thanks to breakthroughs in AI and AR technology, it is possible to customize the information on display for each audience, directly during the live broadcast of a match.
Advertising in sports arenas was initially static and intended for the people attending a particular event in a particular city. Later, LED screens appeared, which showed alternating adverts. The next evolution was to insert a green screen into the spots on the billboards for a fraction of a second, allowing the AI to define the space and insert a unique text.

Nowadays, you can replace anything with anything. The challenge is to make sure that the replacement is done discreetly and realistically. The major players in this market use special cameras. From photos, we can conclude that there are specialized sensors that discern positioning and alignment information. That is, popular solutions still have hardwired crutches.

All this requires enormous computing power, as the inputs and the original broadcast are processed by specialized software online. That is, servers for computing are connected to the cameras and sensors. This equipment is expensive, the market is closed, and the technology is closed too. This is where AI and cloud GPU servers come to the rescue.

Expensive, static and not for everyone

HOSTKEY has a client, a startup called PTF Lab, which has developed its own technology for implementing virtual advertising and integrating digital content (like augmented reality) in a multi-regional mode. And their solution promises seamless (and boneless) integration of adverts directly into the video stream.
Some time ago, they saw information about HOSTKEY in the Open Data Science (ODS) community and reached out to us when they needed cloud capacity, appreciating our service, pricing, and flexible lineup of available options.

The startup's goals are noble, understandable and quite achievable:

  1. To get away from expensive proprietary equipment and complex setup, shifting the task of advertising placement and frame construction to the artificial intelligence which takes into account overlapping people and objects in the frame.
  2. By doing so, they can cover relatively small events (such as arena fights, etc.) and bring the technology to the masses.
  3. Ultimately to make sports advertising accessible and relatively inexpensive.

Moreover, the startup has set an ambitious goal of surpassing the solutions offered by monopoly giants in terms of flexibility. For example, to display "virtual adverts" during replays and using angles from any camera, not just from certain angles.

How does it work?

The video signal from a sporting event venue can be processed using computing power not only at the venue itself, but also in the cloud. This allows for flexible load distribution and the choice of when to apply adverts: before or during broadcasting, taking into account different markets. Also, working with cloud services allows you to use advertising in locations where it is impossible to bring a server (and in principle it is more convenient).

Object segmentation is based on the neural network architecture from U-Net. Neural networks are responsible for the location of objects and to detect and compare key points. However, the task is non-trivial, so all the solutions and neural networks had to be reworked and trained for use. It is especially difficult in martial arts broadcasts, where literally everything is unpredictable from the point of view of the picture: light sources, shadows, camera angles, the grid overlapping sponsor logos, and the bodies of the fighters and referees.

Neural networks are not used everywhere. Sometimes, to solve a problem, it is enough to show ingenuity and use simple algorithms. For example, tracking algorithms combine neural network methods and systems of linear and nonlinear equations.

A significant part of the GPU is taken up by segmentation. The better the detection of people and objects in the frame and their separation by plans and type is implemented, the more natural and attractive the frame will look after the advertising overlay.
A separate task is related to lighting and shadows, which must be taken into account in augmented reality when rendering a scene. The realism of shadows is a key element in assessing the "believability" of the picture.
Traditionally, sports neural networks are trained on real broadcasts using human markings and synthetic models. And here Blender comes to the rescue, in which the company builds 3D models of the ring, fighters, referees, and gets both rendered real footage from the right angles and the segmentation mask, or the position of objects and cameras needed for training. Markup of real data is time-consuming and expensive, but of high quality for specific venues or types of competitions. Synthetic data with less realism provides more data for training.

The main difficulty is that the venues can vary. In one case it will be a boxing ring with ropes and in another case it will be an arena with mesh walls, each of which creates difficulties for segmentation.
Camera tracking and advert position are determined by comparing the point cloud from the 3D model of the venue with their actual position in the frame. This allows them to determine the position even for manual cameras with chaotic movement. After reconstruction of the 3D frame from 2D (the determination of the required angle), the direct rendering of the advertisement in the 3D engine is performed and combined with the video broadcast frame.
A 3D scene has to be built before we start work, and in fact we have a virtual copy of the venue in the frame, into which we fit real people and objects through render masks. It sounds complicated, but with the right power and optimized neural networks, it is possible to perform these somersaults instantly and seamlessly.

Why does the project need GPU computing and server rentals?

PTF Lab has its own servers (and the possibility of using them on site at the competition venue was mentioned earlier), but it is more convenient to use remote resources, as the service provider's engineers are responsible for equipment availability, and the company allocates fewer resources to this. And the client votes with his wallet, and cost-effective options are always preferable.

Also, the capacity required by the company is constantly growing. If necessary, it can be easily scaled up just by renting more (up to and including changing the server configuration to suit the company's needs).

In the future, a startup may need a lot of cloud capacity. It is easier to rent them than to buy and sell the physical servers when the demand for their services rises and falls.

The leased servers and GPU capacity are now being used in the following areas:

  1. Training of neural networks on GPUs (segmentation of people and other objects in sports broadcasts; 3D virtual camera tracking).
  2. Data backup (video from events, datasets, etc.).
  3. The direct process of working with video - testing cloud production, when the company's software is deployed on remote servers and video signals flow through it: input - without graphics, output - one or more signals with graphics.

In addition to our own computing power at HOSTKEY, the startup is currently leveraging the following GPU configurations:

  • AMD Ryzen 9 5950X 3.4GHz (16 cores)/128Gb/1Tb NVMe SSD+12Tb HDD/2xRTX 3090+PSU]
  • Xeon E3-1230 3.2GHz (4 cores)/16Gb/2x12Tb HDD/PSU]
  • AMD Ryzen 9 5950X 3.4GHz (16 cores)/128Gb/1Tb NVMe SSD+12Tb HDD/2xRTX 3090+PSU+HDMI emulator]
  • AMD Ryzen 9 5950X 3.4GHz (16 cores)/128Gb/1Tb NVMe SSD+12Tb HDD/2xRTX 3090+PSU]
  • AMD EPYC 7402P 2.8GHz (24 cores)/384Gb/2x1.92Tb U3 NVMe SSD/4xRTX 4090+2xPSU]

As you can see, mostly GPU solutions are rented based on the RTX4090/3090, but as the computing power requirements increase, the startup wishes to have the option for servers with more powerful, professional cards that offer better stability when running in continuous 24/7 operation. In the case of the 3090, renting the same A5000 might even be cheaper.

PTF Lab is at the beginning of its journey, but their results are already promising and we at HOSTKEY wish them success and growth, especially in terms of overcoming the monopoly of sports augmented reality. The witty David always defeats the clumsy proprietary Goliath.