Memory Management
Overview
On-Prem is an on-premise software that Foursquare provides for local implementation on your infrastructure or AWS. It enables Foursquare to respond to geofencing queries or audience targeting queries in low-latency, high-volume environments, making it well-suited for real-time transactions such as RTB bidding or other dynamic ad serving decisions.
Since On-Prem is designed for low-latency querying, it stores the compiled binary filters for Audience and Proximity in memory. Managing memory usage on your On-Prem server can be simplified by following some good house cleaning tips and by avoiding filters that are unnecessarily large.
This document provides an overview of these best practices. In addition, your Fourquare representative is always available for assistance. Furthermore, if you have implementation questions related to the different versions of On-Prem (HTTP vs Java vs AMI), please refer to the On-Prem Overview documentation.
Understanding Filters
There are two types of filters that stores in memory – Proximity and Audience.
A Proximity filter is the binary representation of the geofences that make up specific proximity targeting criteria. These geofences may be in the form of polygons, point/radius circles, or a combination of both. A Proximity filter may contain millions of individual geofences. In a filter, these geofences are compressed into a binary format optimized for run-time querying.
An Audience filter (also called “Audience Set”) is a collection of device IDs that match specific location activity patterns. These patterns enable Foursquare to build audience segments like “Business Travelers” or “Parents” or “frequent visitors to KFC”, for example. An Audience Set is the optimized binary representation of the device IDs in one of these segments.
Good memory management for On-Prem involves following two main guidelines:
- Design reasonably sized filters
- Manage active filters wisely
Ultimately, managing memory usage is not too different from how you would manage the storage on your phone or PC.
Designing Reasonably Sized Filters
Audience
The following are some examples of memory footprints for a number of Audience Sets.
Audience Set (based on US users) | # of devices | Set Size |
---|---|---|
Users seen within 5mi of a coffee shop in the last 1 yr | >50M | >300MB |
Foursquare business travellers | 7M | 44MB |
Foursquare in-market auto buyers | 2M | 14MB |
Users seen within 100m of a McDonald’s in the last 6 mo | 5M | 30MB |
Users seen within 200m of Home Depot or Best Buy in the last 6 mo | 2M | 13MB |
Note: # of devices is averaged per exchange. Actual numbers will vary depending on the underlying exchange.
Clearly, an Audience Set for users who have been within 5 miles of a coffee shop in the past year is unnecessarily large. It is overly broad in targeting scope and while efficient, given the scale, takes up a relatively large amount of memory — it is only possible for typical servers to house between 30 and 200 sets of this size. You should avoid designing filters that would effectively result in a similar outcome unless your intent is to target essentially a run of network campaign and achieve relatively weak lift. In particular, it is a good idea to follow these guidelines when building custom audience sets (i.e., Tailored Location Segments):
- For locations like airports, stadiums, or malls, keep the radius under 2,000m.
- For locations like major department stores or offices, keep the radius under 1,000m.
- For locations like smaller retail stores or QSR restaurants, keep the radius under 200m.
- Keep the number of results in the TLS under 300,000. In some cases, targeting a category like “Business and Services > Insurance” automatically yields more than 300,000 results. These may be viewed as exceptions to the general rule.
Think of the above guidelines as an upper bound on radius. Depending on your campaign’s goal, the radius you use may differ. If you are looking for people who might be driving past the location, it might make sense to skew toward the higher end of the bound. If you are looking for people who are actually closeby or at the location, it makes sense to aim for a much lower radius than the upper bound. Additionally, these bounds really only apply to national campaigns. For regional campaigns, the radius should be much tighter (e.g., one tenth of the numbers above) since you will likely want to capture people who are actually nearby the local businesses of interest.
Proximity
The following are some examples of memory footprints for a number of Proximity Filters.
Proximity Filter (based on US places) | # of results | Radius | Filter Size |
---|---|---|---|
Fast food, coffee shops, cafes, bakeries, juice bars, 5 mi | 378,000 | 8,000m | 707MB |
Colleges, 5 mi | 34,000 | 8,000m | 84MB |
Computer & tech businesses and computer retail stores, 150m | 209,000 | 150m | 30MB |
Best Buy, 500m | 1,000 | 500m | 0.7MB |
Honda dealers, 175m | 2,000 | 175m | 0.3MB |
The guidelines for keeping Proximity Filters from becoming overly large are similar to the ones mentioned above for Audience Sets. In particular, special consideration must be used when creating designs that contain hundreds of thousands of place results. As the ubiquity of results covers more and more area, using a radius much greater than 500 meters will approximate a filter that targets the entire population. See the example below.
One additional guideline for Proximity Filters: avoid including a Payload in your filter when you don’t need one. Even if the payload only consists of a small number of attributes, the impact will be great – in some cases, the size of your index could be doubled. You should use the Payload option only when you absolutely need place attribute data in your creative.
If you have further questions about guidelines for keeping your filters a reasonable size, you can reach out to your Foursquare representative.
Managing Active Filters Wisely
Once a filter is activated, it will take up some amount of memory on your On-Prem instance. Since filters are not “purged” automatically, you should take certain steps to ensure that you are only keeping active the filters that are actually being used by your live campaigns. If you leave your active filters unchecked, you’ll risk potential runaway memory consumption as new filters keep getting added while old ones never get removed. This is analogous to your hard drive filling up. Note that if memory limits are exceeded, the On-Prem instance will stop loading new filters and log a state of INSUFFICIENT_MEMORY
.
Monitoring Deployed Filters
It’s a good idea to periodically monitor what filters are currently active on your instance. Your Foursquare representative can help you in this regard by sending over a report (CSV file) listing all active filters in your On-Prem instance, including filters that have been requested but not yet activated. You can sort this list by size or date-updated to quickly check for memory-intensive or outdated filters.
Sample deployed filters report:
You should use this report to check for filters that are outdated and no longer being used. Work with your Partner Services representative to have these filters deactivated from your instance of On-Prem. You should also check for filters that are unusually large. In most cases, these filters can be redesigned to meet more memory efficient guidelines (as mentioned in the previous section). You can ask your Foursquare representive to help redesign and rebuild such filters so that they take up less space.
Using the Management APIs
Foursquare has a set of deployment management APIs that enable you to activate and deactivate filters as needed. Since filters only take up memory when they are active, it is highly recommended that you use these APIs to activate filters right before their corresponding campaigns go live and deactivate filters right after their corresponding campaigns end. This guarantees that the only filters you keep active are the ones required by live campaigns. All other filters can be deactivated. This avoids wasting memory on filters that are not currently being used. Note that deactivating a filter merely “pauses” it and does not permanently delete the filter. You can easily re-activate the filter should a pending campaign require it. Please refer to the Management API Documentation for details.
For more information on the Deployment APIs, talk to your Foursquare Partner Services representative.
Using Deployment Tags
While the two main guidelines described above should handle most of your memory management needs, you can further alleviate memory usage by using Deployment Tags to allocate filters across your On-Prem instances in a more balanced manner.
A Deployment Tag is a string associated with a Deployment. You can add a Deployment Tag to a Deployment via the API described here. For details on how to add such tags without having to use the API, talk to your Partner Services representative.
Deployment Tags help you direct Deployments to specific instances of On-Prem. This enables you to systematically distribute active filters across your On-Prem instances, which obviates the loading of the same filter on multiple instances.
Suppose your system runs two instances of the On-Prem Java Library, one instance handles exchange A and the other handles B. You can initiate the first On-Prem Java Library instance with a Deployment Tag “xA” and the second instance with “xB”. The first instance will only handle Deployments with the Deployment Tag “xA” and the second instance will only handle Deployments tagged with “xB”.
You can specify the Deployment Tags for instances of On-Prem Java Library by passing in a list of the Deployment Tags when instantiating the On-Prem Java Library instance for Proximity and Audience. See the On-Prem Java Library javadocs for the appropriate signatures.
If you’re using the On-Prem HTTP Server, the deployment tags can be specified in your config.yml file, using the deployment-tags
parameter within the appropriate product (audience.sets
or proximity
). See the example config.yml file for more details.
Here are some use cases employing the Deployment Tag:
- Multiple Exchange Management. You might be handling RTB streams from different exchanges, and you may have configured your ad bidders to handle one exchange per instance. You might want each of the associated On-Prem Java Library instances to handle audience targeting for only the Bidder’s exchange, either for data rights or technical reaso.
- Memory Management. On-Prem Java Library stores Geopulse Sets in memory. You might hit memory limits on a machine if the number of sets gets too large. A solution is to split the load between two machines, and you can use the Deployment Tag to specify which machine’s instance is supposed to handle which group of Geopulse Sets; thus lowering your memory requirements to half of the original.
Updated 5 months ago