What are Data Transfer (DT) files?
DFP Data Transfer files are the event level DFP server log data made available to publishers. It offers a very granular peek (Yes! Very granular) into the data collected by DFP. Its a paid add on feature in DFP for which you will need to contact your account manager.
Is it useful for me?
Yes it is. But, it all depends upon the amount of technical expertise and resources you have to handle the data, as its pretty raw. For breaking down the data received through DT, you’ll need to polish your Grep and AWK skills (for medium level data operations) or find other alternatives. If you are a big publisher, you’ll need to setup a robust ingestion system with dedicated cloud storage space to load the data on a daily basis. If you are thinking that you can simply use Excel or any similar tools to crunch the data, you are wrong, cause the size of the daily files are humongous.
How it is different from reports?
For an ad request, you can check from which line item it was served, the postal code which was mapped to it, the geographic location, the time at which the request was made, the custom criteria used and much more (told you its granular).
The following features make it unique:
- Custom Criteria: All the key-values (key should be defined in the UI) in the tag that’s passed to DFP are recorded.
- Audience Segment IDs: Shows up to 35 matched/targeted IDs.
- Key values which are not defined in Line Item or under “Custom Targeting”. Even the ones with wild cards(~cars, machine* etc) which are not available through the report UI.
- ord values passed in ad tag.
- Time: The time at which the event was recorded (even milliseconds).
- Track a user using the UserId field (encrypted DoubleClick cookie ID).
- Track conversions from Activity.
How to get it?
Approach your Account Manager to set it up. You will be given a form to mark all the required files(Impression, Clicks, Active View, Rich Media, Video). DT form tells the field names in the data file you will be getting. After the whole process you can see the details under the admin tab in Data Transfer Files option. You can mention the delimiter you would like to have in your data file as well and a Google Group with which users can access the files in Google Cloud bucket.
What all will be there?
Multiple fields can be enabled for the files. Few of them being IsInterstitial, CustomCriteria, IsCompanion etc. Refer DFP’s article for detailed and updated list of fields available. Adding new fields are free but new files will cost you. Here’s a sample Data Transfer impression file you can play with to understand it’s structure.
How to use it?
Many hotshot publishers using DFP like Pandora, ESPN, Weather.Com and more use services from YieldEx. DT files are pushed to YieldEx which then crunches them for analysis, forecasting and reporting. This awesome PDF from DoubleClick is gonna be your best resource to start with – Doubleclick_Yieldex_Guide . You can check out the other DFP approved partner companies providing DT related services here.
> GSUtils : GS utilities offer an awesome set of commands (for Linux) using which you can run basic data processing operations. Use it to setup system for downloading the hourly files on a regular basis.
> BigQuery : If the files are gonna get big then you can go for BigQuery. This premium tool from Google help integrate with the cloud bucket to pull in large chunks of data which can be processed using simple SQL queries.
It’s suggested that the publishers ingest the files on a daily basis and keep it under a permanent storage solution eg. a separate Google Cloud Storage . This gives full control over the log files and proves helpful in data analysis and warehousing. Moreover, networks are generally set to retain their DT files in the cloud storage for 60 days. Files older than that are purged automatically.
What’s the main purpose?
Data Transfer files are primarily used for Data warehousing/analysis and data mining. It help in forecasting and deriving various conclusions by analyzing various trends. Use DT to create custom 1st party audience segments using user list upload feature.
Comparing the report data and data transfer files
There can be differences in how the Data Transfer logs are processed and ingested by different publishers. I had multiple publishers coming back to me asking regarding the difference in numbers. DT files are non-aggregated event level data and the daily files of various networks can vary from a few MBs to TBs. Also, there are other factors like Time Zone differences, parsing mechanisms, Master Companion creatives, hourly file delays, spam filtering etc which could contribute to a discrepancy because of which it isn’t recommended to compare the two. Our article on Comparing DT and QT data should be a good read in such cases.
Creating first party audience segments
The data from DT files can be used to generate 1st party audience segments based on a wide range of combinations. The UserId field is the hashed version of the identifier used to map users. These identifiers can be pushed back to DFP for mapping against a particular 1st party segment. Krux has a similar setup in which the segment IDs at Krux are available against the user under the CustomTargeting which can be sorted and collected. The UserIds against these segment IDs are then mapped to the internal ID of the user in DFP for adding the user to a 1st party segments. This is an addon feature and you’ll need to talk to your AM for getting this enabled (yup, it’s free). Once these are set, you can also upload raw AdIDs and IDFAs to create audience segments.