This blog post will show you how you can extract and analyze data from your Marketo instance using the example of email data so that we can answer the age-old question:
When is the best time to send emails?
The truth is that the best send time is not one size fits all because your email recipients are in different geographies and have different schedules. The best time to send an email to a person is unique to that person and in this blog post, we will show you how to use a person’s historic email interaction and ChatGPT to tell when is the best time to email them.
There are two ways to extract the historic email activity in your database. Both of these methods will use the Marketo API so if you have not used the API before then you should check out the Marketo API Quick Start Guide to learn how to:
As discussed below the best method for your Marketo instance will depend on the available capacity you have as well as the email activity you are looking to extract and over what time frame. Marketo’s data retention limits for email activity are as follows:
Obviously the more email activities in your dataset and the longer the time frame you look over the more API calls or MB you will use to extract this data from Marketo.
In the project for this blog post, it was decided to extract email delivered and email opened data since the time emails are sent will affect when emails are opened, and giving ChatGPT this extra delivery information might give more accurate results than just sending open times alone. This meant that the “Email Delivered” activity limited the furthest we could look back in time at 90 days.
N.B. While the intention was for ChatGPT to use the delivered and opened times to make its decision, as we will see later on in the “Finding Best Send Time Using ChatGPT4” section it only used the opened times when doing its analysis and gave the most frequent hour of opening as the best send time. Therefore, you can only give ChatGPT a dataset of open times or you can refine your prompt to ensure that ChatGPT’s approach also factors in the delivered times and does a more advanced analysis.
The first approach uses the bulk extract API since this is the best method for extracting large amounts of data from Marketo. If you have not used the bulk extract API before then check out the Marketo Bulk API Quick Start Guide to understand the job workflow used to extract data from Marketo.
In Marketo, you will usually have a 500MB daily extraction limit unless you have paid for an increase in this limit. The limitation to using the Bulk extract API is that you may exceed this 500MB limit if you have a large database and lots of email activity. If this is the case then take a look at whether you have sufficient REST API call quota to use the REST API to extract this email activity.
The default REST API quota in Marketo is 50k API calls per day so see if this is enough for you to be able to extract the email data you need.
In order to help you determine which approach is best go to Admin > Web Services. Here you will see your REST API and Bulk API limits as well as your usage for both over the past 7 days. Clicking into the hyperlinked numbers beside “Requests in the Last 7 Days” and “Capacity Used in the Last 7 Days in MB” will show you day-over-day usage over the past 7 days.
Based on what these numbers show you will see whether you have more REST API or Bulk Extract capacity or if you are already very close to your limits with both then you might have to contact Marketo support to get an increase in capacity or reduce the number of calls or jobs currently being created to free up capacity.
It is hard to predict ahead of time how many API calls or MB will be needed to extract the email activity history for your database so it will take some experimentation to find out how much you need by making some API calls or creating some jobs for a subset of your database or over a short time frame. Then once you have done these small experiments you can extrapolate the API call or MB usage expected for your entire database or time frame of interest.
Whether you intend to use the REST API or the bulk extract API you will need the activity IDs for the email activities you are interested in. To get these:
Once you have these IDs check out the Github files linked below to see how to use these IDs in API requests to extract the email activity we are interested in.
Note that the code in all the Github files linked below is designed to be plug-and-play so that once you update the variables at the top of your code with your information then you should be able to run the code right away.
As mentioned in the docs the maximum time span for a job is 31 days so if we want to get data over a longer time frame e.g. 90 days, then we will need to create multiple jobs and then join the data from each job together. The Bulk Extract code uses a number of functions to do just that.
The REST API code works similarly to the bulk extract code except here it is data from “pages” that are being joined together instead of data from jobs. Since only a limited amount of data can be returned in a single REST API request, the data is paginated and we must loop through all the pages to extract all the data.
To begin a request is made to the paging token endpoint, specifying the start date in the past that we would like to extract data from. The paging token that is returned is then used as the starting point for a while loop that makes requests to the activities endpoint to get the activity data from each page.
As the while loop is iterating, the data returned for each page is added to a list and then this list is converted to a dataframe and saved as a CSV.
Once all the data over the desired timeframe is gathered some processing of the dataset is done to remove bot activity and tidy up the dataset to make it easier to process later. You can see how the data is gradually transformed by going through the leftmost tab to the rightmost tab in this sheet. You can see the code used to do this processing here.
By far the easiest way to find the best send time for each lead ID is to use the “Advanced Data Analysis” option with ChatGPT-4. In order to access ChatGPT-4 you need to have a paid subscription. Then to enable “Advanced Data Analysis” from the chat screen:
This “Advanced Data Analysis” feature allows us to upload our CSV of email activity and ask ChatGPT to give us an output file containing the best send time for each lead id. To analyze a file with ChatGPT-4:
Based on my testing and with the prompt I used below, I found that the best format for the input data was grouping the email activity by lead ID and email as shown in the Group by Lead & Email tab. Note that the times returned from the Marketo API are in UTC time and we need to ask ChatGPT to give us the output time in the same timezone as our Marketo instance.
"Given the data which contains the delivered and open times for the emails received by each lead please determine the best time of day to email these leads to maximize the chances that they will open future emails. Note that the times are in UTC time. Provide the output in a table containing 2 columns: 1 for lead Id and 1 for an optimum send time in CDT where the times should be to the nearest hour as a digit from 1-24."
Once you enter your prompt and the input file ChatGPT will start analyzing the data and print out the methodology that it is using to transform and analyze the data as it is going along.
As shown in this chat using the above prompt ChatGPT actually does a very simple analysis to get the best send time where it returns the hour in which someone most frequently opened emails. If this is all you are looking for you do not even need to use ChatGPT for this, you could use Python code or Google Sheets formulas to compute this best send time for yourself.
Therefore if you want ChatGPT to do a more complex analysis and take the email delivery times into account for its analysis then you will have to refine your prompt and ask ChatGPT to make adjustments in subsequent messages as you see the approach it is taking with the analysis.
As I show you can ask questions based on the input or output data, for example, I ask ChatGPT to give me a pivot table of best send times so I can see how the best send times are distributed across the output dataset. I also asked ChatGPT why so many lead IDs were missing from the output dataset and it answered that leads with no open times were not included in the output.
Once you are happy with the analysis that ChatGPT has done you can then ask ChatGPT to export the output dataset as a CSV. Next, take a look at the “Using the Best Send Time in Marketo” section to see how we can get these send times into Marketo and start using them to optimize our email open rates.
You might have noticed that the Marketo API associates the email activity with leadID so that the output dataset of best send times is for lead IDs. However, if we want to do a list upload to Marketo to populate the best send time for these leads we need to have the best send time for each email address instead.
In order to get these best send times into Marketo for each lead follow these steps:
Once the best send time has been uploaded for each email address you are then ready to build a smart campaign to send emails to each person at their unique best time. Being honest building the flow for this smart campaign is a pain because it consists of 24 “Send Email” steps and 23 “Wait” steps.
The smart campaign is scheduled to run at midnight on the day that we want the email to be sent. Therefore the very first action is to send the email to anyone whose best send time is “0”. Then we wait for 45mins with the wait ending at 1am before sending the next email to anyone whose best send time is “1am”. These actions are then repeated for each of the remaining hours in the 24-hour clock.
So long as you have this smart campaign nested within an email program in Marketo and you have the email selected in the “Control Panel” you will still be able to see the email performance in the program “Dashboard”. But what if you want to A/B test to see how sending at the best send time performs compared to sending to everyone at the same time?
In order to A/B test performance of the best send time:
Then upload these 2 lists to Marketo and then use the smart campaign outlined above to send to one list at their best send times and then a second smart campaign (or the email program) to send the email to the second list at a particular time e.g. 10am CDT as is so often recommended.
Then either using a Business Intelligence tool that has access to your Marketo data or using the bulk extract or REST APIs to extract email activity and then using Google Sheets or Python for the analysis:
Analyzing email activity from your Marketo instance is only the start. For example, you send lead activity information to ChatGPT and ask it to highlight the leads that are most engaged and you could then send these to your sales team. The sky is the limit!
Now that you know the power of the Marketo API you should take a look at the Marketo API Crash Course so that you can learn how to use both the REST and Bulk Extract APIs for so many different automation and marketing operations use cases.