How to create a package with {fusen} to enhance your Product Analytics in Mixpanel.
Companies who offer digital products (like us) want to understand how users interact with their products and services. It is important to understand what part of your app or which features are the ones your customers are engaging with and finding value in. Therefore understanding user behavior and engagement is an important aspect to successful product development and business growth. This realization has given birth to a vibrant landscape of product analytics tools, each designed to unearth insights from user interactions, helping businesses fine-tune their strategies and offerings.
Today I want to highlight event-based Analytics with Mixpanel. Mixpanel has been around for some years and it is the main tool with which our Product Managers analyze user pathways, time spent on certain pages etc. I can’t really say that it’s the best tool that is out there, but it’s what we’re working with.
Instead of just tracking page views or sessions, event-based analytics hones in on specific
user actions or events that occur during a user’s interaction with a website, app, or system
(in a previous post
I’ve already discussed the concept of user sessions).
An “event” is a user action or interaction with your product that you choose to track.
These events can be virtually anything – clicking a button, signing up for a newsletter, adding items to a shopping cart, or even scrolling down a page. Essentially, events are custom-defined actions that are relevant to your specific goals and objectives.
In addition to that you can provide further properties and information tied to events.
As an example you might have a Social Media feed where users can like posts.
You could set up an event Liked Post
where you would add further information about the
nature of said post (e.g. a boolean flag whether this was an ad or contained an image).
In our company we’re using Segment to store Frontend User tracking events that get then
sent to Mixpanel.
In order to get the best out of any analytics tool, the data basis must be good.
That means that there has to be a sound strategy of how the event tracking is set up
in the organization. As a data analyst, however, I have the opportunity to additionally
improve or enhance the analytical capabilities of this tool. The example I want to highlight in
this post is how to leverage Lookup Tables.
These let you enrich events with attributes about other entities (in my case companies).
Events are always tied to the user who performed the relevant action. But often
it might be beneficial to link information about other entities to properties.
To give an example: We still got our Social Media Site where people might subscribe to
different channels. We might want to link further information about the channel
to a potential event Subscribed to Channel
(which shouldn’t be set as properties in the specific tracking event).
In this case Lookup Tables come in handy.
The problem I want to solve is having up to date information in a Lookup table that I’ve setup in Mixpanel for my colleagues to use. Therefore I want to setup a scheduled R notebook that’s updating the relevant data on a daily basis. For the scheduled notebook I’m gonna use Deepnote (see this previous post).
The Mixpanel Ingestion API offers decent documentation of how to interact with it. Now I only have to figure out how to best wrap it’s functionality into R functions.
Authentication is an important issue when dealing with API’s (I’ve written a bit about API’s in this previous post). In order to provide the relevant credentials to authenticate yourself, a nice way is storing these in environment variables. Here is a post that describes how to do exactly that. Coming back to the Mixpanel Documentation. In order to use the Lookup Table endpoints, I had to setup a Service Account for authentication. Here is a guide to set these up. The following credentials are going to be used to interact with the API:
I will be turning two endpoints into R functions. First the GET List Lookup Tables
endpoint
under https://api.mixpanel.com/lookup-tables
and second the PUT
Replace a Lookup Table
endpoint.
I’m going to use the {httr2}
package to deal with HTTP requests. The package website
is a great resource and even offers a vignette of how to wrap API’s.
The main concept of the package is a pipeable request object (request()
) where you add information
about authentication, headers, query parameters and method. In the end the request
gets performed (req_perform()
) and you handle the response.
As the name suggests the get_lookup_tables function returns a list of available lookup tables in your Project and their id’s. The credentials are going to be passed as default arguments in the function.
get_lookup_tables <- function(mp_service_account_name = Sys.getenv("MP_SERVICE_ACCOUNT_NAME"),
mp_service_account_secret = Sys.getenv("MP_SERVICE_ACCOUNT_TOKEN"),
mp_project_id = Sys.getenv("MP_PROJECT_ID")) {
checkmate::assert_character(c(mp_service_account_name, mp_service_account_secret,
mp_project_id))
resp <- httr2::request("https://api.mixpanel.com/lookup-tables") |>
httr2::req_auth_basic(username = mp_service_account_name,
password = mp_service_account_secret) |>
httr2::req_headers(accept = "application/json") |>
httr2::req_url_query(project_id = mp_project_id) |>
httr2::req_perform()
tables <- resp |> httr2::resp_body_json() |>
(\(x) x$results )() |>
tibble::enframe(name = "n") |>
tidyr::unnest_wider(col = .data[["value"]])
return(tables)
}
After building up the request by adding authentication, headers and query parameter, I perform the request. Afterwards I have to do some data wrangling to convert the result into a tibble that gets returned.
Now I have created a function to identify the lookup_table_id
that is gonna be passed as
an argument in the main function that deals with updating said lookup table.
In addition to the credentials I also have to add the table_to_replace
as an argument which should be a dataframe or tibble.
The Replace a Lookup Table
endpoint uses a PUT method where I modify an existing object.
There are some requirements that need to be fulfilled when providing the lookup table data in order to successfully use the endpoint:
As I input a dataframe into the function, I’ll have to format this input into
the required csv string. This get’s done through the readr::format_csv()
function.
Now I can build up my request object as done before. One important difference to the
get_lookup_tables
function is the PUT method that needs to be added.
The request gets performed and the returned object is the {httr2}
response which
hopefully results in 200.
replace_lookup_table <- function(mp_service_account_name = Sys.getenv("MP_SERVICE_ACCOUNT_NAME"),
mp_service_account_secret = Sys.getenv("MP_SERVICE_ACCOUNT_TOKEN"),
mp_project_id = Sys.getenv("MP_PROJECT_ID"),
lookup_table_id,
table_to_replace) {
checkmate::assert_character(c(mp_service_account_name, mp_service_account_secret,
mp_project_id, lookup_table_id))
checkmate::assert_data_frame(table_to_replace)
table_to_replace_csv <- readr::format_csv(table_to_replace, escape = "none")
resp <- httr2::request("https://api.mixpanel.com/lookup-tables") |>
httr2::req_auth_basic(username = mp_service_account_name,
password = mp_service_account_secret) |>
httr2::req_headers(accept = "application/json",
`content-type` = "text/csv") |>
httr2::req_url_path_append(lookup_table_id) |>
httr2::req_url_query(project_id = mp_project_id) |>
httr2::req_body_raw(table_to_replace_csv) |>
httr2::req_method("PUT") |>
httr2::req_perform()
return(resp)
}
Now that I successfully wrapped two API endpoints into R functions, I wanted to put these into a
package called {mixpanelutils}
(you can find it on Github here).
I really like the {fusen}
package that let’s you
build an R package based upon an Rmarkdown file that get’s “inflated”.
The awesome thing is that it takes much of the pain and documentation efforts away.
I encourage you to watch this video on youtube to learn more.
The main thing that needs to be done is extending the functions I wrote with
some {roxygen2}
tags. Here is the example for the get_lookup_tables
function:
#' get_lookup_tables
#'
#' @param mp_service_account_name The Mixpanel Service Account Name
#' @param mp_service_account_secret The Mixpanel Service Account Token
#' @param mp_project_id str The Mixpanel Project ID
#'
#' @importFrom rlang .data
#'
#' @return
#' Tibble with information about Lookup tables
#' @export
get_lookup_tables <- function(mp_service_account_name = Sys.getenv("MP_SERVICE_ACCOUNT_NAME"),
mp_service_account_secret = Sys.getenv("MP_SERVICE_ACCOUNT_TOKEN"),
mp_project_id = Sys.getenv("MP_PROJECT_ID")) {
...
}
With {fusen}
you will be walked through the process of doing the necessary steps
in order to have a wonderful tested (not in my case) and checked package hosted on Github.
You mainly have to follow along the 0-dev_history.Rmd
file.
It is obviously up to the author to provide further documentation, helpful vignettes etc.,
but you’re already on a path to have a reliable package.
What I particularly like is the integration of {pkgdown}
that lets you create
websites for your R package. You can find the website of {mixpanelutils}
under
https://l-groeninger.github.io/mixpanelutils/. I only provided the bare minimum but
got a nice documentation set up.
In this post I talked a bit about Product Analytics Tools and how these might be enhanced by us Data Analysts. I laid out the example of Mixpanel and how I setup a scheduled notebook to update relevant information in Lookup tables on a daily basis for my colleagues to use.
I showed how to wrap API functionality into R functions using the {httr2}
package.
Afterwards I took the functions and created my own package {mixpanelutils}
with the help of {fusen}
(shoutout to Sébastien Rochette).
Maybe there is someone that finds the package useful.
Right now I have no intention to continue working on extending the package and wrap other
Mixpanel API endpoints, but who knows…
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Gröninger (2023, Sept. 3). Lukas R Blog: Wrapping the Mixpanel API. Retrieved from https://lukas-r.blog/posts/2023-09-03-wrapping-the-mixpanel-api/
BibTeX citation
@misc{gröninger2023wrapping, author = {Gröninger, Lukas}, title = {Lukas R Blog: Wrapping the Mixpanel API}, url = {https://lukas-r.blog/posts/2023-09-03-wrapping-the-mixpanel-api/}, year = {2023} }