Google Professional Machine Learning Engineer (PMLE) Exam Notes

The 7 big exam topics plus tips

4 min readNov 16, 2024

I took and passed the 2024 PMLE exam today (Nov, 2024) for the first time.

I finished the test in 85 minutes (test is 120 minutes max) and marked two questions for review (there are 50 questions total). There are no case study pre-reads on this exam.

To prepare for this test, I studied for approximately 30 hours over the span of three weeks. I lack any sort of impressive ML background or training, so I had a lot of catching up to do on this subject. You might need less or more depending on your own background and skillset.

General study tips

As with the other GCP exams, I prepared primarily by reading the product documentation, reviewing official sample questions and following all the links in the answers, and reading the best practices guides:

This test is new as of October 1, 2024. As of mid-November, and I saw no courses on udemy or Coursera or anywhere else that had any recent material. If this is your preferred study method, keep this in mind.

It’s hard to tell what’s deprecated and what’s not. Some stuff is still in Public Preview. I found that the “Legacy” FeatureStore documentation was better for learning the concepts than the current one. (It’s called “Legacy” but it’s not deprecated — what does that even mean?). Anyway, it’s weird out there right now.

One big fat pattern

At least half of the questions take the following form:

You work at [company], you are trying to build a model that does [X]. The pipeline needs to preprocess data from [database] using algorithms [A,B,C], train your model, and serve predictions under constraints of [Y,Z]. What do you do?

[company] will be a bank, a healthcare company, a retailer, a gaming startup, etc.
[X] will range from predicting customer churn, to classifying images, to forecasting sales, and recommending products.
[database] is often BigQuery or Cloud Storage, occasionally on-premises, sometimes it is left unspecified.
[A,B,C] are things like min-max scaling, one-hot encoding, etc.
[Y,Z] can be “minimal cost and latency”, or “minimal infrastructure overhead and maximum scalability”, and everything in between.

Your job in these questions is to determine the best way to design a system that 1) accomplishes the business goals of training model [X], and 2) satisfies the constraints of [Y,Z], in that order. Don’t accidentally ignore the business requirements in trying to meet the technical constraints.

The 7 main topics

Here’s a non-exhaustive list of topics that you’ll need to know, roughly in order of how many times I remember seeing them come up on the test:

Vertex Pipelines and Kubeflow. It seemed like almost every question involved pipelines or kubeflow somehow. Or some of the answers would suggest using them (right or wrong). You need to know when and how to orchestrate pipelines, how they help track model lineage and compare versions, how they can be automated, monitored, and how they help you deploy and manage your models in production.
Preprocessing and feature engineering. Know the Tensorflow preprocessing best practices (linked above) like the back of your hand. Some of these were tricky, especially when combined with the constraints given in the question. Know where certain kinds of preprocessing tasks can and cannot be performed. Know how flaws in these pipeline steps can lead to training-serving skew and how to fix them.
Monitoring, retraining, explainability. Know how, whether, and when to re-train an existing model based on new data that has arrived into the training set, or based on metrics gathered from serving the model in production. Know how to set up and use feature attribution to detect drift, explain predictions, and the differences in explanation methods (Shapley, XRAI, etc).
Vertex Workbench, Experiments, and Notebooks. The question sometimes involves a team doing “experiments”, and you need to know how to use and configure Experiments, Metadata, etc. to manage the chaos. Know the uses and limitations of workbench and notebooks.
Serving and scaling. Know when to deploy to a new Vertex endpoint vs. the same one with traffic splitting. Know the features and limitations of Vertex endpoint auto-scaling (it’s not good for sudden spikes!), and know patterns for low-latency online serving vs. batch predictions.
The algorithms. Matrix factorization is good for recommendations. ARIMA is good for time series forecasting. Customer churn is a classification problem. Predicting house prices is a regression problem. And so forth.
The basics. A solid grasp of the basics is critical. You don’t want to spend your time trying to remember the differences between accuracy and precision, labels and features, or when to use a confusion matrix or an F1 score. If you don’t have a solid grasp of these, the test will be even less fun than it already is.

Tips and Trivia

Some other things that might be useful:

Orchestrating an ML pipeline with Composer/Airflow is almost never the move
Knowing when/where you can use CPUs, GPUs, and TPUs in training will help you eliminate wrong answers.
There were a couple questions that involved bias. Fix these in the preprocessing stage rather than trying to correct for them after the model is trained.
I had a question about ReductionServer. I did not know what this was.
There’s very little specifically about “generative AI”

Hope that helps, and good luck!