The taskcluster-worker Mac OSX engine
• mozilla
In this quarter, I worked on implementing the taskcluster-worker Mac OSX engine. Before talking about this specific implementation, let me explain what a worker is and how taskcluster-worker differs from docker-worker, the currently main worker in Taskcluster.
The role of a Taskcluster worker
When a user submits a task graph to Taskcluster, contrary to the common sense (at least if you are used on how OSes schedulers usually work), these tasks are submitted to the scheduler first, which is responsible to process dependencies and enqueue them. In the Taskcluster manual page there is a clear picture ilustrating this concept.
The provisioner is responsible for looking at the queue and determine how many pending tasks exist and, based on that, it launches worker instances to run these tasks.
Then comes the figure of the worker. The worker is responsible for actually executing the task. It claims a task from the queue, runs it, upload the generated artifacts and submits the status of the finished task, using the Taskcluster APIs.
docker-worker
is a worker that runs task command inside a docker container.
The task payload specifies a docker
image as well as a command line to run, among other environment parameters.
docker-worker pulls the specified docker image and runs task commands inside it.
taskcluster-worker and the OSX engine
taskcluster-worker
is a generic and modularized worker under active
development by the Taskcluster team. The worker delegates the task execution
to one of the available
engines.
An engine is a component of taskcluster-worker responsible for running a task
under a specific system environment. Other features, like environment variable
setting, live logging, artifact uploading, etc., are handled by
worker plugins.
I am implementing the Mac OSX engine, which will mainly be used to run
Firefox automated tests in the Mac OSX environment. There is a
macosx
branch in
my personal Github taskcluster-worker fork in which I push my commits.
One specific aspect of the engine implementation is the ability to run more than one task at the same time. For this, we need to implement some kind of task isolation. For docker-worker, each task ran in its own docker container so tasks were isolated by definition. But there is no such thing as a container for OSX engine. Our earlier tries with chroot failed miserably, due to incompatibilities with OSX graphic system. Our final solution was to create a new user on the fly and run the task with this user’s credentials. This not only provides some task isolation, but also prevents privilege escalation attacks by running tasks with different user than the worker.
Instead of dealing with the poorly documented
Open Directory Framework,
we chose to spawn the
dscl
command to create and configure users. Tasks usually takes a long time to
execute, spawning loads of subprocess, so a few spawns of the dscl
command
won’t have any practical performance impact.
One final aspect is how we bootstrap task execution. A tasks boils down to
a script that executes task duties. But where does this script come from?
It doesn’t live in the machine that executes the worker. OSX engine provides a
link
field in task payload that a task can specify an executable to download and
execute.
Running the worker
OSX engine will primarily be used to execute Firefox tests on Mac OSX, and the environment is expected to have a very specific tools and configurations set. Because of that, I am testing the code on a loaner machine. To start the worker, it is just a matter of opening a terminal and typing:
$ ./taskcluster-worker work macosx --logging-level debug
The worker connects to the Taskcluster queue, claims and execute the tasks available. At the time I am writing, all tests but Firefox UI functional tests” were green, running on optimized Firefox OSX builds. We intend to land Firefox tests in taskcluster-worker as Tier-2 on next quarter, running them in parallel with Buildbot.