From Fedora Project Wiki

< User:Roshi

Revision as of 17:22, 15 March 2017 by Roshi (talk | contribs)

An overview of Taskotron dataflow

Taskotron Overview

Taskotron is made up of several components all working in concert.

We need something that maps the data flow through the taskotron pipeline. So, basically:

1) FedMsg Input

Pretty much every aspect of Fedora infrastructure emits status messages to the FedMsg bus. We use this bus to coordinate and kick off a variety of tasks, from desktop notifications to kicking off automated testing. It's a constant stream of event data. Taskotron listens to the FedMsg bus with the "taskotron-trigger" package.

2) Taskotron-trigger

Trigger sets up a local fedmsg-hub with several custom "consumers." These comsumers look for specific FedMsg topics. For instance, the cloud_compose_complete_msg.py jobtrigger listens for 'org.fedoraproject.prod.pungi.compose.status.change' messages, and then kicks off the CloudComposeJobTrigger.

3) After an fedmsg is consumed, the "JobTrigger" parses the data from the FedMsg to prepare it for use by the execution engine. In our CloudComposeJobTrigger example, it first checks to see if the message is saying that the Pungi compose is "FINISHED." If it's "FINISHED," it'll continue to process the data, otherwise it discards the message. That particular fedmsg only contains the following fields: 'status,' 'location,' and 'compose_id.'

Note.png

When the JobTrigger is done, it returns a python dictionary with the following fields (this is not an exhaustive list - these can be anything):

  • _msg: The raw fedmsg we parsed
  • message_type: the type of message, which is CloudCompose or AtomicCompose for our example
  • item: This is the data we want to hand to our task for execution. In our case, it's a URL that points to the qcow2 image the Pungi compose generated.
  • item_type: What kind of item the task will run on, in our case a 'compose' [1]
  • name: The human readable name of what's being passed in the item. In our example it'd be Fedora-Cloud or Fedora-Atomic.
  • version: What version of the thing are we using? In our example it'll be 24 (or whatever release of Fedora this fedmsg was for). Other JobTriggers operate on packages, and the version and release fields would reflect which package is being referenced in the fedmsg.
  • release: This is very similar to version, and for our example it would be the date the compose was executed on. For example, 20170310.0.

Once this data is ready, it's paired with the task defined in the 'trigger_rules.yml' config file. For our example, it looks like this:

- when:
    message_type: AtomicCompose
  do:
    - {tasks: [upstream-atomic, fedora-cloud-tests]}

The 'message_type' inside the dict the JobTrigger generated maps directly to the trigger rule. The next question is where does the task itself live and how does trigger know what it is? Each task lives in it's own repository, which taskotron mirrors locally [2] to speed up execution times. For our example, upstream-atomic and fedora-cloud-tests live on pagure [3].

Let's take a look at what exactly a "task" is and what they're made of. There are already docs for this [4], but here's a brief overview. Each task contains at least 2 things: a task.yml file and some bit of code that is the task. The task.yml contains all the information required by the runner to execute the task. Our fedora-cloud-tests example looks like this:

---
name: fedora-cloud-tests
desc: Run the Fedora Cloud or Atomic tests against a qcow2 image
maintainer: roshi

environment:
  rpm:
    - ansible
    - testcloud
    - git

actions:
  - shell:
    # Clone the upstream test repo
    - git clone https://pagure.io/fedora-qa/cloud-atomic-ansible-tests.git

  - python:
      file: run_cloud_tests.py
      callable: run
      test_image: ${compose}
      artifactsdir: ${artifactsdir}
    export: output

  - shell:
      - ignorereturn:
          - rm inventory

  - shell:
      - rm -rf cloud-atomic-ansible-tests

  - name: report results to resultsdb
    resultsdb:
      results: ${output}

The 'environment' section defines what the runner needs to have installed in order to run. Namely it needs to have the ansible, testcloud and git packages installed on the buildslave to run the task. The 'actions' section defines what steps the task needs to take in order to run. You see a couple different items in there, namely the 'shell' and 'python' directives. The 'shell' calls act exactly like you'd expect, they run raw shell commands [5]. The 'python' directive takes a couple options to run.

First is 'file' which tells it which python file to execute (this is located in the same directory as task.yml). 'callable' is which method inside the 'file' to call. Those are the only two options required when it comes to running python files inside a task. The last two options (test_image and artifactsdir) are keyword arguments that get passed into the method 'callable' points to (you can set as many of these as you like - they just have to match the keyword arguments required as input to the method referenced in 'callable').

'${compose}' is a passed in argument from the runner [6], and '${artifactsdir}' is an absolute path to where the runner stores task artifacts (like log files), which is configured in the libtaskotron installation. The 'item' in the data dictionary created by the JobTrigger ends up in the '${compose}' variable for our task when it's run by the execution engine.

The last section of the task.yml file is how we get the restults of the task into resultsDB. The output of the python call gets save to 'output' (in the 'export: output' section of the python directive), which is then referenced for submission to resultsDB. Now that we've had an overview of what a task is, let's get back to the rest of the overview.

4-5) These two things, the task to be run, and the data dict created by the JobTrigger, are bundled together and sent to the execution engine to be queued for execution. The execution engine gets these two bits of information, and then finds an available job runner to execute the task on.

6) The job runner (buildslave) receives the data from the buildmaster and then executes the task. If we were to run this locally, it would look like this:

 runtask -t data['message_type'] -i data['item'] runtask.yml

The '-t' flag is the type of item we want to run on (in this case, a 'compose') and the '-i' flag is for the item we're going to run against. The value passed in via the '-i' flag gets referenced by the task in task.yml as '${compose}' which corresponds to the 'message_type' passed in via the '-t' flag.

7) The actual reporting of the results to resultsDB is handled in the task itself, but the job runner (buildslave) will report that it's completed running the task to the buildmaster so it knows that the buildslave is available for another task.


[0] These are the only fields we care about for this example - which is everything inside the 'msg'

   block of the emitted fedmsg. You can see a full example here: https://apps.fedoraproject.org/datagrepper/id?id=2017-f853e69c-5e95-43b9-9471-101f508efb6d&is_raw=true&size=extra-large

[1] There is a finite number of available types to choose from, and these are all located in the

   libtaskotron.check component, and determines what type of report (reporttype) is sent to
   resultsDB.

[2] Taskotron mirrors task repos with GrokMirror. All existing tasks currently running can be found

   in the taskotron ansible repo: https://infrastructure.fedoraproject.org/cgit/ansible.git/tree/inventory/group_vars/taskotron-prod#n19

[3] https://pagure.io/taskotron/task-upstream-atomic and https://pagure.io/taskotron/task-fedora-cloud-tests

[4] http://libtaskotron.readthedocs.io/en/latest/writingtasks.html

[5] Astute readers will also see the use of '- ignorereturn' which causes the runner to continue

   if the invoked command fails for some reason.

[6] A local run would look like this:

   "runtask -t compose -i <url to qcow2 image> runtask.yml"