From Fedora Project Wiki

Project Title : Fresque

Contact Information

About Me

  • I’m a senior student at Indian Institute of Information Technology, Allahabad majoring in Information Technology.
  • I have excellent proficiency in Golang, Python and C/C++. I am passionate about building real time scalable web application. Furthermore, in past I have gained experienced in all the major web frameworks i.e. Flask, Django, ROR, and Martini. Apart from this, I am also comfortable with front-end technologies i.e. Javascript, CSS and HTML.
  • My Open Source journey started when I became Mozilla Student Ambassador and initially my contributions were limited to localization and community building. Later on it expanded to code contribution. Recently, I was contributing to release engineering groups projects. Links to some of my contributions are mozregression and releng API.

Q&A

Why do you want to work with the Fedora Project?

I have been using Fedora for past five years and I am fond of its user friendly interface and reliable support forum's. In past, I have contributed to fedora-infra tahrir project and I must say that I had "great learning experience" while interacting with my mentor. I like fedora community and find everyone very helpful and motivating. In last few years, I have found some great projects has been started in fedora-infra like ptogit, shumgrepper, fresque etc, to which I would like to contribute.

Do you have any past involvement with the Fedora project or another open source project as a contributor?

Yes, I have contributed to fedora-infra tahrir project. I have made contributions to several other open source organizations i.e. Apache, Mozilla. Apart from these I have also made contribution to python projects on github i.e. Python-Cliff and Python-Click.

Why should we choose you over other applicants?

I have 3+ years experience in Python language which is basic required skill of the project. Fresque project is a flask app and I know the framework very well. Apart from these, I have experience in writing unit tests, handling database and also coming up new ideas. This project will involve development in flask, sqlalchemy, pygit2, unittests etc, which I know pretty well. I have already gone through the fresque project workflow and made few contribution to the project and thus, I think it makes me a strong candidate for this project.

Contribution to Fresque Project

I have made few contribution to the fresque project. Link to contribution is here

Did you participate with the past GSoC programs, if so which years, which organizations?

Yes, Last year I have participated for the Apache. The project was about developing command line application for Libcloud API. The project repository is available on on my github page.

Will you continue contributing/ supporting the Fedora project after the GSoC 2015 program, if yes, which team(s), you are interested with?

I would love to work more with Fedora-Infra team because their projects intersect with my interests. I was working on Progit, Tahrir for some times and after GSOC, I would like to contribute to these projects as well as fresque.

Will you have any other time commitments, such as school work, another job, planned vacation, etc., during the duration of the program?

In early May our summer vacation of college will start and ends by late of July; I can give my full time commitment to this project,. I assure dedication of at least 40 hours per week to the work and that I do not have any other obligations from early May till mid August.

Past Experience

In past I have made contribution to various projects of Mozilla and Apache foundation. Recently . Moving ahead, I have also written two open source libraries i.e.

  • Parinx: Sphinx docstring parser language which provides a interface to extract the relavant parameter.
  • Sec-Edgar:Download all companies periodic reports, filings and forms from EDGAR database,

which are available on pypi. Apart from this I have cofounded "Prequell" whose first project was scaling Flask application using blueprints and celery. Finally, I consider myself as an experienced Python developer as I have built various application across different domain using it.

  • Prior to GSOC, I have completed internship at The Walt Disney Company and their I have worked on building highly personalized real-time news application using Golang and Python.

Goal

Fedora Fresque is a standard Python web application that abstracts away intricacies in package review process. Currently for any package to enter the fedora repository has to go through the review process which is mechanical and mannual, during this period contributors receives valuable feedbacks but once the package gets imported; all the information is lost.

So, I will develop a web application which will expose dedicated RPM reviews using inline comments along with some level of automation. This will bring up lot of new possibilities to package manages and will allow them to connect to the packaging reality.

Project Details

The project will consist of 4 main phases:

Phase 1: Add git server on backend

The current process of package review have no git integration and with every new build of the package we looses the information of previous one. It means we are depriving the evolution of the spec file. Adding git will help us to track all the changes.

The process workflow can be described as
  • When ever a new request for package review will be requested, a bare git repository with same name as of the package gets created on the fedora review server.
  • The bare repository keeps tracks of the changes made by the packagers.
  • Now packager can clone this git repository on their local machines and can make changes to it.
  • Thereafter they can push the updated code to the fedora review server git repository.
  • The changes data will be saved in the review server repository.

In this phase we will use python pygit2 to interact with git. Also I will import most of the feature from pagure like keeping tracks of commits, revisions etc. For example: Doe submitted spiderman package for review and update his package with .spec file. The process using git will look like below.

on server side
def create_git_repo(name, gitfolder):
    # Create a git project based on the package information.

    gitrepo = os.path.join(gitfolder, '%s.git' % name)
    if os.path.exists(gitrepo):
        raise IOError('The project repo "%s" already exists' % name)

    # create a bare git repository
    pygit2.init_repository(gitrepo, bare=True)
    return 'Project "%s" git respository created' % name

on client side
$ git clone git@fedorareviewserver.org:/var/git/doe/spiderman.git
$ cd spiderman && touch spiderman.spec
$ git add spiderman.spec
$ git commit -m "initial commit"
$ git push -u origin master
Branching

For every target release their will be separate branch in order to avoid the conflicts during review process. To keep it more general, the branch name will be given as fNN such as f14 for Fedora 14. This will be set to the default master branch.

Reporting

Every new push will invoke fedora package review which will build the rpm for suitable target and the report generated will be sent via email as well as notification to the contributor. The fedora review server will have its own lookaside cache to speed up the build process or we can have dockers containers for it.

Phase 2: Streamlining review process

During package review (in which reviewers looks for the package before sending it to the fedora server) reviewer tries to identify bugs, gives feedback and keeps package more maintainable. For reveiw process, currently fedora packaging uses Bugzilla which is more like a bug tracker tool and feedback is given by creating Bugzilla ticket. Furthermore, most of the time a contributor is unclear for what he is waiting for. People forget about fedora flags and Bugzilla don't know about fedora relationship like who is contributor and who is the reviewer. So in this phase the task would be to add features which will bring features to remove the above drawbacks and to provides efficient and reliable review mechanism.

The review process constitutes three entities

Watcher - who is interested in getting reports on package <br\> Contributor - who has created package that is being reviewed; <br\> Reviewer - One or more developers that review the package<br\> The only requirement is if it is the first package of a Contributor, the Reviewer must be in the Sponsor group and be willing to sponsor that Contributor.

Let needle down packages review workflow to understand what happening underneath, then we will look upon how to remove shortcomings.
  • Contributor creates the package using fedora-package tool.
  • Then he adds .spec and .src.rpm files for the package, which have informations about the package and its builds.
  • After that he provides a short review summary, urls of .spec and .src.rpm files, description of .spec file, review description as first package and sponsor for the package.

Its time to look upon reviewer side,

  • The reviewer set the flag to '?', which means it is under review and he identifies the bugs and issues.
  • Reviewer gives his feedbacks.
  • Then contributor do the required changes and again upload the package and if reviewers approves it, the flag get changed to '+'.

Now we have seen that there is lack of streamline conversations i.e. reviewer can't comment where the issue is and each package update turns out to be more manual as contributor will have to follow aforesaid steps again. Fresque will introduce several new features to overcome the shortcomings

  • Threaded comments, inline discussions: This will allow to have discussion at individual line level in the file.
I have written a JavaScript code for selcting inline code text. Link to demo is here.
  • Review Counts: This will show number of issues has been identified and requires modifications.
  • Review Tags: It will describe the type of review i.e. Bug, Enhancement etc.
  • Distributed teams: Profiles of reviewers and contributors.
  • Iterative reviews: Code review is inherently iterative. In this we keep tracking the newest revision and keeps it on top.

Example:

Contributor Doe creates a spiderman python package and puts this on review process by providing informations about package, target builds, sponsor etc. The python packaging guidelines can found here.

spiderman/
    |--__init__.py
    |--setup.py
    |--db/
    |--lib/
    |--spider.spec
  • He puts his package for review. The reviewer John will set the flag to ?.
  • John comments on the current setup.py for adding dependencies.
  • Reviewer may schedule a meeting here or can comment and after meeting takes place and the bugs are marked on the review server. The total count of reviews will also be shown on package page.
  • Inline comments will have instant profile mentions like writing '@name' will show lists of reviewers and contributors.
  • Doe goes and makes the changes on the source code and push the modified revision to the fedora review server.
  • Now Doe generates new spec and marks off the bug.
  • After that John goes ahead and checks that the modifications were done and if everythings is fine then he changes the review flag to +.

Currently, contributor uses their own spreadsheet and various other tools to keep track of changes. To remove this, all the comments and changes will be saved in the fedora-review database so that in future anyone can access it and this will remove the developer-developer dependencies.

Phase 3: Fedora-Review integration

In this phase I will integrate fedora-review because as of now, there is no automatic QE during review process. Fedora-Review implements API which removes many manual process in review process by doing automatic checks. Currently fedora review does following jobs

  • Downloading SRPM & SPEC from Bugzilla report
  • Build and install package in mock
  • Download upstream source
  • Check md5sums
  • Run rpmlint
  • Generate review template with both manual & automated checks serving as a starting point for the review work.

Integrating this will bring a great level of automation and will pave the way for continuos integration process for fedora packages. Implemention wise, fedora-review provides python API which can be easily amalgamated to the flask app on the backend system.

Architecture of the continuos integration of packages using fedora-review .
Flow diagram of the continuous integration system for the package using fedora-review and buildbot

Lets look at one example,
After making changes when Doe push his code on the review server, these steps will follow up. Our CI server is Taskotron and its buildbot master will be invoked for pakage build.

  • The taskotron server keeps running in background and will look for changes in the package.
  • if its detects changes in the package it will pass instruction to execute build process to its slaves.
  • These slaves as build bot will setup the appropriate environment and then it will notify to master as well as contributors. An example of Dockerfile of slave container.
# base image
from fedora:21
# install the dependecies and fedora-review
run yum install --assumeyes \
    buildbot-slave git tar rpmdevtools \
    gcc-c++ liblastfm-devel taglib-devel gettext boost-devel \
    qt-devel cmake gstreamer1-devel gstreamer1-plugins-base-devel glew-devel \
    libgpod-devel qjson-devel libplist-devel \
    usbmuxd-devel libmtp-devel protobuf-devel protobuf-compiler qca2-devel \
    libcdio-devel qca-ossl fftw-devel sparsehash-devel sqlite-devel \
    pulseaudio-libs-devel libqtwebkit-dev sha2-devel desktop-file-utils \
    libechonest-devel libchromaprint-devel python-pip fedora-review

run rpmdev-setuptree

run echo "fedora-21-64" > /fedora-review
# start the build process as mentioned in config file
cmd ["/usr/bin/python", "/config/slave/start.py"]

  • Finally this build report link will be sent to the package page.

Phase 4: Unit tests and deployment

Unit testing makes sure that our code works properly under a given set of conditions and assures correctness under a basic set of conditions. Syntax errors will almost certainly be caught by running tests, and the basic logic of a unit of code can be tested to ensure correctness under certain conditions. In this phase, I will write unit test for the flask application. Example, unit tests for checking whether /packages endpoint is giving right result or not

import json
import unittest

class FlaskTestCase(unittest.TestCase):
    # using the unittest library, call the packages route from the app
    # returned value contained in the JSON response
    # match the result of those parameters
    def test_packages(self):
        tester = app.test_client(self)
        response = tester.get('/packages', content_type='application/json')
        self.assertEqual(response.status_code, 200)
        # Check that the result sent is {'packages': ["spiderman"]}
        self.assertEqual(json.loads(response.data), {"packages": ['spiderman']})
Deployment

It will involve creating the proper stack on the server and also enabling proper security features like port forwarding and firewall. The following step will be followed for the deployment.

  • Install and Enable WSGI, which is WSGI (Web Server Gateway Interface) is an interface between web servers and web apps for python.
  • Put the flask app into /var/www/ directory
  • Install flask app dependencies
  • set up the postgresql database
  • start the server

this can be shipped as docker container. The docker file to deploy flask app is available on my github page.

Deriverables

  • Web front-end and backend of fresque
  • Deployment of the app
  • Documentation and Manual
  • Unit Tests
  • Integrating of the app with other applications

Timeline

Period Task
April 27 - May 14 Community bonding, reading documentation and getting familiar with all the codes.
May 15 - May 24 Writing unit tests for the previous already written functions and fixing bugs.
May 25 Official GSoC coding period begins.
May 25 - June 09 (2 weeks) Git Backend Phase - Development of fedora review git server.
June 10 - June 17 (1 week) Add HTML file browser for git repository.
June 18 - June 25 (1 week) Review Process Phase - Adding backend functions for review process.
June 25 - July 03 Mid term evaluation period.
July 04 - July 14 (10 days) Front-end interface development for the review process.
July 15 - July 31 (2 weeks) Fedora-Review integration Phase: add fedora-review tool for automatic testing of new reviews
Aug 01 - Aug 05 (5 days) Web-Front end development which involve creating user friendly design and embedding security features.
Aug 05 - Aug 12 (1 week) Writing unit tests
August 13 - August 20 (1 week) Sanitizing codes, documenting everything, reviewing all the functionalities

and fixing bugs.

August 21- August 27 (1 week) Pencils down period. Submitting the project for final evaluation.

References