From Fedora Project Wiki

< GSOC 2015

Revision as of 15:15, 24 March 2015 by Charul (talk | contribs)

Project Title : Shumgrepper

Personal Information

  • Name  : charul
  • Fedora Profile  : charul
  • GitHub  : charulagrl
  • Timezone  : India, UTC +5:30

Contact Information


Why do you want to work with the Fedora Project?

I have worked on various fedora projects and had a experience while working on them. I already had participated in GSoC last year and worked on the same project Shumgrepper and this year I would like to bring this project to its completion. Besides all this, fedora is my favorite linux distro and it gives me immense pleasure in contributing to its projects.

Do you have any past involvement with the Fedora project or another open source project as a contributor?

Yes, I have contributed to Datagrepper, Fedora-Packages, Shumgrepper and Summershum.

Did you participate with the past GSoC programs, if so which years, which organizations?

Yes, I participated last year i.e. in year 2014 with Fedora and worked on Shumgrepper project.

Will you continue contributing/ supporting the Fedora project after the GSoC 2015 program, if yes, which team(s), you are interested with?

Yes, I will continue contributing to the Fedora Project even after the GSoC 2015 program. I would prefer contributing to more projects under Fedora-infra team as the projects completely intersect my area of interest.

Why should we choose you over other applicants?

I have already been actively involved in contributing to fedora projects and have worked on this project last year thereby, having a good understanding of project codebase and its requirements. I am pretty much sure that this time i will be able to bring it to the state of completion.

Proposal Description

Overview and The Need

Shumgrepper is a webapp which is built on top of Summershum. Summershum collects md5sum, shasum and sha256sum of every file in every package. Shumgrepper uses this information to check the integrity and duplication among different packages. It can be used to find the common or the files that have been changed among different packages by comparing their sha256 values. It also allows you to query by shum values.

Any relevant experience you have

I had worked on this project last year GSoC. Besides this, I have been writing codes in Python for more than 3 years. Also, I have built many applications in Flask, webapp2, used jinja2 template and have good experience of working as a backend developer.

How do you intend to implement your proposal

1. Database Migrations: We had made some changes in the summershum schema so that Packages list page can be rendered fast. As a first step, I would be writing a alembic migration script to update current data according to the new schema. After this, it is important to check and compare the query results.

2. Running unit-tests: It's important to run unit-tests before launching the product into products in order to minimize failures.

3. Deployment: We can plan to deploy the very first version in production of shumgrepper after the above two steps would be completed.

4. Testing and optimization: When it comes to compare each file one package to each file of other packages to find out common or different files; the queries take too long to return results. I need to find some ways by which we can plan to optimize these queries.

5. GPL License: As we already have the information about shum values of files within packages. This can be used to find if a package is having a genuine GPL license.

6. Improving the GUI: We can improve user experiences with the app by making considerable changes in the UI. This could involve:

  • Visualization of the differences(on the basis of files changed) among different versions.

7. Testing & Documentation: This will involve testing all the end-points and their results. Also documenting everything implemented so far.