From Fedora Project Wiki
m (Add a pipe for the internal link to userpage)
 
(104 intermediate revisions by the same user not shown)
Line 4: Line 4:
* FAS Account: bee2502
* FAS Account: bee2502
* Fedora userpage: [[User:Bee2502|Bee2502]]
* Fedora userpage: [[User:Bee2502|Bee2502]]
*Email Address: bhagyashree dot iitg at gmail dot com (I rarely check my FAS mails)
* Email Address: bhagyashree dot iitg at gmail dot com  
*Blog URL: [https://networksfordata.wordpress.com https://networksfordata.wordpress.com]
* Blog URL: [https://networksfordata.wordpress.com https://networksfordata.wordpress.com]
*Freenode IRC Nick: bee2502
* Freenode IRC Nick: bee2502
*github :  
* Github : [https://github.com/bee2502 https://github.com/bee2502]




===Why do you want to work with the Fedora Project?===
===Why do you want to work with the Fedora Project?===


Your proposal should include the following: your project proposal, why you'd like to execute on this particular project, and the reason you're the best individual to do so. Your proposal should also include details of your academic, industry, and/or open source development experience, and other details as you see fit
'''I love Fedora OS '''


https://fedoraproject.org/wiki/CommOps
While Fedora isn't the first Linux distribution I have used, it is surely one which I have used the longest and am most comfortable with.


Being involved with the Community Operations team and contributing to Fedora has been a wonderful experience. GSoC offers me a rare opportunity to continue this involvement with CommOps and Fedora by spending my summers doing something I really love(metrics and contributing to Fedora) and making some really impactful contributions along the way.
'''I love the Fedora Community'''


===Do you have any past involvement with the Fedora project or any other open source project as a contributor?===
The Fedora community is very warm and welcoming. I especially like that CommOps encourages contributors to work in diverse areas and to try out new stuff, with the Fedora community always ready to help you out if stuck.


Yes, I have been involved with the Community Operations team since the past six months. Some of my past contributions include -
'''I love Fedora CommOps'''


* Collaborated with [[User:Jflory7|Justin Flory]] on [https://communityblog.fedoraproject.org/women-in-computing-and-fedora/ Women in Computing and Fedora article].
I love the work. I love the team and I want to continue contributing and helping improve Fedora. Period.


* Helped [[User:Jkurik|Jan Kurik]] organize ''' F23 Elections''' ! Also compiled the post-election metrics. Read more about the F23 elections on the CommOps retrospective [https://communityblog.fedoraproject.org/commops-2015-elections-retrospective/ here].
'''High Impact'''


* Compiled Fedora IRC metrics [https://communityblog.fedoraproject.org/meetbot-data-analytics-peek-fedora-irc-meetings/ here]
Even as a newcomer, I have had the opportunity to work on high impact projects like organizing elections or working on metrics which affect strategic decisions. The huge impact your work can have on milllions of Fedora users and contributors is something which motivates me to contribute to Fedora.  


* Fedora Badges post for Newcomers : 'How to get started with Fedora Badges?' [https://networksfordata.wordpress.com/2015/10/19/fedorabadges/ here].
'''Great Learning Opportunity'''


* Some bug fixes for fedora-infra repo along with contributing to Community Blog and Fedora Magazine.
Due to the flat hierarchy in Fedora, I have already collaborated with or worked under some of the long term contributors and important figures in the Fedora community. This experience has been a great learning opportunity in many different ways and I look forward to many such chances in the future.


I am also a member of Fedora Women and recently started contributing to Fedora Hubs development too.
I look forward to work and be involved with Fedora. I aim to stick around and become a long term contributor in the Fedora community.


===Did you participate with the past GSoC programs, if so which years, which organizations?===
===Do you have any past involvement with the Fedora project or any other open source project as a contributor?===


No
Yes, I have been involved with the Community Operations team since the past six months. Some of my past contributions include -


===Will you continue contributing/ supporting the Fedora project after the GSoC 2016 program, if yes, which team(s), are you interested with?===
==== Statistics related Contributions ====


I will, of course. I'll continue with the CommOps team and Hubs development. I am also interested in being an Ambassador(but that's for a bit later)
* Data Analytics to understand impact of FOSDEM : [https://networksfordata.wordpress.com/2016/03/08/fedora-at-fosdem/ read here] and [https://github.com/fedora-infra/fedora-stats-tools/blob/develop/event-activity.py code here]
* Year in Review metrics for Fedora CommOps : read the report with information about API queries, analysis and data visualizations [https://networksfordata.wordpress.com/2016/01/22/2015-in-numbers-fedora-commops/ here]
* Community Blog statistics : read the report with information about API queries, analysis and data visualizations [https://networksfordata.wordpress.com/2016/01/22/2015-in-numbers-fedora-community-blog/ here]
* Outreachy Impact metrics : [https://communityblog.fedoraproject.org/women-in-computing-and-fedora/ read here] and [https://apps.fedoraproject.org/datagrepper/charts/line?user=charul&user=pjha&user=riecatnor&user=ktnode&user=housewifehacker&user=smanuel16&user=marija&user=keekri&user=bee2502&user=dhrish20&user=devyani7  related API query here]
* F23 Dec/Jan Election related metrics : [https://communityblog.fedoraproject.org/commops-2015-elections-retrospective/ read here] and related statistics  [https://admin.fedoraproject.org/voting/results/famsco-nov-dec-2015 here], [https://admin.fedoraproject.org/voting/results/fesco-nov-dec-2015 here] and [https://admin.fedoraproject.org/voting/results/council-nov-dec-2015 here]
* IRC metrics using fedmsg activity and datagrepper : [https://communityblog.fedoraproject.org/meetbot-data-analytics-peek-fedora-irc-meetings/ read here] and [https://github.com/fedora-infra/fedora-stats-tools/blob/develop/scripts/meetbot_stats.py code here]
* Spammer Activity in Fedora - some graphs [https://lists.fedoraproject.org/archives/list/commops@lists.fedoraproject.org/message/4WREEL7EBU7X26Y33GEINLD5P3RWLM7S/ ML thread here] and related API query [https://apps.fedoraproject.org/datagrepper/charts/line?topic=org.fedoraproject.prod.fas.group.member.apply&delta=2592000 here] and [https://apps.fedoraproject.org/datagrepper/charts/line?topic=org.fedoraproject.prod.fas.group.member.apply&delta=31536000 here]


===Why am I the best fit for this project idea?===
==== Other Technical Contributions ====


I am really passionate about Data Analytics. With data, I want to understand and impact the community by bringing to light the critical issues along with identifying our strengths and weaknesses to help the leadership make informed decisions.My work in the Community Operations team at Fedora has revolved around these areas and I couldn't be more grateful for this wonderful experience and the awesome community.
* Contributed to Fedora Hubs for gaining technical knowledge of the codebase which would be helpful in developing metrics related widgets.  
Issues I have fixed include :


Apart from that,
[https://pagure.io/fedora-hubs/issue/106 https://pagure.io/fedora-hubs/issue/106]


* I'm really passionate about open source, love the CommOps and Fedora community and I will continue to contribute to Fedora and CommOps even when the project ends.
[https://pagure.io/fedora-hubs/issue/96 https://pagure.io/fedora-hubs/issue/96].
So, Choose me ! Choose me ! Choose me !


// something about watching CommOps grow
You can see my closed PR's [https://pagure.io/fedora-hubs/pull-requests?status=False&author=bee2502 here]
// Fedora community here


* "Bee has been a founding member of the CommOps team since October 2015. In her time contributing to CommOps, she has helped with F23 elections (which was the fourth most participated in election in Fedora history), generated metrics analyzing impact at the FOSDEM conference and telling the story of Fedora's Ambassadors in quantifiable terms (and being featured on the Fedora Magazine for it), and added her unique perspective and wisdom into the decision-making behind many CommOps decisions. Bee has been an integral part of helping CommOps succeed." --[[User:Jflory7|Jflory7]] ([[User talk:Jflory7|talk]]) 14:35, 16 March 2016 (UTC)
My Hubs related fedmsg activity [https://apps.fedoraproject.org/datagrepper/raw?category=pagure&user=bee2502 here]


==Project Proposal==
==== Other Contributions ====


===Overview===
* Collaborated with [[User:Jflory7|Justin Flory]] on [https://communityblog.fedoraproject.org/women-in-computing-and-fedora/ Women in Computing and Fedora article].


CommOps
* Helped [[User:Jkurik|Jan Kurik]] organize ''' F23 Elections''' ! They were the fourth most participated elections in all time.Read more about the F23 elections on the CommOps retrospective [https://communityblog.fedoraproject.org/commops-2015-elections-retrospective/ here].


Metrics
* Completed the [https://fedoraproject.org/wiki/CommOps/Join CommOps Join Process]. Wrote a Fedora Badges post to aid Newcomers : 'How to get started with Fedora Badges?' [https://networksfordata.wordpress.com/2015/10/19/fedorabadges/ read it here].


Impact
* Helping in diversity and women outreach efforts of Fedora by being an active member in Fedora women community.


* I have also contributed to the Fedora Community Blog (see my works [https://communityblog.fedoraproject.org/author/bee2502/ here]) and to Fedora Magazine(see my works [https://fedoramagazine.org/fedora-looks-back-ahead-women-computing/ here])


===What the project fulfills===
* I have a good knowledge of the wiki,Trac,IRC and mailing-lists and I am comfortable with using them to communicate effectively.    I have communicated and interacted with my mentors and other team members in IRC and ML and having been involved with CommOps, I understand the ethics and values that make up the Fedora Community.


* You can see my overall contribution activity via fedmsg [https://apps.fedoraproject.org/datagrepper/raw?user=bee2502 here] and [https://apps.fedoraproject.org/datagrepper/charts/line?user=bee2502 here]


===Experience in me in order to meet the project requirements===
Apart from Fedora, I have done an Open Source Data Analytics project for Measurement Lab


*I am a student in the Department of Computer Science and Engineering, University of Moratuwa (The most sought after university for Engineering studies in Sri Lanka) and I’m studying in the end of 2nd year with a Grade Point Average of 4.1 out of 4.2.
===Did you participate with the past GSoC programs, if so which years, which organizations?===


*I have expertise in languages HTML, CSS, PHP, Javascript, Java and C and I am currently mastering Python and DJango Framework which is necessary for the development of this project.
No


*I have applied MVC architecture in my web based projects and I have knowledge in Symfony which is also a framework based on MVC architecture. As DJango is also a framework based on MVC architecture, my knowledge about MVC will be useful here.
===Will you continue contributing/ supporting the Fedora project after the GSoC 2016 program, if yes, which team(s), are you interested with?===


*I have built web sites using HTML, CSS, PHP and Javascript (One of the hosted web sites : [http://desolator.loomhost.com/ Desolator Tank Game])
I will, of course. I'll continue with the CommOps team and Hubs development. I am also interested in being an Ambassador(but that's for a bit later)


*I am currently learning how to develop responsive websites.
===Why am I the best fit for this project idea?===


*I have sufficient knowledge about the Git version control system and has used Git for many of my projects. My GitHub account: [http://github.com/anuradha1992/ GitHub]
I am really passionate about Data Analytics. With data, I want to understand and impact the community by bringing to light the critical issues along with identifying our strengths and weaknesses to help the leadership make informed decisions. My proposal for the Community Operations slot for Fedora in GSoC revolves around this idea. Along with sound technical skills required to implement this proposal, I also feel that I have the required non-technical skills ideal for effective open source contributions


*I have been in the mentoring program for GSoC 2014 which was held in our university in order to provide pre-knowledge of the importance of contributing towards FOSS development and how to make myself present in the mailing lists etc. I was mentored by Mr. Andun Sameera Liyanagunawardana who was a 2 times GSoC winner and an active open source contributor. (You can view his recommendation on me on my linkedin page here: [http://linkedin.com/in/anuradhawelivita/ LinkedIn] )
Some relevant points include :  


*I have my blog here at [http://anuradhanotes.blogspot.com/ Blogger]. I have written articles about Cloud Computing etc. on my blog and I am willing to blog about the progress of this project continuously.
* I'm really passionate about open source, love the CommOps and Fedora community and I will continue to contribute to Fedora and CommOps even when the project ends.
* I am comfortable with coding in Python , C++ , R and can write queries in SQL. I also have intermediate knowledge of HTML and CSS.
* I have working knowledge of fedmsg system and datagrepper queries and have done related data analytics projects before [https://fedoraproject.org/wiki/GSOC_2016/Student_Application_bee2502#Statistics_related_Contributions link here]
* I also know Machine Learning and NLP and I am interested in using these techniques to understand Fedora community better.
* I am learning Data Visualization techniques like d3.js so that I can develop interactive visualizations from data.
* I also have contributed to Fedora Hubs development in the past and have knowledge of the codebase.Issues I have fixed include :


===Final Deliverables===
[https://pagure.io/fedora-hubs/issue/106 https://pagure.io/fedora-hubs/issue/106]


[https://pagure.io/fedora-hubs/issue/96 https://pagure.io/fedora-hubs/issue/96].


*The main of this project is to improve the UX/UI of AskFedora and the final outcome of this project would be a consistent, totally responsive complete overhaul on the UX/UI of AskFedora with complete rounds of testing and bug fixing.
You can see my closed PR's [https://pagure.io/fedora-hubs/pull-requests?status=False&author=bee2502 here]. My Hubs related fedmsg activity [https://apps.fedoraproject.org/datagrepper/raw?category=pagure&user=bee2502 here]. Being familiar with the codebase, I can help in CommOps related tasks for Hubs development


*Future development might include doing a research on the use of AskFedora by using a web analytic tool and find the areas which are most frequently used by the users and the areas which seems to be getting unnoticed etc. and do further improvement by using the results.
* My contributions to Fedora have not just been limited to technical aspects. To gain a deeper understanding of the Fedora Project, I have tried to contribute in diverse areas including helping [[User:Jkurik|Jan Kurik]] organize F23 elections(which was the fourth most participated elections in Fedora history) , writing a Fedora Badges article to help newcomers[https://networksfordata.wordpress.com/2015/10/19/fedorabadges/ link here], contributing to Community Blog (see my works [https://communityblog.fedoraproject.org/author/bee2502/ here]) and to Fedora Magazine(see my works [https://fedoramagazine.org/fedora-looks-back-ahead-women-computing/ here]) and helping in diversity and women outreach efforts of Fedora by being an active member in Fedora women community.


*Project documentation listing all the things that I have done in accomplishing the project targets. (This can be included in my blog at [http://anuradhanotes.blogspot.com/ Blogger])
* I blog regularly, and I believe this will help me develop interesting and well laid-out documentation as well as data analytics reports for the project.
* I have a good knowledge of the wiki,Trac,IRC and mailing-lists and I am comfortable with using them to communicate effectively.
* I have communicated and interacted with my mentors and other team members in IRC and ML and having been involved with CommOps, I understand the ethics and values that make up the Fedora Community.


===My current approach towards the project ===
* '''If you haven't guessed it by now, I really love CommOps and contributing to Fedora and GSoC offers me a great opportunity to do so over the summer !!! ''' Additionally, I get to work on statistics and Machine Learning - what more could I ask ?!


==Project Proposal==


*I have studied the mockups provided regarding the project and have built a rough web interface according to them. You can view it at [http://askbot-anuradhaw.rhcloud.com/ OpenShift].
===Overview===


*I have also built an interface for the User Profile page of AskFedora on a mobile scale. You can view the code here at [http://github.com/anuradha1992/askfedora/ GitHub]
Fedora Community Operations(CommOps) : Statistical Simulation and Data Analytics for Fedora Infrastructure Message Bus Activity


*And in order to get familiarized with the askbot code base I have a cloned it from [http://github.com/ASKBOT/askbot-devel.git/ GitHub]
[https://fedoraproject.org/wiki/CommOps Community Operations], a.k.a. CommOps, aims to address the area of community infrastructure by providing the tools, resources, and utilities for the different subgroups of Fedora to increase communication across the Project.


*I have installed OpenShift rhc Client Tools and have learnt to create a new Python web project using that.
Because of the fedmsg stack, Fedora has very detailed raw data on Fedora contributor activity. My proposal revolves around programmatically querying Datagrepper API for data collection to build automated tools using Statistical and Machine Learning techniques for data analysis and visualization for different parameters.


*I have also cloned the source for testing repository from [https://github.com/fedoradesign/askbot-test/ GitHub].
=== GOALS ===


*And I have learnt to communicate in mailing lists and IRC channels as well by subscribing into the Fedora summer-coding mailing list and Fedora developers mailing and as well as to the IRC channels.
====  STATISTICAL TOOL FOR FEDORA EVENT ANALYTICS====


*I have communicated with the mentors '''Sarup Banskota''' and '''Suchakra Sharma''' via mailing lists and via IRC and got to know more about the project and the technical things that I need to master in developing this project.  
* Develop automated tools for data collection using tahrir API in conjunction with Fedora Infrastructure Message Bus activity
* Develop statistical tools and algorithms for real-time data analysis using numpy and pandas python libraries.
* Also, identify and code suitable Clustering algorithms for demographical analysis using scipy and sci-kit learn python libraries
* Provide real - time interactive data visualizations using suitable tools from matplotlib or d3.js


*And I have studied the AskFedora redesign plan as provided in the following [http://suchakra.in/random/redesign-plan.pdf document]


===How I plan to implement the proposal in sync with the redesign plan given===
==== STATISTICAL TOOL FOR FEDORA INFRASTRUCTURE MESSAGE BUS ACTIVITY ANALYTICS  ====


====Stages of implementation====
* Develop automated tools for data collection using Datagrepper API queries for Fedora Infrastructure Message Bus activity.
* Develop statistical tools and algorithms for real-time data analysis using numpy and pandas python libraries for generating sub-project wise metrics, contributor wise metrics like mean Contributor Age(Fedora Activity wise), Retention  Rate of Contributors
* Generate programmatic python scripts for Time Series Modelling of data using scikit-learn and scipy libraries in python.
* Also, identify and implement suitable Machine Learning algorithms(like Temporal Clustering) to find similarity patterns in sub-project activity, build contributions etc
* Provide real - time interactive data visualizations using suitable tools from matplotlib or d3.js


As also highlighted in the redesign-plan, the main approach for the project would be simplified as the following:
* Develop an automated tool for the fedmsg statistics for quarterly report for Fedora Project.
* Develop statistical tools to identify long-tail patterns in contribution activity(How many contributors are just long-tail from packaging one thing?)
* Use Machine Learning algorithms like Logistic Regression,SVM or Neural Networks to distinguish Redhat vs Non-redhat contributers on lists and conduct a statistical analysis.
* Develop a Temporal Clustering based tool to identify similarity in contribution patterns for long-time contributors ( Do successful/old contributors have diverse contributions ? Are their contributions in bursts or continous over a period of time ? ) (optional)
* Alternatively, statistics tools could also be implemented as [https://github.com/fedora-infra/statscache statscache] plugins instead of automated python scripts, depending on feasibility.


'''Step 1:'''
* ML Discussion [https://lists.fedoraproject.org/archives/list/commops@lists.fedoraproject.org/message/ZXD2YGW2UREARMNGOUJRMW5YLFG7NCAR/ here] , [https://lists.fedoraproject.org/archives/list/commops@lists.fedoraproject.org/message/GZ5HB7KYZ4AF53NGAOSQXATVCSKKJ5PJ/ here]and [https://lists.fedoraproject.org/archives/list/commops@lists.fedoraproject.org/message/ZXD2YGW2UREARMNGOUJRMW5YLFG7NCAR/ here] and related ticket on CommOps Trac instance [https://fedorahosted.org/fedora-commops/ticket/32 here] and [https://fedorahosted.org/fedora-commops/ticket/31 here]


*Analyze the current UX of the system, identify the drawbacks of the current system and discuss the possible improvements.
==== STATISTICAL TOOL FOR MAILMAN/HYPERKITTY ACTIVITY ANALYTICS  ====


*This step includes identifying the user profiles and the problems faced by them with the current design of AskFedora.
* Develop automated tools for data collection using HyperKitty API in conjunction with Fedora Infrastructure Message Bus activity
The user profiles include:
* Develop statistical tools and algorithms for real-time data analysis using numpy and pandas python libraries to generate statistics like mean/median size of ML thread , number of people in a thread, mean length of discussions, redhat vs non-redhat activity.
* Generate programmatic python scripts for Time Series Modelling for ML activity data using scikit-learn and scipy libraries in python to identify activity patterns(bursts/highs and lows).
* Also, identify and implement suitable Clustering algorithms to find activity-wise and trend-wise similarity patterns in lists.
* Provide real - time interactive data visualizations using suitable tools from matplotlib or d3.js
* Alternatively, statistics tools could also be implemented as [https://github.com/fedora-infra/statscache statscache] plugins instead of automated python scripts, depending on feasibility.


*'''The Seekers''' – People with specific questions to ask and who will land directly on the main page or on the question page of AskFedora.
* ML Discussion [https://lists.fedoraproject.org/archives/list/commops@lists.fedoraproject.org/thread/W2OMDI5MO3BN7SFHPIJ3DZD6VN5R63YU/#VK5PRO4HW5J7UKZFSZBR6IFR7YF6D5KO here] and [https://lists.fedoraproject.org/archives/list/commops@lists.fedoraproject.org/thread/GJCVSS6M4KOT4CZBDFQECFVXWPRJ4BAO/#ZPR2XMYHU4BMZP5ESLVQJJIJEPYFYJNH here]. Related tickets on CommOps Trac instance [https://fedorahosted.org/fedora-commops/ticket/42 here] and [https://fedorahosted.org/fedora-commops/ticket/26 here]


*'''The Contributors''' – People who want answer the user problems. They will be mostly using “Profile” and “Tags” pages in order to find areas which they want to answer.
==== STATISTICAL TOOL FOR BUGZILLA ANALYTICS ====


*'''The Surfers''' – These are the people who land directly on the “Questions” page as a result of searching something via a search engine like Google and who may not want to sign up or login.  
* Develop automated tools for data collection using Bugzilla API in conjunction with Fedora Infrastructure Message Bus activity
* Develop statistical tools and algorithms for real-time data analysis using numpy and pandas python libraries.
* Generate programmatic python scripts for Time Series Modelling for data using scikit-learn and scipy libraries in python to identify activity patterns(bursts/highs and lows/mean Bug turnaround time).
* Also, identify and implement suitable Clustering algorithms to find activity-wise and trend-wise similarity patterns.
* Identify relevant algorithms and develop Machine Learning based tool to identify easy-fix or most relevant bugs (Optional)
* Provide real - time interactive data visualizations using suitable tools from matplotlib or d3.js
* Alternatively, statistics tools could also be implemented as [https://github.com/fedora-infra/statscache statscache] plugins instead of automated python scripts, depending on feasibility.


'''Step 2:'''
=== STRETCH GOALS ===


*Further develop mock ups for the pages. Analyze the mock ups, discuss their drawbacks and get the mock ups finalized.
==== STATISTICAL TOOL FOR FEDORA BADGES ====


'''Step 3:'''
* Develop automated tools for data collection using tahrir API in conjunction with Fedora Infrastructure Message Bus activity
* Develop statistical tools and algorithms for real-time data analysis of badge collection activity using numpy and pandas python libraries.
* Also, identify and code suitable Clustering algorithms for demographical analysis using scipy and sci-kit learn python libraries
* Generate programmatic python scripts for Temporal Analysis for data using scikit-learn and scipy libraries in python to identify activity patterns
* Provide real - time interactive data visualizations using suitable tools from matplotlib or d3.js


*Develop the designed interfaces using HTML, CSS, Java Script and Python and with good responsiveness and consistency.
==== FEDORA HUBS WIDGETS ====


'''Step 4:'''
* Componentization of CommOps deliverables into Fedora Hubs Widgets.
* Develop metrics and statistics related widgets for Fedora Hubs


*Integrate them with ASKBOT
==== Some other cool Ideas bee2502 would like to work on ====
*Identify the bugs and resolve them.


====Technical Details====
* '''Automated NLP-based tool to find the expertise Fedora Contributers'''


* Use of '''OpenShift''' (The Open Hybrid Cloud Application Platform by Red Hat) which is a cloud based service where I can host my applications in the public cloud and share my designs with the team.
Develop an automated tool using NLP techniques to find the expertise of Fedora contributors using meetbot logs of IRC meetings.
We are looking to answer questions like "Who can best solve my doubt?" OR "Who is the most qualified person for this task?"
NLP libraries for python like gensim or nltk will be used for tool development.


* Use of mock up design open source tools like '''Inkscape'''.
* '''Badge Recommendation Engine widget for Hubs'''
 
Much along the lines of Stack Overflow Badge Recommendations : "You are 50% of the way to earning the 'Master Editor Badge' "
Provide recommendations like "80% of contributors who last collected 'White Rabbit Badge' went on to collect 'Origin Badge' next "


* Use of '''LibreBoard''' (An open-source kanban board) where the team can organize things in cards, and cards in lists that will give a better overview of what is completed, what are the things on progress and what are the things to be done.
Develop an automated tool with backend using Tahrir API to fetch data.
Identify suitable Recommendation Algorithms like Collaborative Filtering and develop the engine using them


* AskFedora is powered by AskBot which is a DJango based web application where Django is a free and open source web application framework, written in Python. DJango follows the model–view–controller architectural pattern. And I would be studying the '''DJango framework''' and will be working with it when integrating my designs with the AskBot.
This could be especially helpful for newcomers to explore different areas of Fedora Project
Some related representations by mizmo : https://fedoraproject.org/wiki/Fedora_RPG_OLD


* Languages that can be used in the development process: '''HTML''', '''CSS''', '''Javascript''', '''Saas''' (A css preprocessor where we can use features that don't exist in CSS which makes the code more simpler with the use of variables, nesting, mixins, inheritance etc.), '''Python'''.
* '''Automated Tool to publish IRC meetings word clouds to social media like Twitter'''


* Further '''Compass''' (An open-source CSS Authoring Framework which works with Saas) can be used as a mixin library with Saas which will provide cross browser compatibility so that we will not have to handle CSS hacks.
Generate wordclouds from meetbot data using NLP techniques or wordcloud tools/libraries for python.
Develop a tool using Twitter API to publish these wordclouds to Fedora handles on social media.


* Also '''Susy''' (A Compass responsive grid plugin) can be used to produce responsive web pages more easily by following a bottom-up approach (Doing the interfacing of the mobile devices first and moving into larger screens) in designing the web interfaces.
==== STATISTICAL TOOL FOR GITHUB ANALYTICS ====


===Timeline===
* Develop automated tools for data collection using Github API in conjunction with Fedora Infrastructure Message Bus activity
* Develop statistical tools and algorithms for real-time data analysis using numpy and pandas python libraries.
* Generate programmatic python scripts for Time Series Modelling for data using scikit-learn and scipy libraries in python to identify activity patterns(bursts/highs and lows/mean issue turnaround time).
* Also, identify and implement suitable Clustering algorithms to find activity-wise and trend-wise similarity patterns.
* Identify relevant algorithms and develop Machine Learning based tool to identify easy-fix or most relevant issues (Optional)
* Provide real - time interactive data visualizations using suitable tools from matplotlib or d3.js


=== Final Deliverables ===


I would like to start having a look and master the technical stuff that I need to fulfill even before the Community bonding period starts.
* Automated statistics tools(scripts in python,preferably) for data collection,analysis and visualizations to be committed to the [https://github.com/fedora-infra/fedora-stats-tools fedora-stats-tools github repo] and/or [https://fedoraproject.org/wiki/CommOps#Toolbox Community Operations Toolbox] whichever appropriate. Alternatively, statistics tools could also be implemented as [https://github.com/fedora-infra/statscache statscache] plugins instead of automated python scripts, depending on feasibility.


So, my timeline goes as follows:
* Data files(.csv) to be committed to the [https://github.com/fedora-infra/fedora-stats-tools fedora-stats-tools github repo]


====Upto the start of Community Bonding Period (28th of March - 27th of April)====
* Analysis Reports and documentation of work to be published to community via Mailing lists, Fedora Planet and/or Community Blog posts whichever appropriate.


*Getting familiar with Python and DJango web application framework
* Report back weekly on Community Operations to Mailing Lists, Community Blog, and other channels when appropriate.
*Gain knowledge in Saas, Compass and Susy
*Getting familiar with the development environment of AskBot
*Understand the AskBot pages flow and possible improvements
*Learn responsive web design in detail


====Community bonding period (28th of April - 25th of May)====
=== Timeline ===


*Discuss further about the redesign plan of AskFedora via mailing lists and communicating actively in the IRC channels
I would like to start having a look and master the technical stuff that I need to fulfill even before the Community bonding period starts.
*Present my ideas on how to change the current design of AskFedora reflecting my own ideas.
*Present what I have already done with the design of AskFedora web interfaces and mock ups, receive feedback from the mentors and further carry on development work on the mock ups and interfaces according to their feedback.


====Work Period until mid-term evaluations (26th of May – 26th of June)====
====Community bonding period (22nd of April - 25th of May)====


*'''Week 1-2'''
* Work on improving the technical skills needed for the project (especially data visualizations)
**Finish doing mockups for all the pages in AskFedora and get them finalized.
* Understand the bugzilla API
*'''Week 3-4'''
* Discuss and finalize the Machine Learning Algorithms neccesary for FEDORA INFRASTRUCTURE MESSAGE BUS ACTIVITY statistical analysis tool
**Code the UI for the mockups.
* Discuss project specifications with mentor.
**First finish with the responsive UI for the Main page and Q/A page and integrate them with the AskBot.
* Develop an automated tool for the fedmsg statistics for quarterly report for Fedora Project.
**Testing and bug fixing.
*'''Week 5'''
**Start coding responsive UI for the other pages.


====Period of submitting mid-term evaluations (27th of June - 3rd of July)====
====Work Period until mid-term evaluations (25th of May – 20th of June)====


*Completing and submitting mid-term evaluations.
* Work on STATISTICAL TOOL FOR FEDORA EVENT ANALYTICS
*Carry on coding responsive UI for the other pages.
* Work on STATISTICAL TOOL FOR FEDORA INFRASTRUCTURE MESSAGE BUS ACTIVITY ANALYTICS
* Communicate to the team regarding weekly status
* Update personal blog posts to be syndicated on Fedora Planet as per the progress


====Work Period (4th of July – 8th of August)====
====Period of submitting mid-term evaluations (20th of June - 27th of June)====


*'''Week 1'''
* Clean Code and test for bugs.
**Complete coding responsive UI for all the other pages.
* Fix related bugs and write documentation.
*'''Week 2-3'''
* Code review by mentor.
**Find and code the separate and individual left out elements
* Submitting and completing midterm evaluations.
**Integrate them with AskBot
* Update personal blog posts for weekly status, to be syndicated on Fedora Planet
**Testing and bug fixing
*'''Week 4-5'''
**Final integration with AskBot
**Testing and bug fixing
*'''Week 6'''
**This week is allocated in case of emergency reasons that I would not be able to complete some work within the schedule.
**Apart from the above I will be continuously blogging about the progress of the project and the work I do and on this week I will spend my time refining the content in my blog.


===Other Information===
====Work Period (27th of June – 15th of August)====


====Potential Risks and how I am going to avoid the risks====
* Work on STATISTICAL TOOL FOR MAILMAN/HYPERKITTY ACTIVITY ANALYTICS
* Work on STATISTICAL TOOL FOR BUGZILLA ANALYTICS
* Communicate to the team regarding weekly status
* Update personal blog posts to be syndicated on Fedora Planet as per the progress
* Also attend FLOCK :)


'''In case I will not be able to complete work within the schedule on time.'''
==== Final Week(15th August - 23rd August) ====
*In case this happens I will allocate double the time I am going to allocate on working on this project in the next stage and complete the missed work as soon as possible.
*And also I have created the timeline so that all the work is completed a week earlier than expected so that if there are work that I could not complete on time I will be able to complete those work during that last week.


'''Laptop break down.'''
* Clean Code and test for bugs.
*The Department of Computer Science and Engineering in University of Moratuwa is very much helpful for the students and therefore in case of such a situation they are willing to provide laptops to students for free. Hence I can ask for a laptop from my Department and work on it.
* Fix related bugs and write documentation.
*Also I am living very close to my university (University of Moratuwa) and I can use the computer labs in our Department until late night.
* Code review by mentor.
*I will commit and push all the work I do daily to GitHub so that way any of the work I will be doing on the project will not be lost.
* Submitting and completing midterm evaluations.
 
* Update personal blog posts for weekly status, to be syndicated on Fedora Planet
'''Loss of internet connectivity'''
* Wrap up and Complete tasks
*My university has free wifi connectivity all throughout the day and hence in case I lose my internet connectivity provided by my service provider I can immediately get to my University and work using wifi connectivity.
 
====Miscellaneous Information====
 
*I am very much capable of managing my time and hence I will be able to manage my time effectively and meet the project targets and deadlines on time.
*As I have great passion and interest towards UX/UI I will be very much willing to learn new things related to them. As I will be working on things I love it will not become a stress or a burden for me even though the hardness or quantity of work that I will be doing become high.
*And also my department in the university is spending time and money in organizing programs to encourage students to participate in GSoC and FOSS development activities.  In case I get to contribute towards this project I would also get a chance in talking to the other students about this project and my contribution towards it. I would also encourage the students to contribute towards design and development activities in Fedora and make them aware of the importance of contributing towards open source software.


===Potential Mentors===
===Potential Mentors===


'''Sarup Banskota''' and '''Suchakra Sharma''' have offered to mentor me.
Remy Decausemaker(decause) , Corey Sheldon(linux modder) and Justin Flory(jflory7)
 
===Attachments===
 
*The first one below is the interface I created by referring the mockups provided. You can view it on [http://askbot-anuradhaw.rhcloud.com/ OpenShift] as well.
 
*And the second one is the mobile interface I created for the User Profile page.
 
You can view the code for the following at [http://github.com/anuradha1992/askfedora/ GitHub]


I am currently learning about how to develop responsive web sites and while learning I am going to make these responsive as well. I will commit all the work I am doing in the designs of the web interfaces to my GitHub repo regularly. 






[[category:Summer coding 2016]]
[[category:Summer coding 2016]]

Latest revision as of 18:33, 25 March 2016

Contact Information


Why do you want to work with the Fedora Project?

I love Fedora OS

While Fedora isn't the first Linux distribution I have used, it is surely one which I have used the longest and am most comfortable with.

I love the Fedora Community

The Fedora community is very warm and welcoming. I especially like that CommOps encourages contributors to work in diverse areas and to try out new stuff, with the Fedora community always ready to help you out if stuck.

I love Fedora CommOps

I love the work. I love the team and I want to continue contributing and helping improve Fedora. Period.

High Impact

Even as a newcomer, I have had the opportunity to work on high impact projects like organizing elections or working on metrics which affect strategic decisions. The huge impact your work can have on milllions of Fedora users and contributors is something which motivates me to contribute to Fedora.

Great Learning Opportunity

Due to the flat hierarchy in Fedora, I have already collaborated with or worked under some of the long term contributors and important figures in the Fedora community. This experience has been a great learning opportunity in many different ways and I look forward to many such chances in the future.

I look forward to work and be involved with Fedora. I aim to stick around and become a long term contributor in the Fedora community.

Do you have any past involvement with the Fedora project or any other open source project as a contributor?

Yes, I have been involved with the Community Operations team since the past six months. Some of my past contributions include -

Statistics related Contributions

  • Data Analytics to understand impact of FOSDEM : read here and code here
  • Year in Review metrics for Fedora CommOps : read the report with information about API queries, analysis and data visualizations here
  • Community Blog statistics : read the report with information about API queries, analysis and data visualizations here
  • Outreachy Impact metrics : read here and related API query here
  • F23 Dec/Jan Election related metrics : read here and related statistics here, here and here
  • IRC metrics using fedmsg activity and datagrepper : read here and code here
  • Spammer Activity in Fedora - some graphs ML thread here and related API query here and here

Other Technical Contributions

  • Contributed to Fedora Hubs for gaining technical knowledge of the codebase which would be helpful in developing metrics related widgets.

Issues I have fixed include :

https://pagure.io/fedora-hubs/issue/106

https://pagure.io/fedora-hubs/issue/96.

You can see my closed PR's here

My Hubs related fedmsg activity here

Other Contributions

  • Helped Jan Kurik organize F23 Elections ! They were the fourth most participated elections in all time.Read more about the F23 elections on the CommOps retrospective here.
  • Helping in diversity and women outreach efforts of Fedora by being an active member in Fedora women community.
  • I have also contributed to the Fedora Community Blog (see my works here) and to Fedora Magazine(see my works here)
  • I have a good knowledge of the wiki,Trac,IRC and mailing-lists and I am comfortable with using them to communicate effectively. I have communicated and interacted with my mentors and other team members in IRC and ML and having been involved with CommOps, I understand the ethics and values that make up the Fedora Community.
  • You can see my overall contribution activity via fedmsg here and here

Apart from Fedora, I have done an Open Source Data Analytics project for Measurement Lab

Did you participate with the past GSoC programs, if so which years, which organizations?

No

Will you continue contributing/ supporting the Fedora project after the GSoC 2016 program, if yes, which team(s), are you interested with?

I will, of course. I'll continue with the CommOps team and Hubs development. I am also interested in being an Ambassador(but that's for a bit later)

Why am I the best fit for this project idea?

I am really passionate about Data Analytics. With data, I want to understand and impact the community by bringing to light the critical issues along with identifying our strengths and weaknesses to help the leadership make informed decisions. My proposal for the Community Operations slot for Fedora in GSoC revolves around this idea. Along with sound technical skills required to implement this proposal, I also feel that I have the required non-technical skills ideal for effective open source contributions

Some relevant points include :

  • I'm really passionate about open source, love the CommOps and Fedora community and I will continue to contribute to Fedora and CommOps even when the project ends.
  • I am comfortable with coding in Python , C++ , R and can write queries in SQL. I also have intermediate knowledge of HTML and CSS.
  • I have working knowledge of fedmsg system and datagrepper queries and have done related data analytics projects before link here
  • I also know Machine Learning and NLP and I am interested in using these techniques to understand Fedora community better.
  • I am learning Data Visualization techniques like d3.js so that I can develop interactive visualizations from data.
  • I also have contributed to Fedora Hubs development in the past and have knowledge of the codebase.Issues I have fixed include :

https://pagure.io/fedora-hubs/issue/106

https://pagure.io/fedora-hubs/issue/96.

You can see my closed PR's here. My Hubs related fedmsg activity here. Being familiar with the codebase, I can help in CommOps related tasks for Hubs development

  • My contributions to Fedora have not just been limited to technical aspects. To gain a deeper understanding of the Fedora Project, I have tried to contribute in diverse areas including helping Jan Kurik organize F23 elections(which was the fourth most participated elections in Fedora history) , writing a Fedora Badges article to help newcomerslink here, contributing to Community Blog (see my works here) and to Fedora Magazine(see my works here) and helping in diversity and women outreach efforts of Fedora by being an active member in Fedora women community.
  • I blog regularly, and I believe this will help me develop interesting and well laid-out documentation as well as data analytics reports for the project.
  • I have a good knowledge of the wiki,Trac,IRC and mailing-lists and I am comfortable with using them to communicate effectively.
  • I have communicated and interacted with my mentors and other team members in IRC and ML and having been involved with CommOps, I understand the ethics and values that make up the Fedora Community.
  • If you haven't guessed it by now, I really love CommOps and contributing to Fedora and GSoC offers me a great opportunity to do so over the summer !!! Additionally, I get to work on statistics and Machine Learning - what more could I ask ?!

Project Proposal

Overview

Fedora Community Operations(CommOps) : Statistical Simulation and Data Analytics for Fedora Infrastructure Message Bus Activity

Community Operations, a.k.a. CommOps, aims to address the area of community infrastructure by providing the tools, resources, and utilities for the different subgroups of Fedora to increase communication across the Project.

Because of the fedmsg stack, Fedora has very detailed raw data on Fedora contributor activity. My proposal revolves around programmatically querying Datagrepper API for data collection to build automated tools using Statistical and Machine Learning techniques for data analysis and visualization for different parameters.

GOALS

STATISTICAL TOOL FOR FEDORA EVENT ANALYTICS

  • Develop automated tools for data collection using tahrir API in conjunction with Fedora Infrastructure Message Bus activity
  • Develop statistical tools and algorithms for real-time data analysis using numpy and pandas python libraries.
  • Also, identify and code suitable Clustering algorithms for demographical analysis using scipy and sci-kit learn python libraries
  • Provide real - time interactive data visualizations using suitable tools from matplotlib or d3.js


STATISTICAL TOOL FOR FEDORA INFRASTRUCTURE MESSAGE BUS ACTIVITY ANALYTICS

  • Develop automated tools for data collection using Datagrepper API queries for Fedora Infrastructure Message Bus activity.
  • Develop statistical tools and algorithms for real-time data analysis using numpy and pandas python libraries for generating sub-project wise metrics, contributor wise metrics like mean Contributor Age(Fedora Activity wise), Retention Rate of Contributors
  • Generate programmatic python scripts for Time Series Modelling of data using scikit-learn and scipy libraries in python.
  • Also, identify and implement suitable Machine Learning algorithms(like Temporal Clustering) to find similarity patterns in sub-project activity, build contributions etc
  • Provide real - time interactive data visualizations using suitable tools from matplotlib or d3.js
  • Develop an automated tool for the fedmsg statistics for quarterly report for Fedora Project.
  • Develop statistical tools to identify long-tail patterns in contribution activity(How many contributors are just long-tail from packaging one thing?)
  • Use Machine Learning algorithms like Logistic Regression,SVM or Neural Networks to distinguish Redhat vs Non-redhat contributers on lists and conduct a statistical analysis.
  • Develop a Temporal Clustering based tool to identify similarity in contribution patterns for long-time contributors ( Do successful/old contributors have diverse contributions ? Are their contributions in bursts or continous over a period of time ? ) (optional)
  • Alternatively, statistics tools could also be implemented as statscache plugins instead of automated python scripts, depending on feasibility.

STATISTICAL TOOL FOR MAILMAN/HYPERKITTY ACTIVITY ANALYTICS

  • Develop automated tools for data collection using HyperKitty API in conjunction with Fedora Infrastructure Message Bus activity
  • Develop statistical tools and algorithms for real-time data analysis using numpy and pandas python libraries to generate statistics like mean/median size of ML thread , number of people in a thread, mean length of discussions, redhat vs non-redhat activity.
  • Generate programmatic python scripts for Time Series Modelling for ML activity data using scikit-learn and scipy libraries in python to identify activity patterns(bursts/highs and lows).
  • Also, identify and implement suitable Clustering algorithms to find activity-wise and trend-wise similarity patterns in lists.
  • Provide real - time interactive data visualizations using suitable tools from matplotlib or d3.js
  • Alternatively, statistics tools could also be implemented as statscache plugins instead of automated python scripts, depending on feasibility.
  • ML Discussion here and here. Related tickets on CommOps Trac instance here and here

STATISTICAL TOOL FOR BUGZILLA ANALYTICS

  • Develop automated tools for data collection using Bugzilla API in conjunction with Fedora Infrastructure Message Bus activity
  • Develop statistical tools and algorithms for real-time data analysis using numpy and pandas python libraries.
  • Generate programmatic python scripts for Time Series Modelling for data using scikit-learn and scipy libraries in python to identify activity patterns(bursts/highs and lows/mean Bug turnaround time).
  • Also, identify and implement suitable Clustering algorithms to find activity-wise and trend-wise similarity patterns.
  • Identify relevant algorithms and develop Machine Learning based tool to identify easy-fix or most relevant bugs (Optional)
  • Provide real - time interactive data visualizations using suitable tools from matplotlib or d3.js
  • Alternatively, statistics tools could also be implemented as statscache plugins instead of automated python scripts, depending on feasibility.

STRETCH GOALS

STATISTICAL TOOL FOR FEDORA BADGES

  • Develop automated tools for data collection using tahrir API in conjunction with Fedora Infrastructure Message Bus activity
  • Develop statistical tools and algorithms for real-time data analysis of badge collection activity using numpy and pandas python libraries.
  • Also, identify and code suitable Clustering algorithms for demographical analysis using scipy and sci-kit learn python libraries
  • Generate programmatic python scripts for Temporal Analysis for data using scikit-learn and scipy libraries in python to identify activity patterns
  • Provide real - time interactive data visualizations using suitable tools from matplotlib or d3.js

FEDORA HUBS WIDGETS

  • Componentization of CommOps deliverables into Fedora Hubs Widgets.
  • Develop metrics and statistics related widgets for Fedora Hubs

Some other cool Ideas bee2502 would like to work on

  • Automated NLP-based tool to find the expertise Fedora Contributers

Develop an automated tool using NLP techniques to find the expertise of Fedora contributors using meetbot logs of IRC meetings. We are looking to answer questions like "Who can best solve my doubt?" OR "Who is the most qualified person for this task?" NLP libraries for python like gensim or nltk will be used for tool development.

  • Badge Recommendation Engine widget for Hubs

Much along the lines of Stack Overflow Badge Recommendations : "You are 50% of the way to earning the 'Master Editor Badge' " Provide recommendations like "80% of contributors who last collected 'White Rabbit Badge' went on to collect 'Origin Badge' next "

Develop an automated tool with backend using Tahrir API to fetch data. Identify suitable Recommendation Algorithms like Collaborative Filtering and develop the engine using them

This could be especially helpful for newcomers to explore different areas of Fedora Project Some related representations by mizmo : https://fedoraproject.org/wiki/Fedora_RPG_OLD

  • Automated Tool to publish IRC meetings word clouds to social media like Twitter

Generate wordclouds from meetbot data using NLP techniques or wordcloud tools/libraries for python. Develop a tool using Twitter API to publish these wordclouds to Fedora handles on social media.

STATISTICAL TOOL FOR GITHUB ANALYTICS

  • Develop automated tools for data collection using Github API in conjunction with Fedora Infrastructure Message Bus activity
  • Develop statistical tools and algorithms for real-time data analysis using numpy and pandas python libraries.
  • Generate programmatic python scripts for Time Series Modelling for data using scikit-learn and scipy libraries in python to identify activity patterns(bursts/highs and lows/mean issue turnaround time).
  • Also, identify and implement suitable Clustering algorithms to find activity-wise and trend-wise similarity patterns.
  • Identify relevant algorithms and develop Machine Learning based tool to identify easy-fix or most relevant issues (Optional)
  • Provide real - time interactive data visualizations using suitable tools from matplotlib or d3.js

Final Deliverables

  • Automated statistics tools(scripts in python,preferably) for data collection,analysis and visualizations to be committed to the fedora-stats-tools github repo and/or Community Operations Toolbox whichever appropriate. Alternatively, statistics tools could also be implemented as statscache plugins instead of automated python scripts, depending on feasibility.
  • Analysis Reports and documentation of work to be published to community via Mailing lists, Fedora Planet and/or Community Blog posts whichever appropriate.
  • Report back weekly on Community Operations to Mailing Lists, Community Blog, and other channels when appropriate.

Timeline

I would like to start having a look and master the technical stuff that I need to fulfill even before the Community bonding period starts.

Community bonding period (22nd of April - 25th of May)

  • Work on improving the technical skills needed for the project (especially data visualizations)
  • Understand the bugzilla API
  • Discuss and finalize the Machine Learning Algorithms neccesary for FEDORA INFRASTRUCTURE MESSAGE BUS ACTIVITY statistical analysis tool
  • Discuss project specifications with mentor.
  • Develop an automated tool for the fedmsg statistics for quarterly report for Fedora Project.

Work Period until mid-term evaluations (25th of May – 20th of June)

  • Work on STATISTICAL TOOL FOR FEDORA EVENT ANALYTICS
  • Work on STATISTICAL TOOL FOR FEDORA INFRASTRUCTURE MESSAGE BUS ACTIVITY ANALYTICS
  • Communicate to the team regarding weekly status
  • Update personal blog posts to be syndicated on Fedora Planet as per the progress

Period of submitting mid-term evaluations (20th of June - 27th of June)

  • Clean Code and test for bugs.
  • Fix related bugs and write documentation.
  • Code review by mentor.
  • Submitting and completing midterm evaluations.
  • Update personal blog posts for weekly status, to be syndicated on Fedora Planet

Work Period (27th of June – 15th of August)

  • Work on STATISTICAL TOOL FOR MAILMAN/HYPERKITTY ACTIVITY ANALYTICS
  • Work on STATISTICAL TOOL FOR BUGZILLA ANALYTICS
  • Communicate to the team regarding weekly status
  • Update personal blog posts to be syndicated on Fedora Planet as per the progress
  • Also attend FLOCK :)

Final Week(15th August - 23rd August)

  • Clean Code and test for bugs.
  • Fix related bugs and write documentation.
  • Code review by mentor.
  • Submitting and completing midterm evaluations.
  • Update personal blog posts for weekly status, to be syndicated on Fedora Planet
  • Wrap up and Complete tasks

Potential Mentors

Remy Decausemaker(decause) , Corey Sheldon(linux modder) and Justin Flory(jflory7)