From Fedora Project Wiki

(Redirected from Category:Data mining)

Fedora Statistics 2.0 — Now You See It!

We're working on rebuilding the way we produce and present statistics.

People involved

Add yourself if you want to get involved :)

Person Role
Ian Weller People wrangler
Luke Macken Infrastructure + Fedora Community integrator
Jef Spaleta
Max Spevack

What we currently have

https://admin.fedoraproject.org/community/statistics

What we want to analyze

Community Activity

  • Determine "activity" as a boolean based on wiki edits, translations, mailing list posts, CVS/git/whatever commits, and determine how many accounts are active as a history over time (graph)
  • Determine what types of "activity" are "talk" and "action", and analyze the numbers of active members into a sliding scale between "talk" and "action"

Fedora Accounts System

  • History over time of account registrations and signed CLAs
  • History over time of number of members/sponsors/admins in each group
  • History over time of involvement of people from $COMPANY (overall, in each group, as a sponsor, etc)

Packaging

package use

  • Parse mirror logs: what packages are being the most downloaded?

pkgdb

  • Number of packages over time
  • Package to packager ratio over time
  • Number of people with X packages (histogram)
  • Number of packages with X people (histogram)
  • Percentage of packages with EPEL, OLPC branches

bodhi

  • Number of updates over time
  • Update submitters
  • Feedback submitters
  • Most updated packages
  • Broken deps

rawhide

  • Number of updated packages over time
  • Most updated packages in a release cycle
  • Broken deps

Actual package contents (repoquery)

  • Percentage of packages with common post fix (-devel, -doc, -data, common)
  • Percentage of subpackages that aren't noarch but could be (Features/NoarchSubpackages)

Mailing lists

  • List activity
  • Popular threads
  • Most active posters
  • Number of subscriptions/unsubs over time

Wiki

  • Wiki edits and other actions (page moves, etc)
  • People who actually use edit summaries

Fedora Hosted

  • Commits and committers

Non-fedorahosted.org SCMs

Red Hat Bugzilla

  • Bugs opened
  • Bugs closed
  • Bugs in the rugs

Mirrormanager

IRC meetings

Nagios, Zabbix, and other fun infrastructure things

Website logs