From Fedora Project Wiki
Line 1: Line 1:
 
<big>'''statistics++: Making Fedora Project data accessible'''</big><br>
 
<big>'''statistics++: Making Fedora Project data accessible'''</big><br>
[[User:Ianweller|Ian Weller]], Red Hat, Inc.
+
[[User:Ianweller|Ian Weller]], Fedora Engineering, Red Hat, Inc.
  
 
== Project overview ==
 
== Project overview ==
 +
 +
Fedora Infrastructure has had a limited foray into the field of statistics. The [[Statistics]] page on the Fedora Project Wiki contains some limited information about the number of HTTP requests made to various infrastructure applications and the number of wiki edits made per month.
 +
 +
The [https://admin.fedoraproject.org/community/#statistics statistics app in the first version of Fedora Community] attempted to improve on the [[Statistics]] page, but ultimately failed because of the complexity of adding new and relevant automated queries to the platform and the limited amount of information Fedora's application servers could access.
 +
 +
With the [[Fedora Engineering/FY13 Plan#AMQP Enablement|planned messaging infrastructure]] for infrastructure applications, a statistics application can be programmed to listen on the message bus, record activity, and store activity in a database for later retrieval.
 +
 +
statistics++ consists of three services:
 +
# A server daemon that listens on the infrastructure message bus and records activity to a database
 +
# An HTTP application that provides a [http://en.wikipedia.org/wiki/Representational_state_transfer#RESTful_web_services RESTful web API] for downloading data stored in the database
 +
# An HTTP application that produces automated data displays such as tables or charts
  
 
== Target audience ==
 
== Target audience ==

Revision as of 19:16, 26 March 2012

statistics++: Making Fedora Project data accessible
Ian Weller, Fedora Engineering, Red Hat, Inc.

Project overview

Fedora Infrastructure has had a limited foray into the field of statistics. The Statistics page on the Fedora Project Wiki contains some limited information about the number of HTTP requests made to various infrastructure applications and the number of wiki edits made per month.

The statistics app in the first version of Fedora Community attempted to improve on the Statistics page, but ultimately failed because of the complexity of adding new and relevant automated queries to the platform and the limited amount of information Fedora's application servers could access.

With the planned messaging infrastructure for infrastructure applications, a statistics application can be programmed to listen on the message bus, record activity, and store activity in a database for later retrieval.

statistics++ consists of three services:

  1. A server daemon that listens on the infrastructure message bus and records activity to a database
  2. An HTTP application that provides a RESTful web API for downloading data stored in the database
  3. An HTTP application that produces automated data displays such as tables or charts

Target audience

Justification

Goals

This project aims to solve the following problems:

  • Data on the Statistics wiki page can only be generated and validated by those who have access to Fedora log servers.
  • Data on the Statistics wiki page requires a human to generate the data each week.
  • Data on the Statistics wiki page does not encompass all infrastructure applications.
  • Data on the Statistics wiki page can be modified by anybody who can edit the wiki.
  • To generate data for other infrastructure applications (such as FAS, Koji, Bodhi, and other applications), separate code has to be written for each application in order to download data.

To solve these problems, statistics++ will have the following functionality:

  • Open, read-only access to any anonymized data collected by infrastructure applications
  • A standard RESTful API for downloading data
  • Flexible schemas for storing and retrieving data from infrastructure applications
  • Live updates of statistical data from infrastructure applications
  • An interface for creating automated queries and representing data in tables or charts

Non-goals

Requirements

Use cases

Relationship to other services

Reviewers

Details

Schedule summary

Dependencies

Open issues

Resources for information

Design overview

Responsible parties