(Redirected from Infrastructure/RFR)
Purpose - Request for Resources
The purpose of this page is to help track requests for resources and to attach resources with their project leaders. In general the way this works is someone fills out an RFR ticket (see below) and sends an email to the infrastructure team for approval. If for some reason the request is denied the requester should take their issue to the Fedora Council. If you work for a company or Red Hat and have a budget to back the request, you are far more likely to be able to get request approved. There's only so much stuff to go around.
First, please read this entire page and collect answers to questions you will be asked. If your resource requires packages, please make sure that you create, get approved and have your package(s) available in Fedora and/or EPEL before you file your RFR ticket. Then fill out a ticket on https://pagure.io/fedora-infrastructure/ (select type: RFR) and join the fedora-infrastructure-list . Once on the list make sure to send an email about your RFR. Remember, not only is this a request, it is also a proposal. Include all information about why this is important for Fedora and why we should dedicate resources to test it and ultimately deploy, support and back it up.
All information on the page is required. Pay close attention to the "Expiration Date". This is the date by which this project will either be migrated to production hardware, or will have been declared a failure and taken to the Council to decide if it should be deleted or remain.
What to expect
All infrastructure discussion is done in the open as much as possible. This is especially true for new ideas and new requests so be prepared to defend your ideas and make sure that you have a clearly defined goal in mind. Unfortunately our resources are finite and we can't just do everything for everyone whenever they ask for it. Requesters not satisfied here should not be discouraged, final say will always happen with the Fedora Project Board. and even if they should say no, there's always other places to turn to to host your project.
Those who's project is approved may be given a complete test machine or only a service to work with. Those with an entire test machine must keep it up to date and in a working order. The purpose of the machine is to make sure that requests don't have to wait on the infrastructure team for action. It's a way for us to get out of your way while you are implementing this new service.
It is the job of those requesting resources to find a sponsor and convince them the project is worth their time. *Not all projects can be approved* If not initially accepted, remember to contact a sponsor and explain why the project should be accepted and how Fedora benefits from it. Contact a sponsor on the FIGs page related to the request. (For example, if you're building a website, contact a sponsor of the sysadmin-web or sysadmin-test group)
In the perfect world your request would get approved and within days your request would be live and people could access it. In reality it takes much longer with a lot more work. This is especially true of the project leader. The project leader is typically the person who made the RFR and is responsible for seeing the project through to conclusion. They will need to find an infrastructure member with access to the test servers and must work together with the team to make sure the work is getting done.
The infrastructure team doesn't intentionally not do work but often times other things get busy and if a Project Leader isn't around to continue to work with Infrastructure by asking when things will be done and what can be done to help, then that RFR typically gets forgotten about. This is actually a very community way of doing things. In the corporate world the boss says do and it gets done. In the community world if the person who thought the thing up doesn't care to stick around and see it through to conclusion then its questionable if that thing should have been done in the first place.
Responsibilities of an RFR Project Leader
If you have a service that is accepted as an RFR that you plan to eventually have deployed permanently in Fedora you are signing up for maintainance as well as for initial coding and deployment. Keep that in mind :-)
Here's a list of some of the things that we expect of an RFR maintainer. Note that infrastructure is a team so it won't just be on your shoulders to do these things but equally, infrastructure is a team of which you're a part and it's not fair to the rest of the team to bring in new maintainance burden without pulling your own weight.
Bringing in a team of people to do these things is always appreciated so there's not a single point of failure.
- Recruiting and training other people to work on the service so that you aren't a single point of failure
- Applying rpm updates to the service (and to any underlying pieces of the application stack) if necessary.
- Note that there's infra policies on freeze periods around release and updating pieces of the software stack may require you to interact with the teams working on other services deployed in infrastructure.
- Applying hotfixes via the puppet hotfix module if there's a securiry fix or bugfix that needs to go in and it's not worth spinning a new rpm.
- Keeping up with upstream development
- This includes keeping track of security fixes. Note that for many apps, this task involves much more work than simply following the Fedora package updates.
- Answering questions about whether a yum update (to your app or to the underlying stack) might break your app.
- Also, testing if you don't immediately know the answer
- Also coding patches to fix things should it become apparent that the app is broken with the update
- Fixing things should an app start throwing errors in production for unknown reasons
- Could include deploying to staging
- Could include coding and diagnosing
- Could include spending long hours staring at log files
- Could include being paged or contacted off hours to get the service back on line.
- Work on deployment problems
- It's too slow, what can we change to speed up this page?
- Testing things in staging before deploying to production
- Rolling new rpms of the application
In addition to a Project Leader, each resource should have a community of maintainers around it. Resources with single leaders or where there is not a healthy turn around of contributors can lead to failure when the leader is busy or unavailable. Resources with a single leader/maintainer may be rejected until a larger community is available. If your resource is popular / desirable, you should have no trouble on-boarding interested maintainers.
When submitting a Request for Resources the RFR Leader should realize that they are commiting to producing, testing, deploying and maintaining this resource. If they become unable to continue, it's the responsibility of the RFR project leader to hand off the resource or retire it. Don't expect things to magically continue when there is no interested folks maintaining or driving deployment.
All Fedora Projects must be on working, supported hardware. Hardware that is EOLed or "hobby hardware" will never be accepted into Fedora Infrastructure for a production application that we then support. The bottom line is that we're not going to say "we now support this" if the box might die in the middle of the night and no replacements can be found. Fedora has grown and matured quite a bit since its inception, proper production hardware is part of that.
All production resources must be clusterable / load balanceable. Preferably the latter. So remember when you're asking for resources, you're typically asking for _at least_ double what you need to get something up and running. Shared storage or database being the exception to this. Make sure to think hard about what you're asking. Don't forget about backups and storage space. Imagine if your project is wildly popular but suddenly running out of disk space.
When considering driving a RFR, keep in mind the on-going maint costs for the resource:
- How often does the package need to be updated (for instance for security fixes)?
- Do we sometimes have to program our own code to fix things/auth to fas/etc?
- Is the upstream alive or dead?
- Do we have a relationship with upstream where we can ask them to do things for us?
- Is the upstream branch going to be producing bugfixes (or at least, security fixes) to the service for a long time?
- How easy is updating? (yum update && done at one end of the spectrum; we're packaging ourselves, porting our custom addons, run a series of scripts to update the production database, update the config file, finally, take an outage to actually do the update at the other end of the spectrum).
Tips and Hints
What can you do to get your RFR accepted and moved out to users better/faster?
- Have your resource use tools that are already widely deployed in Fedora Infrastructure (python, flask, etc) and not things that are not (jboss, java, ruby on rails, etc).
- Make sure your resource is already packaged up for Fedora/EPEL before filing a RFR. Sometimes this can take a while, and filing an RFR too early will cause people to get discouraged or less excited by your project.
- Make sure you have multiple people involved/willing to help.
- Be available on irc in #fedora-admin for problems or questions about your resource.
- Feel free to nag your sponsor and others to keep driving things forward, but keep in mind that this is a process and we will NOT skip steps or apply a shoddy workaround to get your resource deployed faster.
- Gather information from affected fedora groups before proposing your RFR. For example, if your resource is related to package maintainers, tell them about it and get feedback first. Having a clear mandate from a Fedora group that your resource would help them improves chances for it to be accepted/worked on.
- If at any point in time the primary or secondary contacts become unresponsive, the project may be removed.
- Test instances are NOT backed up unless requested. Once something goes live it will be brought into the normal fold of what we do
- Ultimately the board has final say on what is and is not "Fedora"