This is a project to convert our current package source control (CVS) into git. See Dist_Git_Proposal.
- Jesse Keating
- Toshio Kuratomi
Currently we are evaluating a number of tools to perform the tasks needed.
parsecvs is the tool we're currently using to convert CVS history into git format. It has been used to convert a number of projects, including xorg and the Gnome projects, so it has a good track record. It is quite fast, is able to translate CVS commit names into full git like name+addresses, and seems to handle our packages well. A script has been written that processes the CVS ,V files via parsecvs and creates the proper branches for release subdirs. The last full run took roughly 900 minutes to complete.
A trail import is available to the public via a public test system. Modules are exported via the git:// or ssh:// protocol, and the url format is
ssh://[fedoraccount@]pkgs.stg.fedoraproject.org/<module> For example, in order to clone the yum module anonymously, one would enter:
git clone git://pkgs.stg.fedoraproject.org/yum
To clone the yum module with write access, one would enter:
git clone ssh://pkgs.stg.fedoraproject.org/yum
git push should just work, provided you have pkgdb rights to commit to the yum module.
Unlike CVS, where we used subdirectories for "branches", and thus would be able to apply filesystem ACLs on the subdirs, git does not provide an easy way to do filesystem ACLs at a branch level. Therefor we will need to use an extra layer in order to accomplish our needs. This is not unlike our current use of CVS, where we rely on file system group ID to provide write access, and then use the cvs Avail system to restrict that down.
Currently we are evaluating gitolite to provide the ACLs. It has the ability to provide users write access only to specific branches.
gitolite does have a few problems for our use that we are working out with upstream.
- It defines user groups internally rather than using getent
- It is designed around every user logging in through a single system user via ssh keys
- The config file system does not quite scale to our size
gitolite upstream has created a branch of the code for multi-user huge config file use, namely the Fedora case, and is committed to making it work for us.
A preliminary script has been written that takes data from pkgdb and getenv in order to draft a gitolite config file which can then be "compiled" into what gitolite uses internally to check ACLs.
gitolite works by running a gl-auth-command when an ssh connection is initiated. This is controlled by .ssh/authorized_keys, much in the way that our current CVS server setup is done. gl-auth-command will then check ACLs against a pre-compiled hash and if so allowed, will pass the rest of your ssh command on to the local git command. The update hook in each repo will also check permissions to see if you have rights to do whatever it is you are doing on whichever ref you are trying to do it on (master, F-12, a tag, etc..) This does bring up the wrinkle of admins with shell access to the git server, who we can't force gl-auth-command in via authorized_keys. These few people will bypass the first auth check, and we'll have to rely upon file system permissions + the repo update hook to deny them access to things which they shouldn't access. A symlink on the filesystem will need to be provided so that people interacting directly with git use the same paths as people who interact via gl-auth-command (which defines its own git root path).
Another problem is that when using the update hook, and not using gl-auth-command (eg people who have full shell access) the update hook will not allow writing due to missing data in the environment. There are a couple different ways this could be fixed.
- An early check in the update hook that exits 0 if gl-auth-command wasn't used, bypassing branch level ACLs
- Running a secondary ssh server for admin shells, forcing all git traffic through gl-auth-command
- Forcing gl-auth-command for every ssh user, with a check in gl-auth-command that detects non-git actions and drops to the shell if the user is an admin user
All have their pros and cons, but the last option seems the best currently, due to not having to run more daemons, having every git action go through gl-auth-command and subsequently the update hook, and by being able to define a git base path to keep the urls short. This solution has one small wrinkle. Currently we have 'no-pty' as one of the options when using a forced command in authorized_keys. This means that admins would not be able to get a shell due to lack of a pty, so for certain users we'd have to not use that option. This may require some changes within FAS itself to make ssh auth commands and options more generic and defined by each group. This is the route we're currently taking to provide ssh write access to the repos.
No code has been written yet for fedpkg, see Dist_Git_Proposal#fedpkg for a feature list. Ideally fedpkg would operate somewhat like the koji client or python-bugzilla client, or even git works, where fedpkg takes a series of global options, and then each command takes further options. This helps in isolating development on the tool and makes adding features to specific commands quite easy (look into python-argparse for this). fedpkg will most likely be a part of the fedora-packager package.
Koji upstream already has some code to deal with git repos, however our proposed layout will be different enough to require modification. No modification has been done yet.
Current target for conversion is shortly after Fedora 13 release.
How can I help?
Find Jesse Keating (Oxf13) on freenode IRC