From Fedora Project Wiki
(added information how the fedora message bus can be queried to determine if the files on the master mirror have changed)
(→‎Communicating: Matrix -> Chat)
 
(41 intermediate revisions by 15 users not shown)
Line 1: Line 1:
{{autolang|base=yes}}{{header|infra}}
{{autolang|base=yes}}{{header|infra}}
Fedora is fortunate to have '''over 200 volunteer [http://mirrors.fedoraproject.org/publiclist mirror sites] globally''', which helps distribute the Fedora software to end users.  We greatly appreciate our mirror sites and their system administrators.
Fedora is fortunate to have '''over 200 volunteers [https://admin.fedoraproject.org/mirrormanager/mirrors] globally''', which helps distribute the Fedora software to end users.  We greatly appreciate our mirror sites and their system administrators.
 
Please read the [https://docs.fedoraproject.org/en-US/legal/export/ Fedora Export Compliance/Customs Information].
 
== Communicating ==
== Communicating ==
* Mailing lists: [http://www.redhat.com/mailman/listinfo/mirror-list mirror-list] (announcements only) and [http://www.redhat.com/mailman/listinfo/mirror-list-d mirror-list-d]  (discussion)
* IRC: <code>{{fpchat|#fedora-admin}}</code> on Freenode
* Administrative changes: send email to <code>mirror-admin@fedoraproject.org</code>


== What are the size estimates? ==
* Mailing lists: [https://lists.fedoraproject.org/admin/lists/mirror-admin.lists.fedoraproject.org/ mirror-admins ] ** NOTE 2017-03-21 New list location **
* Chat: [https://matrix.to/#/#admin:fedoraproject.org <code>#admin:fedoraproject.org</code> on Matrix]
* Administrative changes: send email to <code>mirror-admin@lists.fedoraproject.org</code>
 
== Before you decide to become a mirror ==
 
=== What are the size estimates? ===


* http://dl.fedoraproject.org/pub/DIRECTORY_SIZES.txt
Please read [https://dl.fedoraproject.org/pub/DIRECTORY_SIZES.txt the DIRECTORY_SIZES.txt text file] carefully. If you can allocate the required amount of space for mirroring Fedora, read on.
Please read the text file carefully.


== Export Compliance ==
=== How can someone become a public mirror? ===


By downloading Fedora software, you acknowledge that you understand all of the following: Fedora software and technical information may be subject to the U.S. Export Administration Regulations (the “EAR”) and other U.S. and foreign laws and may not be exported, re-exported or transferred (a) to any country listed in Country Group E:1 in Supplement No. 1 to part 740 of the EAR (currently, Cuba, Iran, North Korea, Sudan & Syria); (b) to any prohibited destination or to any end user who has been prohibited from participating in U.S. export transactions by any federal agency of the U.S. government; or (c) for use in connection with the design, development or production of nuclear, chemical or biological weapons, or rocket systems, space launch vehicles, or sounding rockets, or unmanned air vehicle systems. You may not download Fedora software or technical information if you are located in one of these countries or otherwise subject to these restrictions. You may not provide Fedora software or technical information to individuals or entities located in one of these countries or otherwise subject to these restrictions. You are also responsible for compliance with foreign law requirements applicable to the import, export and use of Fedora software and technical information.
Becoming an official public mirror is easy, and getting easier. All we request is that your site have sufficient bandwidth and disk space to handle the load.


== How can someone become a public mirror? ==
==== Storage space ====


Becoming an official public mirror is easy, and getting easier.  All we request is that your site have sufficient bandwidth and disk space to handle the load.  Each Fedora release can consume upwards of 200GB of disk space, and downloaders can consume as much bandwidth as you've got.  Mirror sites have at least a 100Mbit/sec* connection to the Internet, many have Gigabit or larger pipes.  As of the Fedora 8 release, the total space consumed on the master server, thus what a mirror could consume, is 1.1TB and growing.  A 1-2TB volume would be most appropriate for a long-term mirror.  This content is hardlinked; if you can't hardlink (e.g. you're on AFS), you'll need much more disk space.  Actual space consumed is noted at [http://dl.fedoraproject.org/pub/DIRECTORY_SIZES.txt] .
Each Fedora release can consume upwards of 250GB of disk space.  As of the Fedora 26 release, the total space consumed on the master server, thus what a mirror could consume, is 1.5TB and growing.  Older releases are periodically archived, so a 3-4TB volume would be most appropriate for a long-term mirror.  This content is hardlinked; if you can't hardlink (e.g. you're on AFS), you'll need much more disk space.  Actual space consumed is noted in [https://dl.fedoraproject.org/pub/DIRECTORY_SIZES.txt DIRECTORY_SIZES.txt].


* 100Mbit/sec is the rule for countries with adequate mirror coverage already.  We can make exceptions for new mirrors in countries that have few mirrors.  Connections to Internet2, National Lambda Rail, GEANET2, RedIRIS, or other such high speed research and educational networks are always appreciated.
==== Bandwidth ====


== How can someone make a private mirror? ==
Downloaders can consume as much bandwidth as you've got.  Mirror sites have at least a 100Mbit/sec connection to the Internet, many have Gigabit/sec, 10 Gigabit/sec, or larger pipes.  100Mbit/sec is the minimum required for new mirrors in countries with adequate mirror coverage already.  We can make exceptions for new mirrors in countries that have few mirrors.  Connections to Internet2, National Lambda Rail, GEANET2, RedIRIS, or other such high speed research and educational networks are always appreciated.
 
=== How can someone make a private mirror? ===


Private mirrors are mirrors that reside entirely within an organization (company, school, etc.) and can only be accessed by members of that organization.  They exist to speed up distributing Fedora within an organization, where local bandwidth costs are far less than going across the Internet.
Private mirrors are mirrors that reside entirely within an organization (company, school, etc.) and can only be accessed by members of that organization.  They exist to speed up distributing Fedora within an organization, where local bandwidth costs are far less than going across the Internet.
Line 31: Line 38:
* Private mirrors are not crawled by the MirrorManager web crawler.  Therefore:
* Private mirrors are not crawled by the MirrorManager web crawler.  Therefore:
* Private mirrors must run report_mirror to inform the MirrorManager database of their content.  If you don't run report_mirror, your clients will not be automatically redirected.
* Private mirrors must run report_mirror to inform the MirrorManager database of their content.  If you don't run report_mirror, your clients will not be automatically redirected.
You may also find it more beneficial to run an [https://fedorahosted.org/intelligentmirror/ IntelligentMirror] instead of a full rsync mirror.  In this way, only the updates your local users actually need will be cached on your local mirror, saving you the bandwidth from downloading updates you don't actually need.


== MirrorManager: the Fedora Mirror Management system ==
== MirrorManager: the Fedora Mirror Management system ==
The [http://mirrormanager.org MirrorManager] software keeps track of all the mirrors without requiring a lot of manual text file editing.  Admins of the mirrors must ensure that the <code>report_mirror</code> script available with the <code>mirrormanager-client</code> is run after each <code>rsync</code> to update the Mirrormanager database.


=== Signing up ===
The [https://github.com/fedora-infra/mirrormanager2/ MirrorManager2] software keeps track of all the mirrors without requiring a lot of manual text file editing. Fedora's instance of this application is at [https://admin.fedoraproject.org/mirrormanager https://admin.fedoraproject.org/mirrormanager]


==== Fedora Account System ====
=== Mirror Status ===
* You must have an account in the [https://admin.fedoraproject.org/accounts/ Fedora Account System] . (More info also at [[Infrastructure/AccountSystem| ]] .) You are not required to sign the Contributors License Agreement to merely mirror Fedora content, but you must do so if you wish to contribute to other aspects of Fedora.
* If a public mirror, you must send an email to <code>mirror-admin@fedoraproject.org</code> introducing yourself, stating you would like to become a mirror, your IP address, your location (country), and your outbound bandwidth available for the mirror.  Private mirrors are encouraged to send a similar note.
* You must subscribe to [http://www.redhat.com/mailman/listinfo/mirror-list mirror-list]  and (optionally) [http://www.redhat.com/mailman/listinfo/mirror-list-d mirror-list-d]  (discussion) to be notified of new releases.


==== Registering in MirrorManager ====
MirrorManager keeps track of all the mirrors with regular scans of the content of all mirrors. The content of the mirrors is scanned by the crawler which either scans/crawls the mirrors using HTTP or RSYNC. We strongly recommend providing a RSYNC URL (at least for our crawler) as crawling via RSYNC is much faster as it only requires a single network connection for each mirrored category. If only HTTP access for Fedora's mirror crawler is possible it is important to enable [[Infrastructure/Mirroring#Keepalives|HTTP Keepalives]] to reduce the number of required network connections.
* Log into [https://admin.fedoraproject.org/mirrormanager mirrormanager]  using your FAS account.
* Create a new Site.
* create a new Host, and sign up that host for the Categories of content you'll carry, any other site administrators you want, your site's IP addresses used for our Access Control List, and the other details listed there if applicable to you.
* Please run <code>report_mirror</code> after each rsync run.
* You may list your site's IP address ranges (Netblocks). Clients coming from an IP address within your netblock will be automatically redirected to your mirror for any content you carry.
* You may list your site's BGP Autonomous System Number (ASN).  Clients on your ASN will be automatically redirected to your mirror for any content you carry.  One way to lookup up your ASN is to query it from the routeviews.org DNS servers.  It is like a PTR record lookup, but at a specific server.  For example, to look up 143.166.1.1, type:
$ dig txt 1.1.166.143.asn.routeviews.org @archive.routeviews.org
;; ANSWER SECTION:
1.1.166.143.asn.routeviews.org. 86400 IN TXT "3614" "143.166.0.0" "16"
Here, the answer is in the TXT record, the first value, 3614.


=== Mirroring ===
The crawler does not scan private mirrors and private mirrors have to run <code>report_mirror</code> to tell MirrorManager that they are up to date.
The only sane way to do mirroring is to use <code>rsync</code>.  Note the options <code>-H</code> (hardlinks), <code>--delay-updates</code>, <code>--numeric-ids</code> and <code>--delete-after</code> are required to ensure your mirror content stays valid even during a new rsync run, until all the new data is available.


rsync -vaH --exclude-from=${EXCLUDES} --numeric-ids --delete --delete-after --delay-updates \
==== rsync health checking ====
  rsync://dl.fedoraproject.org/fedora-enchilada ${LOCAL_DIR}
If you are allowing MirrorManager to perform a health check via rsync (generally preferred due to the cost of doing HTTP HEAD requests and such), please make sure that 38.145.60.3 has read access to the respective rsync targets.


* You may exclude any content you desire, such as architectures, using an EXCLUDES file.
=== Signing up ===


* Please pull from one of the Tier 1 mirrors. See [[Infrastructure/Mirroring/Tiering]] .  Instead of using one of the Tier 1 servers, you may wish to pull from another fast mirror that's closer to you.  Contact the respective mirror admins to be added to their ACL.
==== Fedora Account System ====
* You must have an account in the [https://accounts.fedoraproject.org Fedora Account System] (more info also at [[AccountSystem]]). Note that this uses the same backend as the [https://accounts.centos.org CentOS Account System], so if you already have an account there you do not need to create a new oneYou are not required to sign the Contributors License Agreement to merely mirror Fedora content, but you must do so if you wish to contribute to other aspects of Fedora.
* If a public mirror, you must send an email to <code>mirror-admin@lists.fedoraproject.org</code> introducing yourself, stating you would like to become a mirror, your IP address, your location (country), and your outbound bandwidth available for the mirrorPrivate mirrors are encouraged to send a similar note.
* You must subscribe to [https://lists.fedoraproject.org/admin/lists/mirror-admin.lists.fedoraproject.org/ mirror-admin]to be notified of new releases.


* If you are using rsync 3.0 or higher, you can use the <code>--delete-delay</code> option instead of <code>--delete-after</code>, which is [http://lists.debian.org/debian-mirrors/2009/04/msg00017.html reported] to provide faster performance. Additionally, if enough free space is available on the receiving side, the option <code>--delay-updates</code> can be used to make the updating of the files a little more atomic and decrease the time during which the mirror is inconsistent.
==== Registering in MirrorManager ====


==== Mirror Frequency====
* Log into [https://admin.fedoraproject.org/mirrormanager/ MirrorManager] using your FAS account.
* Create a new Site (a site is a group of Hosts belonging to the same organization) by clicking on your user profile.
* Create a new Host, and sign up that host for the Categories of content you'll carry (under the ''formally optional'' section '''Categories Carrried'''), any other site administrators you want, your site's IP addresses used for our Access Control List, list of countries that are allowed to access your mirror, and the other details listed there if applicable to you.
** You may list your site's IP address ranges (Netblocks).  Clients coming from an IP address within your netblock will be automatically redirected to your mirror for any content you carry.
** You may list your site's BGP Autonomous System Number (ASN).  Clients on your ASN will be automatically redirected to your mirror for any content you carry.  One way to lookup up your ASN is to query it from the routeviews.org DNS servers.  It is like a PTR record lookup, but at a specific server.  For example, to look up 143.166.1.1, type:<br />
**:  <pre>$ dig txt 1.1.166.143.asn.routeviews.org @archive.routeviews.org
**:: ;; ANSWER SECTION:
**:: 1.1.166.143.asn.routeviews.org. 86400 IN TXT "3614" "143.166.0.0" "16"</pre>
**:: Here, the answer is in the TXT record, the first value, 3614.
** For a mirror to get listed in the MirrorManager for the category of content you have selected to carry, you ''must'' add at least one URL via which the category of content on your mirror can be accessed (under '''Categories Carried'''). The URLs you add can be of type HTTP, HTTPS, and rsync. Fedora's MirrorManager instance does not support FTP URLs and therefore it is not possible to add a FTP URL anymore.
 
=== Mirroring ===


* You should sync shortly after 0800 UTC (when rawhide is pushed), 1400 UTC (when bitflips occur), and another 3-5 times per day (updates are manually released).
* Please use the [https://pagure.io/quick-fedora-mirror quick-fedora-mirror] zsh script if at all possible. This script uses rsync and some informational files on the mirrors to allow you to only sync those files you need and saves lots of time and file seeking. '''Note:''' make sure that the mirror you intend to use provides the fedora-buffet rsync module, as otherwise the script will not work.


* You should add some random value to the start time of your rsync jobs, to even out load on the upstream mirrors. A cron line might look like:
* Please pull from one of the [[Infrastructure/Mirroring/Tiering#Tier_1_Mirrors|Tier 1 mirrors]]. Instead of using one of the Tier 1 servers, you may wish to pull from another fast mirror that's closer to you. Contact the respective mirror admins to be added to their ACL.
  45 */6 * * * perl -le 'sleep rand 1800' && bash -l ~/mirror-fedora > /dev/null


* For Tier 1 mirrors there is also the option to query the [https://apps.fedoraproject.org/datagrepper/ Fedora message bus] if the files on the master mirror have been updated since the last sync. In the [https://fedorahosted.org/fedora-infrastructure/ticket/4539 Fedora Infrastructure Ticket #4539] this is discussed in more detail and the script [https://fedorahosted.org/fedora-infrastructure/raw-attachment/ticket/4539/last-sync last-sync] is provided.
* You may exclude any content you desire, such as architectures or releases, using an EXCLUDES file or <code>--exclude</code> parameter.


* The tool <code>last-sync</code> can be used to query if any files have been changed on the mirror or it can also be used to query if specific sections have new files:
==== Tools to avoid ====
<pre>
$ last-sync -h


last-sync queries the Fedora Message Bus if new data is available on the public servers
Please don't use tools like 'lftp mirror' to mirror content from the master mirrors. It's places a heavy load on our master mirrors and is slow for both you and us. Please use [https://pagure.io/quick-fedora-mirror quick-fedora-mirror] instead.


Usage: last-sync [options]
==== Mirror Frequency====


Options:
* If you are using quick-fedora-mirror you can sync as often as every 10minutes in a loop. If there is no new content the script should return almost immediately. Make sure to use locking so you are only running one instance at a time.
  -a, --all                query all possible releases (default)
                            (fedora, epel, branched, rawhide)
  -f, --fedora              only query if fedora has been updated during <delta>
  -e, --epel                only query if epel has been updated
  -b, --branched            only query if the branched off release
                            has been updated
  -r, --rawhide            only query if rawhide has been updated
  -q, --quiet              do not print out any informations
  -d DELTA, --delta=DELTA  specify the time interval which should be used
                            for the query (default: 86400)
</pre>


* The following is an example how <code>last-sync</code> could be used in a mirror script. It requires that the mirror has stored the information about their last successful sync of the their Fedora mirror:
* If you are not using quick-fedora-mirror you should sync a few times a day and consider switching.  


<pre>
==== Running report_mirror (only private mirrors) ====
# check if mirror needs to be updated
CURDATE=`date +%s`
LASTRUN=`the magic command to get the time of the last successful run of this mirror script`
DELTA=`echo ${CURDATE}-${LASTRUN} | bc`


/path/to/last-sync -d ${DELTA} -q
'''Note:''' quick-fedora-mirror can also check in for you and has no need of the <code>report_mirror</code> package/conf. To enable it, set the correct values for <code>CHECKIN_SITE</code> and <code>CHECKIN_PASSWORD</code> variables. In most cases, setting <code>CHECKIN_HOST</code> is not necessary. The following applies only if you are not using quick-fedora-mirror, and, as said above, you should consider switching.


if [ "$?" -ne "0" ]; then
MirrorManager includes a tool, <code>report_mirror</code> which uploads to the mirror database for private mirrors that you completed a run and what content you've got. This is required for private mirrors. Private mirrors have to run <code>report_mirror</code> after every rsync job completes.
        # no changes on the master mirror
        # abort
        exit 0
fi
</pre>


==== Running report_mirror ====
  dnf install mirrormanager2-client
MirrorManager includes a tool, <code>report_mirror</code> which can upload to the mirror database that you completed a run and what content you've got. This makes generating the yum mirrorlists and all other pages much much simpler.  Please run <code>report_mirror</code> after every rsync job completes.You can get this tool via yum:


yum install mirrormanager-client
or yum (in RHEL and CentOS):


or get the files directly from the git tree [http://git.fedorahosted.org/cgit/mirrormanager/tree/client report_mirror files] .  Or it can be obtained using git:
yum install mirrormanager2-client


git clone git://git.fedorahosted.org/mirrormanager
or get the files directly from [https://github.com/fedora-infra/mirrormanager2 the official Git repository]:
  or
git clone http://git.fedorahosted.org/git/mirrormanager/  


git clone https://github.com/fedora-infra/mirrormanager2.git


You need both report_mirror and report_mirror.conf, and must edit report_mirror.conf to include the content you're carrying and the path to that content on your disk.
You need both report_mirror and report_mirror.conf, and must edit report_mirror.conf to include the content you're carrying and the path to that content on your disk. Only private mirrors can use report_mirror.


==== Available content ====
==== Available content ====
Line 134: Line 112:


===== Suggested rsync modules =====
===== Suggested rsync modules =====
{| border="1"
{| border="1"
|-
|-
|'''rsync module'''||'''Description'''||'''path on master server'''||Comments
! rsync module || Description || path on master server || Comments
|-
|-
|fedora-buffet0||Fedora - The whole buffet. All you can eat.||/pub||Please use this if you can, it provides the all current Fedora content, including pre-bitflip content.  This is open to specific mirrors by request.  Mirrors participating in our tiering should use this.  Mirrors syncing both fedora-enchilada and fedora-archive should use this, as we can now hardlink across both of those trees under fedora-buffet0.
|fedora-buffet||Fedora -- the whole buffet (all you can eat)||/pub||Please use this if you can, it provides the all current Fedora content, including pre-bitflip content.  This is open to specific mirrors by request.  Mirrors participating in our tiering should use this.  Mirrors syncing both fedora-enchilada and fedora-archive should use this, as we can now hardlink across both of those trees under fedora-buffet0.
|-
|-
|fedora-enchilada0||Fedora - the whole enchilada||/pub/fedora||Please use this if you can, it provides the all current Fedora content, including pre-bitflip content.  This is open to specific mirrors by request.  Mirrors participating in our tiering should use this.
|fedora-enchilada||Fedora -- the whole enchilada||/pub/fedora||Please use this if you can, it provides the all current Fedora content, including pre-bitflip content.  This is open to specific mirrors by request.  Mirrors participating in our tiering should use this.
|-
|-
|fedora-epel||Extra Packages for Enterprise Linux||/pub/epel||Please use this to mirror EPEL
|fedora-epel||Extra Packages for Enterprise Linux||/pub/epel||Please use this to mirror EPEL
Line 156: Line 136:
{| border="1"
{| border="1"
|-
|-
|'''rsync module'''||'''Description'''||'''path on master server'''||Comments
! rsync module || Description || path on master server || Comments
|-
|fedora-enchilada||Fedora - the whole enchilada||/pub/fedora||Please use this if you can, it provides the all Fedora content except pre-bitflip content.  This is open for the general public to use.
|-
|-
|fedora-linux-releases||Fedora Linux Releases||/pub/fedora/linux/releases||
|fedora-linux-releases||Fedora Linux Releases||/pub/fedora/linux/releases||
Line 165: Line 143:
|-
|-
|fedora-linux-updates||Fedora Linux Updates||/pub/fedora/linux/updates||
|fedora-linux-updates||Fedora Linux Updates||/pub/fedora/linux/updates||
|-
|fedora-epel||Extra Packages for Enterprise Linux||/pub/epel||EPEL doesn't do the bitflip trick, so this is the same as fedora-epel0 above.
|-
|-
|}
|}
===== Red Hat Enterprise Linux Beta =====
RHEL beta releases are also managed via MirrorManager.  Fedora's Tier 1 mirrors may sync these modules.
{| border="1"
|-
|'''rsync module'''||'''Description'''||'''path on master server'''||Comments
|-
|rhel-beta||RHEL||/pub/redhat||on rhm3.redhat.com
|}
==== DVDs, CDs, and the exploded trees ====
When a new release is available, it can be bandwidth-efficient to download only the ISOs first (say, the DVD ISOs), then explode those into the directory structure, then run a full normal rsync run.  This lets you avoid downloading the same RPMs twice (both on ISOs and as plain RPMs).  There's a tool somewhere to help do this.
==== Regular hardlink runs ====
While the Fedora release maintainers try to keep as little redundant packaging around as possible, there are some duplicate packages in the tree.  For example, when a Fedora Test release comes out, the package set included there looks remarkably like that of the development tree from a few days before.  By copying the development tree over into the new Test directory before starting your rsync run, and using <code>rsync -H</code>, you can avoid downloading all that content a second time.
In addition, it's good practice to run a tool like <code>hardlink++</code> on your tree occasionally (say, weekly), to ensure as much of your tree as possible is hardlinked.
==== Pre-Release: Copying Development tree to new release directory ====
In the days leading up to a release, either test or final, the development tree will stop taking new packages, and will closely resemble what winds up in the new release.  As a mirror, you can avoid downloading content that already is in your copy of the development tree that matches what's in the release tree by copying those packages using hardlinks, such as:
cp -lr fedora/linux/development/13/i386/os fedora/linux/releases/13/Fedora/i386/
cp -lr fedora/linux/development/13/x86_64/os fedora/linux/releases/13/Fedora/x86_64/
cp -lr fedora/linux/development/13/source fedora/linux/releases/13/Fedora/
and then start the rsync process, which will clean up any changes and fix up the timestamps.


==== Rsync Configuration (sample) ====
==== Rsync Configuration (sample) ====
Line 243: Line 193:
       ExpiresDefault "now"
       ExpiresDefault "now"
   </LocationMatch>
   </LocationMatch>
</pre>
===== Redirecting ISO downloads to FTP =====
While no longer a recommended practice, the following mod_rewrite rules will force all *.iso files to be downloaded via FTP.  In this example HEAD requests are not redirected, so the MirrorManager crawler is not disrupted.
<pre>
RewriteCond    %{REQUEST_METHOD} GET
RewriteRule    ^(.*\.iso)$ ftp://myserver/$1  [L,R=301]
</pre>
</pre>


Line 279: Line 221:
<pre>
<pre>
RewriteEngine on
RewriteEngine on
RewriteCond %{HTTP:Range} [0-9] $
RewriteCond %{HTTP:Range} [0-9]$
RewriteRule \.iso$ / [F,L]  
RewriteRule \.iso$ / [F,L]  
</pre>
</pre>
Line 316: Line 258:


===== Serving content to other mirrors =====
===== Serving content to other mirrors =====
Tier 1 mirrors will necessarily need to share content to Tier 2 mirrors before the bitflip.  This is done by running another instance of the rsync daemon, on a different port (e.g. 874), with an Access Control List to prevent public downloads, running as a user in the same group as downloaded the content (e.g. group mirror).  This could be user mirror, group mirror, who has group read/execute permissions on the still-private content.
 
[[Infrastructure/Mirroring/Tiering#Tier_1_Mirrors|Tier 1 mirrors]] will necessarily need to share content to Tier 2 mirrors before the bitflip.  This is done by running another instance of the rsync daemon, on a different port (e.g. 874), with an Access Control List to prevent public downloads, running as a user in the same group as downloaded the content (e.g. group mirror).  This could be user mirror, group mirror, who has group read/execute permissions on the still-private content.


Tier 1 mirrors have a tendency to use different authentication methods for granting access to these non-public downloads, they vary from maintaining IP based ACL's to assigning username/password combinations to mirrors wishing to sync from them.  Each method has advantages / disadvantages, the IP list is 'simpler' from a mirrormanager perspective as mirrormanager can give you the list of IP's but from an automation standpoint can be more difficult (as rsync's configuration file does not allow that ACL list to be stored in a separate file).  Username / passwords can be more versatile as sites mirroring can change IPs without notifying you, but it's easier for those credentials to leak out and get miss-used.
Tier 1 mirrors have a tendency to use different authentication methods for granting access to these non-public downloads, they vary from maintaining IP based ACL's to assigning username/password combinations to mirrors wishing to sync from them.  Each method has advantages / disadvantages, the IP list is 'simpler' from a mirrormanager perspective as mirrormanager can give you the list of IP's but from an automation standpoint can be more difficult (as rsync's configuration file does not allow that ACL list to be stored in a separate file).  Username / passwords can be more versatile as sites mirroring can change IPs without notifying you, but it's easier for those credentials to leak out and get miss-used.


== Mirror Map ==
=== Other MirrorManager features ===
http://fedoraproject.org/maps/mirrors.png
 
==== Statistics ====
 
MirrorManager tracks [https://admin.fedoraproject.org/mirrormanager/statistics the number of accesses to the mirror list] and breaks it down by country, architecture, and repository.
 
==== Mirror Maps ====
 
[https://admin.fedoraproject.org/mirrormanager/maps The map of all public mirrors] is updated daily; if it doesn't display the correct data, please try again later.


This map is updated daily.If it doesn't display,please try again later.
===== Recognition =====


=== Recognition ===
This product includes [https://dev.maxmind.com/geoip/legacy/geolite/ GeoLite data] created by [https://www.maxmind.com/ MaxMind].
This product includes GeoLite data created by MaxMind, available from http://www.maxmind.com/.


[[Category:Infrastructure]]
[[Category:Infrastructure]]

Latest revision as of 21:17, 17 November 2023

Fedora is fortunate to have over 200 volunteers [1] globally, which helps distribute the Fedora software to end users. We greatly appreciate our mirror sites and their system administrators.

Please read the Fedora Export Compliance/Customs Information.

Communicating

Before you decide to become a mirror

What are the size estimates?

Please read the DIRECTORY_SIZES.txt text file carefully. If you can allocate the required amount of space for mirroring Fedora, read on.

How can someone become a public mirror?

Becoming an official public mirror is easy, and getting easier. All we request is that your site have sufficient bandwidth and disk space to handle the load.

Storage space

Each Fedora release can consume upwards of 250GB of disk space. As of the Fedora 26 release, the total space consumed on the master server, thus what a mirror could consume, is 1.5TB and growing. Older releases are periodically archived, so a 3-4TB volume would be most appropriate for a long-term mirror. This content is hardlinked; if you can't hardlink (e.g. you're on AFS), you'll need much more disk space. Actual space consumed is noted in DIRECTORY_SIZES.txt.

Bandwidth

Downloaders can consume as much bandwidth as you've got. Mirror sites have at least a 100Mbit/sec connection to the Internet, many have Gigabit/sec, 10 Gigabit/sec, or larger pipes. 100Mbit/sec is the minimum required for new mirrors in countries with adequate mirror coverage already. We can make exceptions for new mirrors in countries that have few mirrors. Connections to Internet2, National Lambda Rail, GEANET2, RedIRIS, or other such high speed research and educational networks are always appreciated.

How can someone make a private mirror?

Private mirrors are mirrors that reside entirely within an organization (company, school, etc.) and can only be accessed by members of that organization. They exist to speed up distributing Fedora within an organization, where local bandwidth costs are far less than going across the Internet.

Private mirrors behave similarly to public mirrors, with a few exceptions:

  • Private mirrors are never listed in the MirrorManager publiclist pages.
  • Private mirrors cannot pull from the master Fedora download servers. They must pull from another listed public mirror.
  • Private mirrors must include IP netblocks in their MirrorManager configuration. This allows your network-local users to be automatically redirected to your mirror. You may list IP netblocks (e.g. 18.0.0.0/8), or if your network is NAT'd, the hostname of your NAT gateway.
  • Private mirrors are not crawled by the MirrorManager web crawler. Therefore:
  • Private mirrors must run report_mirror to inform the MirrorManager database of their content. If you don't run report_mirror, your clients will not be automatically redirected.

MirrorManager: the Fedora Mirror Management system

The MirrorManager2 software keeps track of all the mirrors without requiring a lot of manual text file editing. Fedora's instance of this application is at https://admin.fedoraproject.org/mirrormanager

Mirror Status

MirrorManager keeps track of all the mirrors with regular scans of the content of all mirrors. The content of the mirrors is scanned by the crawler which either scans/crawls the mirrors using HTTP or RSYNC. We strongly recommend providing a RSYNC URL (at least for our crawler) as crawling via RSYNC is much faster as it only requires a single network connection for each mirrored category. If only HTTP access for Fedora's mirror crawler is possible it is important to enable HTTP Keepalives to reduce the number of required network connections.

The crawler does not scan private mirrors and private mirrors have to run report_mirror to tell MirrorManager that they are up to date.

rsync health checking

If you are allowing MirrorManager to perform a health check via rsync (generally preferred due to the cost of doing HTTP HEAD requests and such), please make sure that 38.145.60.3 has read access to the respective rsync targets.

Signing up

Fedora Account System

  • You must have an account in the Fedora Account System (more info also at AccountSystem). Note that this uses the same backend as the CentOS Account System, so if you already have an account there you do not need to create a new one. You are not required to sign the Contributors License Agreement to merely mirror Fedora content, but you must do so if you wish to contribute to other aspects of Fedora.
  • If a public mirror, you must send an email to mirror-admin@lists.fedoraproject.org introducing yourself, stating you would like to become a mirror, your IP address, your location (country), and your outbound bandwidth available for the mirror. Private mirrors are encouraged to send a similar note.
  • You must subscribe to mirror-adminto be notified of new releases.

Registering in MirrorManager

  • Log into MirrorManager using your FAS account.
  • Create a new Site (a site is a group of Hosts belonging to the same organization) by clicking on your user profile.
  • Create a new Host, and sign up that host for the Categories of content you'll carry (under the formally optional section Categories Carrried), any other site administrators you want, your site's IP addresses used for our Access Control List, list of countries that are allowed to access your mirror, and the other details listed there if applicable to you.
    • You may list your site's IP address ranges (Netblocks). Clients coming from an IP address within your netblock will be automatically redirected to your mirror for any content you carry.
    • You may list your site's BGP Autonomous System Number (ASN). Clients on your ASN will be automatically redirected to your mirror for any content you carry. One way to lookup up your ASN is to query it from the routeviews.org DNS servers. It is like a PTR record lookup, but at a specific server. For example, to look up 143.166.1.1, type:
      $ dig txt 1.1.166.143.asn.routeviews.org @archive.routeviews.org
      
      ;; ANSWER SECTION:
      1.1.166.143.asn.routeviews.org. 86400 IN TXT "3614" "143.166.0.0" "16"
      Here, the answer is in the TXT record, the first value, 3614.
    • For a mirror to get listed in the MirrorManager for the category of content you have selected to carry, you must add at least one URL via which the category of content on your mirror can be accessed (under Categories Carried). The URLs you add can be of type HTTP, HTTPS, and rsync. Fedora's MirrorManager instance does not support FTP URLs and therefore it is not possible to add a FTP URL anymore.

Mirroring

  • Please use the quick-fedora-mirror zsh script if at all possible. This script uses rsync and some informational files on the mirrors to allow you to only sync those files you need and saves lots of time and file seeking. Note: make sure that the mirror you intend to use provides the fedora-buffet rsync module, as otherwise the script will not work.
  • Please pull from one of the Tier 1 mirrors. Instead of using one of the Tier 1 servers, you may wish to pull from another fast mirror that's closer to you. Contact the respective mirror admins to be added to their ACL.
  • You may exclude any content you desire, such as architectures or releases, using an EXCLUDES file or --exclude parameter.

Tools to avoid

Please don't use tools like 'lftp mirror' to mirror content from the master mirrors. It's places a heavy load on our master mirrors and is slow for both you and us. Please use quick-fedora-mirror instead.

Mirror Frequency

  • If you are using quick-fedora-mirror you can sync as often as every 10minutes in a loop. If there is no new content the script should return almost immediately. Make sure to use locking so you are only running one instance at a time.
  • If you are not using quick-fedora-mirror you should sync a few times a day and consider switching.

Running report_mirror (only private mirrors)

Note: quick-fedora-mirror can also check in for you and has no need of the report_mirror package/conf. To enable it, set the correct values for CHECKIN_SITE and CHECKIN_PASSWORD variables. In most cases, setting CHECKIN_HOST is not necessary. The following applies only if you are not using quick-fedora-mirror, and, as said above, you should consider switching.

MirrorManager includes a tool, report_mirror which uploads to the mirror database for private mirrors that you completed a run and what content you've got. This is required for private mirrors. Private mirrors have to run report_mirror after every rsync job completes.

dnf install mirrormanager2-client

or yum (in RHEL and CentOS):

yum install mirrormanager2-client

or get the files directly from the official Git repository:

git clone https://github.com/fedora-infra/mirrormanager2.git

You need both report_mirror and report_mirror.conf, and must edit report_mirror.conf to include the content you're carrying and the path to that content on your disk. Only private mirrors can use report_mirror.

Available content

The available content modules by rsync, and their point in the directory tree are:

Suggested rsync modules
rsync module Description path on master server Comments
fedora-buffet Fedora -- the whole buffet (all you can eat) /pub Please use this if you can, it provides the all current Fedora content, including pre-bitflip content. This is open to specific mirrors by request. Mirrors participating in our tiering should use this. Mirrors syncing both fedora-enchilada and fedora-archive should use this, as we can now hardlink across both of those trees under fedora-buffet0.
fedora-enchilada Fedora -- the whole enchilada /pub/fedora Please use this if you can, it provides the all current Fedora content, including pre-bitflip content. This is open to specific mirrors by request. Mirrors participating in our tiering should use this.
fedora-epel Extra Packages for Enterprise Linux /pub/epel Please use this to mirror EPEL
fedora-archive Historical Fedora releases /pub/archive Fedora Core 1-6 and Extras 3-6, and obsolete releases 7 and higher
fedora-secondary Fedora Secondary Arches /pub/fedora-secondary
fedora-alt Fedora Other /pub/alt


These other modules exist for legacy purposes, and should be avoided by current mirrors.

rsync module Description path on master server Comments
fedora-linux-releases Fedora Linux Releases /pub/fedora/linux/releases
fedora-linux-development Fedora Linux Development (Rawhide) /pub/fedora/linux/development
fedora-linux-updates Fedora Linux Updates /pub/fedora/linux/updates

Rsync Configuration (sample)

Larger mirrors, like kernel.org, have slightly custom front-ends to rsync (mainly so that they can have a single rsync instance and have multiple ip based vhost configuration files) That said what follows is a sample rsync configuration file for public syncing (this is not intended for private pre-bitflip mirroring)

[fedora]
        comment         = Fedora - RedHat community project
        path            = <path to your fedora directory>
        exclude         = lost+found/
        read only       = true
        max connections = 100
        lock file       = /var/run/rsyncd-mirrors.lock
        uid             = <user id (numeric, or textual) of an anonymous style user who should have read access>
        gid             = <group id (numeric, or textual) of an anonymous style user who should have read access>
        transfer logging = yes
        timeout         = 900
        ignore nonreadable = yes
        dont compress   = *.gz *.tgz *.zip *.z *.Z *.rpm *.deb *.bz2
        refuse options = checksum

Things to explicitly note:

  • The path above should be a full path to your fedora directory
  • You should *really* want to leave this read-only
  • Make sure your uid/gid are set to public users, not to the user that you run as your sync agent. If you set this to the user who does your syncs you will be inadvertently giving the public full pre-bitflip access.
  • Make sure you have the 'refuse options' set to checksum, your server will be *MUCH* happier with this set, as it will prevent public users from performing a checksum run against you. This can be incredibly I/O abusive, so should not be available to the general public.

HTTPd Configuration

Keepalives

HTTP Keepalives should be enabled on your mirror server to speed up client downloads. By default, Fedora's Apache httpd package has keepalives disabled. They should be enabled, with a timeout of at least 2 seconds (the default of 15 seconds might be too high for a heavily loaded mirror server, but 2 seconds is sufficient and appropriate for yum).

KeepAlive On
KeepAliveTimeout 2
MaxKeepAliveRequests 100

Other http servers such as lighttpd have keepalives enabled by default.

Caching of metadata

We don't want caching proxy servers between our mirrors and our end user systems to cache our yum repository metadata. So, add explicit metadata handling. (Suggested by the OpenSUSE download redirector.)

   <LocationMatch "\.(xml|xml\.gz|xml\.asc|sqlite)$">
       Header set Cache-Control "must-revalidate"
       ExpiresActive On
       ExpiresDefault "now"
   </LocationMatch>
Content Types

ISO and RPM files should be served using MIME Content-Type: application/octet-stream. In Apache, this can be done inside a VirtualHost or similar section:

<VirtualHost *:80>
AddType application/octet-stream .iso
AddType application/octet-stream .rpm
</VirtualHost>
Limiting Download Accelerators

Download accelerators will try to open the same file many times, and request chunks, hoping to download them in parallel. This can overload heavily loaded mirror servers, especially on release day. Here are some tricks to thwart such activities.

To limit connections to ISO dirs by some amount per IP:

<IfModule mod_limitipconn.c>
MaxConnPerIP 6
</IfModule>

To block ranged requests as this is what download accelerators do indeed:

RewriteEngine on
RewriteCond %{HTTP:Range} [0-9]$
RewriteRule \.iso$ / [F,L] 

Similar things can be done with iptables and the recent module, which might give you a little more ability to control what is being done, either by limiting new connections or by dropping 50% of a users packets.

Logging Partial Content Downloads

Partial content can be logged correctly using apache:

# this includes actual counts of actual bytes received (%I) and
# sent (%O); this requires the mod_logio module to be loaded.

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" %I %O \"%{User-Agent}i\"" combined

Pre-bitflip mirroring

Several days before each public release, the content will be staged to the master mirror servers, but with restricted permissions on the directories (generally mode 0750), specifically, not world readable.

Mirror servers should have several different user/group accounts on their server, for running the different public services. Typically you find:

  • HTTP server runs as user apache, group apache
  • FTP server runs as user ftp, group ftp
  • RSYNC server runs as user rsync, group rsync
  • a user account for downloading content from the masters (e.g. user mirror, group mirror).

The user account used to download content from the masters must be not be the same as the HTTP, FTP, or RSYNC server accounts. This guarantees that content downloaded with permissions 0750 will not be made available via your public servers yet.

On the morning of the public release, the permissions on the directories on the master servers will change to 0755 - world readable. This is called the bitflip.

Mirrors may either rsync one more time to pick up these new permissions (but won't have to download all the data again), or preferably, can schedule a batch job to bitflip:

$ echo "chmod a+rx /pub/fedora/linux/releases/9" | at '14:45 UTC May 13 2008'
Serving content to other mirrors

Tier 1 mirrors will necessarily need to share content to Tier 2 mirrors before the bitflip. This is done by running another instance of the rsync daemon, on a different port (e.g. 874), with an Access Control List to prevent public downloads, running as a user in the same group as downloaded the content (e.g. group mirror). This could be user mirror, group mirror, who has group read/execute permissions on the still-private content.

Tier 1 mirrors have a tendency to use different authentication methods for granting access to these non-public downloads, they vary from maintaining IP based ACL's to assigning username/password combinations to mirrors wishing to sync from them. Each method has advantages / disadvantages, the IP list is 'simpler' from a mirrormanager perspective as mirrormanager can give you the list of IP's but from an automation standpoint can be more difficult (as rsync's configuration file does not allow that ACL list to be stored in a separate file). Username / passwords can be more versatile as sites mirroring can change IPs without notifying you, but it's easier for those credentials to leak out and get miss-used.

Other MirrorManager features

Statistics

MirrorManager tracks the number of accesses to the mirror list and breaks it down by country, architecture, and repository.

Mirror Maps

The map of all public mirrors is updated daily; if it doesn't display the correct data, please try again later.

Recognition

This product includes GeoLite data created by MaxMind.