From Fedora Project Wiki
(adding some more steps)
(capture of most of the markup fixes from my experience plowing through the mw-render output to XML, then converting it to be contextual and clean)
Line 12: Line 12:
*** Using XSLT?
*** Using XSLT?
*** Hacky way is to chop the <book></book>, convert the <article...></article> to <chapter></chapter>; remove the <articleinfo>...</articleinfo> block entirely
*** Hacky way is to chop the <book></book>, convert the <article...></article> to <chapter></chapter>; remove the <articleinfo>...</articleinfo> block entirely
** Give each <section> an ID value equal to the contents of the <title>...</title> with '_' instead of spaces, starting with 'sn
** Look for random empty containers, such as 'para' and 'literallayout' that likely came from extraneous empty lines
*** for relnotes, all sections now have 'section id=""'
** Many list elements and titles have a leading or following space inherited from something in the wiki
*** Need to figure out what that is and change the mw-render or our markup practices
** Give each <section> an ID value equal to the contents of the <title>...</title> with '_' instead of spaces, starting with 'sn-'
*** for relnotes, all sections now have 'section id="sn-"'
** Turn the admonition output into the equivalent DocBook admonition.  Note that we are using only three admonitions, so a specific mapping needs to be made.<ref>
** Turn the admonition output into the equivalent DocBook admonition.  Note that we are using only three admonitions, so a specific mapping needs to be made.<ref>
<pre>
<pre>
Line 44: Line 47:
** Search for each instance of 'emphasis' and replace it with the proper DocBook contextual markup
** Search for each instance of 'emphasis' and replace it with the proper DocBook contextual markup
** Search for each instance of 'code' and 'programlisting' and replace it with the proper DocBook contextual markup
** Search for each instance of 'code' and 'programlisting' and replace it with the proper DocBook contextual markup
** Search and replace empty literallayout containers
** Search and replace empty literallayout containers with proper markup
** Convert inlinemediaobject to proper admonition
** Convert inlinemediaobject to proper admonition
 
** Make 'ulink' entries recursively single -- &lt;ulink url="" /&gt;





Revision as of 21:21, 15 October 2008

This page is a random set of notes about what needs to be changed, hopefully with a script, after converting wiki Beats content to XML using:

mw-render -c http://fedoraproject.org/w/ -w docbook Some_wiki_file_name -o Some_wiki_file_name.xml


  • The content renders each page as a stand-alone book. (This is different from previous Moin Moin behavior, which made every page a chapter.) There is content that needs removing of changing to be a chapter.
    • Change the !DOCTYPE to 'chapter'
    • Remove the ?xml-stylesheet call entirely, or the <? remnant
    • Change the actual document from <book> to <chapter&;
      • Using XSLT?
      • Hacky way is to chop the <book></book>, convert the <article...></article> to <chapter></chapter>; remove the <articleinfo>...</articleinfo> block entirely
    • Look for random empty containers, such as 'para' and 'literallayout' that likely came from extraneous empty lines
    • Many list elements and titles have a leading or following space inherited from something in the wiki
      • Need to figure out what that is and change the mw-render or our markup practices
    • Give each <section> an ID value equal to the contents of the <title>...</title> with '_' instead of spaces, starting with 'sn-'
      • for relnotes, all sections now have 'section id="sn-"'
    • Turn the admonition output into the equivalent DocBook admonition. Note that we are using only three admonitions, so a specific mapping needs to be made.[1]
    • Run the page through something similar to xmlformat or ... xmllint?
  • Search through the file for each of the markup output types covered in [#Wiki_markup_output_to_XML,_mapped_to_DocBook_XML Wiki markup output to XML, mapped to DocBook XML] ; that is, do the following:
    • Search for each instance of 'emphasis' and replace it with the proper DocBook contextual markup
    • Search for each instance of 'code' and 'programlisting' and replace it with the proper DocBook contextual markup
    • Search and replace empty literallayout containers with proper markup
    • Convert inlinemediaobject to proper admonition
    • Make 'ulink' entries recursively single -- <ulink url="" />


<section>

Notes

  1.         <para>
              <para>
                <para>
                  <inlinemediaobject>
                    <imageobject>
                      <imagedata contentwidth="35px" fileref="http://fedoraproject.org/w/uploads/a/a\
    4/Idea.png" scalefit="1" width="35px" />
                    </imageobject><caption>
                      <para />
                    </caption>
                  </inlinemediaobject>
                </para><para>
                  <emphasis> Visit <ulink url="http://docs.fedoraproject.org/release-notes/">http://\
    docs.fedoraproject.org/release-notes/</ulink> to view the latest release notes for Fedora, espec\
    ially if you are upgrading.</emphasis><literallayout>
    </literallayout>If you are migrating from a release of Fedora older than the immediately previou\
    s one, you should refer to older Release Notes for additional information. You can find older Re\
    lease Notes at <ulink url="http://docs.fedoraproject.org/release-notes/.">http://docs.fedoraproj\
    ect.org/release-notes/.</ulink>
                </para>
              </para>
            </para>