Transforming Red Hat's comps.xml File
There are five and a half versions of the comps.xml file transformation here. The most useful (light-weight) is the fourth version in english. For other languages see the fifth version.
Note that the browsable, generated html is always color-coded yellow in the linked file tables for each version.
There are other interesting aspects of visualizing the comps.xml file not yet addressed here; for example:
Of course, maybe this is all just a ruse. Maybe I just wanted to give some webbots a __really_ good exercise -- nothing like thousands of links to keep those puppies busy! he he he.
To play with the comps.xml file locally you can make a copy of the file from the install cdrom at RedHat/base/comps.xml.
Note that starting with Fedora Core and Enterprise 3, the full compls.xml file is not on the CD. You must generate it using getfullcomps.py. This utility will whine about missing packages (mostly to do with alternate architectures). These messages can be ignored:
$ cp -p /path/to/top/Fedora/base/comps.xml . $ rpm2cpio /path/to/top/Fedora/RPMS/comps-extras-9.0.3-2.noarch.rpm | \ cpio -imvur --no-absolute-filenames ./usr/share/comps-extras/getfullcomps.py $ chmod u+x ./getfullcomps.py $ ./getfullcomps.py comps.xml /path/to/top "" > compsfull.xml # now edit comps.xml and insert the file compsfull.xml at the end, before the # closing </comps> tag
Then you need to include the XSLT stylesheet file by adding a reference at the top of the XML file; for example:
<?xml version="1.0"?> <?xml-stylesheet href="comps.xsl" type="text/xsl" ?>
(Note that you really don't need the stylesheet reference above if you only use xalan - ie. you decided not to clobber your browser by directly loading the xml file with the stylesheet reference.)
You may want to also remove the <!DOCTYPE ..> reference for xalan-java, which in my experience doesn't care for it.
The files used to transform the XML file are linked below. I have also added the slightly-modified comps.xml file for your convenience. Of course, this XML file is large at more than 600 kb. Not surprisingly the generated comps.html file is also rather large.
You will want to shift-click (for mozilla/netscape) to pick up the files.
I decided to gzip/zip up the comps.xml file. This will save you the pain of accidentally clicking on it and sending your browser into a spin while it tries to present this large transform to you. This is especially true for the 2nd and 3rd versions of the XSLT file; they are fairly complex...
|comps.xml.gz||72068||the slightly modified Red Hat 9 comps file in (GNU) gzip format|
|comps.xml.zip||72190||the slightly modified Red Hat 9 comps file in ZIP format|
|comps.xsl||9646||the XSLT style sheet used in this example|
|comps.css||676||the CSS style sheet used in this example|
|comps.html||417964||the xalan-generated HTML file that allows you to browse the comps.xml file|
This time we set a global language parameter at the top of the file called lg. It is actually empty, and will default to english. By passing this parameter from xalan we can loop through the list of supported languages and generate html that uses the group names and group descriptions for the specified language.
for i in cs da de es fr is it ja ko no pt ru sv zh_CN zh_TW ; do echo $i::: xalan -IN comps.xml -XSL comps-lang.xsl -PARAM lg $i -HTML -OUT comps-lang-$i.html done
|Version 1 (lang)|
|comps-lang.xsl||11223||the XSLT style sheet used in this example|
|comps-lang-cs.html||420072||Czech group names and descriptions|
|comps-lang-da.html||419255||Danish group names and descriptions|
|comps-lang-de.html||418903||German group names and descriptions|
|comps-lang-es.html||419189||Spanish group names and descriptions|
|comps-lang-fr.html||419188||French group names and descriptions|
|comps-lang-is.html||419437||Icelandic group names and descriptions|
|comps-lang-it.html||418623||Italian group names and descriptions|
|comps-lang-ja.html||420155||Japanese group names and descriptions|
|comps-lang-ko.html||419114||Korean group names and descriptions|
|comps-lang-no.html||418431||Norwegian group names and descriptions|
|comps-lang-pt.html||419188||Portuguese group names and descriptions|
|comps-lang-ru.html||422085||Russian group names and descriptions|
|comps-lang-sv.html||419048||Swedish group names and descriptions|
|comps-lang-zh_CN.html||417735||Chinese Simplified group names and descriptions|
|comps-lang-zh_TW.html||417925||Chinese Traditional group names and descriptions|
In the second version of the stylesheet, I first generated a list of RPMs in XML format from the CDROM images. The list for each RPM includes the name, the architecture and the descriptive summary. With a bit of sed the CDROM number is also added to the output. Finally the output files are combined into a file called comps-disks.xml, and a container tag is pre- and post-pended.
The new XSLT style sheet file can now use both the comps.xml and the comps-disks.xml files to generate a more detailed, but also a much larger HTML file. The resultant HTML file now shows for each RPM package the descriptive summary, the architecture and on which CD the RPM is found.
Additionally some 'quick' links :-) are added, turning the file into a click-here jamboree! In the following table the new set of files are listed. Note that comps.xml file above is still needed, as is the CSS stylesheet file, comps.css. Note that you would need to edit the .xml file to reference the XSLT stylesheet that you wish to use.
|genrpmlist||2014||the shell script used to extract the RPM names, architectures and summary descriptions.|
|comps-disks.xml.gz||29753||the generated Red Hat 9 CDROM file list in (GNU) gzip format|
|comps-disks.xml.zip||29881||the generated Red Hat 9 CDROM file list in ZIP format|
|comps-v2.xsl||13813||the XSLT style sheet used in this example|
|comps-v2.html||642877||the xalan-generated HTML file that allows you to browse the comps.xml file|
In the third version of the stylesheet, I added the source RPMs to the XSLT syle sheet of version 2 above. The script 'extractSpecFiles' was used to extract the .spec file and to generate the necessary xml data.
Though most of the data can be generated automatically with this script, there are a few minor problems with about 2% of the .spec files that require some manual intervention:
In the version 3 generated html you will find links to the .spec files. If you click on an spec link a new window will open and present the spec file. There is also an html-ized link that shows up as (html) if you like the christmas-tree look :o}. The html-ized files were generated with a wee for-loop and the never-ending capacity of gvim to entertain me:
for i in *.spec ; do echo $i::: gvim -c ':let html_use_css = 1' -f +"syn on" +"run! syntax/2html.vim" +"wq" +"q" "$i" sleep 2 done
I also needed to edit the title tag to prevent it from telling the whole world the full real path to the file on the web server:
for i in *.html ; do echo $i::: sed 's%<title>/path/to/specs/%<title>%' <$i > $i.new /bin/mv $i.new $i done
Note that the CD # does not yet show up for the source package section.
|extractSpecFiles||5594||the shell script used to extract the SOURCE RPM names, specfiles and summary descriptions.|
|srpms.xml.gz||40482||the generated Red Hat 9 SRPMS file list in (GNU) gzip format|
|srpms.xml.zip||40604||the generated Red Hat 9 SRPMS file list in ZIP format|
|comps-v3.xsl||17097||the XSLT style sheet used in this example|
|comps-v3.html||1165061||the xalan-generated HTML file that allows you to browse the comps.xml file|
The stylesheet generates the main html page, as in previous versions. However for the detailed 'package' and 'sources' templates three sets of PHP files are produced:
Additional features implemented are (and some scripts and xml files now bear a different name from their predecessors):
|jshead.php||1476||the small php function headers file that is included in each of the subsequently generated PHP files|
|comps-disks-desc.xml.gz||143434||the generated Red Hat 9 CDROM file list in (GNU) gzip format|
|comps-disks-desc.xml.zip||143567||the generated Red Hat 9 CDROM file list in ZIP format|
|genrpmlist2||2753||the shell script used to extract the RPM names, architectures cd numbers, summaries and descriptions.|
|extractSpecFiles2||5988||the shell script used to extract the SOURCE RPM names, specfiles and summary descriptions.|
|srpms2.xml.gz||37789||the generated Red Hat 9 SRPMS file list in (GNU) gzip format|
|srpms2.xml.zip||37912||the generated Red Hat 9 SRPMS file list in ZIP format|
|comps-v4.xsl||20200||the XSLT style sheet used in this example|
|comps-v4.html||241059||the xalan-generated HTML file that allows you to browse the comps.xml file|
The buildLangXML script takes a while to run (about an hour!) and generates many thousands of files using up over 200 MB of disk space. The scripts were clearly not designed to be efficient :-/
The intermediate $lang-comps-disks-desc.xml.zip file for each language is linked under the Description column in the table below.
Only the rpm data is internationalized; I have not attempted to use the specspo data to internationalize the SOURCE package descriptions.
|Version 5 (lang)|
|comps-v4-lang.xsl||23753||the XSLT style sheet used in this example|
|buildLangXML||2589||the script used to generate the cdrom descriptions and the html|
|jshead-lang.php||1550||the small php function headers file that is included in each of the subsequently generated PHP files. It is slightly different from the version 4 file, since it passes the language variable in doHeader()|
|comps-v4-lang-cs.html||249401||Czech (cs) group names, descriptions and specspo data|
|comps-v4-lang-da.html||248782||Danish (da) group names, descriptions and specspo data|
|comps-v4-lang-de.html||248430||German (de) group names, descriptions and specspo data|
|comps-v4-lang-es.html||248716||Spanish (es) group names, descriptions and specspo data|
|comps-v4-lang-fr.html||248715||French (fr) group names, descriptions and specspo data|
|comps-v4-lang-is.html||248964||Icelandic (is) group names, descriptions and specspo data|
|comps-v4-lang-it.html||248150||Italian (it) group names, descriptions and specspo data|
|comps-v4-lang-ja.html||249682||Japanese (ja) group names, descriptions and specspo data|
|comps-v4-lang-ko.html||248641||Korean (ko) group names, descriptions and specspo data|
|comps-v4-lang-no.html||247958||Norwegian (no) group names, descriptions and specspo data|
|comps-v4-lang-pt.html||248715||Portuguese (pt) group names, descriptions and specspo data|
|comps-v4-lang-ru.html||251612||Russian (ru) group names, descriptions and specspo data|
|comps-v4-lang-sv.html||248575||Swedish (sv) group names, descriptions and specspo data|
|comps-v4-lang-zh_CN.html||253565||Chinese Simplified (zh_CN) group names, descriptions and specspo data|
|comps-v4-lang-zh_TW.html||253755||Chinese Traditional (zh_TW) group names, descriptions and specspo data|
|This web site hosted at:||Triumf - Canada|
|Created:||21 Feb 2003 @ 18:53 GMT|
|Last modified:||28 Sep 2003 @ 14:30 GMT|