Inasmuch as I’m going to be authoring test files for the XSLT 3.0 testbed, I wanted an easy way to use Wendell’s oXygenJATSFramework. The easiest way would have been as an Oxygen framework add-on, but there wasn’t yet one. Since (unsigned) framework add-ons can be Zip archives and I’d recently downloaded archived releases from other GitHub projects, I did some experimenting and got GitHub to host the framework descriptor and the framework add-on itself. Continue reading
Inasmuch as the offer’s open a bit longer, you can still use the XML13 discount code for a 10% discount on your fees for this year’s XML Summer School.
This year, with Patricia Walmsley, I’m teaching Improving stylesheets through the use of advanced features in the XSLT and XQuery track:
Most XSLT developers stick to a familiar core set of XPath and XSLT instructions and functions. There are a number of advanced features, many of them introduced in version 2.0, that appear only rarely in stylesheets even though they can be very useful in certain situations. In this course, we will explore some of these less-used features, showing interactively how they can be applied to improve existing XSLT stylesheets. XPath features covered will include operators like
>>, quantified expressions, and 2.0 functions that can significantly simplify your stylesheets. XSLT features covered will include grouping, regular expression matching and advanced modularization using modes and instructions like
Other sessions in the track are XSLT Efficiency and Effectiveness taught by Michael Kay, Querying XML Databases with XQuery taught by Adam Retter, and Trends in XSLT/XQuery taught by Florent Georges, while the other tracks during the week are XML Primer, Hands-on Introduction to XML, Publishing With XML, Semantic Technologies, Trends and Transients, and Hands-on Web Publishing. There's also the evening events, such as punting, the formal dinner, and the unconference session, where I'm lined up to present a lightning talk.
One, in the XSLT and XQuery track, is an update of the Developing and Testing in XSLT talk, again alongside Jeni Tennison, that got us such a good review last year:
Unit tests, profiling, debugging and, increasingly, test-driven development are part of the bread and butter of working with other programming languages but are not always so with XSLT or XQuery. In test-driven development, which is a fundamental part of agile approaches to software development, the developers write tests that describe the desired behaviour of their application, then write code that meets the tests. This style of development keeps code focused, avoids breaking existing code and facilitates refactoring.
In this session, Jeni Tennison and Tony Graham will describe both the state of the art in testing and debugging XSLT and XQuery and how test-driven development applies to XSLT and XQuery development. In particular, they will focus on the use of the XSpec testing framework.
The other, in the Publishing track, is XML and Publishing Workflows:
Some formats are better or worse than others for capturing and/or representing the information for publishing purposes. Can you create and manage life-cycle workflows which rationalise or regularise mixes of formats using XSLT and other XML toolsets? Should XML be the beginning of your publishing workflow, the hub format in the middle, the result, or all three? How can XSLT and related tools be used to cover up the deficiencies or excesses of the source XML? What are the arguments for moving authors towards submitting in XML (or not)? For moving editors?
Incorporating both live examples and war stories, Tony Graham will lead an examination of XML in publishing workflows, the advantages and disadvantages of using XML at each stage, and some of the tools and techniques available to you.
XML Summer School 2012 is on September 16–21 2012 at St Edmund Hall, Oxford University.
Inasmuch as I’d been threatening since the XML Summer School last year to do it, I’ve made a custom Ant task for running XML Calabash, currently only in my fork at firstname.lastname@example.org:MenteaXML/xmlcalabash1.git.
You can use this task to process:
- A single input file to produce a single output file
- A set of input files, processed one at a time, to produce a set of output files
- Multiple input files as the input to one XProc input port processed to produce a single output file
- Any of the above with additional input ports to each of which are applied one or more input files whose file names may be either fixed or mapped from the name(s) of the current main input file(s)
- Any of the above with additional output ports whose file names may be either fixed or mapped from the name(s) of the current main input file(s)
- Any of the above with Ant defaulting to not running the pipeline when the outputs are already up-to-date compared to the inputs and the pipeline
You can also specify options and parameters to be used by the pipeline. Continue reading
Inasmuch as it’s useful, when editing an Ant build file, to have a list of the targets in the file and the ability to jump to any of them, my Ant mode at email@example.com:tkg/ant-mode.git currently only does two things: make a “Ant” menu that lists all the targets and associates a RELAX NG compact syntax schema with build files. Continue reading
Inasmuch as, back in January, I was teaching another XML course, I reviewed the basis for draconian error handling in XML in light of the sea change in recent years towards HTML5-style completely-defined error recovery.
At the time of the draconian error handling decision, I was on the larger “W3C SGML Working Group” mailing list that provided input, clamour, and distraction to the core “W3C SGML Editorial Review Board” that did the work and made the decisions on the road to XML. I followed the discussions on the mailing list at the time (as much as humanly possible), and the message about this that stuck in my mind is the “ERB votes on error handling” message from Tim Bray on behalf of the ERB, particularly this section:
2. We have a strong political reality to deal with here in that for the first time, the big browser manufacturers have noticed XML and have together made a strong request: that error-handling be completely deterministic, and that browsers not compete on the basis of excellence in handling mangled documents. It was observed that if they wanted to do this, they could just do it; but then pointed out that this is exactly why standards exist – to codify the desired practices shared between competitors. In any case, if we want XML to succeed on the Web, it will be difficult to throw the first serious request from M & N back in their face.
Inasmuch as the Wisent parsing and other CEDET/Speedbar/Semantic goodness for RELAX NG compact syntax files that I’m currently working on may not be ready for prime time for a while, here’s something to add to your
`flymake' runs Jing in the background to find syntax errors in your RELAX NG compact syntax files:
(require 'flymake) (defun flymake-rnc-init () (let* ((temp-file (flymake-init-create-temp-buffer-copy 'flymake-create-temp-inplace)) (local-file (file-relative-name temp-file (file-name-directory buffer-file-name)))) (list "jing" (list "-c" local-file)))) (add-to-list 'flymake-allowed-file-name-masks '(".+\\.rnc$" flymake-rnc-init flymake-simple-cleanup flymake-get-real-file-name)) (add-hook 'rnc-mode-hook 'flymake-mode)
Inasmuch as it exists as a PDF file, you, too, can have your own copy of my “Schematron Testing Framework” (
stf) poster from XML Prague 2012. I’m happy to say that I received constructive comments about
stf from people at XML Prague 2012 who read the poster, and I’ll be looking at incorporating the feedback in the near future.
One suggestion, from George Bina, was to make a single “framework” file for running the tests – and including the test files in the framework file either directly or by using XInclude to refer to external test files – rather than the current decentralised approach. A single framework file would make it easier to make a report of the results, unlike the the current approach where the idea is that the only report you really want to see is “
<errors/>” when there are no more errors. A single framework file could also become very large and hard to navigate when there’s lots of very similar tests in it. What do you think?
Inasmuch as a suite of Schematron tests contains many contexts where a bug in a document will make a Schematron
assert fail or a
report succeed, it follows that for any new test suite and any reasonably sized but buggy document set, there will straight away be many
report messages produced by the tests. When that happens, how can you be sure your Schematron tests all worked as expected? How can you separate the expected results from the unexpected? What’s needed is a way to characterise the Schematron tests before you start as reporting only what they should, no more, and no less.
stf (https://github.com/MenteaXML/stf) is a XProc pipeline that runs a Schematron test suite on test documents (that you create) and winnows out the expected results and report just the unexpected. stf uses a processing instruction (PI) in each of a set of (typically, small) test documents to indicate the test’s expected
reports: the expected results are ignored, and all you see is what’s extra or missing. And when you have no more unexpected results from your test documents, you’re ready to use the Schematron on your real documents. Continue reading