Reporting Done With a Local Web Stack
How I managed to streamline my local town association's reporting needs using HTML5, CSS3, paged.js, headless Chrome (through Puppeteer Sharp), Microsoft's Razor Engine, and ASP.NET Core.
Last updated
How I managed to streamline my local town association's reporting needs using HTML5, CSS3, paged.js, headless Chrome (through Puppeteer Sharp), Microsoft's Razor Engine, and ASP.NET Core.
Last updated
Being on the committee of an association puts you in charge of a lot of paperwork. Publishing regular reports on the whats an whys of the committee's decisions is an important legal obligation. Taking the minutes of all sorts of meetings and producing reports from these is a common way to account to the members of the association for the decisions that have been made.
Unfortunately, producing reports of high quality is still a time-consuming effort. Luckily, today's world of information technology allows us to choose from a plethora of tools that help us streamline the entire process.
In this article I'd like to show you what production setup I have put into practice over the past weeks to produce reports that are visually appealing and yet easy to handle for those producing them. In short: they require little more than editing a simple text file (preferable during such meetings or shortly after) and allow you to distribute a well-designed report at the click of a button.
A little background on my case: I'm responsible for taking the minutes of our local town association. With one to two meetings per month and two to three hours per meeting, there's quite a lot of paperwork to be done.
Working in IT I see it as my duty to produce high quality reports. At the same time I know how time-consuming it is to create a decent print layout. It starts with taking notes and making sketches during the meetings, which are then put into custom-made document templates usually using a UI-based word processor.
Unfortunately, these reports often go through multiple iterations. So you are never really done with them easily. In my case, a tentative pre-print version is sent to the committee for corrections and additions. Once these have been processed, the document is reviewed once more before it is going to print (or sent as a PDF to the members of the association via e-mail).
What is more, many parts of the give process are dull and repetitive. Naturally, wrangling with the layout of a UI-based word processor while rearranging, adding or deleting paragraphs is prone to errors.
Usually this overhead could be reduced by opting for a simpler setup. Markdown made into HTML would be a feasible solution, but it lacks in terms of visually appealing layouts.
What is needed is an integrated print production chain that can handle simple notes and produce top-notch PDF reports from them with the click of a button. Ideally, those notes shouldn't care about layout issues, they should only contain the text needed to create such reports. Layout issues should be dealt with by a template that can be reused for all future reports.
The solution I was aiming for had to meet the following requirements:
it should be based on text (source) files to allow for the use of any version control system (preferably Git);
it should use a simple, semantic markup to allow for simple note taking which is flexible enough so that it can be used as the starting point for producing final reports without having to transform the notes into another (interim) format;
it should have a simple yet powerful template engine so as to use ready-made layout templates or create new templates for special occasions;
it should produce print-ready, high quality PDF files;
it should be highly integrated and automated; ideally, I would want to feed the final source file (with a reference to the template file) into a program and be passed back the final PDF report.
This is what I have come up with:
I have opted for an xml-based source format that allows me to take notes and parameterize the final outcome according to my needs. I use a template file with the essential boilerplate code for each new report and just fill in the details as I go. This is especially useful for frequently recurring bits of information such as attendance lists, agenda items, action items, and meta information (e.g. time and venue of the meeting).
The semantic structure of the generated report file is controlled by an HTML template file enriched with declarative Razor Syntax.
The layout specifics are handled by print-optimized CSS3. In order to give the layout a professional look and provide for special "print edge-cases", I had to look for improved print-support.
After all, things like headers, footers, counters, handling of page breaks (content fragmentation, pagination), etc. may all be irrelevant in terms of web-only publications but make a huge difference when it comes to print production (page layout). So, what CSS3 cannot do out of the box is taken care of by the awesome Paged.js library, "a free and open-source library that paginates any HTML content to produce beautiful print-ready PDF".
And this is how it all comes together. In short, the process goes like this: xml source file in, generated PDF file out. The specificy are explained below.
The user feeds the xml source file (together with a reference to the template file) into a simple console application (based on C# and .Net Core).
The console application creates a "live object" of the xml source and runs this through the templating engine. I am using Microsoft's Razor Engine (through Ivan Balan's RazorLight) for the razor processing. The template is enriched with various resource files such as the Pagedjs source and images that may be part of the template. The entire process returns the final HTML that is then saved locally.
The HTML file thus produced (along any dependencies) is then served by a local web server (using ASP.NET Core and Kestrel) and rendered (consumed) by a headless version of Google Chrome. I use Puppeteer Sharp as a Headless Chrome .NET API for full control over the rendering process.
The final outcome is a high quality, print-ready PDF file, which is then returned to the user.
You can have a look at the generated PDF file here.
I have enriched the entire process with some additional extras that help me keep track of my work and account for changes.
The source and template files (including any third-party references) are version controlled using git. Additions and corrections in the source XML itself can be accounted for easily. Each modification increases a predefined version counter (version
and versiondate
in the root element) of the document, which will be visible in the final report's footer. Thus, versioning is a core part of the entire system.
The entire print process is working smoothly, yet there are many areas of possible future improvements:
I would like to create an integrated environment for source and template file editing with live previews (I imagine Electron could come in very handy here). That way templates could be created and modified easily, and note takers could write their minutes with a real-time preview of the generated file.
PDF file generation takes about five seconds on standard off-the-shelf hardware. There is lots of space for performance improvements here. I would love to see the entire setup process hundred or thousand-page documents.
File type support for EPUB or Kindle (MOBI) documents would be an awesome improvement. A plug-in architecture could help extend the use cases of the entire process. I imagine projects like Calibre and Dog Tompson's Epub Creator could help tremendously here.
That's it. I will be happy to go into more specifics in a future post as well as publish the entire source of my setup online. If anyone is interested in the meanwhile, please drop me a line.