User Tools

Site Tools


beauty:2014:html2yaml

HTML2YAML

In short ...

We want to switch from the HTML files that contain the documentation about the ABINIT input variables, to a big YAML file (abinit_vars.yml), that has already been to a large extent generated automagically by python scripts, and that simply needs to be corrected. From that YAML file, new HTML files can be generated automatically, and one should check that such new HTML files correspond to what we want.

A first group of developers will correct the abinit_vars.yml file (on Wednesday 1 October), while another group of developers will check that the new HTML files correspond to what we want.

A more detailed explanation

The documentation about ABINIT input variables, as seen on the Web page http://www.abinit.org/documentation/helpfiles/for-v7.8/input_variables/keyhr.html is presently maintained as a set of .html files, see in the package, the directory doc/input_variables (one index file, called keyhr.html, and thirteen var*.html files). These .html files have already some structure, but they are not convenient to work with, and nothing can be done at present to automatically extract data from these (e.g. examining the default values, or the types, in view of automatic checking the coherence of the ABINIT sources with the ABINIT documentation).

YAML is as structured as HTML, and moreover is more human-readable than HTML. In the long run, ABINIT should rely on YAML instead of HTML for text files that should be both parsable and human-readable. This idea was discussed in Dinard, and then at several occasions, in different groups of developers.

This beautification is the occasion to switch from HTML to YAML, and, by the same token, clean the documentation about input variables.

What has already been done, and the problem

In order to convert the documentation about input variables from HTML to YAML, Yannick has written a series of python scripts, contained in doc/input_variables/YML_vars. You can read the README.txt to see the action of each script. You can even try to run them by yourself (btw you might need pyyaml http://pyyaml.org/wiki/PyYAML, or login on the testf machine of the test farm to run them)… You see that at the second step there is a problem … Indeed, some fields of the YAML file need specific types of data … which is not in the correct format in the present HTML files !

Examples of incorrect format

(1) The “vartype” field should have specified the type of the variable following the form :

[integer|real|complex] input_variable (dimensions)   

where (dimensions) is optional. On the contrary, very often, additional data has been mentioned, or double precision has mentioned (instead of “real”), or … many others variants exist that prevented automatic generation of the correct mention in the abinit_vars.yml file.

(2) The “default” fields should have contained typically a numerical value, perhaps preceeded by “*” (as a indefinite multiplier) or a number times “*” (as a definite multiplier). When there is no default value, the string “null” should have been used. Conditionals can be taken into account in the abinit_vars.yml file, but have to be introduced manually (see later).

Again, very often, something more verbose has been mentioned. As an example :

all 0.0's

cannot be treated by the python script, while

  *0.0

would have been treated easily.

Over the 667 input variables, the abi_dict2struct.py python script has identified 223 errors. The list of errors is contained in the file doc/input_variables/YML_vars/err.txt .

An already existing YAML file

Yannick has then gone further, by generating an “incomplete” file abinit_vars.yml , from ABINIT_structvariables.yml. This has been done in 2 steps:

  1. First, the script abi_structclean has generated the abinit_vars_new.yml. This steps “beautifies” the documentation with nice indentation for HTML and correct some links.
  2. Then, some corrections have been done in abinit_vars_new.yml to generate abinit_vars.yml. These corrections will be described as example later in this document.

Although being still incomplete, this file is however now suitable for several purposes :

  1. to generate automagically the .html files, using the abi_struct2html.py script ;
  2. to allow being viewed by the Abivars.jar Java Previewer, see doc/input_variables/YML_vars/README.txt
  3. to allow the AbinitGUI http://forum.abinit.org/viewtopic.php?f=24&t=2240 to benefit from it (to be released soon).
  4. it will be the basic file to be modified during the beautification, by hand or thanks to the Abivars.jar previewer.

Description of the work

So, our tasks will be :

  1. To correct the errors detected by the python script in abinit_vars.yml (see the err.txt file)
  2. To check that the final HTML files, generated by abi_struct2html.py are correct and contain the requested information, and to clean and correct the documentation associated with the variables

The task 1 “Correct” will be done in Louvain-la-neuve, at a coding party starting 5pm on Wednesday 1 octobre, with several developers. See section Correct for the share of work.

The task 2 “Check” will be done by other developer, at the earliest when the task 1 is finished. The developers will first to merge the trunk in their branch. See section Check for the share of work.

Correction of the errors

In the “err.txt” file, there are different types of errors, depending on the field on which the parser failed:

  • [error reading default]
  • [error reading dimensions] or [no dimension]
  • [unknown type]

In general, this means that you have to read the old value and try to make it “YML”-compliant in the YML file. In the YML file (abinit_vars.yml), you'll find the different fields “errordimensions”, “errordefault” and “errortype” containing the text that was under parsing when the error occured. Starting from this, you should try to put in “dimensions”, “defaultval” and “vartype” the corresponding value.

The errors are also printed in red in the generated HTML files.

If the parser fails, it may simply be due to “english”-human way of writing information that has not been understand. This is quite simple to tackle.

Sometimes, however, this might be more difficult:

  • For the moment, it is not possible to specify that the value depends on the way Abinit was compiled, so expressions depending on whether NetCDF is activated or depending on the compiler are not yet feasible in the YML format. Please put these expressions in the text description.
  • Sometimes, there are some long descriptions in default value or for dimensions. Two cases might occur:
    • The statements are useless, they should be removed. For example, reference to old version of Abinit and obvious statements should be removed.
    • The statements are commenting on the default value or dimensions, they can be used as comments. There are two special fields in variable commentdims and commentdefault that can be use for this purpose.

Specifications of the YML file

As values in the YML file, you can specify numbers, string, arrays, following the standard specification of YAML.

Pay attention to strings. If it is recognized as string directly, you don't need ticks (' '). Otherwise, you need to put ticks.

For example, if you want to use a link as a value, use [[varname]]. If you forget varname, it will be interpreted as a list, containing a list, containing the string varname and not a link !

However, in order to keep the informations in the documentation, we have also introduced other types.

!variable

Is the type that contains the other fields

!multiplevalue

This is the equivalent to the X*Y syntax in fortran.

  X*Y

will become

  !multiplevalue
    number : X
    value : Y

If X is null, it means that you want to do *Y (all Y)

!range

  !range
     start: 1
     stop: N

As a default value, it means that the default value is 1,2, … N

!valuewithconditions

This type allows to specify conditions on values:

!valuewithconditions
    defaultval: -[[diemix]]
    '70 < [[iprcel]] and [[iprcel]] < 80': '[[diemix]]'
    '[[iscf]]<10': '[[diemix]]'
    '[[iprcel]]==0': '[[diemix]]'

defaultval is the default value if no condition is fulfilled. As condition, please use strings with the most basic expressions, containing <, < =, >, >=, ==, !=, +, -, *, /, etc to allow for further simple parsing !

As a convention, we use “pythonic” way for expressions, so you can use “or”, “and” and “in” also as varname in [1,2,5] for example. If you need more advanced expressions, please contact Yannick !

!valuewithunit

This type allows to specify values with units:

 !valuewithunit
        units: eV
        value: 100.0

means “100 eV”.

Constraints between variables

In the YML file (and via the GUI), there are some constraints between variables that have been introduced.

You can specifiy “requires: CONDITION” and “excludes: CONDITION” in the YML file (or fill the fields requires and excludes in the GUI).

If a varname as “requires: CONDITION”, it means that the variable is only relevant when CONDITION is fulfilled.

If a varname has as “excludes: CONDITION”, it means that the specification of the variable in the input file forbids the CONDITION to be fulfilled.

Some of the constraints have been defined during the coding party of the 01/10/2014 (Correct), please check them carefully and verify that no information has been lost !

Who will do what ?

Correct

Each of the following developers treats a set of errors in the err.txt file (on Wednesday October 1st, 5pm). This set of errors is simply defined by the initial of the input variable in this err.txt file . To treat an error, the easiest is to use the GUI found in the doc/input_variables/YML_vars directory :

java -jar Abivars.jar

Developer in charge Range of initials of input variables
Jean-Michel Beuken [A-E]
Michiel Van Setten [F-I]
Xavier Gonze [J-M]
Aurélien Lherbier [N]
Samuel Poncé [O-Q]
Gian-Marco Rignanese [R-T]
Yannick Gillet [U-V]
Xavier Gonze [W-Z]

Presentation files for correction procedure

Check

Each developer treats one HTML documentation file, or more than one file, or a fraction of one file. The task consists in comparing visually (thanks to a browser !) the (old) HTML files in doc/input_variables and the new ones in doc/input_variables/YML_vars/html_output . If a discrepancy is observed, it has to be corrected in the file abinit_vars.yml thanks to the GUI (java -jar Abivars.jar issued in doc/input_variables/YML_vars ). This task will start ONLY when the previous task is completed … This is expected on Thursday 2 October morning. Merge of the newest trunk revision in you branch will be needed.

Of course, do not hesitate to correct spelling mistakes, or style, or any other problem in the documentation provided for the input variables.

In case of problem with the GUI, do not hesitate to contact Yannick or Xavier.

Developer in charge HTML files (with possibly, the range of initials for input variables)
Bernard Amadon vardev.html (H-Q)
Gabriel Antonius vargs.html (A-N)
Jordan Bieder vardev.html (A-G)
Fabien Bruneval vargw.html (A-M)
Michel Côté vargs.html (O-Z)
Muriel Delaveau varpar.html
Grégory Geneste varrlx.html (A-J)
Xavier Gonze vargeo.html & varw90.html
Vincent Gosselin varbas.html
François Jollet varpaw.html (Pawprtden-Z)
Jonathan Laflamme Janssen vargw.html (N-Z)
Alexandre Martin varrlx.html (K-Z)
Micael Oliveira varfil.html
Yann Pouillon vardev.html (R-Z)
Marc Torrent varpaw.html (A-Pawovlp)
Matthieu Verstraete varint.html
Bin Xu varrf.html
Joe Zwanziger varff.html

What you have to check carefully (besides re-reading and cleaning)

* Default values:

  • they should reproduce the Abinit behaviour
  • check indefo.F90, indefo1.F90, inkpts.F90, … routines if you hesitate
  • If a comment is really needed, put this into commentdefault ! (Comment in default field in the GUI)
  • Sometimes, we had some difficulties to express them in YML, so we might have use comments to express that problem !

* Dimensions:

  • check them carefully
  • if the internal representation is different, put this into commentdims ! (Comment in dimensions field in the GUI)

* Constraints: see above

* Description:

  • Clean the HTML : the equations might have been broken by the python beautifier, please modify so that the final HTML are right
  • The images links are broken because we changed folder (we are inside doc/input_variables/YML_vars/html_output) instead of doc/input_variables
  • Transform links to variables (<a href=“varname”>varname</a>) as [[varname]]

* Mnemonics: check that they correspond to the name of the variable

* Type : should be integer, real or string. There were some issues in the old HTML's !!!

Special variables

We have introduced a set of special variables that are needed to give conditions on the default value and on dimensions. You can use them as link to variables, for example [[NPROC]] for the number of processor.

Here is the current list of such special variables:

  • AUTO_FROM_PSP: Means that the value is read from the PSP file
  • CUDA: True if CUDA is enabled (compilation)
  • ETSF_IO: True if ETSF_IO is enabled (compilation)
  • FFTW3: True if FFTW3 is enabled (compilation)
  • MPI_IO: True if MPI_IO is enabled (compilation)
  • NPROC: Number of processors used for Abinit
  • SEQUENTIAL: True if the code is compiled in sequential

Links are also automatically generated towards a new HTML file ABINIT_specials.html.

Documentation

Problems, remarks, comments

If you encounter any problem, please contact Yannick in case of problem (yannick.gillet@uclouvain.be)

If you need to describe something and you're not able to do it using the current specifications, please contact Yannick !

Also, you can send remarks concerning the GUI, the YML file and/or the HTML files.

By the way, the HTML files inside $YML_VARS/html_output/ are used for the beautification. The effort has been done concerning the content and not really the design, that's why they don't look that nice yet !

beauty/2014/html2yaml.txt · Last modified: 2014/11/05 22:53 by Yann Pouillon