[rfc][icedtea-web] get rid of custom@@ markup for documentation

Jacob Wisor gitne at gmx.de
Wed Oct 15 20:21:19 UTC 2014

On 10/15/2014 at 09:57 AM, Jiri Vanek wrote:
> On 10/14/2014 09:02 PM, Jacob Wisor wrote:
>> On 10/14/2014 at 05:35 PM, Jiri Vanek wrote:
>>> This patch is replacing all usages by @BOLD...@ by html equivalents <b> or </B>
>>> (case does not meter, nor do spaces inside)
>>> I don't have better idea how to get rid of it:(
>>> There was one more note to this - to include some escaping - I'm not going to
>>> implement it  until it is needed,
>> In my experience, this kind of statement is usually a strong indicator of a
>> bad personal attitude to
>> work. Your argument can be applied almost every time and everywhere. You are
>> delivering incomplete
>> work which is practically useless and in the end causes more problems than
>> solves.
>>> but if it will be needed, then I would go y
>>> this same way - have html escaping in properties, and get rid of it in
>>> ReplacingFormatter.
>>> The docs looks same with this patch, as they looked before.
>> :-\ This is a waste of energy because it is no better than before. I will not
>> even try to have a
>> look at it.
>> How about having methods that strip different kinds of formatting and return
>> plain text at runtime.
>> E.g.
>> .SH Man Title
>> .br -> stripMan() -> Man Title
>> or
>> <h1>XHTML title</h1>/<br/> -> stripXHTML() -> XHTML Title
>> Are there any problems with this approach, except for the fact that the
>> formatting would need to be
>> detected or stored as meta data either in the properties files or in code?
>> Another approach would be to have plain text and formatted text properties in
>> different name spaces,
>> like JavawsParamName.doc.man or JavawsParamName.doc.html for JavawsParamName, or
>> doc.html.JavawsParamName (or any other permutation you like, but consistent).
>> Although this would
>> "double" or maybe even "triple" some properties, it would at least be a clean
>> approach, consistent,
>> and keep all texts in *one* file, which is what you initially wanted to have
>> since you embarked on
>> your "Great Documentation Generator Endeavor".
>> Yet another approach would be to accept only HTML formatted code in the
>> property files and have it
>> converted to man or what ever document format when generated. It should be
>> pretty easy to strip HTML
>> tags from strings in Java. ;-)
> uh... this is exactly what the aptch was doing...???...

No, it does not. This would require a HTML validator, or at least calls for one. 
If we set out to accept only HTML in message property files then we should also 
have a decent HTML validator test. The provided test does not test HTML but some 
very specific character sequences which /tend/ to be, almost by accident, a 
subset of valid HTML. And although I am not a strong proponent of software tests 
(for various reasons), I can see a great benefit to a proper and complete test 
in this case because we have no other way to enforce proper formatting of 
property values in message property files which in turn makes sure that the 
document generators will not break. So again, your approach to the problem is 
not holistic.

> diff -r df05d1de5af4 netx/net/sourceforge/jnlp/resources/Messages.properties
> --- a/netx/net/sourceforge/jnlp/resources/Messages.properties	Mon Oct 13 16:05:27 2014 +0200
> +++ b/netx/net/sourceforge/jnlp/resources/Messages.properties	Tue Oct 14 17:25:36 2014 +0200
> @@ -270,13 +270,13 @@
> […]
> -# policyeditor man (note, spaces (especially the one around @@ markup) are important due to man pages markup)
> -PEintro= - view and modify security policy settings for @BOLD_OPEN at javaws @BOLD_CLOSE at and the @BOLD_OPEN at browser plugin at BOLD_CLOSE@
> +# policyeditor man (note, spaces (especially the one around markup) are important due to man pages markup). Only bold  tag is now recognized by RepalcingTextFormatter.
> +PEintro= - view and modify security policy settings for <B>javaws </B>and the <B>browser plugin</B>

Please keep in mind that HTML is white space agnostic in the sense that it 
replaces multiple consecutive white spaces with one space character (U+0020) in 
content data before forwarding it to a rendering unit (like a browser or a 
document generator). So documents that rely on multiple consecutive white space 
characters for proper rendering may render content distorted.

> diff -r df05d1de5af4 netx/net/sourceforge/jnlp/util/docprovider/formatters/formatters/ReplacingTextFormatter.java
> --- a/netx/net/sourceforge/jnlp/util/docprovider/formatters/formatters/ReplacingTextFormatter.java	Mon Oct 13 16:05:27 2014 +0200
> +++ b/netx/net/sourceforge/jnlp/util/docprovider/formatters/formatters/ReplacingTextFormatter.java	Tue Oct 14 17:25:36 2014 +0200
> @@ -1,18 +1,25 @@
>  package net.sourceforge.jnlp.util.docprovider.formatters.formatters;
> […]
>  public abstract class ReplacingTextFormatter implements Formatter {
>      public static String backupVersion;
> -
> +    private static final String BOLD_OPEN_REGEX = "<{1}\\s*[Bb]{1}\\s*>{1}";
> +    public static final Pattern BOLD_OPEN_PATTERN = Pattern.compile(BOLD_OPEN_REGEX);
> +    private static final String BOLD_CLOSE_REGEX = "<{1}\\s*/{1}\\s*[Bb]{1}\\s*>{1}";
> +    public static final Pattern BOLD_CLOSE_PATTERN = Pattern.compile(BOLD_CLOSE_REGEX);

Storing static regex pattern Strings *and* compiled Patterns is a waste of 
memory. A regex pattern string can always be retrieved from a Pattern object by 
calling Pattern.pattern(). So you can drop the static Strings.
Besides, I think you do not need the "{1}" quantifiers here. A single character 
/is/ a single character. ;-)

Speaking of a HTML validator, I also think that we should be tolerant here and 
accept empty elements here too, like "<br/>" or "<br /> etc. Unless you insist 
that we accept pure HTML only (and no XHTML).


More information about the distro-pkg-dev mailing list