P.S.: RFR [9] 8133651: automated replacing of old-style tags in docs

Martin Buchholz martinrb at google.com
Tue Sep 29 21:23:41 UTC 2015

Hi Alexander,

your change looks good.  It's OK to have manual corrections for automated
mega-changes like this, as long as they all revert changes.

Random comments:

Should you publish your specdiff?  I guess not - it would be empty!

            while((s = br.readLine()) != null) {

by matching only one line at a time, you lose the ability to make
replacements that span lines.  Perlers like to "slurp" in the entire file
as a single string.

        s = s.replace( "<CODE>", tag1);
        s = s.replace( "<Code>", tag1);
        s = s.replace("</CODE>", tag2);
        s = s.replace("</Code>", tag2);

Why not use case-insensitive regex?

Here's an emacs-lisp one-liner I've been known to use:

(defun tt-code ()
  (query-replace-regexp "<\\(tt\\|code\\)>\\([^&<>\\\\]+\\)</\\1>" "{@code

With more work, one can automate transformation of embedded things like <

But of course, it's not even possible to transform ALL uses of <code> to
{@code, if there was imaginative use of nested html tags.

On Tue, Sep 29, 2015 at 3:21 AM, Alexander Stepanov <
alexander.v.stepanov at oracle.com> wrote:

> Updated: a few manual corrections were made (as @linkplain tags displays
> nested {@code } literally):
> http://cr.openjdk.java.net/~avstepan/tmp/codeTags/jdk.patch
> -checked with specdiff (which of course does not cover documentation for
> internal packages), no unexpected diffs detected.
> Regards,
> Alexander
> On 9/27/2015 4:52 PM, Alexander Stepanov wrote:
>> Hello Martin,
>> Here is some simple app. to replace <code></code> tags with a new-style
>> {@code } one (which is definitely not so elegant as the Perl one-liners):
>> http://cr.openjdk.java.net/~avstepan/tmp/codeTags/SimpleTagEditor.java
>> Corresponding patch for jdk and replacement log (~62k of the tag changes):
>> http://cr.openjdk.java.net/~avstepan/tmp/codeTags/jdk.patch
>> http://cr.openjdk.java.net/~avstepan/tmp/codeTags/replace.log
>> (sorry, I have to check the correctness of the patch with specdiff yet,
>> so this is rather demo at the moment).
>> Don't know if these changes (cosmetic by nature) are desired for now or
>> not. Moreover, probably some part of them should go to another repos (e.g.,
>> awt, swing -> "client" instead of "dev").
>> Regards,
>> Alexander
>> ----- Исходное сообщение -----
>> От: alexander.v.stepanov at oracle.com
>> Кому: martinrb at google.com
>> Копия: core-libs-dev at openjdk.java.net
>> Отправленные: Четверг, 24 Сентябрь 2015 г 16:06:56 GMT +03:00 Москва,
>> Санкт-Петербург, Волгоград
>> Тема: Re: RFR [9] 8133651: replace some <tt> tags (obsolete in html5) in
>> core-libs docs
>> Hello Martin,
>> Thank you for review and for the notes!
>>   > I'm biased of course, but I like the approach I took with
>> blessed-modifier-order:
>>   > - make the change completely automated
>>   > - leave "human editing" for a separate change
>>   > - publish the code used to make the automated change (in my case,
>> typically a perl one-liner)
>> Automated replacement has an obvious advantage: it is fast and massive.
>> But there are some disadvantages at the same time (just IMHO).
>> Using script it is quite easy to miss some not very trivial cases, e.g.:
>> - remove unnecessary linebreaks, like
>>    * <tt>someCode
>>    * </tt>
>> (which would be better to replace with single-line {@code someCode};
>> - joining of successive terms, like "<tt>ONE</tt>, <tt>TWO</tt>,
>> <tt>THREE</tt>" -> "{@code ONE, TWO, THREE}";
>> - errors like extra or missing "<" or ">": * <tt>Collection
>> <T></tt>", - there were a lot of them;
>> - some cases when <tt></tt> should be replaced with <code></code>, not
>> {@code } (e.g. because of unicode characters inside of code etc.);
>> - extra tags inside of <tt> or <code> which should be moved outside of
>> {@code }, like <tt><i>someCode</i></tt> or <tt><b>someCode</b></tt>;
>> - simple removing of needless tags, like "<tt>{@link ...}</tt>" ->
>> "{@link ...}";
>> - replace HTML codes with symbols ('<', '>', '@', ...)
>> - etc.
>> - plus some other formatting changes and fixes for misprints which would
>> be omitted during the automated replacement (and wouldn't be done in
>> future manually because there is no motivation for repeated processing).
>> So sometimes it may be difficult to say where is the border between
>> "trivial" and "human-editing" cases (and the portion of "non-trivial
>> cases" is definitely not minor); moreover, even the automated
>> replacement requires the subsequent careful review before publishing of
>> webrev (as well as by reviewers who probably wouldn't be happy to review
>> hundreds of files at the same time) and iterative checks/corrections.
>> specdiff is very useful for this task but also cannot fully cover the
>> diffs (as some changes are situated in the internal com/... sun/...
>> packages).
>> Moreover, I'm sure that some reviewers would be annoyed with the fact
>> that some (quite simple) changes were postponed because they are "not
>> too trivial to be fixed just now" (because they will suspect they would
>> be postponed forever). So the patch creator would (probably) receive
>> some advices during the review like "please fix also fix this and that"
>> (which is normal, of course).
>> So my preference was to make the changes package by package (in some
>> reasonable amount of files) not postponing part of the changes for the
>> future (sorry for these boring repeating review requests). Please note
>> that all the above mentioned is *rather explanation of my motivation
>> than objection* :) (and of course I used some text editor replace
>> automation which is surely not so advanced as Perl).
>>   > It's probably correct, but I would have left it out of this change
>> Yes, I see. Reverted (please update the web page):
>> http://cr.openjdk.java.net/~avstepan/8133651/jdk.00/index.html
>> Thanks,
>> Alexander
>> P.S. The <tt> replacement job is mostly (I guess, ~80%) complete. But
>> probably this approach should be used if some similar replacement task
>> for, e.g., <code></code> tags would be planned in future (there are
>> thousands of them).
>> On 9/24/2015 6:10 AM, Martin Buchholz wrote:
>>> On Sat, Sep 19, 2015 at 6:58 AM, Alexander Stepanov
>>> <alexander.v.stepanov at oracle.com
>>> <mailto:alexander.v.stepanov at oracle.com>> wrote:
>>>      Hello,
>>>      Could you please review the following fix
>>>      http://cr.openjdk.java.net/~avstepan/8133651/jdk.00/
>>>      <http://cr.openjdk.java.net/%7Eavstepan/8133651/jdk.00/>
>>>      http://cr.openjdk.java.net/~avstepan/8133651/jaxws.00/index.html
>>>      <http://cr.openjdk.java.net/%7Eavstepan/8133651/jaxws.00/index.html
>>> >
>>>      for
>>>      https://bugs.openjdk.java.net/browse/JDK-8133651
>>>      Just another portion of deprecated <tt> (and <xmp>) tags replaced
>>>      with {@code }. Some misprints were also fixed.
>>> I'm biased of course, but I like the approach I took with
>>> blessed-modifier-order:
>>> - make the change completely automated
>>> - leave "human editing" for a separate change
>>> - publish the code used to make the automated change (in my case,
>>> typically a perl one-liner)
>>>      The following (expected) changes were detected by specdiff:
>>>      - removed needless dashes in java.util.Locale,
>>>      - removed needless curly brace in xml.bind.annotation.XmlElementRef
>>> I would do a separate automated "removed needless dashes" changeset.
>>>      Please let me know if the following changes are desirable or not:
>>> http://cr.openjdk.java.net/~avstepan/8133651/jdk.00/src/jdk.jconsole/share/classes/sun/tools/jconsole/Formatter.java.udiff.html
>>>      <
>>> http://cr.openjdk.java.net/%7Eavstepan/8133651/jdk.00/src/jdk.jconsole/share/classes/sun/tools/jconsole/Formatter.java.udiff.html
>>> >
>>> This is an actual change to the behavior of this code - the
>>> maintainers of jconsole need to approve it.  It's probably correct,
>>> but I would have left it out of this change. If you remove it, then I
>>> approve this change.

More information about the core-libs-dev mailing list