Oh no, my URLs disappeared…(and how to get them back)

Recently we got a couple of complains about new Web Content article behaviour, specifically about the JournalArticle.getContent() method’s return value. The main problem developers experience is when they embed an Image into the Web Content article or use ddm-image or ddm-document-library field in their structures they expect to see the URL of the object(Image or D&M asset) in the raw XML when using JournalArticle.getContent() method, it actually was there in the 7.0 and the raw XML looked like this:

(...)

<dynamic-element 
    name="Image8r1v" type="image" index-type="text" instance-id="ryns">
        <dynamic-content 
            language-id="en_US" alt="" name="blonde.png" title="blonde.png" 
            type="journal" fileEntryId="34506" id="34835">
                 /image/journal/article?img_id=34835&amp;t=1531817578959
         </dynamic-content>
</dynamic-element>

(...)

 

There are two main differences in the 7.1:
We switched from the internal table JournalArticleImage to the common Documents and Media repository as a storage for the Web Content article images
DDM fields for Image and D&M assets changed their internal representation from the URL to the JSON object

Now the raw XML of the article with Images or ddm-image(ddm-documentlibrary) fields looks like this:

(...)


<dynamic-element 
    name="Image54q7" type="image" index-type="text" instance-id="wscg">
        <dynamic-content language-id="en_US">
            <![CDATA[\{
                 "groupId":"20124","name":"allatonce.png","alt":"",
                 "title":"allatonce.png","type":"journal",
                 "uuid":"80269faa-dea9-fd5a-cb78-3c7aa9da51ea",
                 "fileEntryId":"36774","resourcePrimKey":"36772"}
            ]]>
        </dynamic-content>
</dynamic-element>


(...)

 

It was an internal decision and we didn’t realize that out there could be developers who actually use the raw XML content for their own needs…

First I would like to explain why it was done, not to try to excuse for this case, but to prevent such cases in the future. On the one hand, JournalArticle.getContent() method is a public API and its behaviour must be, at least, backward compatible, but on the other hand its behaviour depends on many components behind it, the signature of the method didn’t change and the implementation details (including the raw XML format of the content) never were published. To avoid such a problem we strongly recommend the developers to use published means for Web Content processing, such as JournalContent and JournalArticleDisplay. Both of them provide processed content of the Article without need to work with the raw XML, clear example can be found in the Web Content Display portlet:

(...)

JournalArticleDisplay articleDisplay = _journalContent.getDisplay(
article, ddmTemplateKey, viewMode, languageId, page,
         new PortletRequestModel(renderRequest, renderResponse), themeDisplay);

String processedContent = articleDisplay.getContent();

(...)

@Reference
private JournalContent _journalContent;

(...)

 

Also there is a taglib which allows to render a specific journal article using its JournalArticleDisplay instance:

<liferay-journal:journal-article-display articleDisplay="<%= articleDisplay %>" />

 

Or the developer can use JournalContent.getContent() method directly, the result must be the same - processed content where all the fields behave as expected.
Now let’s talk about how to get the URLs back because I understand that it could be a problem to refactor hundreds of lines of your code and the best way for the developers who use the raw XML would be to keep processing the URLs as they were doing it before.
Here I have to mention one detail - there is no way to return to the old format of the URLs for embedded images, so if you have some sort of regular expression catching “/image/journal/article...” - there is no way to make it work again.
There are two options to get the URLs back, both need to adapt your existing code, which works with the raw XML, a little bit.

First option is applicable when you have a concrete file entry ID:

(...)

// here fieldValue is raw XML field value for your Image/DM field
JSONObject jsonObject = JSONFactoryUtil.createJSONObject(fieldValue);

long fileEntryId = jsonObject.getLong("fileEntryId");
FileEntry fileEntry = PortletFileRepositoryUtil.getPortletFileEntry(fileEntryId);

String fileEntryURL = PortletFileRepositoryUtil.getDownloadPortletFileEntryURL(
themeDisplay, fileEntry, StringPool.BLANK);

(...)


And the second option is applicable in case when you don’t have specific file entry ID, but have UUID and group ID of the target entry:

(...)

// here fieldValue is raw XML field value for your Image/DM field
JSONObject jsonObject = JSONFactoryUtil.createJSONObject(fieldValue);

String fileEntryGroupId = jsonObject.getLong("groupId");
String fileEntryUuid = jsonObject.getLong("uuid");

FileEntry fileEntry = PortletFileRepositoryUtil.getPortletFileEntry(
fileEntryUuid, fileEntryGroupId);

String fileEntryURL = PortletFileRepositoryUtil.getDownloadPortletFileEntryURL(
themeDisplay, fileEntry, StringPool.BLANK);

(...)

 

Hope these 5 lines of code help you to solve the problem. We understand that it could be frustrating to deal with such changes and we are trying to do our best to avoid them without actual need.
 

Blogs

I *love* when someone from the development team writes about their achievements and changes - this one is particularly valuable - Thank you Pavel. I can't go without saying that a few words on https://docs.liferay.com/ce/apps/web-experience/latest/javadocs/com/liferay/journal/model/JournalArticleModel.html#getContent-- would help tremendously as well ;) Is it ok for Javadoc to link to blog articles? This would be a prime case for exactly that: give some "implementation dependent" content warning there and link to this article for the details and steps to solve the problem. Thank you for providing the detailed steps.

In my opinion the flaw is in the design. We developers should actually never care about the internal data storage. We shouldn't even "know" that the webcontent is stored as xml.

But we need it.

 

First, when we import content. It is not uncommon and I had to do it several times. Import some content from somewhere and store it as webcontent. In that case, we needed to create the xml "manually". Not too hard, but why do I have to know the internal representation of the data?

 

Secondly in ADTs: The tags to render webcontent with a given template were introduced only last year (if I recall correctly). We have several old templates that use the "parse xml" approach, but in some cases it was also the easiest approach to just parse the xml (instead of rendering the content).

There are several usecases, where rendering the content just doesn't cut it. To give an example:  In a webcontent template you just get the fields in the current language. You just can't access fields in other languages.

 

Third, trough /api/jsonws. Reading the content gives us the xml.

Well, IMHO that rest api is IMHO quite broken anyway and really hard/impossible to use except for the most basic usecases.

 

 

If you really want to fix it:

Please introduce a nice "data transfer object" (dto) for webcontent. You already do that to some degree with fields (in the webcontent itself, forgot the class name).

 

We developers should only use said DTO to create/read/update webcontent. That would actually help all of us. You will be free to do whatever you want internally. And we will have a nice object to work with instead of fiddling with the xml content ...

 

I strongly support this idea, we should not be aware of what is the internal article storage mechanism. As Christoph mentioned, there could be an API that allows a developer (both java and freemarker) to manipulate the article contents.

 

It would be also very helpful if the API would allow to explicitly control how the domain:port/context URLs are generated within the rendered article, without having to deal with Request and Response objects (however the base method that includes Request and Response should remain in place as it is still used in most cases).

At this moment default json api has no resource path too. I have just one question: is this the same reason for no resource addresses in json service response (available on /api/jsonws)? Sorry but in this form, default json api is not useful for sharing content e.g. for mobile.

Thanks for the article. how to do that with velocity? I have an asset publisher ADT that I want to move it to 7.1.

 

I use to write:            

 

#set ($banner_image_val = $banner_image.selectSingleNode($root_element).getStringValue())

 

and I get the URL for the image. now I get:

 

{"groupId":"20126","name":"DXP.jpeg","alt":"","title":"DXP.jpeg","type":"document","uuid":"313d1448-fcb0-1735-03c9-795ae17d8331","fileEntryId":"48416","resourcePrimKey":"48566"}

Here is a snippet to get the url in a Freemarker ADT: https://bitbucket.org/snippets/mrg3kko/pebq6x/liferay-71-adt-get-download-url-for-a

Dear Erik,

Thank you so much for this snippet. Coming from 6.2 to 7.3 hit me hard. I couldn't find any example like that in the LifeRay documentation for days. I owe you a beer.

For those who are struggling like me, do not forget to remove the serviceLocator from the restricted variables in Control Panel > Configuration > System settings > Platform > Template Engine

 

I guess this also has to do with why when using document-library fields in freemarker, that .data no longer returns the URL?  .getData() does, but unfortunately we are using the .data syntax which used to work before

Great! It took a few hours analyzing what was the best way to replace this part. Thanks Pavel for the solution and for the information provided about the change. Some of us have encountered these types of problems in upgrade 6.2 > 7.2 I think that what Olaf says about inform in javadoc is a very good idea.