Link Checker Transformation

Last modified by Admin on 2024/05/01 22:34

link_breakVerifies if external URLs are valid and decorates them with parameters if not
TypeJAR
CategoryRendering Transformation
Developed by

XWiki Development Team

Rating
0 Votes
LicenseGNU Lesser General Public License 2.1
Bundled With

XWiki Standard

Compatibility

XWiki 3.3M1+

Installable with the Extension Manager

Description

Check also the Link checker Application to see how this technical transformation is used.

Checks validity of external links (URLs) and store the result in memory which can then be accessed through a Component or through a Script Service.

Here's how it works:

  • When the LinkChecker Transformation Component is first looked up, it spawns a LinkChecker Thread
  • All external links (URLs) found when rendering pages are put on a Queue in order to not impact the rendering process. Note that in order to prevent flooding the links are put on the queue if the queue doesn't already have the max number of items on it (10 by default). This is to prevent flooding.
  • The LinkChecker Thread is a low priority Thread that checks if there are links to verify on the Queue, take them one by one and verify the links by calling them and storing the response code of the HTTP request in memory. The check only processes links from pages that are not in the exclude list. This can be configured in xwiki.properties using:
    #-# [Since 5.3RC1]
    #-# List of document references that are excluded from link checking, specified using regexes.
    #-# the default configuration is:
    # rendering.transformation.linkchecker.excludedReferencePatterns = .*:XWiki\.ExternalLinksJSON
  • The LinkChecker Thread only rechecks a given link after a certain amount of time has elapsed (1 hour by default). This timeout can be configured in xwiki.properties using:
    #-# [Since 3.3M2]
    #-# Defines the time (in ms) after which an external link should be checked again for validity.
    #-# the default configuration is:
    # rendering.transformation.linkchecker.timeout = 3600000
  • When an external link is checked and a response code < 200 or > 299 is returned then an event of type org.xwiki.rendering.transformation.linkchecker.InvalidURLEvent is sent and the passed Event data is a Map containing the following key/values:
    • url: the link reference
    • source: the reference to the source where the link was found or "default" if not source reference was found
    • state: a org.xwiki.rendering.transformation.linkchecker.LinkState containing the response code and the last checked time

Accessing Link states from a Component

Example:

...
<dependency>
 <groupId>org.xwiki.rendering</groupId>
 <artifactId>xwiki-rendering-transformation-linkchecker</artifactId>
 <version>3.3-milestone-1</version>
</dependency>
...
...
@Inject
private LinkStateManager linkStateManager;
...
public void someMethod()
{
    Map<String, Map<String, LinkState>> linkStates = this.linkStateManager.getLinkStates();
}

Listening to links with problems

Here's an example in Groovy to listen to InvalidURLEvent events.

{{groovy}}
import groovy.util.logging.*
import org.xwiki.observation.*
import org.xwiki.observation.event.*
import org.xwiki.rendering.transformation.linkchecker.*
import com.xpn.xwiki.web.*
import com.xpn.xwiki.*

@Log
class MyLinkListener implements EventListener
{
   def xwiki
   def context

    MyLinkListener(xwiki, context)
    {
        this.xwiki = xwiki
        this.context = context
    }

    String getName()
    {
       return "myLinkListener"
    }

    List<Event> getEvents()
    {
       return Arrays.asList(new InvalidURLEvent())
    }

    void onEvent(Event event, Object eventSource, Object data)
    {
       def url = eventSource.get("url")
       def source = eventSource.get("source")
       def state = eventSource.get("state")

        log.info("Error for {url} in ${source} - Response code: ${state.getResponseCode()} - Checked: ${String.format('%tF %<tT', state.getLastCheckedTime())}")
    }
}

// Register against the Observation Manager
def observation = Utils.getComponent(ObservationManager.class)
observation.removeListener("myLinkListener")
def listener = new MyLinkListener(xwiki, xcontext)
observation.addListener(listener)
{{/groovy}}

Adding custom Context Data to Link States

By default the Link Checker Transformation will save some information about the Link such as the link URL, the reference to the source that contains it and the HTTP response code when trying to navigate to the link. However it's possible to also save any other custom data to link states.

To do so you just need to provide a Component implementation for the role org.xwiki.rendering.transformation.linkchecker.LinkContextDataProvider

Here's an example that provides the XWiki URL that generated the check on the link. A valid use case for it is the following:

  • The IRCBot application can report on broken links
  • Imagine that a broken link is found and fixed. This broken link will still exist if you navigate to the page containing it in the version that had the issue (by using the rev URL parameter)
  • Thus your IRC Bot listener could for example ignore broken links when there's a rev parameter used in the Context URL
package org.xwiki.linkchecker.internal;

import java.util.Collections;
import java.util.Map;

import javax.inject.Named;
import javax.inject.Singleton;

import org.xwiki.component.annotation.Component;
import org.xwiki.rendering.transformation.linkchecker.LinkContextDataProvider;

/**
 * Adds the HTTP Request URL to the Link data for external links being checked for status.
 */

@Component
@Named("requesturl")
@Singleton
public class RequestURLLinkContextDataProvider implements LinkContextDataProvider
{
   @Override
   public Map<String, Object> getContextData(String linkURL, String contentReference)
   {
       // We know that the Request URL is set as the name of the current thread so we get it from there...
       return Collections.<String, Object>singletonMap("requestURL", Thread.currentThread().getName());
   }
}

Then, you can access this data by using:

...
@Inject
private LinkStateManager linkStateManager;
...
public void someMethod()
{
    Map<String, Map<String, LinkState>> linkStates = this.linkStateManager.getLinkStates();
    Map<String, Object> contextData = linkStates.get("some link ref").get("some content ref").getContextData();
    String requestURL = contextData.get("requestURL");
...
}

Or, if you're listening to the InvalidURLEvent event in an Event Listener:

...
   public void onEvent(Event event, Object source, Object data)
   {
        Map<String, Object> brokenLinkData = (Map<String, Object>) source;
        Map<String, Object> contextData = (Map<String, Object>) brokenLinkData.get("contextData");
        String requestURL = contextData.get("requestURL");
...
   }

Prerequisites & Installation Instructions

We recommend using the Extension Manager to install this extension (Make sure that the text "Installable with the Extension Manager" is displayed at the top right location on this page to know if this extension can be installed with the Extension Manager).

You can also use the manual method which involves dropping the JAR file and all its dependencies into the WEB-INF/lib folder and restarting XWiki.


This transformation is not active by default. 

To activate it, edit WEB-INF/xwiki.properties, look for a property named rendering.transformations and make sure it's uncommented and contains the value linkchecker

For example:

rendering.transformations = macro, icon, linkchecker

If you activate the Link Checker transformation you need to be aware that in a multiwiki setup it will be possible for a script located in any subwiki to see all link statuses of all subwikis. Since this could be a privacy issue you need to be sure you're ok with this. We're planning to find a solution for this in the near future.

Make sure you restart your XWiki instance for the change to take effect.

Dependencies

Dependencies for this extension (org.xwiki.rendering:xwiki-rendering-transformation-linkchecker 16.3.0):

  • org.xwiki.commons:xwiki-commons-script 16.3.0
  • org.xwiki.commons:xwiki-commons-observation-api 16.3.0
  • org.apache.httpcomponents:httpclient 4.5.14
  • org.apache.commons:commons-lang3 3.14.0
  • org.xwiki.rendering:xwiki-rendering-api 16.3.0

Get Connected