Confluence XML

Last modified by Admin on 2024/09/16 00:04

cogFilter stream extension to parse a Confluence XML package
TypeJAR
Category
Developed by

Thomas Mortagne, XWiki Development Team

Active Installs131
Rating
0 Votes
LicenseGNU Lesser General Public License 2.1

Installable with the Extension Manager

Description

Filter module used to read a Confluence XML package. Allow for example to import it into an XWiki instance.

In an XWiki instance it's generally used through Filter Streams Converter Application which needs to be installed separately. Select it as input module and select the output module you want (the instance module to import the confluence package in the current instance for example).

Warning

Importing a Confluence instance is a multi-step process. This tool allows to import an export from Confluence into XWiki pages, but additional specific actions need to be handled, including supporting standard or custom Confluence macros. For large instances, some specific issues might occur, such as too long page names or hierarchies that cannot be imported or unsupported macros.

XWiki SAS, a sponsoring company of the XWiki Open Source software, is providing software and services to help with the migration:

  • The paying macro package has been released and is available on the XWiki SAS store: https://store.xwiki.com/xwiki/bin/view/Extension/ProMacros/
  • XWiki SAS is working on a migration package to handle all steps of the migration in one application. This package is currently in development.
  • XWiki SAS can provide services to analyse a Confluence instance and study what is needed to migrate it (which package, which process, what macros are needed).

You can contact XWiki SAS on the XWiki SAS web site: https://xwiki.com

Tutorial

To get started with the Confluence Import, you will need to install the Filter Streams Converter Application and this Confluence XML module. You can install these apps using the Extension Manager.

You then need to export your confluence data from the Confluence administration and upload the zip file to the XWiki server. Make sure you make it accessible to the user running your java process.

extensions.png

After you have installed the two extensions, click on the Filter Stream Converter entry from the Applications panel.

appbar.png

Follow these steps on the screenshot below:

  • Choose the "Confluence XML input stream (confluence+xml)" input type
  • Fill in the source field which contains "file:" followed by the path of the Confluence zip file, located on the machine where XWiki is running
  • Choose the "XWiki instance output stream (xwiki+instance)" output type to import the Confluence pages in your wiki

Note: in some cases, the Java Application server cannot access all the directories from your computer or server. You can try storing the file in a directory used by the application server (root directory of the application server or log directory). Alternatively it is also possible to make your file accessible on a web server and indicate the URL of that file.

filterconverter.png

After you have completed these steps, click the "Convert" button. After that, you will see the conversion progress. For example:

WSmess.png

For example with the following initial page from Confluence:

Confluencepage.png

You'll get the following page in XWiki after you've made the import:

WikiPage.png

You're all set! emoticon_smile

Macro Conversion

The following macros are converted automatically:

In addition, the importer converts the Confluence macro syntax to the macro syntax for the default syntax defined in your XWiki instance. To avoid collisions with existing XWiki macros which have very little chance to behave as expected macro unknown macros coming from the confluence package are prefixed by default with "confluence_" (it's possible to change the prefix in the filter properties) and an error message will appear saying that the macro cannot be found. You then have several choices:

  • Edit the page and manually select an existing XWiki macro to use.
  • Create a macro of the same name.

To help with cases where there is no direct equivalent in XWiki Standard, XWiki SAS, a sponsoring company of XWiki, has released the paying "Pro Macros" package which supports a set of Confluence Macros. See the "Warning" section in this document. In particular, if you see a "layout macro" error, this package will be very useful.

Macro Equivalents

This section is meant to help user pick the closest existing macro in XWiki to match Confluence ones.

  • <add here>

Extending Macro Conversion

You can implement a MacroConverter XWiki Component that will we used by this Confluence importer when performing the macro conversions. For that you need to write a component which implements the org.xwiki.contrib.confluence.filter.MacroConverter with a component hint named after the confluence macro id you wish to convert automatically.

Events

As of version 9.21.0, two events are emitted ConfluenceFilteringEvent and ConfluenceFilteredEvent. ConfluenceFilteringEvent is sent after the package has been read, but before the actual filtering has begun. ConfluenceFilteredEvent is sent after the filtering is done but before closing the package. Both the events carry the confluence package as data.

Reusable tool to parse and analyze a Confluence package

The main job of this extension is to provide an input filter to convert a Confluence package into something else, but it's also exposing its parsing and analyzing tooling as an API.

To parse a Confluence package, you can inject and use the component org.xwiki.contrib.confluence.filter.input.ConfluenceXMLPackage:

    @Inject
   private Provider<ConfluenceXMLPackage> confluencePackageProvider;

   public void analyze()
   {
     // Create a new instance of ConfluenceXMLPackage component
     ConfluenceXMLPackage confluencePackage = this.confluencePackageProvider.get();

     // Parse the packaged located on the file system (but support any org.xwiki.filter.input.InputSource which lead to a zip content or a directory)
     confluencePackage.read(new DefaultFileInputSource(new File("path/to/the/confluencepackage.xml.zip")));

     // Call the various getters of ConfluenceXMLPackage
   }

Input parameters documentation

The confluence XML package provides many parameters to customize a confluence import.

Parameter nameDescriptionExample valueDefault value
Import archived documentsConfluence exports can contain archived documents. XWiki doesn't have a concept of archived document. This parameter lets you optionally import archived documents as regular XWiki documents instead of ignoring them. false
Import archived spaces.Confluence exports can contain archived spaces. XWiki doesn't have a concept of archived space. This parameter lets you optionally import archived spaces  as regular XWiki spaces instead of ignoring them. false
Import attachmentsIf you don't want to import document attachments, set this to false. true
Base URLs

The list of base URLs leading to the Confluence instance. They are used to convert wrongly entered absolute URLs into wiki links. The first URL in the list will be used to compute page URLs used in the conversion report if the the 'Store Confluence details' property is used.

This parameter is used to convert (fix) absolute URLs present as links in the documents that will get imported. The bases of this URLs should be input in this field so that they are converted from the Confluence to the XWiki links. For example, adding in this field www.<myconfluence>.com/wiki/spaces will convert absolute links such as www.<myconfluence>.com/wiki/space/KEY/page into www.<myxwiki>.com/bin/view/KEY/page.

 N/A
Blog Space name

This field defines the name of the space under which blog posts will be imported. By default, a "Blog" space will be created under "SPACE KEY".

blog-default-location.png

 Blog
Import blog posts

This field decides whether to import blog posts or not. By leaving the default value "true", blog posts will be imported. By changing the value to "false", blog posts will not be imported. 

 true
Cleanup mode

The mode to use for cleaning up temporary files produced when parsing the Confluence package.

  • SYNC: clean up right after the filter stream is done.
  • ASYNC: same, but asynchronously.
  • NO: don't clean up at all
 

SYNC

ASYNC in the Confluence Migrator Pro Application

Produce rendering events for the contentParse the content to produce rendering events (if the output filter supports them).

This is needed in very specific conditions. We do not recommend modifying this parameter unless you know what you are doing.

 false
Import contents

This parameter defines whether to import the body and set content of regular documents and blog posts (if blog posts are imported). We do not recommend modifying this parameter. 

 true
XWiki Conversion

This parameter defines whether to convert

  • user, space and document references from the Confluence names to the XWiki names. This includes user id mapping and group mapping described below, which will not be applicable if this parameter is set to false.
  • The confluence syntax to XWiki syntax

This is needed in very specific conditions. We do not recommend modifying this parameter unless you know what you are doing. In particular, XWiki may not be able to render your imported documents and links will probably be broken if you disable this.

 true
Default locale

This parameter defines the locale that will be used for the imported documents.

Usage example: you have a xwiki instance that you want to be localized with both en and fr. The default locale for your instance is en. You have a Confluence instance that has its content in French. You import from Confluence to XWiki and all the created documents will have their locale set to your xwiki default locale (en) and the content brought from Confluence wwill be in French. Ideally, you set the parameter default locale to "fr" so all the imported documents will have their locale equal to "fr". Now, if you want to also have an English version of the documents, you create it and translate it.

frN/A
Page name validation

This parameter defines whether the pages should be validated against and converted using XWiki's current page naming strategy. This works if the XWIKI CONVERSION field is set to "true".

This is needed in very specific conditions. We do not recommend modifying this parameter unless you know what you are doing. In particular, XWiki may not be able to render your imported documents and links will probably be broken if you disable this.

 true
Excluded pages

List in this field the Confluence pages to ignore from the import. The format is a comma separated list of page IDs where each number is a page id.

See also the Ranges and the Included pages parameters.

543234,123123,65423
Group Format

The group format to use to transform a Confluence group names to XWiki group names. String ${group} will be replaced with the group Confluence name; String ${group._clean} same with the special characters removed.| ${group._clean}Group |

Group name mapping

This field offers the option to specify a list of A=X relations where A is a Confluence group name and X is an XWiki group name. A will be renamed to X. These sets are separated using the pipe character (|). If several Confluence groups are mapped to the same XWiki group, the groups will be merged: users of all the Confluence groups will be added to the given XWiki group. For instance, with A=X|B=X, users in Confluence group A and in Confluence group B will be added to XWiki group X. If X is empty, the group will be discarded.

This parameter is used for two things:

  • Group and user imports
  • Permission migration (the Confluence permission applying to Confluence group A will be translated and will apply to XWiki group X).

Also note the existence of the "group name prefix" and the "group name sufix" output filter stream parameters that apply on top of this mapping.

 system-administrators=XWikiAdminGroup|site-admins=XWikiAdminGroup|administrators=XWikiAdminGroup|users=XWikiAllGroup|confluence-users=XWikiAllGroup|_licensed-confluence=|confluence-administrators=XWikiAdminGroup
Import historySet this to false if you want to  discard previous revisions of documents (for performance, space or import speed concerns) true
Home redirectWhen non nested import is used, home pages are renamed so they can be the home page of spaces in XWiki. If set to true, redirects are output so links to these pages are not broken. When nested import is used, this parameter is ignored. true
Included pages

This field allows you to specify pages that should be imported. The format is a comma separated list of page IDs where each number is a page id.

See also the Ranges and the Excluded pages parameters.

543234,123123,65423 
Macro content syntax

This parameter defines the target syntax to be used.

This is needed in very specific conditions. We do not recommend modifying this parameter unless you know what you are doing

 N/A
Max Page countif you want to limit the number of imported pages, set this to the desired number. -1 disable any limitation -1
Import non-blog content

This field defines whether to import non-blog contents (normal documents) or not. When this parameter and the IMPORT BLOG POSTS are set to "true", both regular pages and blog posts get imported. Set this parameter to "false" if you wish to import only blog posts present in an export package. 

 true
Object ID ranges

Ranges of Confluence objects to read.

Can be used to restore an interrupted migration.

Several comma-separated ranges can be given. Note that the order used for these ranges are not increasingly big ids, but in the order they are processed by the Confluence module. This order may change between versions of the parser, but is guaranteed to be the same between different runs using the same version of the Confluence module. Ranges must not overlap. Overlapping ranges are not supported, may lead to surprising results and their behavior is not guaranteed to be stable. In the same vain, ranges must be ordered in the parsing order.

  • [4242,] - only read object id 4242 and all the following ones
  • (4242,] - same, but exclude object id 4242
  • [,4242] - read all objects until object id 4242 included
  • [,4242) - same, but exclude 4242
  • [4242,2424], [3456,1234] - read objects between 4242 and 2424 both included, then ignore objects until 5656 and read objects between 5656 and 1234 both included (notice how IDs may look disordered)
 
Prefixed macros

This field stores an allowlist of macros that should be prefixed. A few macros exist both in XWiki and in Confluence under the same name. In order to allow the usage of the bridge macros (dedicated to displaying content in the same manner as it was in Confluence), those macros should be prefixed so that the bridge macros (e.g. "confluence_gallery") are used and not the original XWiki macro (e.g. "gallery"). We do not recommend modifying this parameter. 

 attachments,gallery,chart
Import rights

This parameter definer whether permissions set in Confluence should be migrated into XWiki.

This parameter definer whether permissions set in Confluence should be migrated into XWiki.

Importing users and groups is not mandatory; rights will be imported without. However, if you disable user or group import to import them using another method, for the correct rights to apply to the correct users and groups, you will need to create the users and the groups with the exact same names that are in Confluence, or that are specified in the User id mapping and Group name mapping parameters (as these two parameters will be respected by the right migration even if groups and users are not imported).

 

 true
Root space nameIf you want content to be put in a specific space instead of at the wiki root, set this to the name of this spaceMigrated 
Title spaces from their home page

Title spaces using the Confluence home page titles instead of the Confluence space names.

Home pages in Confluence are usually named something like "Home" or "SPACENAME Home", which is not very helpful. In Confluence, spaces are named and that's usually what you want as the title of spaces in XWiki.

However, if you happen to have useful home page titles, you may want to set this to true.

Space name are always taken from Confluence space keys regardless what you choose here.

 false
Store Confluence details

This parameter specifies whether to store Confluence metadata in migrated documents as objects.

This is usually useless, but in some cases, this metadata can be useful for debugging purposes and might end up useful to support CQL-based Confluence macros like spacebylabel or detailssummary in the future.

 false
Import tagsSet this to false if you don't want Confluence labels to be migrated to XWiki tags true
Unknown macro prefix

This field defines the prefix to be used for the macros specified in the "Prefixed macros" field. See also the Unprefixed macros parameter.

We do not recommend modifying this parameter. 

 confluence_
Unprefixed macros

This field stores a denylist of macros that should not be prefixed. We do not recommend modifying this parameter. If set (not empty), takes precedence over "Unknown macro prefix": any macro that is not listed in the Unprefixed macros will be prefixed and the Unknown macro prefix will be ignored. If you want to prefix absolutely all macros, set this to an unlikely macro name.

 N/A
User id mapping

A mapping between Confluence user id located in the package and wanted ids.

Similar to GROUP NAME MAPPING, this field stores a list of A=B couples separated with a pipe character (|) where A is the name of a Confluence user and B is the desired user name in XWiki.

user1=User1|charliedo=CharlieDo|ConfluenceAlice=Alice|ConfluenceBob=Bob 
Produce user references

This parameter defines if links to user profiles should be created for existing Confluence user profile links.

 false
Import users

Import the users found in the Confluence package.

If this parameter is set to true, user profiles present in the export package will be imported into XWiki. Setting this parameter to "false" may be needed if you use a central user directory service like LDAP or Active Directory, in which case you need to decide on a careful user migration strategy. See also the "Import groups" parameter.

Users are only present in full Confluence export packages and are not present in space export packages.

Only the user profiles won't be imported from the package if this option is set to false. Permissions will still be imported for users, unless you also disable permission import. 

 true
Import groups

Import the groups found in the Confluence package.

If this parameter is set to true, groups present in the export package will be imported into XWiki. Setting this parameter to "false" may be needed if you use a central user directory service like LDAP or Active Directory, in which case you need to decide on a careful user migration strategy. See also the "Import users" parameter.

Groups are only present in full Confluence export packages and are not present in space export packages.

Only the groups won't be imported from the package if this option is set to false. Permissions will still be imported for groups, unless you also disable permission import. 

 true
Users wikiThe wiki where users and groups are located.

You can specify here the wiki in which users and groups are located.
If applicable, users and groups will be imported in this wiki. Any user reference, including ones in permission objects, will contain this wiki.
This is most useful in a multi wiki environment when importing confluence spaces in a subwiki, but users are located in the main wiki (xwiki), in which case, set this to "xwiki". 

 N/A (current wiki)
Verbose

This field defines whether to create detailed import logs or not.

If you are using the Confluence Migrator Pro Application, we do not recommend modifying this parameter.

 true
Link MappingThis field defines the link mapping to use to produce the correct links to pages missing from the Confluence package. Note: In Confluence Migrator Pro, this parameter is hidden and automatically managed. 
{
   "spaceKey1": {
       "page title 1": "Space.Doc1",
       "page title 2": "Space.Doc2"
    },
   "spaceKey2": {
       "page title 3": "Space.Doc3",
       "page title 4": "Space2.Doc4"
    },
   "spaceKey:ids": {
       "42": "Space.Doc5"
    },
   ":ids": {
       "43": "Space.Doc6"
    }
}

Release notes

Release notes can be found on Confluence project page.

Prerequisites & Installation Instructions

We recommend using the Extension Manager to install this extension (Make sure that the text "Installable with the Extension Manager" is displayed at the top right location on this page to know if this extension can be installed with the Extension Manager).

You can also use the manual method which involves dropping the JAR file and all its dependencies into the WEB-INF/lib folder and restarting XWiki.

Dependencies

Dependencies for this extension (org.xwiki.contrib.confluence:confluence-xml 9.52.0):

Get Connected