Description
The Substack Importer will import content from an export file downloaded from your Substack newsletter.
The following content will be imported:
- Posts and images.
- Podcasts.
- Comments (only for publicly accessible posts).
- Author information.
In the future, we plan to improve the importer by:
- Mailing lists.
- Enhancing the performance of processing export files with many posts and media.
Development
For running unit tests and contributing to the plugin, see the README on GitHub.
Tests can be run with wp-env or with any local WordPress setup paired with a Docker MySQL container. Run composer install first, then vendor/bin/phpunit.
Hooks
The Substack Importer provides filters and actions at key stages of the content conversion pipeline.
Post-level Filters
substack_importer_post_meta
Filter the post metadata loaded from the Substack API before it is used for author, comments, and other post data.
Parameters:
* $post_meta (array|null) – The post metadata from the Substack API response.
* $post (array) – The raw Substack post data from the CSV.
* $id (int) – The Substack post ID.
substack_importer_raw_content
Filter the raw HTML content before Gutenberg conversion. Runs after the subtitle has been prepended (if present). Useful for cleaning up Substack-specific HTML, adding custom elements, or stripping unwanted markup.
Parameters:
* $html_body (string) – The raw HTML content from the Substack export.
* $post (array) – The raw Substack post data from the CSV.
* $post_meta (array|null) – The post metadata from the Substack API response.
substack_importer_subtitle
Filter the subtitle HTML before it is prepended to the post content. Return an empty string to skip the subtitle entirely.
Parameters:
* $heading (string) – The subtitle HTML (default: an h2 element).
* $post (array) – The raw Substack post data.
substack_importer_post_content_after_conversion
Filter the post content after Gutenberg conversion but before it is added to the WXR. Useful for wrapping paywalled content in custom blocks (e.g., membership plugins).
Parameters:
* $post_content (string) – The converted Gutenberg block content.
* $post (array) – The original Substack post data.
* $post_meta (array|null) – Additional post metadata from Substack API.
substack_importer_post_data
Filter the final post data array before it is added to the WXR.
Parameters:
* $post_data (array) – The post data.
* $post (array) – The original Substack post data.
Content Conversion Filters
substack_importer_converted_node
Filter the result of a single node conversion to a Gutenberg block. Allows modification of the block name and attributes. Return a null block_name to skip the node.
Parameters:
* $block_data (array) – Array with ‘block_name’ and ‘block_attributes’ keys.
* $node (DOMElement) – The converted DOM node.
* $node_name (string) – The original HTML tag name (e.g. ‘p’, ‘div’, ‘h2’).
substack_importer_image_result
Filter the image node conversion result. Useful for adjusting image sizes, captions, or link destinations.
Parameters:
* $result (array) – Array with ‘block_attributes’ and ‘node’ keys.
* $image_data (array|null) – The decoded image data from the Substack data-attrs attribute.
substack_importer_pre_embed_conversion
Short-circuit the embed node conversion before default handling. Return a non-null array to skip the built-in switch statement entirely. Useful for handling unsupported embed types or overriding the default conversion for a specific provider.
Parameters:
* $pre_result (array|null) – Return non-null to short-circuit. Expected keys: ‘node’, ‘block_attributes’, ‘block_name’.
* $node (DOMElement) – The embed DOM node before conversion.
* $parent (DOMElement) – The parent DOM element.
* $first_class (string) – The CSS class identifying the embed type (e.g. ‘youtube-wrap’, ‘tweet’).
substack_importer_embed_result
Filter the embed node conversion result after the default conversion. Useful for modifying embed URLs, adding custom attributes, or changing how embeds are represented.
Parameters:
* $output (array) – Array with ‘block_name’, ‘block_attributes’, and ‘node’ keys.
* $first_class (string) – The CSS class identifying the embed type.
substack_importer_audio_block
Filter the Gutenberg audio block HTML for podcast posts.
Parameters:
* $block (string) – The Gutenberg audio block HTML.
* $audio_url (string) – The URL of the podcast audio file.
Paywall Filters
substack_importer_paywall_marker_text
Filter the paywall marker text that appears in the imported content.
Parameters:
* $marker_text (string) – The default paywall marker text.
* $node (DOMElement) – The paywall node being converted.
* $parent (DOMElement) – The parent element.
substack_importer_paywall_content
Filter the entire paywall conversion result. Return a non-null value to override the default conversion.
Parameters:
* $result (array|null) – The conversion result, null to use default.
* $node (DOMElement) – The paywall node being converted.
* $parent (DOMElement) – The parent element.
Actions
substack_importer_before_post
Fires before a single Substack post is processed and converted. Useful for setting up state or performing actions before conversion begins.
Parameters:
* $post (array) – The raw Substack post data from the CSV.
* $post_meta (array|null) – The post metadata from the Substack API response.
* $id (int) – The Substack post ID.
substack_importer_after_post
Fires after a single Substack post has been converted and added to the WXR. Useful for logging, progress tracking, or performing cleanup after each post.
Parameters:
* $post_data (array) – The final post data that was added to the WXR.
* $post (array) – The raw Substack post data from the CSV.
* $post_meta (array|null) – The post metadata from the Substack API response.
* $id (int) – The Substack post ID.
Installation
This plugin depends on the WordPress Importer plugin which needs to be installed first.
To install the Substack Importer:
- Upload the
substack-importerdirectory to the/wp-content/plugins/directory - Activate the plugin through the ‘Plugins’ menu in WordPress
FAQ
-
After about 30 seconds, the import stops and I am seeing a blank screen. What happened?
-
When trying to import a large number of posts and images, timeouts can occur. To solve this, you can try to run the import
several times until all content has been imported.
Reviews
Contributors & Developers
“Substack Importer” is open source software. The following people have contributed to this plugin.
Contributors“Substack Importer” has been translated into 4 locales. Thank you to the translators for their contributions.
Translate “Substack Importer” into your language.
Interested in development?
Browse the code, check out the SVN repository, or subscribe to the development log by RSS.
Changelog
1.2.0
- Compatibility: the plugin now requires PHP 7.4 or higher.
- Enhancement: added new pre-import options for forcing Draft status, choosing publish date mode, setting the first image as Featured Image, and applying a global Category/Tag.
- Enhancement: improved import behavior handling for featured image assignment and post metadata processing during import.
- Enhancement: added
substack_importer_paywall_marker_textfilter to customize paywall marker text. - Enhancement: added
substack_importer_paywall_contentfilter to override paywall block conversion. - Enhancement: added
substack_importer_post_content_after_conversionfilter to modify content after Gutenberg conversion. - Enhancement: added
substack_importer_raw_contentfilter to modify raw HTML before Gutenberg conversion. - Enhancement: added
substack_importer_subtitlefilter to customize or skip the subtitle heading. - Enhancement: added
substack_importer_post_metafilter to modify post metadata before processing. - Enhancement: added
substack_importer_converted_nodefilter to customize individual block conversions. - Enhancement: added
substack_importer_image_resultfilter to modify image block attributes. - Enhancement: added
substack_importer_embed_resultfilter to modify embed block results after conversion. - Enhancement: added
substack_importer_pre_embed_conversionfilter to short-circuit embed conversion before default handling. - Enhancement: added
substack_importer_audio_blockfilter to customize the podcast audio block. - Enhancement: added
substack_importer_before_postaction that fires before each post is processed. - Enhancement: added
substack_importer_after_postaction that fires after each post is added to the WXR.
1.1.2
- Enhancement: support captions for images.
- Enhancement: support TikTok embeds
- Compatibility: the plugin now requires PHP 7.2 or higher.
- Fix: convert preformatted content to verse block.
- Fix: twitter conversion bug.
1.1.1
- Tested up to WordPress 6.7
- Fix: null checking
1.1.0
- Update
wxr-generatorto latest version. Fixes a bug where imports could error out due to a misformed timezone identifier.
1.0.9
- Use subtitle as post excerpt if not empty
- Testing the plugin up to WordPress 6.4.2
- Fix PHPCS error and cleanup composer.lock
1.0.8
- Removed the subscription input from post content
1.0.7
- Convert the paywall div to a paragraph
1.0.6
- Testing the plugin up to WordPress 6.2
1.0.5
- Add support for WordPress 6.1
1.0.4
- Fix Soundcloud embeds
1.0.3
- Identify authors for draft posts as «Draft Posts»
1.0.2
- Republishing to fix a CI error.
1.0.1
- Remove unnecessary load_meta_data line.
- Fix embeds not displaying properly on website.
1.0.0
- Add post meta for paid content.
- Convert Instagram embed to a link.
- Add the subtitle as a H2 at the beginning of the post.
- Set the correct comment_status for posts.
0.1.0
- Refactored the importer.
- Add support for authors.
- Add support for comments.
- Conversion of content to Gutenberg blocks.
- Convert the export to WXR and use the WordPress Importer plugin to import the WXR.
- Add progress indicator
- Add support for attachments.
0.1
Early proof-of-concept version.