« Getting windy at the Opera | ::: | Should he stay or should he go? »
XML file too big to import
Part of “Going self-hosted with Wordpress : A Wolfie Guide”
On one of my older posts (Going Self-Hosted with Wordpress) Lisa has asked a question about how to import a larger than 2MB XML file into her new self-hosted WordPress installation, from her existing WordPress.com blog.
The first thing to say is that this is not a WordPress restriction, it is a restriction of the hosting company being used. If you have cPanel loaded on your host, take a look in ‘PHP Configuration’ and you’ll see that ‘upload_max_filesize’ is set to 2MB. (For some hosts this number may be smaller or larger; as always, your mileage may vary). There is a way that you can change this value, although I’ve only managed to make it 8MB on my server. (Before going ahead and making any of the changes that follow, please make sure that you have a working back-up of anything that you can’t afford to lose – just in case. I will not accept any responsibility for anything that goes wrong with your system and make no promises that any of these methods will work for you).
In your public_html directory, there should be a file called .htaccess. This is a small text file that, at least on my server, looks like this:
# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress
If you add the line php_value upload_max_filesize 32M just before the final line, then when you go to the import screen you’ll be told that the maximum file size is 8MB. Whether the system will actually allow you to import an 8MB file is a different question and I have not been able to test this. Assuming it does work, once you’ve uploaded the XML file I suggest re-editing the file to take out the alteration.
If that does not work, then as far as I’m aware, you can’t change this setting yourself but your hosting company should be able to change it for you. If they can’t / won’t, then you need to look at a different solution which involves splitting the XML file into several smaller pieces.
Once you’ve got your exported XML file (to get this, from your wp.com dashboard go to ‘Manage’, then click ‘Export’. You get the option to restrict authors, but this won’t apply to most people. WordPress then saves an XML file to your hard-drive), you need to open it up in a text editor. I use TextWrangler – because its free and because it helpfully colour-codes tags, etc – but anything should work, even WordPad. What you’ll see is a huge list of text, with lots of things in tags (which are things like <channel>, <rss>, <item>, etc). All this text is what WordPress will use to reconstruct your blog on your new installation.
But that upload limit is a bit of a pain. I didn’t experience this issue when I moved The New Wolfs Howl because the export file was quite small (even now it’s only 1.4MB) but after a quick search around the forums it seems that this is quite a common problem. Unfortunately, splitting the XML file is not quite as simple as putting the second half of the file in a different document; there are certain things that have to be in each file.
The first thing to do is to work out how many files you need to split the file into. If your upload limit is 2MB and you have an 8MB file, then I would suggest you need to have five files – I know that eight divided by two is four, but I’ve added one to take care of the overlap. That will then give you a rough idea of who much of your file has to be moved each time. For example, my XML file is just over 21,500 lines – so I’d want just over 5,000 lines per file.
Take a look at your XML file and at the top you’ll see there are various items of header code (instructions from WordPress, etc). From the top line of the file (<?xml version=…) to <wp:base_blog_url>http://… needs to be in every file. Scroll right to the bottom of the file and </channel> and </rss> also need to be in every file. So, before any content has gone in, you want an XML file that looks like this:
<?xml version=”1.0″ encoding=”UTF-8″?>
<!– This is a WordPress eXtended RSS file generated by WordPress as an export of your blog. –>
<!– It contains information about your blog’s posts, comments, and categories. –>
<!– You may use this file to transfer that content from one site to another. –>
<!– This file is not intended to serve as a complete backup of your blog. –>
<!– To import this information into a WordPress blog follow these steps. –>
<!– 1. Log into that blog as an administrator. –>
<!– 2. Go to Manage: Import in the blog’s admin panels. –>
<!– 3. Choose “WordPress” from the list. –>
<!– 4. Upload this file using the form provided on that page. –>
<!– 5. You will first be asked to map the authors in this export file to users –>
<!– on the blog. For each author, you may choose to map to an –>
<!– existing user on the blog or to create a new user –>
<!– 6. WordPress will then import each of the posts, comments, and categories –>
<!– contained in this file into your blog –>
<!– generator=”WordPress/2.5.1″ created=”2008-07-12 05:47″–>
<rss version=”2.0″
xmlns:content=”http://purl.org/rss/1.0/modules/content/”
xmlns:wfw=”http://wellformedweb.org/CommentAPI/”
xmlns:dc=”http://purl.org/dc/elements/1.1/”
xmlns:wp=”http://wordpress.org/export/1.0/”
>
<channel>
<title>Your Blog Title</title>
<link>http://yourblogdomain.com</link>
<description>Your blog descriptions</description>
<pubDate>Fri, 11 Jul 2008 19:33:37 +0000</pubDate>
<generator>http://wordpress.org/?v=2.5.1</generator>
<language>en</language>
<wp:wxr_version>1.0</wp:wxr_version>
<wp:base_site_url>http://yourdomain.com</wp:base_site_url>
<wp:base_blog_url>http://yourblogdomain.com</wp:base_blog_url>
[this is where the content will go]
</channel>
</rss>
So, now you need to get your content in there. In the first of the XML files, you’ll want to make sure that you include your categories and your tags; these are listed immediately after the <wp:base_blog_url> line. They only need to be included in one file. Then the rest of the file is filled up with content; just look for <item> and </item> tags and cut and paste information between files. Always make sure you only copy complete items, though, otherwise you’ll have an error.
This way of splitting files is a laborious process and will take a fair while, but will work if you do it properly. There are file splitting utilities out there, but I have not tested any of them for effectiveness (or simplicity).
Comments
12 Responses to “XML file too big to import”


Hi Wolfie-
Thanks so much for your help. Unf, this didn’t work for me either.
But I
just want to let you know that finally BLue Host (my server) folks helped me. AND i did not have to split my XML file into smaller files. All i can say is this is what the support tech said he did:
I went through and cleared all the php.ini’s. I disabled FastCGI. I then created a brand-new php.ini.default; I fixed the parameters, renamed it, and put it in the proper folders. Logged out and logged back in…
Set php5 again to take effect of the changes
Lisa’s most recent blog post: Big Changes!
@Lisa:
Sounds like there was a lot of stuff broken if he had to do all that! Glad it’s working for you now, though.
Wolfie – have you had any issues with comments not displaying on the wordpress blog? I’ve just migrated from wordpress.com to my my own domain and all is working well except the comments are in the database, but not displaying…
any advice? Thanks, Tammy
Tammy’s most recent blog post: Who else wants to work 15 minutes rather than 8 hours?
@Tammy:
Sorry Tammy. That one’s got me stumped. Provided the comments are present in the XML file, then they should import with the rest of the information from your blog. And I can’t think of any reason why they wouldn’t be present in the XML file.
Hi there Tammy and Wolfie-
I have the same problem. I uploaded and then imported my XML file.
All my comments are there in the admin panel under comments, BUT they are not showing up on the individual posts that they were originally written for. Each post says “No comments” even though the comments did import. They somehow lost their ‘link’ to their respective posts. Any ideas?
Thanks!
LL
http://www.llworldtour.com
This is an odd one. Can one of you send me your XML file so that I can test it with my local installation? Email it to me at wolfieb[at]wolfshowl[dot]com.
hi there, the maximum upload file size of my web host is 8,192KB, but why is it that I can’t still completely import my posts?
empressofdrac’s most recent blog post: EmpressOfDrac.Com
@empressofdrac:
Based on the information you’ve given me, I have no idea. What process have you gone through? I need to know the steps you’ve followed so far, and any error messages you’re getting to have any chance of offering a solution.
Thank you, thank you, thank you! This was a tremendous help. I could have saved myself quite a bit of scalp tissue had I found this before ripping my hair out.
@jenifleur
Glad you found the article useful. Sorry it didn’t save the hair-pulling
Hi there… I edited my .htaccess file, but when I reloaded import.php inside wordpress it returned a 500 error message. I have set my .htaccess like this:
# -FrontPage-
IndexIgnore .htaccess */.??* *~ *# */HEADER* */README* */_vti*
order deny,allow
deny from all
allow from all
order deny,allow
deny from all
AuthName paulforcey.com
AuthUserFile /home/forceyp/public_html/_vti_pvt/service.pwd
AuthGroupFile /home/forceyp/public_html/_vti_pvt/service.grp
# BEGIN WordPress
php_value upload_max_filesize 32M
# END WordPress
is this correct?
@Princess
I assume that the import.php page would load before you made the changes? And that when you go back to your original .htaccess file it works? I’ve cut and paste your .htaccess file into mine and everything still seems to work. Therefore, it would seem not to be a problem with the .htaccess file but something else on your installation.
As I said in the article, this method may not work for everyone.