MS Office Word Error: The file cannot be opened because there are problems with the contents

So, you are working on that important Final presentation for your college Marketing Course and after toiling for hours you find Microsoft Office’s Word will no longer open your file.  When you do, you get a error box that (after clicking on the “Details>>>” box) looks like this:

 

Image

 

Argggggh!!!!!!! Six hours of work down the drain, or so my wife thought. 

So you may have tried Microsoft’s fix, but alas, that was a waste of time and hope.  (Microsoft’s page on this error is here). So here is the simple solution to all your problems.

This is nothing more than an XML error.  XML is eXtensible Markup Language, which is not really a language as a format (like HTML).  Everything has to be enclosed in containers.  Containers look like this:

<container type=”sample”>Stuff</container>.

Container is a “node”.  A Node has a start tag <container> and an equivalent end tag with a slash </container>.  The stuff after the start tag and the closing brace are attributes of the container node (type=”sample”).  Now this is XML in its simplest form, where it gets tricky is multiple nesting of duplicate nodes.  If this is confusing you, don’t worry, it really isn’t important.

1. Make a copy of the offending document.

2. Rename the file to badfile.zip.  Now, if you use explorer to rename it, it won’t suffice if it keeps the .docx extension at the end of the file.  You can go to DOS (cmd) and rename it.  Go to your Start box and type “cmd” thusly:

Image

Once you get the distinctive DOS style box, type “ren badfile.docx badfile.zip”, like so:

Image

Now I recommend putting the file low in your directory path to avoid lots of cd (change directory) commands to get to the file.  If you need help navigating to the directory your file is in, google “DOS commands”.

Now all Word 2007+ documents are really just zipped folders.  Once you rename that file, the document icon should turn from a Word icon to a zip folder icon. 

3. Now you can extract the contents of that folder by right clicking and selecting “Extract All…” and accept the default folder location (should be a folder called “BadData”).

4. Download an XML Editor.  Free ones are available, but you will need one that will open the file even if the XML is malformed.  Microsoft’s XML Editor won’t do that.  The XML Editor in Visual Studio will if you have that program.  If not, I have found that SynText’s SernaFree works well enough, although it is slow with large files.  The  key is you need an XML Editor that will open malformed XML and identify the errors it finds.  Specifically, it will be missing end tags, that is the tags with the </NodeName>.

NOTE: On large files, some of these editors are SLLLLOOOOOWWWWW. Like you type a letter and it takes 20 seconds to show up slow, maybe even slower. You don’t have to do much here so be patient.

5.  Navigate to the badfile folder and select the word directory.  In that directory you will find the offending document.xml.  Open the document in the XML Editor.  Select the default template if prompted, and when it chokes on the error most editors will either open in edit mode or ask if you want to edit in a text editor. 

Image

Since the XML is malformed it will show on en entire line or two.  Your error message told you what line and column the error is on.  Most XML editors will tell you the exact error, like so:

Image

6.  Go to the spot of the error by either double clicking  the error and get taken to the spot of the error or just by scrolling to the spot, There is normally a column counter somewhere on the bottom of the Editor.  Some editors will show the location of the error by underlining the error area in red.

7.  Now we need to insert the end tag so the XML is well formed.  Normally the spot identified as the error is the end of the end tag that the editor expected the missing tag.  So, if the error is at position 23213, so to that position and backup to the previous end tag.  This is where you need to insert the missing end tag.  The error message will say what the tag should be but usually omits the brackets and slash.  So if the error said it is missing a v:textbox end tag, you need to insert a </v:textbox>

Image

 

So, the line should look like this:

Image

 

Now, I would recommend saving and re-opening the file after each fix.  That way you can tell if it is working or not before you spend too much time.  Also, it is important to make a copy of this document,xml file in case you foul it up terribly, you can start over.

Once all the errors are gone, the file should open in the XML Editor without having to resort to the text file editor.

Image

 

Once this file looks like a real XML file again, you can fix the word document.  Remember the word document is now a zip file.  So, right click the good copy of document.xml and select “Copy”.  Double click the zip folder, and navigate to the word folder.  Once you are at the word folder, right click and select “paste”. It will ask you if you want to overwrite the file, you, of course say “Yes”.

Almost done.  Now you just need to rename the file back to a docx name of your choosing, again use cmd (DOS) same as in step 2 and ren badfile.zip repaired.docx, or whatever you want to call the fixed file. 

It should now open in Word. yippee. 

If this is all too much, email me and maybe we can work something out.

About these ads

37 responses to this post.

  1. Posted by Ahmed Yasser on April 26, 2013 at 3:08 pm

    I’d Like to thank you very very much, because your tutorial helped me to restore very important data.

    Reply

  2. You are very welcome. Glad to hear this helped. I know how frustrating it is to lose work.

    Reply

  3. Posted by matt on May 20, 2013 at 12:15 pm

    Hey, I also had a similar problem. My issue was with a Macro Enabled Document(.docxm). I eventually fixed it like this:

    1. Edit the ‘m’ off the end of the extension.

    The document then opened as a regular .docx, and worked just fine. Which makes me think my problem was with a Macro.

    Note: I was tipped off to the potential that it was a macro when the document opened in Libreoffice, but not Word. So that might be something to look out for

    Reply

  4. Posted by bart on May 29, 2013 at 8:26 am

    i follow the steps, but sternafree does not come up with any errors. however, the file is still damaged… how can i locate these errors myself?

    Reply

    • If the XML Editor (most of them from my experience) do not give you any warnings or errors when you open the document.xml file then it is probably not a malformed xml error. I have helped people with other issues that generate this error and will post some of those fixes.

      Reply

    • I know this was a couple of weeks ago, if you are still having a problem, post the exact message (every word) in the error message could help identify the real issue.

      Reply

  5. Posted by Nigel on June 16, 2013 at 1:15 pm

    This is the fix that helped me too. For everyones’ info, Syntexts Sernafree is no longer available as of June 2013. I used Oxygen XML Editor instead.

    Reply

    • Nigel, thanks for the tip on the Oxygen XML Editor. Is it resposive? Biggest problem I had with SernaFree is it was very slow on large files.

      Reply

      • Posted by Nigel on June 17, 2013 at 5:08 am

        Oxygen was fairly good yeah. The document.xml I was playing with was 20MB uncompressed so it wasn’t quick. I was waiting no more than 10 secs at any point (this was on a 7yr old laptop). Only thing is that Oxygen is not free. I used the 30-day trial version for which you need to apply to get a trial license. Also, I had to do several iterations of: Validate, add an end tag, validate, add an end tag, validate . . . Word 2010 bombs and reports only the first missing end tag it finds. Honestly, I’m not an XML whiz, just dabbled in programming. If I could fix this, it should be childs play for the MS development team to write repair routines built into Word, at least for the common problems.

  6. So, this fix works only when you see the additional info about “the end tag….”. Tags are specifically xml implements. There are many supplemental messages you may see with this error, such as corrupt files, that have other fix steps.

    Reply

  7. Posted by zet on July 15, 2013 at 6:37 am

    i need help, can you please fix mine ? we had a similar or even same problem. if you want i’ll send mine on your email. please, i can’t try it from your tutorial. there is a deadline i need to fulfilled. i’m begging you

    Reply

  8. Posted by mark robert on August 30, 2013 at 12:31 pm

    I need help me too, can you repair m
    y document??i need to my thesis….plsssssssssssssss…God Bless

    Reply

  9. Posted by Chane on October 12, 2013 at 10:42 pm

    Hi, I have installed Syntext Serna Free but the program keeps not responding. It is a large file – 15 Mb. Is there any way that you could maybe take a look at it? I spent 7 hours doing the final touchups on the document and then damn Word went and did this.

    Reply

    • Sure, send me the file at the link at the bottom of the blog. If it is an XML error, I can use Visual Studio’s XML editor. Most of the files I get sent have standard corruption and not the XML error I outlined above. I have less success when it is a standard file corruption issue (I once recovered most of a file that was corrupted).

      Reply

      • Posted by Chane on October 13, 2013 at 7:41 am

        From what I have seen once Syntex Serna finally opened there are 4 xml errors. Only problem is that the program doesn’t respond long enough for me to actually fix it. I have sent the email.

  10. Posted by Elspeth on October 14, 2013 at 11:14 pm

    Brilliant article this enabled me to recover a document that one of our senior managers had been working on all day. I really appreciate you posting this information.

    Reply

  11. So, a couple of quick notes. I have received dozens of requests to help and I do my best if I have the time. About 60% of the errors I receive are generic corruption errors, and not XML errors. I have been able to partially recover one corrupted file, the remaining 5 or so I was no help. With corruption, the issue is with how MS zips and unzips the document directory structure. The remaining 40% were true XML errors, and I have been pretty successful. One common error that MS seems to create is a error. This often occurs with graphical objects. These can be easily fixed by moving the in front of the (these letters represent XML nodes). One common one I found has to do with Math objects (oMath) and the AlternateContent tag. So what you see in the document is ……… This is wrong because the AlternateContent node is closed before the oMath node. The issue s corrected by moving the oMath open node in front of the AlternateContent node, such as:
    .

    Microsoft has a specific FixIt solution for this which I have NEVER, EVER gotten to work on any instance, when clearly this is the problem. None of the FixIt programs have EVER worked even though it is clear the problem it is supposed to fix is the solitary issue with the document. Oh well.

    The biggest problem to implementing my fix is it is difficult, especially if you are not XML saavy. The second issue is XML editors are very slow on these large files. I use VisualStudio (not free, very expensive) and have learned a tip: enter the node you want to insert into another document (notepad for instance) and copy it into your clipboard (Ctrl-C). Then paste it into the XML Editor at the desired location. That is better than typing each letter individually and waiting 30 seconds for it to put it into the document. The reason for the long delay, is that most XML editors reparse the entire document after each individual change, so an insert from the clipboard is one change, just as typing a single letter is a single change. It still takes 30 seconds, but you only have to do it twice (once to delete the node, once to copy it in the new location).

    I wish all of you luck, don’t hesitate to contact me. If I don’t respond right away, try mailto:kferrel@asu.edu. This appears to be showing up in search engines as I got no emails for the first 6 months, now I get something several times a week.

    Good luck to all.

    Reply

  12. I had this same error , when I tried to open word 2013 !
    you can see the article, it helped
    http://microsoftshare.blogspot.com/2013/10/resolved-why-i-cant-open-my-word-2013.html

    Reply

  13. Posted by Janne on November 22, 2013 at 8:33 pm

    THANK YOU!
    I had about half a year work behind with my bachelor’s thesis and you saved me. I am not a geek and your guide was perfect. =)

    Reply

  14. Posted by xStatic on November 24, 2013 at 8:51 pm

    This post was FANTASTIC! Helped me recover a very important document

    Reply

  15. Posted by Yasir on November 25, 2013 at 2:12 pm

    Thank you so much sir, I recovered my final year project report through your guide…God bless you.

    Reply

  16. Posted by Aka on March 18, 2014 at 1:11 am

    This is amazing.
    I had a document about 35 pages with 2 pictures per page + comments. Document fucked up with no reason, saved perfectly etc.

    Tried to convert pdf to doc, tried to open in gmail etc etc nothing worked.

    Downloaded Oxygen XML, opened document.XML and used “Checked Well-Formedness” or use Ctrl+Shift+W (it is on the Red Tick Page for the icon).
    Then I followed up your tutorial, and as soon as I added ” </ " it was automatically writting the correct text.

    Perfect, followed up tutorial, rebuilt .docx and worked.

    A huge thank you to you.

    Reply

    • Aka,

      It is such a simple fix, I am not sure why Word doesn’t try to make the fix when it sees corruption. Seems to happen a lot with the oMath nodes, which I believe is a function inserted into the document.

      Glad it helped!

      Regards,
      Ken

      Reply

  17. Posted by Ali Kaiser on March 24, 2014 at 10:48 pm

    Dear sir thank you so so so so very much….u don’t know how many blogs and fora i tried before stumbling here, and that was at a time when our group had given up and decided to stay up indefinitely to remake the project report….everyone was breaking down because we had lost 10 days worth of work (of course we have separate versions here and there but only one master file which had a lot of unique compiled info)

    The fact that the file finally opened has literally made us breathe so hard!!

    Reply

    • Ali,

      So very glad my blog helped. I tried to make it as easy as possible to do; it is a complicated task.

      Not sure why Microsoft can’t get this figured out.

      Regards,
      Ken

      Reply

  18. Posted by Shawn on November 5, 2013 at 10:32 pm

    My link was wrong, this is the correct link, sorry for the confusion.

    https://skydrive.live.com/redir?resid=A79B758FDC35E8D9!315

    Reply

  19. I’ll try to take a look at it tonight or first thing in the morning. Wish me luck!

    Reply

  20. Posted by Shawn on November 6, 2013 at 12:10 am

    Great, thank you very much! I hope you have better success than me. Good luck!

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: