Monday, February 20, 2012

Kindle Formatting 1: Preparing Word Document

Microsoft Word Used: Office 2010, Version 14.0.6112.5000 (32-bit)

Clean Up the Word Document

Go to File à Options à Proofing à Autocorrect Options and make sure “Straight quotes” with “Smart quotes” is ticked.

Go through your document and “Find/Replace” a single quote ( ' ) with a single quote ( ' ), and a double quote ( " ) with a double quote ( " ). Trust me, it works.
Now, go through your document and find every place you have a contraction like ‘tis or ‘66 and fix them. Type a’tis and delete the “a”. Yep, it’s a pain, but hopefully you don’t have that many contractions.

You want to fix all of these smart quotes before you start editing HTML, because you do not want curly quotes in HTML tags.

Replace all double spaces by single spaces. Go to “Find/Replace” and space twice in the Find box, and once in the Replace box. Modern typography does not require a double space at the end of the sentence.

Go ahead and remove all page numbers, headers and footers that will not be in the Kindle file.

General Table of Contents without Page Numbers

Uncheck the “Show page numbers” and make sure “Use hyperlinks instead of page numbers” is checked.

Insert the Kindle Required Markers

Kindle requires two markers “toc” for Table of Contents, and “start” for the beginning of the text
Put your cursor right before the “T” in the Table of Contents and Go to the Insert Tab, and hit bookmark. Add “toc”
Now put your cursor right before the first letter of either your “Part 1” or “Chapter 1” and insert bookmark. Add “start”.
Because of a bug in KindleGen [ kindlegen(Windows) V2.3 build 36043] we will end up deleting the Table of Contents and using Calibre to generate it. But for now, leave it in so you can use the Bookmark.

Generate HTM

Create a sandbox folder or directory
Place the following documents in your sandbox.
•         Word doc, mynovel.docx [Word 2010 format]
•         Cover image, mycover.jpg [sized to approx. 800x600]
Eventually, two other files will also be placed in the sandbox: toc.ncx and mynovel.opf. But let’s not worry about these yet.
Save the Word file in the sandbox as type "Web Page, Filtered" .htm file to strip most of MSFT encoding. You'll notice an additional folder added where any images you might have will be stored (mynovel_files). Don't worry about the presence of the "toc.htm" file in the picture below. It does not get used because I haven't figured out how to include it in the source file. I use Calibre later on to generate the Table of Contents, however we need the Word generated TOC tags to generate the required toc.ncx file.

Download and install Notepad++ from
Download and install Calibre from
Congratulations. Take a break and have a cup of coffee.

