12-02-2012 11:47 PM
My name is David and my wife, Pat, finished her first epic fiction novel a few days ago. I'm handling the technical side of self-publishing her work as an ebook. My experience in electronic publishing and prepress stretches back over a decade.
We published her book first at Amazon because I've prepared books for the Kindle before and am familiar with mobi file creation. I'm hoping to duplicate those efforts here for the good folks at B&N.
I write directly in html and the text for Pat's book is stored in a single 4 MB html-coded file. Why one text file? Because it makes editing—especially global tasks—much easier. The file is about 1.7 million characters (approx. 385 thousand words) so it's about the size of three good-size books (roughly 1,300 pages if it were printed). And, since there is only one text file, I placed my css definitions in it so that a separate css file would be unnecessary. Finally, to keep things simple, I am using only the stock fonts in the Nook—no external fonts are used or included.
I had no trouble creating the opf and ncx files to generate my own mobi master for the Kindle from my html. The layout looks good and it paginates well. But I've run into a variety of problems trying to duplicate this feat for the Nook.
I understand that each platform has its idiosyncrasies with the way it interprets html and the default text (css) styles it imposes. So, we're not expecting the book to look identical on the Nook as it does on the Kindle. Yet I've run into such serious problems with the Nook that I'm on the verge of abandoning my effort to support it.
Here's where I am: I've successfully created an ePub version of the book that is compatible (as far as I can tell) with the Nook and passes the ePubCheck test with zero errors. But it works erratically in the Nook PC (Windows) software. Here are the problems I'm having:
The closer the reader gets to the end of the book (remember, this book is BIG), the slower it goes. Pages advance normally at the beginning but can take over 20 seconds to advance near the end.
I use the standard css "page-break-before" attribute to force each chapter to begin on a new page. When the reader stops at a chapter beginning and chooses to go back to the previous page (the last page of the preceding chapter), the Nook software will sometimes beep and refuse to advance back to the previous "page". This happens to about 5-10% of the book's 75 chapters and it consistently occurs at the same ones.
We've included a number of internal hyperlinks to aid reader navigation. For example, at the end of each chapter is a link to jump back to the beginning of the chapter and another link to jump back to the Table of Contents. When the reader taps a hyperlink that should take them to a previous ID tag in the book, it works some of the time but jumps to a wrong location other times. For example, the same hyperlink will sometimes jump to the first Table of Contents page (correct) and other times jump to the Cover page (incorrect).
I've carefully examined my html code and am convinced that these problems are the fault of the Nook software. Unfortunately, the only test platform that we seem to have from B&N is the Windows PC Nook software and it seems, frankly, very poorly implemented.
Question: Is it possible to test my ePub file with the Android app version of the Nook software? Apparently, the only way to load a user-created ePub file into the Windows PC Nook software is to drag the file from the Windows Explorer window onto the Nook software window. The Nook software appears to have no "file open" dialog or facility. Is there a way to load a user-created ePub file into the Android Nook app?
This is important because we're at the "go / no go" stage with this project and, if the Nook platform cannot handle large books reliably, then we'll have to forgo it.
Question: The B&N ePub Formatting Guide (which doesn't appear to have been updated since 2010) states that each html file should be less than 300 KB in size. Obviously, my single 4 MB file (4,000 KB) exceeds this. Could this be the reason why my book becomes so sluggish as the reader gets farther into it? Would the same problem occur if the book were divided into 14 x 286 KB html files with a separate css file? Sadly, we're unwilling to break the book into multiple smaller text files because it would make editing a nightmare. But it would help me better understand the nature of the problem.
By the way, the 4 MB size is the uncompressed file size. The file compresses almost 80% when I add it to my ePub wrapper file. The compressed html text file that the Nook software "sees" is actually 862 KB in size. The total size of my compressed ePub file is 1.23 MB.
Also, I assembled an uncompressed 4.5 MB version of my ePub file to see if it would have any effect on the speed of the Nook software and it did not appear to. The Windows Nook PC software was able to open the uncompressed version of the ePub file with no problem and it seemed to exhibit similar performance as the compressed file.
I hope you can help and I look forward to hearing from you. Thanks in advance for your kind response.
Warm regards, David (for Pat)
Solved! Go to Solution.
12-03-2012 01:51 AM - edited 12-03-2012 01:54 AM
I don't have a direct answer for you, I've never seen an EPUB use one massive html file as opposed to splitting them up per each chapter.
For the sake of curiosity I created an EPUB just now with a single 2.1 meg html file. I managed to get Nook for PC to crash with it. I've never had Nook for PC crash on any EPUBs. I then tossed it in Adobe Digital Editions and though it didn't crash it, it did struggle with it at some parts. I don't have my actual Nook on hand right now to test it on there though I'm now somewhat curious to see how it handles an EPUB with a single large HTML file.. curious if I can crash it too or cause it to lock up.
I also created a 300k word file with no page breaks and nothing to signify any chapter breaks or new pages. I tossed it in a program that creates EPUBs to see how it would handle it and it forced separate html files all split up approximately the same size.
Not very scientific and not 100% conclusive but I'm leaning towards your problem being the one single large HTML file. Have you tested the EPUB on a Nook reader or in other EPUB readers on the PC? (Adobe Digital Editions is a popular one). Regardless of how Nook for PC handles the EPUB you should test it on a handheld Nook device.
My suggestion would be to split the the HTML to <300k files and test it. No need to perfect the book in that form, just split it up enough so that it will function in Nook for PC as a test. That shouldn't take a lot of time and should answer your question definitively.
As for editing an EPUB with multiple HTML files .. ..Once you have your final EPUB file and you come across errors you need adjusted, just use Sigil to edit the EPUB directly, I can't imagine how having one html file versus multiple (when edited using Sigil) would be beneficial.
Hope some part of this helps.
12-03-2012 03:28 AM
Just a random suggestion, but if the mobi works well for you, have you tried converting mobi to epub using Calibre or a similar program?
Also, no offense, but it kind of sounds like you're really overcomplicating the whole process. Are you uploading the HTML right to PubIt! or trying to convert it first? You can upload it as is without converting it, which might fix some of the errors you're having, I imagine. If you do that, you can download the converted epub before publishing and test it on whatever device you want, too.
12-03-2012 12:00 PM
Thank you for the thoughtful reply.
Respectfully, the reason for the single document file seems obvious to us. Perhaps we're missing something. Why would you want to break a book up if you didn't have to? It seems illogical to do so.
Have you ever written and edited a very large fictional story? As the story develops, you continually need to revisit different parts of it to update the spelling of a name or make other changes for story consistency that will be scattered throughout. This is complicated by the fact that Pat has created a number of different languages. They make a minor appearance in Book One but will become more significantly in Book Two. It was a lot easier to work on them in one place than scattered through different files.
Plus there's the fact that, when working with a large story with parallel story threads, the writer can't keep every detail in mind. You may need to find where something was mentioned by a character but you can't remember where. Again, having one file makes a search so much easier.
We're aware that some word processor and document editor programs allow the user to work on a publication composed of multiple documents as a whole. But we've been unimpressed with how well this is implemented.
Regarding Sigil, we learned about this program some time ago and wanted to try it. But we have never been able to get it to install on any of our computer systems. The installer always errors out. I'm a Microsoft Windows Developer, so my skills in this area are pretty deep. I've written a number of installers, myself. Believe me: If there were a way to get Sigil to work on our systems, I would have found it.
But even though we couldn't use Sigil, itself. We still found it helpful. This is because its User Manual is a DRM-free ePub document. As you're probably aware, an ePub file is really just a zip file serving as a wrapper for a group of folders and files containing the various parts of the publication. All you have to do is rename it by changing its ".epub" extension to ".zip" and Windows will allow you to open it and expand its contents. It gave me a great ePub sample.
It amazes me that B&N does not provide samples like this for self-publishers to examine. Amazon has a variety of great samples in its KDP area (both fixed and non-fixed layouts). It makes me wonder just how serious B&N is about self-publishing on the Nook.
I've heard from some self-published writers that they have a higher sales volume on B&N than Amazon. But I'm unsure what genres they represent. I don't want to hamper the success of Pat's book by not publishing for the Nook platform. But my time is limited and I'm not sure that I'll have time to divide the text file.
What I'd like to do first, is test our existing ePub product on an Android version of the Nook software. After all, the Nook is simply a custom Android tablet. Therefore, I expect the Android Nook app will provide a better test bed for our ePub file than a Windows PC. If the problems persist under the Android version, then we'll definitely give strong consideration to breaking our html into sub-300 KB files. But I must confess that it irks me that the Kindle is fine with our single-html-file method and the Nook may not. It makes it more work for self-publisher who want to support a variety of platforms.
Does anyone know if a self-published dPub file can be opened with the Android Nook app? If so, how?
Kind regards, David (for Pat)
12-03-2012 01:12 PM - edited 12-03-2012 01:22 PM
Thanks for the suggestion. We were unaware that Calibre would convert a mobi to an ePub. I'll look into that further.
I hope that I'm not overcomplicating things. Thanks for mentioning it.
Let me ask you something—I mean no disrespect. How many ebooks have you read and what do you think about the overall quality of their layout and typography? I'm not asking about the technical limits of the readers and their software. I'm asking about the quality of the implementation of ebooks within those environments.
As a professional who has worked in and with the publishing industry for many years, I am appalled at the poor quality of MOST ebook implementations. Even more amazing is the fact that the quality of work seems to have very little relationship with the size of the publisher. I've seen many ebooks from well-known professional publishing houses whose layout and formatting are just as lousy as that from rank amateurs with no idea how to create an attractive, readable layout.
It doesn't take much research to discover that the converters offered by the ebook distributors (like Amazon and B&N) are imperfect. Even if you have the artistry and skill to create a great layout, the converters are likely to introduce problems.
This is why it is generally understood in the ebook industry that the ONLY way (at present) to create a great layout for your book is to control the process yourself. Whether you are a professional publisher or a self-publisher, you have to do the whole thing, yourself, if you want the best results.
The goal is to provide the distributor with a ready-to-publish ebook file so their converter doesn't touch your work. Ideally, all they should need to do is wrap their DRM layer around your file prior to publication. For Amazon, this means the publisher needs to create the ready-to-publish mobi or Format 8 file. For B&N, this means the publisher needs to create the ready-to-publish ePub file.
Unfortunately, this complicates things. Creating a great ready-to-publish file will almost certainly require some expertise with both html and ePub. Why? Because the idiosyncrasies of the way the ebook platforms interpret the underlying html codes vary. This forces the publisher to tweak the html code of their layout to fit each destination.
Consider the differences between the Kindle and Nook formats:
The Kindle tries to be more like a printed book. Its html interpreter tries to suppress the space that is normally inserted between html paragraphs while trying to force automatic indentation of the first line of every paragraph. If you need to do something different (for example, you have nested sections of quotation where a text is quoted by a text that is quoted) then you must know how to manually compensate with css definitions to circumvent the Kindle defaults.
The Nook tries to be more like a webpage. Its html interpreter tries to suppress css definition attributes which would automatically indent the first line of a paragraph as well as block efforts to suppress the space normally present after an html paragraph. This forces the publisher to take an opposite approach when editing html code for the Nook when indentation and suppression of paragraph space is desired.
The reason I wrote that both platforms also require the publisher to have expertise with the ePub format is because both the Kindle and Nook publication process require opf and ncx files.
Neither platform is adequately documented and it is left to the publisher to experiment and discover, for example, which css margin attributes are recognized and, if they are, how they are interpreted.
It's too bad that this is the state of the ebook industry today. If someone would create a world-class ebook wysiwyg editor program that is capable of creating optimized ready-to-publish files for Amazon and Nook, then much of the complexity would disappear. The user wouldn't have to know html or the details of the ePub's various standards. The only requirement then, would be talent and skill on the part of the publisher to know what constitutes an attractive and readable layout. I'd pay for such a program in a heartbeat.
I fully expect such a program to appear someday—if we're still here. In the meantime, one of the tools I use is EditPad Pro 7.1.1. It is a great html and xml editor.
In summary, the reason for the complexity is to gain control. Specifically, control over the layout with which an ebook will be presented to the reader on their chosen reader platform.
Kind regards, David (for Pat)
12-03-2012 04:40 PM - edited 12-03-2012 05:36 PM
Not to underestimate your abilities but have you tried installing Sigil with all AV, firewall, etc programs turned off? I had issues at one point with Zone Alarm and MSE crashing out install programs and the second I shut them both off and rebooted without either starting up at all the install went in fine.
I don't know what other programs you've tried for EPUB editing but I feel like if you saw Sigil in action you would understand why I think you could split your HTML file up into multiple ones and still have an easy time editing it. As you may have noticed from my sig at the end of my last post, I do conversions and formatting for ebooks. In thinking back at past books I cannot imagine a time where having all the HTMLs combined into one would've made things easier on me. Sigil will let you search across the whole book and has an easy toggle to go from book view to code view and I'm only guessing here but I think that would cover what you are describing in searching/editing.
(To answer your other question) I've formatted/converted/edited books with the features you've described (I've formatted/edited a lot of fiction, non-fiction, and reference books. I've formatted books in English, Norwegian, and German. I've formatted and edited books with over 350 internal links (end notes) that take the reader directly from where they're reading to the corresponding endnote in the back of the book and then back to exactly where they left off) and not had any of the issues you've mentioned and the one thing they don't have in common seems to be the single HTML file. It of course could be some CSS issue in your book but I feel like with the B&N EPUB formatting guide saying the files need to be a certain size and my rudimentary test I did from my last post that your issue is in that single file HTML.
You mention wanting to test it on an Anroid version of the book software, but even if that works you'll still have the issue with it not working on Nook for PC. I would think the goal should be to get it working across all platforms.
Kindle's (MOBI) format is proprietary and used only by them. They set the rules and the make the devices that read it. EPUB format is an open format with standards used by many readers and many companies and not created by B&N. *IF* keeping HTML files under 300K is part of that standard then it likely exists as one of the many guildelines for companies to build their readers around so Company X's Reader can read the same file that Company Y's can.
One other note because of your mention about creating the EPUB for B&N - The EPUB file you're creating can be used on at least 2 other major (major and semi-large) retailers directly. Probably not right for me to mention their names here but when you have a working EPUB you might as well go direct through 3 instead of just this one.
I'm curious to help get to the bottom of your issue and can test your EPUB on a Nook device for you if you like. If you're interested just go to my website and submit via the form your info so I can contact you or use the email address listed at the top of that form. (I'm not listing it here so as to avoid web crawlers from finding it and starting up the spam machine).
12-03-2012 08:15 PM - edited 12-03-2012 08:19 PM
Thanks again for another great reply!
I worked through the answer to the first question of my original post on my own. The answer is "Yes, the Android Nook app will allow the user to open a user-created ePub file as long as it does not have DRM." The process is called "sideloading". Here are the steps (they assume that the Nook app is already installed on the Android device):
1 - Using your computer (PC, Mac or Linux), mount the memory (SD) card of your Android device.
2 - Copy your ePub file(s) from your computer to the "Nook/MyDocuments/" folder of your memory card.
3 - Unmount the memory card from you computer and remount it from your Android device.
4 - Launch the Nook app on your Android device.
5 - In the library, select filter "My Files".
6 - If necessary, tap on the sync button to prompt the Nook app to search for new ebooks (as long as "My Files" is selected, it searches in the "Nook/MyDocuments" folder). The ebook(s) should appear.
Similar instructions are located in the FAQs in the Android Nook app but the "MyDocuments" folder is incorrectly spelled "My Documents" in the FAQs. In reality it does not have the space.
After installing Nook on my device and testing my ePub file it quickly became apparent that similar and worse problems exist. I was able to open the book but hyperlinks jumped to incorrect locations and the Nook app hung. Attempts to "force close" the app caused the Android OS, itself, to hang. I had to pull the battery and do a cold boot. Then, for safety, I reformatted the Android system cache and made sure there were no orphan fragments or corrupt code in the Dalvik cache. (Sorry for the techno speak---if there are any Android developers reading this, they will understand. The salient point is that I'm trying to be thorough.)
I'm not surprised that my ePub file caused a bigger crash with the Android Nook app than the Windows PC Nook app because the former has much smaller memory limits for apps. My big html text file must have sprung a major memory leak in the Android OS. Evidently, none of B&N's Android programmers considered that someone would do what I did because they sure don't seem to have included any error handlers for it.
So I'm satisfied that adherence to the 2-year old requirement to stay under 300 KB for each html file is still essential and is likely the cause of the weirdness I've observed. I honestly didn't believe it until I saw it with my own eyes because other capacity limits (like the size of allowable image files) has increased so dramatically during this past 2 years. Plus, all of the Kindle apps work fine with the mobi version of the book and I expected similar from the Nook apps.
Therefore, I will be dividing my monster html file and plan to test ePub again in the near future.
By the way, the Android Nook app seems to work WAY BETTER than the Windows PC version. On one hand it was a surprise, considering the resources available in a high-end PC. B&N (and Adobe) need to give more attention to their PC offerings. On the other hand, it wasn't a surprise because the Nook is a custom Android device so you would expect them to get the Android software right before any other.
Regarding installing Sigil - Your suggestions were already tried. Since I've created installers, myself, I've had to deal with similar problems many times. The most common culprit is overprotective security software like antivirus, antispyware, antimalware and firewall software. In addition, Windows Vista and 7 can, themselves, be overprotective at times and interfere with the legitimate operation of an installer.
I had circumvented all of these issues by temporarily disabling all such security measures (after first scanning the Sigil installer package for safety) and it did not help. I have no plans to revisit Sigil until our computers are replaced with Windows 8 systems in the future.
Incidentally, the initial problem we encountered seems to be an attempt by the Sigil installer to replace/modify a dynamic link library (DLL) that we cannot allow because of its dependence by other critical applications on our PC systems. But that probably wasn't the only issue because there were a lot of additional cryptic error messages that flew by so fast that it was impossible to follow them without training a video camera on the display to capture each message.
It isn't a big deal to me right now. Creating the opf, ncx and html files is easy enough for me. Remember, my ePub file passes the ePubCheck test with zero errors—I've already cleaned my css and html to make sure it complies with the ePub standards.
Regarding the proprietary Kindle mobi format - In my limited experience so far, it seems to be a superior format. I seem to have fewer formatting issues to overcome and I prefer the bias toward being more book-like for an ebook reader. It certainly does NOT have an html size limit. And it supports a much larger character set. As you undoubtedly know, the process to create a mobi file is surprisingly similar to the process of creating a standardized ePub. In fact, that is the place where some publishers start.
I have no problem with a proprietary format as long as it is well documented, offers the features I need and is stable. Sometimes, I think open source development misses the point by being too fragmented as it's pulled in too many different directions by the various egos involved. So "proprietary" and "open source" are not good and evil to me. Each project and application has to be evaluated on its own merits. Great stuff has come from both approaches. Terrible stuff has also come from both.
My impression of B&N so far, is sadly quite poor. Their support for self-publishing of ebooks seems far behind Amazon. For example: (1) The B&N "Formatting Guide" is laughable. It basically tells you to go read the many ePub standards documents. That's extremely "unfriendly" for a newbie ebook self-publisher. (2) No preview tools are provided to emulate the operation of B&N's various Nook readers and software. The only way a self-publisher can test their book on the various readers is to go get them. (3) No validation tools are provided. Rather, the self-publisher is sent to ePubCheck with no assistance. Then they are left largely to their own devices to interpret the results.
It's not surprising that most self-publishers would choose to simply submit a Word html document to the B&N ePub converter and tolerate the results.
On balance, my opinion (for what it's worth) is that B&N has the best reader software (the Android version of the Nook is really nice) but Amazon has the best format (it is easier to create great-looking and easy-reading books in mobi and kf8).
Finally, thanks for the reminder that ePub is used by more than B&N. We haven't forgotten that and it is another reason why we're still here plugging away at this.
Kind regards, David (for Pat)
12-03-2012 10:29 PM - edited 12-03-2012 10:30 PM
Yah I'm not overly surprised it crashed the smaller device as opposed to the PC app. I guess they just coded the PC app to run with the same specifications and memory allocation as the device? .. Otherwise I would think the PC app on any normal PC could handle the larger files.
You mentioned the specifications are 2 years old and seem a bit surprised they've not been updated but the devices they had in mind 2 years ago when they made those specifications are still very much in use so even if the newest devices can handle larger HTML files, the old ones are still what a lot of the people out there are using so the specifications wouldn't change.
As for an explanation for this behavior and requirement (and why the Kindle/MOBI format doesn't have it) I found this on a quick search : http://www.mobileread.com/forums/archive/index.php
I'm surprised that Sigil gave you so many errors as it's widely used program. I hope you're able to make use of it eventually and that it is a solution to your issue with editing the larger book.
My reasoning for pointing out the proprietary format of MOBI versus EPUB's open format wasn't really an endorsement of either or preference. Technically it'd benefit me if there were 43987 different formats that people needed formatting for. All I meant to say was that since Amazon developed the format for their readers specifically then they can set the standards however they like and then make sure their devices can handle any MOBI file thrown at them. The EPUB being an open format I'd imagine some of the specifications are what they are (in a way so as to require less resources possibly?) so as to allow a wider range of products from multiple companies to make use of the format.
B&N's support for indie publishers does leave a lot to be desired and they seem to be relying on EPUB as an open format to provide the tools as opposed to paying a team of programmers to code the tools, previewers, and other bells and whistles.
I'm not a fan of the 'upload an MS Word file and deal with what comes out' approach but my opinion is probably a bit biased since I know how to create EPUBs and that is where I earn my money. It's beneficial for the people who have a book that is just a straight-forward novel with text and nothing much else in it or for people on a budget with a desire to work at perfecting it that way. I've always been a bit surprised that this forum isn't filled with more questions from indie publishers with issues.
One other note .. If at any point it sounds like I'm defending B&N or one place over another I'm not. I just like getting to the root cause of issues so I can help myself (and perhaps people reading this thread) avoid the same mistakes.
12-04-2012 12:35 PM - edited 12-04-2012 12:37 PM
Hello again, SF50,
Thanks for finding the "downside of using a single 'chapter'?" thread over at the MobileRead forums. I wish that I had seen that many months ago! You've convinced me that multiple small html files are the way to go even if we (Pat and me) don't like it. I think our workflow will stay with a single large html file until a story is "finished" and wait until ebook conversion to divide it.
However, there was one significant error: HarryT wrote in the 13th post that Kindles that use the kf8 format have the same limitations as ePub readers and this is untrue in my experience. Pat's large book with its single 4 MB html file works fine on kf8 Kindles. It does not slow down as you get near the end as it does on an ePub device like the Nook. Therefore, I believe his analysis about the kf8 Kindles having this huge memory overhead with a large html file is wrong.
Nor do I agree that the problems with large html files can be excused in the name of broad ePub hardware compatibility. Just the opposite: it is sloppy programming, plain and simple. The programmers working on the ePub reader software have failed to provide adequate error handlers for files that exceed the limits of their hardware and/or the various ePub standards. Our ePub file should not have crashed the Nook reader and it certainly should not have crashed the Android OS. The Nook app should have gracefully presented an error message explaining that the ePub document contained a component that exceeded the 300 KB limit of the standard and then closed the ePub document.
The scenario that HarryT describes is (if it's true), again, poor programming. What shortsighted programmer came up with the idea to hold all that stuff in memory at one time? If true, it's ludicrous. This is 2012 (soon 2013) and surely we've learned how to protect memory from files that are too big without crashing the app and the operating system it's running within! What shortsighted programmer thought that html files would stay below 300 KB? These folks must have worked on DOS if their thinking is so small.
Regarding older readers and the 300 KB limit, it was my understanding that updated software is pushed to readers when improvements are made. I know for a fact that Amazon does this with its Kindles and Google does this in conjunction with hardware manufacturers for Android devices. (Google was also "outed" a few years ago for hiding a secret backdoor in Android which allows them to unilaterally push an update to any connected Android device. To date, the only time this backdoor has been used by Google was to remove a programming flaw that allowed a hacker exploit.) The user has no choice. At best, an Android device user can delay a system update for a short time (in case they are in the middle of something important and cannot have their device's operation suspended). But the update will eventually be forced onto their device unless they root it and take appropriate countermeasures.
And those readers that lack the hardware (for example, lack sufficient RAM and/or ROM) to accommodate new reader software are orphaned. We already have lots of publications that can be read on a Kindle Fire which are not compatible with the original Kindle reader for this reason. And my brief time with the Nook has shown the same thing—it identifies which publications are and are not available for my Nook device/software.
That is why I didn't take the 300 KB limit seriously at first. It was listed in a B&N Formatting Guide from 2010. It never occured to me that this was still the limit. Surely the limit would have been raised and updates automatically pushed to ePub readers. I'm sorry my assumption was wrong.
As publishers, we have to decide how far back we want to support. I certainly want to support the vast majority of readers, but I have no desire to dumb down my format options for the sake of supporting the earliest readers IF I can avoid it.
The reason I'm explaining this is for the sake of other self-publishers who may read this thread and are faced with a similar decision. Do I want to try and support every ePub reader or does my publication require advanced formatting that is only supported by later or more powerful ePub devices? We all have to draw the line somewhere.
So I'll begrudgingly adhere to ePub's 300 KB html/xml file size limit because the vast majority of ePub readers require it. But the ePub standards and the ePub readers have not won my confidence. They are made inferior by the way they appear to utilize memory.
This limit will have to change. I doubt that the ePub standards will survive much longer with a 300 KB limit. The handwriting is already on the wall—content is becoming more "rich"—not less!
SF50, thanks again for your help. You've certainly helped answer my original questions and I appreciate your insights very much. You ought to write a short book describing the pitfalls of self-publishing and explain why a "pro" can be an essential part of ebook conversion. I don't think that most self-publishers understand the scope of the problem. Here are some suggestions for the book:
1 - Explain and illustrate what both bad and good layouts/formats look like: typography, alignment, use of white space, leading and margins, etc.
2 - Explain hyperlinks and what goes into a good navigation system.
3 - Explain how the underlying assumptions of each e-reader platform differ and why this is important.
4 - Illustrate what happens when you run a Word document through a distributor's converter (include samples from all the distributors whom you support which depict their limitations and errors).
5 - Explain the skills required to control the process and guarantee a good result.
I have no desire to get into the ebook conversion business—I'm content just to publish Pat's and my work. But there is a huge need for services like yours by all but the most simple publications. In many cases the old adage "you don't know what you don't know" allows a lot of poor quality ebooks to be published.
Kind regards, David (for Pat)
12-09-2012 09:53 AM
This is a follow-up to let you know the outcome. I divided the giant html file of Pat's book into 83 small html files. Basically, each item in the book's table of contents (TOC) was made into a unique file. The largest file size was only 66 KB. This was a mammoth undertaking owing to the size and complexity of the book.
The results were very satisfying. The book operates very smoothly with all Nook software and devices that we've tested. It is very fast regardless whether the reader is at the beginning or near the end of the book. All hyperlink issues have disappeared and the html TOC works as it should.
My experience underscores the necessity to keep your html/xml text files down to a modest size. The B&N recommends 300 KB or less but I recommend keeping them smaller as this seems to help your chapters load faster.
Breaking Pat's book into many small html files presented us with a quandary since we also publish for the Kindle. It accepts large single html text files with no trouble and this is how we prefer to work when in the creative and editing stage of a book.
Fortunately, we arrived at an acceptable solution for this: We have devised a workflow where we divide a work into individual chapter-size html files after the editing has been completed. Then we create an ePub file that is ready for the Nook. Finally, we create an ePub file for the Kindle and convert it to a mobi file.
We are able to use the same opf, ncx and html files for both platforms by using a few tricks:
First, we maintain two separate versions of the main css styles definition file (styles.css) for each platform and use the appropriate one for each when creating the ePub files. The ePub file for the Nook uses the Nook version of our styles.css. The ePub file for the Kindle uses the Kindle kf8 version of our styles.css.
Second, we employ the "media" attribute only to detect if an older mobi-only (non kf8) Kindle is being used and, if one is, a second set of style definitions (mobi.css) are loaded to overwrite the kf8 ones that are not compatible. The associated html code causes no trouble for the Nook because a Nook device never identifies itself as a mobi device.
Third, we maintain two separate versions of the cover image (cover.jpg) for each platform and use the appropriate one for each when creating the ePub files. Each image is the "optimal" size for the platform.
Fourth, we use a style as a sort of "flag" to identify when a non-Kindle device is being used. When a non-Kindle reader is detected, the "Cover" link in our table of contents is displayed. When a Kindle reader is detected, the "display" attribute of the "Cover" link is set to "none" so that it will not appear. Nor will it take up a blank line as an "invisible" element would. (For those who are unaware, the Kindle handles the cover page in a unique way that prevents the publisher from linking to it within the html of an ebook.) We also include the appropriate elements in the opf file to flag the Kindle converter to avoid creating a double cover since we must include an html cover page for ePub readers.
Fifth, we regretably limit our use of extended text characters to only those supported by ePub readers like the Nook. This is a much smaller set than that supported by the Kindle (which supports just about everything covered by the html 5 standard). By the way, the "Basic Latin Unicode Characters" tables in the B&N Formatting Guide are misleading. The reason why B&N provides the tables in their Formatting Guide is to show publishers which characters a Nook can display. The fact is that any extended character that has an html name (for example, • for a bullet) should display correctly on a Nook. What the Nook cannot display are extended characters without an html name (for example ◦ for a hollow bullet). Therefore, the tables in B&N's Formatting Guide are incomplete.
Finally, since Sigil is not compatible with our PCs, we are using EditPad Pro 7 to edit the html/xml code of our books. It includes a "Project" feature which we were delighted to discover can accommodate very large projects (up to 4 GB). In our case, it had no trouble opening all 83 html files in our book, grouping them as a single project and allowing us to do global spellcheck, search and replace across all files simultaneously. It also allows us to open, save and close all files in the project simultaneously. It is very fast!!!
EditPad Pro 7 is primarily a tool for programmers and it offers syntax checking for most programming languages, including html and css. It also has tools to clean and simplify html code (although we rarely need to use them). And it allows you to convert code between different text encoding standards. What it isn't is a general-purpose word processor so it would not be the best choice for a writer who is first writing a story. But it is an excellent tool to use when the writing/editing is finished and you are ready to convert the work to html/xml.
I have no financial interest in EditPad Pro. My only reason for mentioning it here is because we were unable to use the free open-source Sigil program and, if you are also unable to use Sigil, then EditPad Pro may be helpful. It won't automatically generate opf and ncx files for you (like Sigil evidently does)—the program is ignorant of the ePub standards.
Kind regards, David (for Pat)