Formatting Source Code For Blog Posts

Link. February 7, 2008. Comments [4]. Posted in: Blogging | Tools

I've struggled quite a bit over the past few months trying to come up with a good way of including code snippets in my blog posts. The problem, of course, is making the code look good both when browsing the website as well as in the RSS feed. It's not the first time I've ran into this issue, either.

As I've mentioned in the past, one of the things making it hard is my blog engine, as the dasBlog build I'm currently running doesn't properly respect whitespace in the original HTML code. Because of this, posting code is a big pain in the neck, as you need to format the code in HTML manually since using good old <pre><code></code></pre> tags will render the code unreadable in the RSS feed.

Beyond just having the spacing and indentation right, there's also the matter of posting syntax highlighting. There are several options I've tried over the years:

  • CopySourceAsHtml: This is an OK tool, though I've had to build a custom build to make it work on my machine (which has some weird clipboard issues at times; not sure why). I'm not 100% satisfied with it, though. For one thing, it generates pretty ugly HTML. It's also not very useful for posting code in a language not supported by Visual Studio itself, which I occasionally do (example: PowerShell snippets).
  • CodeHTMLer: An online site for converting code snippets to HTML, supporting a bit more languages than CopySourceAsHtml. I've used this one extensively, and it usually does a good job, provided I explicitly check the "convert whitespace" option as well as "Inline Tags" for formatting. You don't get much choice in how the code is formatted, though. I believe there's a Windows Live Writer plugin based on this somewhere, but I always forget where so it's more convenient for me to just use the web application.
  • Syntaxhighlighter: A Javascript Library for formatting code snippets. Looks nifty, but only usable on the website and not RSS feeds, I think. It also seems to rely heavily
  • Iris: This is an interesting project, based on the VIM syntax highlighting. It actually does a pretty good job, though I haven't tried using it yet directly on my blog. The Ajax interface is slick, though there's also a version you can download to use on your own desktop. It only seems to support using CSS for the syntax highlighting, though.
  • TOhtml: I've also been experimenting lately with using Vim's :TOhtml command. This can actually provide very nice HTML if you set up the right options, and given the breath of syntaxes supported by Vim, you can pretty much use it to pretty print almost any code snippet you can think of.
    Another nicety of TOhtml is that it will use the colors in your selected Vim colorcheme, so the generated HTML looks exactly like it does in Vim. It can also generate the formatting using CSS or inline tags.
    As a sample, this is what I used in my last PowerShell post, with the options html_use_css, html_no_pre and use_xhtml. The color scheme I was was using was the black variant of the recently updated moria scheme, aided by Peter Provost's excellent PowerShell syntax files.

I'm sure there are many other options out there. I know there are some very nice plugins for other blogging platforms (like wordpress, from what I've been reading), but for obvious reasons that's not very useful.

Another issue that can be a bit bothersome with code formatting is the choice between using CSS rules or using inline tags.

In an ideal world, using CSS rules would be much more preferred, particularly if you can keep them in an external CSS file. One obvious benefit of this is that if you later decide to change your formatting preferences, your color scheme, or simply change your blog's layout and colors and want your code to match them, it becomes a whole lot easier (though this isn't all that possible for someone like me with 6 years of past blog posts with all kinds of code formatting used).

The downside of using CSS is that it's pretty much a website-only option, so it's not very useful for formatting code in your RSS feed. At least, my experience has been that most RSS generators and/or consumers will strip any inline CSS rules found in blog posts (this was, in fact, what happened to my last PowerShell post mentioned above).

I know of no way to easily reference an external CSS for this, but if anyone knows of a way, I'd sure appreciate knowing about it!

So that pretty much leaves, for now, the only option of not using <pre> tags and resorting to inline tags. Yuck! So, what's the secret sauce others are using for this?



Thursday, February 07, 2008 11:09:08 AM (SA Pacific Standard Time, UTC-05:00)
If you use Windows Live Writer you should really get the CodeHtmler Live Writer Plugin. It allows you to customize pretty much everything about the formatting and best of all it is persisted so you don't need to re-customize everytime. You can find it at http://www.codeplex.com/CodeHtmler, the complete source is there as well so if there is something you don't like by all means ask for it or contribute to the project.
Thursday, February 07, 2008 11:33:46 AM (SA Pacific Standard Time, UTC-05:00)
+1 on CodeHTMLer's WLW plugin. I recently added F# support as well as custom font support, so I could choose Consolas or Lucida Console instead of Courier for my code snippets.

Of course, we could just fix dasBlog's RSS feed so that it respected whitespace correctly!
Tuesday, February 12, 2008 2:47:59 AM (SA Pacific Standard Time, UTC-05:00)
Hi Tomas,

I'm the guy who wrote Iris, and I'd be up for writing the code to do what you need (since it's a pretty common need). After all, it was so much work parsing the Vim syntax files, evaluating Vim expressions, and converting Vim regex to .NET, that it seems a waste not to go the last mile and write a simple formatter class. :)

So, if you had to specify a formatter exactly, what would it do? From what you said in the post, it sounds like I'd have to:

1. Convert spaces/tabs to &nbsp

2. Use <font> for color, and <i />, <u>, <b />, etc for font style, text decoration and weight

3. Not use pre

Anything else? Here are some questions I have:

A. Do you care about line numbers?
B. If so, how important is it for users to be able to copy and paste without being bothered by the line numbers?
C. Are there any other 'structural' aspects you'd like to see in the HTML? What should the code be wrapped in?
D. If the highlighter took a Visual Studio config file as input for a Color Scheme, would that help?

cheers,
Gustavo
Tuesday, February 12, 2008 8:04:16 AM (SA Pacific Standard Time, UTC-05:00)
Hi Gustavo! Actually thanks for posting the tool. Yes, indeed those would be some of the ideas I'd have. As for line numbers, I don't use them so they're not really a concern. I don't think supporting VS config files would be strictly necessary; the support for colorschemes there seems to do the trick already (though some people might find it useful; not sure).
Comments are closed.

About

Tomas Restrepo is co-founder of devdeo. His interests include .NET, Connected Systems, PowerShell and, lately, dynamic programming languages. More...

email: tomas@winterdom.com
msn: tomasr@passport.com
twitter: tomasrestrepo

Technorati Profile

devdeo logo

View my profile on LinkedIn

MVP logo

Syndicate

Ads


Links

Categories

Statistics

Total Posts: 1014
This Year: 84
This Month: 3
This Week: 2
Comments: 776

Blogroll

Post Archive

Other

Copyright © 2002-2008, Tomas Restrepo.

Powered by: newtelligence dasBlog 2.1.8102.813

Sign In