Remove Textile from a Wordpress blog

Here’s a Python script to remove Textile mark-up from a Wordpress blog.

Download the script here: untextile.py

I’ve recently upgraded The Raven to the latest version of Wordpress. When I first set it up (back when Wordpress was at version 1.2!!) I installed the Textile plug-in. Most of the users weren’t tech-savvy, and so they needed an easy way to write posts without having to know HTML. Nowadays, Wordpress has a nifty WYSIWYG editor, which is much easier for casual users than Textile ever was. Unfortunately, I couldn’t just turn off the plug-in – lots of old posts still contained the markup.

Fortunately there’s a Textile module for Python. I wrote this little script to automatically purge Textile mark-up from a Wordpress blog. If you find it useful, then let me know.

syntax: python untextile.py [WP-CONFIG]

Just run the script in your blog directory, or give it the filename of your wp-config.php file.

Example:

% cd my/wordpress/directory
% python untextile.py
or
% python untextile.py my/wordpress/directory/wp-config.php

Comment · RSS · TrackBack

  1. Christopher Ross said,

    10 October, 2008 @ 13:49

    It’s great to see somebody else using Python with WordPress!

  2. Sumit said,

    5 November, 2008 @ 15:52

    I have several old Rails sites I’ve been moving into Wordpress. I leave the Textile plug-in & WP-MarkItUp Toolbar on for those sites and turn it off for any new stuff, just because people who have used it before are use to it.

    The Visual Editor in WP is surprisingly clean, well as clean as it can be. If you cut and past from Word into it, you’re still gonna get a ton of garbage.

    I still use Textile personally for my own stuff because I do like the markup and I also like the way it auto handles footnotes with the[1] and fn1. symbols.

    This is a cool script though. I’ll have to file this away in case I ever want to untextile a site.

  3. Lisa said,

    4 December, 2008 @ 23:24

    Hi! This thing is a potential godsend. After much putzing around trying to get the textile and uuid extensions installed fort python correctly, and then modifying your code to allow for the path to mysql.sock to be in the connect string it’s working! Sort of. I’m getting through a few and then it cacks. I wonder if you could help me troubleshoot? If you have time, here’s the error I’m getting:

    user@domain:~/domains/domain.com/html$ python untextile.py Converting post: 1 Converting post: 2 Converting post: 104 Traceback (most recent call last): File “untextile.py”, line 84, in ? untextile(db,wpconfig.table_prefix) File “untextile.py”, line 29, in untextile text = textile.textile(text) File “/home/mt/users/.home/lib/python/textile.py”, line 1052, in textile return Textile().textile(text, **args) File “/home/mt/users/.home/lib/python/textile.py”, line 299, in textile text = self.block(text) File “/home/mt/users/.home/lib/python/textile.py”, line 577, in block o1, o2, content, c2, c1 = self.fBlock(tag, atts, ext, cite, line) File “/home/mt/users/.home/lib/python/textile.py”, line 667, in fBlock content = self.graf(content) File “/home/mt/users/.home/lib/python/textile.py”, line 833, in graf text = self.links(text) File “/home/mt/users/.home/lib/python/textile.py”, line 871, in links text = re.compile(pattern, re.X).sub(self.fLink, text) File “/home/mt/users/.home/lib/python/textile.py”, line 894, in fLink url = self.relURL(url) File “/home/mt/users/.home/lib/python/textile.py”, line 786, in relURL if (not o.scheme or o.scheme == ‘http’) and not o.netloc and re.search(r’^\w’, o.path): AttributeError: ‘tuple’ object has no attribute ’scheme’

    Any idea what could be going wrong? I’d really appreciate your help. Thanks again for this! Great stuff.

  4. abiye said,

    13 December, 2008 @ 21:27

    i cant understand what is this excatly. can some one explain?

  5. chris said,

    3 February, 2009 @ 17:01

    Lisa – the issue is that you are running a version of Python prior to 2.5 when attribute-access to the results of “urlparse” were introduced. See: http://docs.python.org/library/urlparse.html#urlparse.urlparse

    You can fix this with a small patch to textile.py:

    Line 784:

    def relURL(self, url):
        o = urlparse(url)
        # XXX PATCH
        # o[0,1,2,3,4,5] = [scheme,netloc,path,params,query,fragment]
        # attribute access only introduced in python 2.5+
        if (not o[0] or o[0] == 'http') and not o[1] and re.search(r'^\w', o[2]):
            url = self.hu + url
        if self.restricted and o[0] and o[0] not in self.url_schemes:
            return '#'
        return url
    

Leave a Comment

Sponsors