How to convert Microsoft Word .doc files to PDF from command line

January 14th, 2009 § 2 comments § permalink

I know lot of people need it, Google is full of requests by hundred, maybe thousands of users asking for a doc2pdf converter or this kind of thing. I need it too. It is useful to have all files in pdf format (and maybe all merged in one file only) and if you have a lot of files to convert by hand, believe me, you’re not going to have a nice day.

The easy way

It is pretty easy:

$ abiword --to=pdf filename.doc

I don’t think there is so much to explain here. It converts filename.doc to filename.pdf and saves it in the current directory. It was too easy. Why should you need an hard way? I don’t know, I’m sure I need one. Unfortunately abiword’s Microsoft doc file support is not so good, in fact it lacks of the math and image/clipart features. I’m not sure if this affects all versions of abiword but it is sure for the one that comes with ubuntu (actually it doesn’t come with it, you’ve to apt-get install it).

Anyway I really need to see plots and formulas. What you said? OpenOffice supports them. Check it out. Yes I know that, OpenOffice can read almost always plots and images in doc files. Bad luck seems to be here again, OpenOffice lacks of the same command line interface abiword has, so the only way is to open doc files one by one and click on the Export as PDF button. It is very frustrating. So, here is the hard way.

The hard way

Short version (for whom doesn’t like read me be but want to read so much): check the Python-UNO site.

Long version. You need to know what Python-UNO is

The Python-UNO bridge allows to

  • use the standard API from the well known python scripting language.
  • to develop UNO components in python, thus python UNO components may be run within the process and can be called from Java, C++ or the built in StarBasic scripting language.
  • create and invoke scripts with the office scripting framework (OOo 2.0 and later).

You can find the most current version of this document from

Oh no! I’ll have to download this Python-UNO, read manuals to learn how to use those API and who knows if it’ll work…… No. Just don’t panic. I’m going to tell you something that will make this a not-so-hard way. The first thing is that if you have installed OpenOffice you’re at 50% of the work, in fact Pyhton-UNO comes with OpenOffice since version 1.1.

  • Pyhton-UNO comes with OpenOffice since version 1.1. You don’t have to download and install anything
  • Pyhton-UNO’s guys are so cool that in their code examples there is all of what we need.

From the examples page you can download the script. It has a very simple usage, we need to use it in this way:

$ openoffice -invisible "-accept=socket,host=localhost,port=2002;urp;"
$ python --pdf filename.doc

The result is almost the same of the one of the easy way but this will use OpenOffice for the conversion, so it will do it better. You also may like to write a little shell script to automate the conversion of a bunch of files, so there it is a very simple version:


openoffice -invisible "-accept=socket,host=localhost,port=2002;urp;"
for i in *.doc; do
	python --pdf $i

Remember to kill OpenOffice when it ends :o) OpenOffice has now batteries included.