Up to the TUG homepage
(external link)
Up to Converters between LaTeX and
PC Textprocessors homepage
Switch conversion direction: From PC to LaTeX
Author: Wilfried Hennings (texconvfaq "at" gmx.de), this page
last updated on June 30, 2011
The url of this page is
http://tug.org/utilities/texconv/textopc.html
I maintain these pages because I need converters between LaTeX and PC Textprocessors for my work and I want to share the information with others who need it. Because I maintain them in my spare time (uh, what is spare time?), I can not answer individual questions.
This list is as good or as bad as its support, and it needs YOUR support to update and supplement this list. Please supplement if you know more and/or better ones. There are some more converters on the CTAN sites, but the following seem to be most promising for conversion to and from the current versions of wordprocessors.
Neither correctness nor completeness is guaranteed.
All opinions mentioned (if any) are my own. Please
send corrections, enhancements and supplements (auch in deutscher
Sprache) to the following address:
texconvfaq "at" gmx.de
Note that this FAQ list contains information about converters ONLY between LaTeX and PC word processors. Converters to and from other formats may have own FAQ lists - e.g. see the link for converters to and from HTML.
For the impatient, here is a table with overview of features of the most recent converters.
Before looking for a converter, stop and think about a principal question:
Do you want to convert the document structure,
i.e. a heading should remain a heading, a list should remain a list
etc., no matter how it will look like in the target format?
Or do you want to convert the appearance,
i.e. how it looks like, no matter how it is represented in the target
format?
Or do you want a mixture of both?
For using SGML as an intermediate format, you would have to specify the
translation rules yourself (as far as I understood). This makes sense,
and explains why different people have very different opinions about
which converter best fits their needs: They simply have different
demands and expectations on what should be converted and how.
So, not only practically there is no converter which is good for
everyone and every purpose, but this is even principally impossible
because there are no well-defined requirements which a converter should
meet.
An additional problem is that TeX/LaTeX can be extended by an unlimited number of macros. Unless the converter contains a full-scale TeX system, it can at best support the publicly available macro commands, not the ones privately written by individual users. Practically you can expect that it supports the standard LaTeX commands and perhaps a few more widely used packages. The only converter which uses a full-scale TeX system is TeX4ht.
So keep this in mind when looking through the following list of converters, try yourself and decide what you need.
To illustrate these, let me restrict it to the Microsoft Word case:
The converters being most complete and currently maintained / supported are:
TeX2Word - a shareware LaTeX import filter for MS Word
GrindEQ - a shareware LaTeX import filter for MS Word
latex2rtf - a free standalone LaTeX -> RTF converter for PC, Macintosh and Unix,
TeX4ht - a free LaTeX to html or XML converter for PC and Unix produces html which is good for loading into Word. TeX4ht relies on other software, it needs at least a full TeX system.
There are also converters to Powerpoint and to FrameMaker (see further below).
"Aurora"
(formerly named Ribbit) can now convert a LaTeX coded equation (which
must be placed on the Windows clipboard) to Word. The converter is
still experimental and as such has a number of limitations, some of
which will be addressed in future releases. The converter’s
output will generally need some manual touching up to achieve the level
of fidelity
on a par with the original document.
The other functionality of Aurora, which was the only functionality of
its predecessor "Ribbit", is letting you enter LaTeX equations in word
processors such as MS Word or in Powerpoint. One can enter the equation
in LaTeX markup, and the formatted equation is inserted as an object.
See homepage (external
link)
(Shareware)
Aurora needs a working LaTeX installation. If there is no, it will
install a micro version of MiKTeX.
"LaTeX in Word":
See homepage
(external link)
(Freeware, GPL).
It allows to enter LaTeX equations in word
processors such as MS Word in LaTeX markup, and the
formatted equation is inserted in the Wordprocessor as a png
bitmap. It needs a server which performs the conversion. Server
installation files are available from the download
page (external link).
MathType (external link) allows typing and pasting equations in LaTeX markup and also direct conversion of an equation in LaTeX markup which is part of the Word document text.
OpenOffice allows typing equations in LaTeX-like markup.
Word 2007 allows typing equations in LaTeX-like markup (although not 100% compatible), see http://blogs.msdn.com/microsoft_office_word/archive/2006/10/04/Equations-in-Word-2007.aspx
TeX2Word:
Shareware, 99$ (45$ academic).
Current version: 3.0, Feb 2011.
Support for more document styles and packages will be available with
future versions. You can also supply support for document styles,
packages and user defined macros by yourself (needs TeX programming
knowledge).
Needs:
* MS Windows 2000, XP or later,
* MS Word 97 or later,
* with Word 97 to 2003, also MathType
(external
link) 4
or later is needed (full version of the Equation Editor which comes
with MS Word).
See homepage
(external link)
GrindEQ
LaTeX-to-Word: Shareware, 99EUR (49EUR academic).
converts LaTeX, AMS-LaTeX, Plain TeX, or AMS-TeX documents to Microsoft
Word format.
You can choose the following formats for TeX/LaTeX equations: Microsoft
Equation 2007, Microsoft Equation 3.x, or MathType.
Works with Microsoft Word 97/2000/XP/2003/2007 and Microsoft Windows
98/Me/NT/2000/XP/2003/x64/Vista.
Evaluation version is restricted to 10 launches.
See homepage
(external link)
latex2rtf: LaTeX-to-RTF-converter. See the more detailed page.
TexPort
converts your TeX
and LaTeX files to WordPerfect or Microsoft Word documents.
See more detailed page
See home
page
(external link)
tex2doc by Thomas Link
(external link):
LaTeX to WinWord 6 and WinWord 7(95) converter, written as Word macros.
Also attempts to convert tables! Not compatible with Word 2000 and up,
no further development.
homepage
(external link),
ltx2word,
by myself: LaTeX to WinWord 6, WinWord 7(95) and WinWord 97 converter,
written as Word macros. No tables. Not compatible with Word 2000 and
up, no further development.
See more detailed page.
The following are no full converters but only allow typing or pasting LaTeX code into Word:
"Aurora"
(formerly named Ribbit) can now convert LaTeX code (which must be
placed on the Windows clipboard) to Word. The converter is still
experimental and as such has a number of limitations, some of which
will be addressed in future releases. The converter’s output
will
generally need some manual touching up to achieve the level of fidelity
on par with the original document.
The other functionality of Aurora, which was the only functionality of
its predecessor "Ribbit", is letting you enter LaTeX equations in word
processors such as MS Word or in Powerpoint. One can enter the equation
in LaTeX markup, and the formatted equation is inserted as an object.
See homepage (external
link)
(Shareware)
Aurora needs a working LaTeX installation. If there is no, it will
install a micro version of MiKTeX.
"LaTeX in Word": It allows to enter
equations in word processors such as MS Word in LaTeX markup, and the
formatted equation is inserted in the Wordprocessor as a png
bitmap. It needs a server which performs the conversion.
See homepage
(external link) (Freeware, GPL)
Server installation files are available from the download
page (external link).
TexPoint enables the easy use of Latex
symbols and formulas in
Powerpoint presentations. See homepage
(external link).
(Shareware)
Latest version requires PowerPoint2000, does not work with earlier
versions of PowerPoint.
Aurora (formerly named Ribbit) now also supports PowerPoint. See homepage (external link) (Shareware)
cost free unless otherwise stated
Because HTML is a structured format, the conversion between HTML and LaTeX is rather straightforward. However there remain the limitations of HTML compared to LaTeX, i.e. there are many elements in LaTeX which can not (yet?) be represented in HTML. Converters from LaTeX to HTML are:
Recommended
if you have TeX installed or don't mind to install it:
TeX4ht (external link)
is a highly configurable TeX-based converter to hypertext. It comes
with a built-in default setting for plain TeX, LaTeX and TeXinfo, and
it generates html with accompanying css stylesheet, xhtml, or xml. The
converter needs a full TeX installation, but this gives the advantage
that TeX's full support for macros and styles is available (with only
few exceptions).
Equations are converted to either bitmaps or MathML. There are some
different MathML flavors around which can be chosen by an option.
(Following description is partially copied from the TeX4ht web site.)
The special command
oolatex
is available for producing xml compatible with OpenOffice, LibreOffice
(and probably also StarOffice). The output of the command oolatex
<filename> is a zipped file with same name and
a ".odt" extension (containing the document in xml format which does
not suffer the limitations of html). For this to work, TeX4ht needs a
zip program which is not included in the TeX4ht distribution but e.g.
in the MikTeX distribution.
A zip program can be downloaded from e.g. http://www.info-zip.org/
(external link).
The resulting .odt file can directly be opened in OpenOffice or
LibreOffice, converted equations are editable in OpenOffice's or
LibreOffice's own equation editor. OpenOffice and LibreOffice
can save the document in MS Word 97/2000/XP (.doc) format, but
some equations may not be converted correctly to Word. It is also
possible to save as Microsoft Word 2007 XML (.docx), but this breaks
the equations (at least in LibreOffice 3.3.2).
A command of the form
htlatex filename "html,word" "symbol/!"
asks for HTML output tuned toward Microsoft Word. Such a format,
however, relies on bitmaps for mathematical formulas.
Conversion to bitmaps additionally needs Ghostscript and ImageMagic or
netpbm.
For more information see
http://tug.org/applications/tex4ht/mn.html
(external link).
TTH
(external link):
LaTeX-to-HTML converter which translates LaTeX into HTML 4.0 markup.
Formulae are also translated into standard html markup. (Free for
non-commercial applications.)
A sister of tth, TtM
(external link),
converts formulae to MathML (Linux version for free, Windows version
must be paid).
ltoh (external link): LaTeX-to-HTML converter which is highly customizable, i.e. you can define how the LaTeX macros which are used in your document are to be translated. Requires that the input file conforms to LaTeX2e (see documentation). It was last updated 1979, and it seems that the homepage is no longer available, so look on CTAN in .../support/ltoh/ .
HEVEA (external link): LaTeX-to-HTML converter which translates LaTeX into HTML4.0 markup. Formulae are also translated into standard html markup (not yet using MathML).
Hyperlatex (external link) allows the use of a subset of LaTeX to produce documents in HTML .
Some converters are available from
CTAN
(external link)
("Comprehensive TeX Archive Network"), e.g. in
.../support/latex2html.
(The ... stands for a host specific base directory, which often is
either "/pub/tex" or "/tex-archive")
Word 8 (97) and up contain the html converter by default (but
its installation may have to be explicitly chosen during the Word setup
in user-defined mode).
For Word 6 and 7 (95) for Windows and Mac there are free HTML
converters available from Microsoft:
Download... IA
for Word 6 (external link) / IA
for Word 7 (95) (external link) / IA
for Word for Mac (external link)
WordPerfect 7 and up have an integrated InternetPublisher.
For WordPerfect 6.1 for Windows, the InternetPublisher is available
separately:
Download...
InternetPublisher
for WPWin 6.1 (external link)
OpenOffice can also import html, but much better is using tex4ht for lossless conversion to native OpenOffice format.
There are ways to use SGML as intermediate format, and others have used it successfully. Having had a quick look at it, I found it rather complicated, especially it seems that you have to define the translation rules yourself. So I did not put more effort in trying to use it. If anyone can give me a ready-to-use cookbook solution, I will include it here.
An upcoming format is XML, a subset of
which can be exported and imported by Microsoft Office 2000 and up,
OpenOffice uses it as its native format, and the browser programmers
are working on implementing XML. It actually is an instance of SGML. As
it is more powerful than HTML,
conversion from LaTeX to XML would lose much less information than
conversion from LaTeX to HTML. There are good chances that it could be
used as a general exchange format in the future. TeX4ht
already has scripts for converting to XML (TEI or DOCBOOK). MS Word
2000 and earlier can not import XML, for these target systems convert
to html+css using the xwtex and xwlatex scripts. MS Word 2003 can
export and import XML, but I haven't yet tested whether it can import
the TEI or DOCBOOK files produced by TeX4ht.
The most successful path is using TeX4ht
to convert to the OpenOffice format (.sxw, which actually is a zip
compressed archive containing the document and vector graphics as XML
and the bitmap graphics as bitmap files) and open this in OpenOffice.
One could stop there, as OpenOffice is publicly available, or go on and
save from OpenOffice as a "MS Word 97/2000/XP"
file.
Most astonishing, one could also use PDF as intermediate format. Generating PDF from LaTeX is straightforward if you have a full TeX implementation installed. If you have the full commercial version of Adobe Acrobat 7, you can open the pdf and "save as" e.g. rtf (saving "as Word doc" actually generates an rtf file, too), xml, or plain text. Or you can use other commercial software to convert PDF to Word, just do a www search for "pdf to word" to get several hits. In this path of conversion however the document strucure and probably some formatting will be lost.
Finally, you can use OCR software to convert any printed document to word or plain text. To avoid the inaccuracy introduced by printing to paper and scanning, you can convert the TeX output to ps or pdf, convert this to a bitmap (using ghostscript), and feed this bitmap into the ocr software.
la2mml: converts LaTeX to FrameMaker
format. Maybe outdated, latest version was created
Nov. 1995.
See more detailed page.
homepage
(external link)
FrameMaker Utilities: Contains converters
for both directions
(LaTeX <-> FrameMaker) as well as templates which make
conversion from Framemaker to LaTeX more easy
homepage (external link)
This HTML page is part of the texconv pages.
Copyright © 1998 … 2011 Wilfried Hennings
You may copy and redistribute it under the following conditions:
Please also note the disclaimer.