Convert small pieces of HTML to DOCx

Jun 7, 2012 at 8:23 AM

Hi everyone!

I've got HTML in my DB and I want to convert that to docx. Is there anyway to convert HMTL to Docx. It's no problem when its not a new document, showing the docx 'code' on the sreen is good enough.

So I'm looking for the name of the function that converts the HTML to Docx. Anyone an idea?

Kind regards!

Coordinator
Jun 7, 2012 at 5:41 PM
Edited Jun 7, 2012 at 6:47 PM

Hi roestbak

Well, this is what this project does.

For an example of how to perform the conversion, look at the file "example.php" (in the htmltodocx folder). For a bit more detailed documentation, put the files on your web server and browse to htmltodocx/documentation/index.php. 

Jun 8, 2012 at 8:12 AM

Hi neilt17!

 

Thanks for your reaction.

When I look to the example there's an URL provided witch you want to convert to docx. I'm looking for the name of the function thats converts the html to docx. I tried the h2d_insert_html function but that doens't work. I've got a query witch is putting some HTML data out of my DB and I want to convert that to docx. I'm using an other tool to create the final docx.

Kind regards!

Coordinator
Jun 8, 2012 at 5:42 PM

The function that does all the work is h2d_insert_html_recursive() (called from hd2_insert_html()). This is entirely dependent on PHPWord, so if you are not feeding a PHPWord object into it it won't work. I guess you could use another tool, but you would need to rewrite a lot of the code here to work with the methods that that tool uses to create a docx document. By the way, what tool are you thinking of using?

Jun 12, 2012 at 8:28 AM

Hi!

Thanks for your answer!

I've got an web-application witch creates HTML-reports. I also want the report as an Word document. With this tool I can convert a whole HTML-page, but I've got some trouble with tables. Sometimes I've got in a tablerow one collum and sometimes a row has multiple columns. I'm using the 'collspan' at the one-collum tablerow to make it look even width as a row with 4 columns. When I'm using this tool to convert the html to docx my collspan is broken. And I cannot fix it.

I also tried another tool (openTBS). They are working with an docx template and they parse my stuff (from DB) into the template. Works great with tables but not with HTML-data from my DB.

My idea was to combine both tools but I don't think that is going to work...