matbaric Posted August 31, 2010 Share Posted August 31, 2010 Hi! Every help would be most appreciated! I am working on a module (php or java which one is easier don’t mind as it will be used as service) whose task is to insert letters from box file into pdf. One of solutions is to convert this .box file (which I will explain later) first into some kind of xml and then into pdf or convert .box file with coordinates into pdf immediately. .Box file looks like this: V 558 4211 612 4247 T 233 4078 329 4199 R 330 4078 422 4198 O 425 4078 534 4199 S 538 4077 624 4198 and so on and so on… V – is the letter that must be inserted into pdf 558 – coordinate for left edge of letter 4211 – bottom edge coordinate of letter 612 – coordinate for right edge of letter 4247 - coordinate for top edge of letter My first question is which PL to use Java or PHP, and the second one any ideas how to do it. Thanx in advance. [attachment deleted by admin] Quote Link to comment Share on other sites More sharing options...
jayarsee Posted August 31, 2010 Share Posted August 31, 2010 Because of the forum you're asking on, I would say most are going to recommend PHP over Java For converting that format into XML, if you /weren't/ going to use PHP I would suggest Perl, since it's pretty much made for converting documents like that at its core. Along the lines of PHP, you have two great options among several, they are: PHP's PDF extension: http://php.net/manual/en/book.pdf.php or, if this is part of a bigger project, my weapon of choice, Zend Framework's PDF component: http://framework.zend.com/manual/en/zend.pdf.pages.html The coordinates seem a little funny because I would think their applicability would depend on the font size and line height of the previous characters, if the unit is pixels. But converting this to XML would be trivial in PHP. You would simply explode() the .box file into an array of lines, and then explode() each line into an array separated by spaces. From there, you could use SimpleXML: http://php.net/manual/en/book.simplexml.php To convert the arrays into XML. Quote Link to comment Share on other sites More sharing options...
matbaric Posted August 31, 2010 Author Share Posted August 31, 2010 First thanx for replay jayarsee. With PHP PDF library there is also one i suggest http://www.fpdf.org/ I have no problem of explode, but what bothers me next are those 4 coordinates and how to use them I didn't get that in your previous post. For example if PHP PDF library function :: pdf_show_xy excepts bool PDF_show_xy ( resource $p , string $text , float $x , float $y ) only x and y coordinates examples: pdf_show_xy($pdf, "There are more things in heaven and earth, Horatio,", 50, 750); pdf_show_xy($pdf, "than are dreamt of in your philosophy", 50, 730); what about other coordinates, how to insert those letters in given positions. Please help. Quote Link to comment Share on other sites More sharing options...
jayarsee Posted August 31, 2010 Share Posted August 31, 2010 You know I didn't realize when I originally read your post that PDFBox was Apache's project for working with PDF files in Java. If your PDF documents are already in a PDF format, but that format is unique to Java, I would have to say Java will probably require less steps since your file is already prepared. That's not to say, however, that the conversion would be tremendously difficult in PHP. Hopefully the coordinates from that format are PostScript points and not pixels or some other unit of measure. If they are, they will be directly convertible to the coordinates PHP's PDF library is expecting, and you can pass them directly to PDF_set_text_pos() The only consideration you'd need to make is for font size. The third parameter of PDF_setfont() expects font size in points. If your coordinates are in points, which hopefully and presumably they are, you could simply subtract the bottom and top edges of the letters to arrive at what that letter's font size would be. Does that make sense? Quote Link to comment Share on other sites More sharing options...
matbaric Posted August 31, 2010 Author Share Posted August 31, 2010 Thanks again jayarsee for quick replay. So let's make it clear, what is my task and what I have. My task is to extract coordinates from .box file and make PDF from it. So the only thing I have is that .box file. One of my guys suggested me to first turn this .box file into xml and then after that into PDF file (look at the attachments in the first post). First of all I must see with my guys what kind of measure is in this .box file (points, pixels or something else, because this file has been made from OCR or something like this). Let’s get another look at one line V – is the letter that must be inserted into pdf 558 – coordinate for left edge of letter 4211 – bottom edge coordinate of letter 612 – coordinate for right edge of letter 4247 - coordinate for top edge of letter The next step you are suggesting is to get coordinates left and right to use with PDF library function bool PDF_set_text_pos ( resource $p , float $x , float $y ), so in my case $x would be 612 or right edge and $y would be 558 left edge, and as for font bool PDF_setfont ( resource $pdfdoc , int $font , float $fontsize ) third param will be 4247 – 4211 = 36 pt Did I get this all right? Please advise. Quote Link to comment Share on other sites More sharing options...
jayarsee Posted August 31, 2010 Share Posted August 31, 2010 Really, the XML wouldn't be necessary. It sounds suspiciously like that's someone's Java experience talking. If you exploded the format into multi-dimensional arrays you could create the PDFs directly from that. You got it on the font size. The $x,$y the positioning function is referring to is either the top left of the letter or the bottom left of the letter, I actually don't remember which, you'd have to experiment to see though it should be easily recognizable which it is with one test. Quote Link to comment Share on other sites More sharing options...
matbaric Posted August 31, 2010 Author Share Posted August 31, 2010 jayarsee, your help is very much appreciated. Ok I will see what I can do for know, and as for the rest of forum guys that are interested in this kind of topic I’ll post complete code of solution. I’ll bring new information to this thread as soon as I get some news about mesurment values in .box file. Thanks again jayarsee. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.