Jump to content

Catching a 3 letter word from a text file


newphpnoob13

Recommended Posts

Hello Everyone,

 

I have a log file in txt format. Here's a part of it:

================================================

[Jan 25 2012 11:47:31] - ID2PDF v.2.6 (ID2PDF.jsx)

General options: [default] (ID2PDF_options.xml)

================================================

Loaded options from XML file: '/Thisis/some/Users/sumuser/Desktop/SM_Folder/ESQ/Virtual_Proof_ESQ/processing/ID2PDF_options.xml

extendedPrintPDF started

Postfix '3.0' was append from file 'ESQ030112ELAM_lo-metadata.xml' for file: '/Thisis/some/Users/sumuser/Desktop/SM_Folder/ESQ/Virtual_Proof_ESQ/processing/someFile.indd

printPDF started

PDF Export Preset: Some preset

PDF file created: ''/Thisis/some/Users/sumuser/Desktop/SM_Folder/ESQ/Virtual_Proof_ESQ/processing/someFile.pdf'.

File someFileName.xml removed

postprocessingDocument started

INDD file removed: '/Thisis/some/Users/sumuser/Desktop/SM_Folder/ESQ/Virtual_Proof_ESQ/processing/someFile.indd

================================================

[Jan 25 2012 15:18:23] - ID2PDF v.2.6 (ID2PDF.jsx)

General options: [default] (ID2PDF_options.xml)

================================================

Loaded options from XML file: ''/Thisis/some/Users/sumuser/Desktop/SM_Folder/COS/Contract_Proof_COS/processing/ID2PDF_options.xml

extendedPrintPDF started

Postfix '8.1' was append from file 'ESQ030112Politics_Russia_lo-metadata.xml' for file: '/Thisis/some/Users/sumuser/Desktop/SM_Folder/COS/Contract_Proof_COS/processing/oneMoreFile.indd

printPDF started

PDF Export Preset: Hearst PDF 1.6_1_16_08

PDF file created: '/Thisis/some/Users/sumuser/Desktop/SM_Folder/COS/Contract_Proof_COS/processing/oneMoreFile.pdf'.

File oneMoreFile.xml removed

postprocessingDocument started

INDD file removed: /Thisis/some/Users/sumuser/Desktop/SM_Folder/COS/Contract_Proof_COS/processing/oneMoreFile.indd

 

And it continues this way. I am supposed to catch the 3 letter words like ESQ and COS (appearing in the path of file on very first line of entry ONLY) in an array. I'm not able to think of a logic/create a RegEx because the user name may change often.

 

Can anyone give me heads up on how I could possibly achieve this?

 

Would appreciate the help.

 

Thanks

Link to comment
Share on other sites

So you're looking for the "COS" in

Loaded options from XML file: '/Thisis/some/Users/sumuser/Desktop/SM_Folder/COS/Contract_Proof_COS/processing/ID2PDF_options.xml'

Is the folder structure always the same? At least in terms of the SM_Folder/*? Then it's as simple as

/^Loaded options from XML file: '/.*/SM_Folder/([a-z]{3})/Contract_Proof_\1/processing/ID2PDF_options.xml'$/im

(\1 being a backreference to the [a-z]{3} matched earlier)

 

If the folder structure varies much more than that then how do you know where to find the "COS"?

Link to comment
Share on other sites

Okay. So, now I've got to

/^Loaded options from XML file: '\/.*\/SM_Folder\/([a-zA-Z])\/[a-zA-Z]_Proof_\1\/processing\/ID2PDF_options.xml'$/im

But, I get an EMPTY array. Still better than getting an error. But, how do I resolve this to get my 3 letter word?

Note: If I take off the '\' from

\/([a-zA-Z])\

I get a unknown modifier '(' error.

Edited by newphpnoob13
Link to comment
Share on other sites

The [a-zA-Z] you added will only match one character. You'd need a + to match more than one.

 

$regex = '#^Loaded options from XML file: \'/.*/SM_Folder/([a-z]{3})/\w+_\1/processing/ID2PDF_options.xml\'$#im';
preg_match_all($regex, $text_from_file, $matches) && var_dump($matches);

works for me (with the wonky quotes in the example text fixed).

Link to comment
Share on other sites

@requinix:

This is the data which is more closer to my original file. I messed up the quotes a little bit in my original post.


================================================
[Jan 25 2012 11:26:03] - ID2PDF v.2.6 (ID2PDF.jsx)
General options: [default] (ID2PDF_options.xml)
================================================
Loaded options from XML file: '/This/folder/Users/userName/Desktop/SoMe_Folder/ABC/SomeName_Proof_ABC/processing/someFile_options.xml
extendedPrintPDF started
Postfix '4.4' was append from file 'ABCFileName.xml' for file: /This/folder/Users/userName/Desktop/SoMe_Folder/ABC/SomeName_Proof_ABC/processing/ABCFileName.indd
PDF file created: '/This/folder/Users/userName/Desktop/SoMe_Folder/ABC/SomeName_Proof_ABC/processing/ABCFileName.4.pdf'.
INDD file removed: /This/folder/Users/userName/Desktop/SoMe_Folder/ABC/SomeName_Proof_ABC/processing/ABCFileName.indd
================================================
[Jan 25 2012 11:44:50] - ID2PDF v.2.6 (ID2PDF.jsx)
General options: [default] (ID2PDF_options.xml)
================================================
Loaded options from XML file: '/This/folder/Users/userName/Desktop/SoMe_Folder/XYZ/SomeOtherName_Proof_XYZ/processing/ID2PDF_options.xml
extendedPrintPDF started
Postfix '4.2' was append from file 'XYZFileName.xml' for file: /This/folder/Users/userName/Desktop/SoMe_Folder/XYZ/SomeOtherName_Proof_XYZ/processing/XYZFileName.indd
PDF file created: '/This/folder/Users/userName/Desktop/SoMe_Folder/XYZ/SomeOtherName_Proof_XYZ/processing/XYZFileName.2.pdf'.
INDD file removed: /This/folder/Users/userName/Desktop/SoMe_Folder/XYZ/SomeOtherName_Proof_XYZ/processing/XYZFileName.indd

However, I'm still getting an empty array with

/^Loaded options from XML file: '\/.*\/SM_Folder\/([a-zA-Z]{3})\/[a-zA-Z]+_Proof_\1\/processing\/ID2PDF_options.xml$/im or with the RegEx you gave.

I'm trying everything and it doesnt seem to help. :(

P.S. I've ignored stuff from the file that's not required for this particular problem.

Edited by newphpnoob13
Link to comment
Share on other sites

Well, the regex specifically says "SM_Folder" while your output has "SoMe_Folder", and the regex also expects "ID2PDF" where your output has "someFile".

 

If you don't know those names for sure then use a wildcard. Also I'm removing the leading slash from the filename, just in case.

/^Loaded options from XML file: '.*\/([a-zA-Z]{3})\/[a-zA-Z]+_Proof_\1\/processing\/.*_options.xml$/im

When a regex doesn't match, make it more and more generic (more wildcards, fewer literal characters) until it matches. Then figure out what you may or may not need to add back to get the exact results you need.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.