Jump to content

Catching a 3 letter word from a text file


newphpnoob13

Recommended Posts

Hello Everyone,

 

I have a log file in txt format. Here's a part of it:

================================================

[Jan 25 2012 11:47:31] - ID2PDF v.2.6 (ID2PDF.jsx)

General options: [default] (ID2PDF_options.xml)

================================================

Loaded options from XML file: '/Thisis/some/Users/sumuser/Desktop/SM_Folder/ESQ/Virtual_Proof_ESQ/processing/ID2PDF_options.xml

extendedPrintPDF started

Postfix '3.0' was append from file 'ESQ030112ELAM_lo-metadata.xml' for file: '/Thisis/some/Users/sumuser/Desktop/SM_Folder/ESQ/Virtual_Proof_ESQ/processing/someFile.indd

printPDF started

PDF Export Preset: Some preset

PDF file created: ''/Thisis/some/Users/sumuser/Desktop/SM_Folder/ESQ/Virtual_Proof_ESQ/processing/someFile.pdf'.

File someFileName.xml removed

postprocessingDocument started

INDD file removed: '/Thisis/some/Users/sumuser/Desktop/SM_Folder/ESQ/Virtual_Proof_ESQ/processing/someFile.indd

================================================

[Jan 25 2012 15:18:23] - ID2PDF v.2.6 (ID2PDF.jsx)

General options: [default] (ID2PDF_options.xml)

================================================

Loaded options from XML file: ''/Thisis/some/Users/sumuser/Desktop/SM_Folder/COS/Contract_Proof_COS/processing/ID2PDF_options.xml

extendedPrintPDF started

Postfix '8.1' was append from file 'ESQ030112Politics_Russia_lo-metadata.xml' for file: '/Thisis/some/Users/sumuser/Desktop/SM_Folder/COS/Contract_Proof_COS/processing/oneMoreFile.indd

printPDF started

PDF Export Preset: Hearst PDF 1.6_1_16_08

PDF file created: '/Thisis/some/Users/sumuser/Desktop/SM_Folder/COS/Contract_Proof_COS/processing/oneMoreFile.pdf'.

File oneMoreFile.xml removed

postprocessingDocument started

INDD file removed: /Thisis/some/Users/sumuser/Desktop/SM_Folder/COS/Contract_Proof_COS/processing/oneMoreFile.indd

 

And it continues this way. I am supposed to catch the 3 letter words like ESQ and COS (appearing in the path of file on very first line of entry ONLY) in an array. I'm not able to think of a logic/create a RegEx because the user name may change often.

 

Can anyone give me heads up on how I could possibly achieve this?

 

Would appreciate the help.

 

Thanks

So you're looking for the "COS" in

Loaded options from XML file: '/Thisis/some/Users/sumuser/Desktop/SM_Folder/COS/Contract_Proof_COS/processing/ID2PDF_options.xml'

Is the folder structure always the same? At least in terms of the SM_Folder/*? Then it's as simple as

/^Loaded options from XML file: '/.*/SM_Folder/([a-z]{3})/Contract_Proof_\1/processing/ID2PDF_options.xml'$/im

(\1 being a backreference to the [a-z]{3} matched earlier)

 

If the folder structure varies much more than that then how do you know where to find the "COS"?

Okay. So, now I've got to

/^Loaded options from XML file: '\/.*\/SM_Folder\/([a-zA-Z])\/[a-zA-Z]_Proof_\1\/processing\/ID2PDF_options.xml'$/im

But, I get an EMPTY array. Still better than getting an error. But, how do I resolve this to get my 3 letter word?

Note: If I take off the '\' from

\/([a-zA-Z])\

I get a unknown modifier '(' error.

The [a-zA-Z] you added will only match one character. You'd need a + to match more than one.

 

$regex = '#^Loaded options from XML file: \'/.*/SM_Folder/([a-z]{3})/\w+_\1/processing/ID2PDF_options.xml\'$#im';
preg_match_all($regex, $text_from_file, $matches) && var_dump($matches);

works for me (with the wonky quotes in the example text fixed).

@requinix:

This is the data which is more closer to my original file. I messed up the quotes a little bit in my original post.


================================================
[Jan 25 2012 11:26:03] - ID2PDF v.2.6 (ID2PDF.jsx)
General options: [default] (ID2PDF_options.xml)
================================================
Loaded options from XML file: '/This/folder/Users/userName/Desktop/SoMe_Folder/ABC/SomeName_Proof_ABC/processing/someFile_options.xml
extendedPrintPDF started
Postfix '4.4' was append from file 'ABCFileName.xml' for file: /This/folder/Users/userName/Desktop/SoMe_Folder/ABC/SomeName_Proof_ABC/processing/ABCFileName.indd
PDF file created: '/This/folder/Users/userName/Desktop/SoMe_Folder/ABC/SomeName_Proof_ABC/processing/ABCFileName.4.pdf'.
INDD file removed: /This/folder/Users/userName/Desktop/SoMe_Folder/ABC/SomeName_Proof_ABC/processing/ABCFileName.indd
================================================
[Jan 25 2012 11:44:50] - ID2PDF v.2.6 (ID2PDF.jsx)
General options: [default] (ID2PDF_options.xml)
================================================
Loaded options from XML file: '/This/folder/Users/userName/Desktop/SoMe_Folder/XYZ/SomeOtherName_Proof_XYZ/processing/ID2PDF_options.xml
extendedPrintPDF started
Postfix '4.2' was append from file 'XYZFileName.xml' for file: /This/folder/Users/userName/Desktop/SoMe_Folder/XYZ/SomeOtherName_Proof_XYZ/processing/XYZFileName.indd
PDF file created: '/This/folder/Users/userName/Desktop/SoMe_Folder/XYZ/SomeOtherName_Proof_XYZ/processing/XYZFileName.2.pdf'.
INDD file removed: /This/folder/Users/userName/Desktop/SoMe_Folder/XYZ/SomeOtherName_Proof_XYZ/processing/XYZFileName.indd

However, I'm still getting an empty array with

/^Loaded options from XML file: '\/.*\/SM_Folder\/([a-zA-Z]{3})\/[a-zA-Z]+_Proof_\1\/processing\/ID2PDF_options.xml$/im or with the RegEx you gave.

I'm trying everything and it doesnt seem to help. :(

P.S. I've ignored stuff from the file that's not required for this particular problem.

Well, the regex specifically says "SM_Folder" while your output has "SoMe_Folder", and the regex also expects "ID2PDF" where your output has "someFile".

 

If you don't know those names for sure then use a wildcard. Also I'm removing the leading slash from the filename, just in case.

/^Loaded options from XML file: '.*\/([a-zA-Z]{3})\/[a-zA-Z]+_Proof_\1\/processing\/.*_options.xml$/im

When a regex doesn't match, make it more and more generic (more wildcards, fewer literal characters) until it matches. Then figure out what you may or may not need to add back to get the exact results you need.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.