Jump to content

Issue with PHP regular expressions


Sandeep590

Recommended Posts

Hello Everyone,

 

 

                         I have a problem with the regular expression in PHP.

 

Input Given : 9780735585157 (pbk.;teacher's manual) 

 

Expected Output :  9780735585157

 

 

For example :  9780735585157 (pbk.;teacher's manual)  - Here , I have written a PHP code in such a way to split the portion of text using delimiter ; . so this is making the record to split in two different texts such as 9780735585157(pbk. and teacher's manual.

 

Now, My question is , how to write a php code to replace the (pbk.;teacher's manual)  with null or empty so that only 9780735585157 will be displayed in the output.

 

Note that there are other several records with such delimiters in the parenthesis which is making the original text to be displayed in two parts.

 

Kindly help me on this issue.

 

 

With Regards,

Sandeep.

 

 

 

 

Link to comment
Share on other sites

To answer your question number 1 : 

 

The input comes from a text file which has been not properly formatted.  There is no way that we can format it as it contains huge number of records.

 

To answer your question number 2 : 

 

I don't have control on why there is no proper data structure which separates 10 or 13 digit ISBN number from the input textfile.

 

Is there any solution which you can provide me so that i would be thankful to you .

Link to comment
Share on other sites

You could just search and replace the ( with a ,( which will give you a comma separated list as long as that format is consistent in your data meaning numbers followed by a left ellipses.

 

That would give you

9780735585157,(pbk.;teacher's manual)

 

Then you could do whatever you want with the properly formatted data.

Edited by benanamen
Link to comment
Share on other sites

Since you aren't dealing with a proper data structure, I would not make any assumptions about details like the presence of a semicolon.

 

One line (appearently) consists of an ISBN followed by additional information in parentheses:

<?php

const BOOK_REGEX = '~\\A(?<isbn>[\\d-]+)\\s*\([^)]+\)\s*\\z~';



$bookFile = fopen('/path/to/file', 'r');

$isbnCollection = [];
$lineNumber = 1;
$matches = null;
while ($line = fgets($bookFile))
{
    if (!ctype_space($line))
    {
        if (preg_match(BOOK_REGEX, trim($line), $matches))
        {
            $isbnCollection[] = $matches['isbn'];
        } 
        else
        {
            echo "Malformed line {$lineNumber}: ".htmlspecialchars($line, ENT_QUOTES | ENT_SUBSTITUTE, 'UTF-8').'<br>';
        }
    }

    $lineNumber++;
}

var_dump($isbnCollection);

fclose($bookFile);
Link to comment
Share on other sites

Well , thank you so much all for your valuable inputs and suggestions.

 

The regular expression which I have used to extract the 10 digit or 13 digit ISBN number is displayed below.

 

preg_match_all('/\d+(?:\d|X)/',$str,$matches); 

 

Where $str is the string which to be parsed and $matches is the output result.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.