Jump to content

Where to start ? Reading variables from changable texts


SG

Recommended Posts

Hi all - Its my first post so please treat me as the ultimate newbie! (Forum, Php AND MySql-wise)  ::)

I have hunted around for solutions for a while and have become baffled with all the different functions (strstr, substr, preg_match etc) and I am left confused about what direction to head in right from the start.

Essentially the backbone of what I want to achieve is a simple text area form field where people can paste in chunks of text ( I have no control over the formatting of the text as its produced by another application ). The text will contain various snippets of information that I would like to read out of the form and ultimately into a mySql db - But for now I just want to work out the reading part.
An example of some text to read might be this:

Dog
---
Legs: 4 Eyes: 2 Weight: 10.00 lbs
Age: 11 Hair: Black

Habits:
Bites
Barks
Sniffs itself

Another example might be:

Bird
----
Wings: 2 Eyes: 2
Age: 1

Habits:
Targets clean cars

As you can see from these two examples the supplied information will vary each time and so I need to produce a flexible text reader that will scan the text for predefined keywords ("Wings:", "Eyes:", "Age:") and record the information ready to add to a db.

Important things I need to consider are:
The name field has no preceding keyword to trigger off but will always be the first line.
Some lines of the text will have more than one set of information to be recorded separately.
Some information (Like Habits in the example) would be some kind of list or array of multiple lines.

So far I have the form and the script to loop through all the lines but I am unsure how to start splitting up the lines to separate out the keywords and values and which functions to use. There is no easy character that separates the information (except spaces - but spaces are also within the values so I cant use it).

Can anyone advise me of how I should try to approach this problem ?  ???

Thanks in advance,
Steven
Link to comment
Share on other sites

One thing you could use , all the keywords in the text  end with ":". If this is always the case you can use that to identify them. If, however, you might occasionally get "Weight : 10 lbs" (or worse "Weight 10 lbs") then you will need an array to define every possible keyword, so long as keywords cannot also occur in the data. eg this could screw things up

Monk
--------
Clothing: habits

Then it's a simple matter of splitting the text into words and processing each word to see if it is a keyword. If it is, then all words up to the next keyword are data.
Link to comment
Share on other sites

Not all the keywords happen to have a ':' to be used to flag them unfortunately. Below is some exact examples of text that are spit out from the application (Its a text based online game and the information is detailed stats of objects in the game):

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                              A Pale Bone Dagger
                              - ---- ---- ------
    Base Cost: 450          Level: 5          Weight: 1.00 lbs
        Damage: 1d5          Class: dagger      Attack: pierce 
      Element: physical
          Acid: 0%            Fire: 0%            Cold: 0%

    Condition: damaged
    Materials: Bone
    Wear Loc.: right.hand, left.hand
        Layer.: base

          Affects:
            This item is water proof.
            This item may be replicated
            This item has only a marginal amount of metal in it.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                        A Dusty-brown Bear Skin Shield
                        - ----------- ---- ---- ------
    Base Cost: 2000        Level: 25          Weight: 4.00 lbs
  Armor Class: 11            Global AC: 0 
          Acid: 0%            Fire: 0%            Cold: 0%

    Condition: very good
    Materials: Leather
    Wear Loc.: left.hand
        Layer.: base

          Affects:
            Strength by 20.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-


-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                              A Cap Of The Blessed
                              - --- -- --- -------
    Base Cost: 30000        Level: 70          Weight: 1.50 lbs
  Armor Class: 0            Global AC: 0 
          Acid: 0%            Fire: 0%            Cold: 0%

    Condition: damaged
    Materials: Cloth
    Wear Loc.: head
        Layer.: bottom

          Affects:

          General Synergies
            Modifies a mage' skills by -50.00.
            Modifies a thief' skills by -50.00.
            Modifies a warrior' skills by -50.00.
            Modifies a ranger' skills by -50.00.
            Modifies a blade dancer' skills by -50.00.
            Modifies a monk' skills by -50.00.
            Modifies a sorcerer' skills by -50.00.
            Modifies a druid' skills by -50.00.
            Modifies an unfinished class.
            Modifies an unfinished class.
            Modifies an unfinished class.
            Modifies an unfinished class.

          Skill Specific Synergies
            Modifies heal by 15.00.
            Modifies restoration by 28.00.
            Modifies group critical by 27.00.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

I can capture all this information into a database using an application that is set up to read this text as it is streamed to my computer. The 'rules' I would set up to read the text would be something like:

Rule 1:
^(%s)Base Cost~:(%s)(%d) // Capture and process (%d) as cost information
Rule 2:
(%s)Level~:(%s)(%d) // Capture and process (%d) as level information
Rule 3:
(%s)Weight~:(%s)(%d)(%s)lbs$ // Capture and process (%d) as weight information
Rule 4:
^(%s)General Synergies$ // Capture following text untill a blank line and process as Synergy information

OR

I could make a single rule that manages the above rules 1, 2 and 3 in a single line capture as this particular line is very consistent in formatting:

^(%s)Base Cost~:(%s)(%d)(%s)Level~:(%s)(%d)(%s)Weight~:(%s)(%d)(%s)lbs$

My program will read the text and look for the above rules where it will interpret (%s) as any number of spaces/tabs, (%d) as a number, ^ as the beginning of a line and $ as the end of a line. So therefore as it picks up this data that fits that individual rule I can read the information and handle it accordingly.

However I am finding myself lost in recreating this system or similar into php and not sure which functions I should be swatting up on to achieve this.
Link to comment
Share on other sites

[quote]... not sure which functions I should be swatting up on to achieve this.[/quote]

I get the feeling you'll need quite an arsenal of functions for this one. Start with these pages

REGEX
http://uk2.php.net/manual/en/ref.regex.php
http://uk2.php.net/manual/en/ref.pcre.php

STRING
http://uk2.php.net/manual/en/ref.strings.php
Link to comment
Share on other sites

Great! Thanks for the links - At least I have some direct focus on what I should be learning and understanding now  :)

It is typical of me to take on a big project and bite off more then I can chew  ;D - But there is nothing like jumping in at the deep end and working your way up :)

Thanks again,
Steven
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.