stlyusss Posted May 21, 2018 Share Posted May 21, 2018 (edited) I want to create a function that first reads a .txt file and creates arrays from repeating patterns inside of it. Let's pretend the file contains the following information (word for word). 1} ABC Answer: More Words *NEXT* 1} DEF Answer: More Stuff *NEXT* What I'd like the function to do is read the file, put all content after 1} into an array up until Answer: . Also inside the same array, I want to store the answer -- the answer stops at *NEXT*. Therefore, the final array would be: array ( (ABC, More Words), (DEF, More Stuff) ); Eventually I'd use a for each or while statement to read the array. How would a professional PHP coder tackle this? Edited May 21, 2018 by stlyusss Quote Link to comment Share on other sites More sharing options...
benanamen Posted May 21, 2018 Share Posted May 21, 2018 You would be much better off telling us the actual problem you're trying to solve rather than telling us how you want to solve it. Quote Link to comment Share on other sites More sharing options...
stlyusss Posted May 21, 2018 Author Share Posted May 21, 2018 I'm building a PHP-based script that reads documents like the one I described and posts them on my question-answer website. Quote Link to comment Share on other sites More sharing options...
Barand Posted May 21, 2018 Share Posted May 21, 2018 I certainly would not start with a text file in that format. Here's some alternatives 1. Simple ini ;; Questions and Answers ;; ;; Format: ;; question text="answer text" ;; What was the US President's name in 1976?="Donald Trump" How many sides has dodecahedron?="Twenty" What is the capital of Mongolia?="Ulaanbaatar" ------------------------------------------------------------------------------ TO PROCESS ------------------------------------------------------------------------------ <?php $array = parse_ini_file('qa1.txt']; ?> ------------------------------------------------------------------------------ RESULTS (question is the array key and the answer is the value) ------------------------------------------------------------------------------ Array ( [What was the US President's name in 1976?] => Donald Trump [How many sides has dodecahedron?] => Twenty [What is the capital of Mongolia?] => Ulaanbaatar ) 2) Complex ini file ;; Questions and Answers ;; ;; Format: ;; [question_number] ;; Q="question text" ;; A="answer text" ;; [1] Q="What was the US President's name in 1976?" A="Donald Trump" [2] Q="How many sides has dodecahedron?" A="Twenty" [3] Q="What is the capital of Mongolia?" A="Ulaanbaatar" ------------------------------------------------------------------- TO PROCESS ------------------------------------------------------------------- <?php $array = parse_ini_file('qa2.txt', true); ?> ------------------------------------------------------------------- RESULTS ------------------------------------------------------------------- Array ( [1] => Array ( [Q] => What was the US President's name in 1976? [A] => Donald Trump ) [2] => Array ( [Q] => How many sides has dodecahedron? [A] => Twenty ) [3] => Array ( [Q] => What is the capital of Mongolia? [A] => Ulaanbaatar ) ) 3) CSV file "What was the US President's name in 1976?","Donald Trump" "How many sides has dodecahedron?","Twenty" "What is the capital of Mongolia?","Ulaanbaatar" ------------------------------------------------------------------ TO PROCESS ------------------------------------------------------------------ <?php $array= []; $fh = fopen('qa3.txt', 'r'); while ($qa = fgetcsv($fh)) { $array[] = $qa; } fclose($fh); ?> ------------------------------------------------------------------ RESULTS ------------------------------------------------------------------ Array ( [0] => Array ( [0] => What was the US President's name in 1976? [1] => Donald Trump ) [1] => Array ( [0] => How many sides has dodecahedron? [1] => Twenty ) [2] => Array ( [0] => What is the capital of Mongolia? [1] => Ulaanbaatar ) ) 1 Quote Link to comment Share on other sites More sharing options...
stlyusss Posted May 21, 2018 Author Share Posted May 21, 2018 (edited) Unfortunately all the .txt files come in this format. Here's a sample of a text file (I uploaded a sample, didn't appear) That's why I asked the initial question Edited May 21, 2018 by stlyusss Quote Link to comment Share on other sites More sharing options...
kicken Posted May 21, 2018 Share Posted May 21, 2018 I typically tackle things like this by just splitting the text into components by the static separators. Repeat multiple times until you get what you want, or narrow the scope down to something easier to work with. For example: <?php $text = file_get_contents('yourfile.txt'); //Split each question/answer by the *NEXT* separator $parts = explode('*NEXT*', $text); $result = []; foreach ($parts as $part){ //Split the question and aswer by the Answer: separator list($q, $a) = explode('Answer:', $part, 2); //Remove the #} from the question $q = preg_replace('/^\s*\d+}/', '', $q); //Trim whitespace $q = trim($q); $a = trim($a); $result[$q] = $a; } var_dump($result); This depends on your separator values being unique. If your separator values occurred somewhere else then it'd fail. Quote Link to comment Share on other sites More sharing options...
stlyusss Posted May 21, 2018 Author Share Posted May 21, 2018 This is excellent. It actually worked, just not perfectly. Perhaps we can tweak it? Here's the Sample file I'm working with: https://expirebox.com/download/710b51edbc953c940616624375d4f4c0.html Array ( [This person devised a simple formula for calculating an index of intelligence, or intelligence quotient (IQ). A) Theo Simon B) Wilhelm Stern C) Franz Gall D) Louis Thurstone] => B [3 1} Which of the following individuals has the highest IQ according to Stern's formula? A) Clarissa, with a mental age of 9 and a chronological age of 9 B) Matt, with a mental age of 9 and a chronological age of 10 C) Cecilee, with a mental age of 9 and a chronological age of 6 D) They would all be close in IQ; the difference would not be significant.] => C [1 1} William Stern’s formula for the intelligence quotient was (mental age/chronological age) × 100. What is the IQ of a 12-year-old with a mental age of 9? A) 75 B) 85 C) 125 D) 135] => A [Twelve-year-old Arnold received an IQ test score of 75. What is his mental age? A) 9 B) 10 C) 5 D) 7] => A [3 1} What is the IQ of a 12-year-old with a mental age of 16? A) 147 B) 70 C) 133 D) 145] => C [] => ) 1 I get the following. Notice that sometimes there is content after *NEXT* in the sample file that needs to be excluded. How can we account for that? Quote Link to comment Share on other sites More sharing options...
Barand Posted May 21, 2018 Share Posted May 21, 2018 You should be able to follow the hints given by Kicken and work it out for yourself now. If you hadn't wasted everyone's time by posting totally non-representative data in the first instance you could have been given a working solution by now. Quote Link to comment Share on other sites More sharing options...
kicken Posted May 21, 2018 Share Posted May 21, 2018 Use the 1} token as another separator to split the question from the extra data. Assuming the text is always 1} any never any other number, then it'd pretty much the same code as for the Answer: token. Quote Link to comment Share on other sites More sharing options...
stlyusss Posted May 21, 2018 Author Share Posted May 21, 2018 You should be able to follow the hints given by Kicken and work it out for yourself now. If you hadn't wasted everyone's time by posting totally non-representative data in the first instance you could have been given a working solution by now. damn that was rude, i'm learning, man @kicken, do you mind writing it for me? Quote Link to comment Share on other sites More sharing options...
Psycho Posted May 21, 2018 Share Posted May 21, 2018 Try this <?php //Read file into variable $file = "Sample.txt"; $text = file_get_contents($file); //Create array to hold results $results = array(); //Split the content based on *NEXT* $questions = preg_split("#\*NEXT\*[^\n]*#is", $text); //Process each question section foreach($questions as $question) { //Find the question text if(preg_match("#}(.*)#", $question, $question_match)) { //Exctract the question text $question_text = trim($question_match[1]); //Find the answers preg_match_all("#([ABCD]\)) ([^\n]*)#i", $question, $answers_match, PREG_PATTERN_ORDER); $answers = array_combine ( ['A','B','C','D'], array_map('trim', $answers_match[2])); //Find the correct ansewer preg_match("#Answer\: ([ABCD])#", $question, $correct_match); $correct = $correct_match[1]; //Put question parts into results $results[] = array( 'question' => $question_text, 'answers' => $answers, 'correct' => $correct ); } } //See results echo "<pre>" . print_r($results, true) . "</pre>"; 1 Quote Link to comment Share on other sites More sharing options...
stlyusss Posted May 21, 2018 Author Share Posted May 21, 2018 That's incredible. If I could bug you for a small adjustment? Can you make it so that the question and answer options are combined as one called question? The reason why is because not all the questions are multiple choice, so this would make it more versatile I think. That being said, not all options are A, B, C and D, some answers could be paragraphs. Quote Link to comment Share on other sites More sharing options...
Barand Posted May 21, 2018 Share Posted May 21, 2018 And so it continues! Quote Link to comment Share on other sites More sharing options...
kicken Posted May 21, 2018 Share Posted May 21, 2018 damn that was rude, i'm learning, man Based on this thread so far, no you are not learning, you're trying to get free work. @kicken, do you mind writing it for me? No, I'm not going to write it for you. You've been given enough details to be able to accomplish what you want on your own if you put even a tiny amount of effort into it, which shouldn't be a problem if you're really trying to learn. 1 Quote Link to comment Share on other sites More sharing options...
stlyusss Posted May 22, 2018 Author Share Posted May 22, 2018 I did try, in fact. Just couldn't figure it out. //Split the content based on *NEXT* $questions = preg_split("#\*NEXT\*[^\n]*#is", $text); //Process each question section foreach($questions as $question) { //Find the question text if(preg_match("#}(.*)#", $question, $question_match)) { //Exctract the question text $question_text = trim($question_match[1]); //Find the correct ansewer preg_match("#Answer\: (.*)#", $question, $correct_match); $correct = $correct_match[1]; //Put question parts into results $results[] = array( 'question' => $question_text, //'answers' => $answers, 'correct' => $correct ); } } //See results echo "<pre>" . print_r($results, true) . "</pre>"; I just can seem to combine the options all in one variable. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.