mike2 Posted July 2, 2014 Share Posted July 2, 2014 (edited) To whom it may concern: I have PHP code that I am trying to improve upon. I want to fix it so that when I put a dash next to a word, the word that is next to a dash is excluded. I also want to see if it is possible, what would I have to change in my code so that when I search for words or phrases within PDF documents, that I get exactly what I want. Whenever I have the Exact Phrase radio button enabled, and I type in the phrase Oil Gas Company, three PDF appear, however when I open the links of those PDF documents to search for that phrase, that exact phrase is not there so since it is not there I want to know what I would have to do to fix it so that the 0 matches found appears on my screen? also When it comes to using the dash what I want know is, is it possible to have something like Tax credit -appraisal in the search box and when either the Exact Phrase or the Any Phrase radio is selected it will do the function of both radio buttons but at the same time exclude the word appraisal in its search? Any Phrase meaning of course it will search for either words in the phrase and bring up PDFs that have either results or both. I ask these questions because whenever I type into the my search box bank and loans -money while the Exact Phrase radio button is highlighted it is only suppose to return PDF documents that have that exact phrase but without the word money in them. Is this possible with my code. Is this possible with my code? attached with this message is my php code. Any help would be greatly appreciated. Finalsearchcode2.txt Edited July 2, 2014 by mike2 Quote Link to comment Share on other sites More sharing options...
mac_gyver Posted July 2, 2014 Share Posted July 2, 2014 what you are asking is likely, but it would take reading the swish documentation and examples. have you done so? then, once you have determined if swish can exclude words/force exact matches, you would need to write the php program logic that supplies the necessary parameters/search term to the swish methods. Quote Link to comment Share on other sites More sharing options...
mike2 Posted July 2, 2014 Author Share Posted July 2, 2014 what you are asking is likely, but it would take reading the swish documentation and examples. have you done so? Yes I have. I have read both these documents and so far they haven't been much help to me. http://devzone.zend.com/1591/indexing-web-content-with-php-and-swish-e/ http://swish-e.org/docs/swish-search.html#phrase_searching Quote Link to comment Share on other sites More sharing options...
mac_gyver Posted July 3, 2014 Share Posted July 3, 2014 you need to do a little defining before you can write any code. for each of the four possibilities, 1) exact, 2) exact with excluded word(s), 3) any, 4) any with excluded word(s), what will be the search term you build and supply to the $swish->query() method? you also need to define if you are going to allow multiple excluded words (i.e. -word1 -word2) and you must handle the case where there is a - as part of an exact phrase search, such as finding a negative number -1000 and how to distinguish that from excluded word(s). you also need to decide if the excluded -word can appear at any place within the term entered in the search box or if they must be at the end (or even if you should have a separate box for excluded word(s) - this method would address the questions i posed about things like a -1000 in an exact phrase search. you could even do away with the - to identify excluded words since any excluded words would come from a different input text box.) also, if the search box contains any of the and, or, near or not boolean operator keywords, you must surround those keywords with double-quotes, making them a phrase, so that they won't affect the actual search result logic. the key to understanding the search terms you are trying to build is that the search term is a logical statement that produces a boolean (true/false) value. when the search term is TRUE, the matching file will be in the search results. 1) for an exact phrase search, the phrase must be enclosed by double-quotes. the double-quotes are part of the string you pass to the $swish->query() method, not be confused with any double-quotes that delimit php strings that are part of the php syntax in your code. for your Oil Gas Company example, the literal query would be "Oil Gas Company". 2) for an exact phrase with excluded word(s), the phrase is enclosed by double-quotes, followed by the not keyword and the word(s) to exclude. for the example Oil Gas Company -money, the search term would be "Oil Gas Company" not money (there is an implied/default and operator right before the not keyword.) 3) your any phrase condition is actually an or (the default between words/terms is an and). for the example Oil Gas Company, the search term would be Oil or Gas or Company. 4) for the any phrase with excluded word(s), for the example Oil Gas Company -money, the search term would be (Oil or Gas or Company) not money (the (...) around the or'ed term insures the intended operator precedence due to the implied/default and operator right before the not keyword.) next, in addition to your exact phrase and any word searches, are you planning on an all word search (all the word(s) must be present, but not necessarily together in the form of a phrase)? this would use and between the words (or simply leave it out between the words since it is the default.) lastly, you may want to consider having the user's put double quotes around the exact phrase part of what they are searching for.so that they can enter just about anything for a search - "Oil Gas Company" credit -money (search for the exact phrase Oil Gas Company, with the word credit, and without the word money. in this case, they would need to select between the any/all search (or you could allow them an option to enter the search exactly the way the want with the and, or, near or not boolean operator keywords fully under their control.) Quote Link to comment Share on other sites More sharing options...
mike2 Posted July 10, 2014 Author Share Posted July 10, 2014 you need to do a little defining before you can write any code. for each of the four possibilities, 1) exact, 2) exact with excluded word(s), 3) any, 4) any with excluded word(s), what will be the search term you build and supply to the $swish->query() method? you also need to define if you are going to allow multiple excluded words (i.e. -word1 -word2) and you must handle the case where there is a - as part of an exact phrase search, such as finding a negative number -1000 and how to distinguish that from excluded word(s). you also need to decide if the excluded -word can appear at any place within the term entered in the search box or if they must be at the end (or even if you should have a separate box for excluded word(s) - this method would address the questions i posed about things like a -1000 in an exact phrase search. you could even do away with the - to identify excluded words since any excluded words would come from a different input text box.) also, if the search box contains any of the and, or, near or not boolean operator keywords, you must surround those keywords with double-quotes, making them a phrase, so that they won't affect the actual search result logic. the key to understanding the search terms you are trying to build is that the search term is a logical statement that produces a boolean (true/false) value. when the search term is TRUE, the matching file will be in the search results. 1) for an exact phrase search, the phrase must be enclosed by double-quotes. the double-quotes are part of the string you pass to the $swish->query() method, not be confused with any double-quotes that delimit php strings that are part of the php syntax in your code. for your Oil Gas Company example, the literal query would be "Oil Gas Company". 2) for an exact phrase with excluded word(s), the phrase is enclosed by double-quotes, followed by the not keyword and the word(s) to exclude. for the example Oil Gas Company -money, the search term would be "Oil Gas Company" not money (there is an implied/default and operator right before the not keyword.) 3) your any phrase condition is actually an or (the default between words/terms is an and). for the example Oil Gas Company, the search term would be Oil or Gas or Company. 4) for the any phrase with excluded word(s), for the example Oil Gas Company -money, the search term would be (Oil or Gas or Company) not money (the (...) around the or'ed term insures the intended operator precedence due to the implied/default and operator right before the not keyword.) next, in addition to your exact phrase and any word searches, are you planning on an all word search (all the word(s) must be present, but not necessarily together in the form of a phrase)? this would use and between the words (or simply leave it out between the words since it is the default.) lastly, you may want to consider having the user's put double quotes around the exact phrase part of what they are searching for.so that they can enter just about anything for a search - "Oil Gas Company" credit -money (search for the exact phrase Oil Gas Company, with the word credit, and without the word money. in this case, they would need to select between the any/all search (or you could allow them an option to enter the search exactly the way the want with the and, or, near or not boolean operator keywords fully under their control.) OK I will take all of that into consideration. In your opinion is still possible to do this using the Swish-e search engine? Here is the Swish-e link Swish-e.org Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.