Jump to content

jconey

New Members
  • Posts

    5
  • Joined

  • Last visited

    Never

Profile Information

  • Gender
    Not Telling

jconey's Achievements

Newbie

Newbie (1/5)

0

Reputation

  1. I think I found my solution but it wasn't what I set out looking for. I was looking in the wrong direction. As I surfed for a solution I stumbled on data mining, page scraping and data harvesting. Most of the files I have to work with are .HTML so I dug into how to use these methods and I came up with gold. First I created a .html file with a link to all the files... that was simpler than thought it would be. Once all the files we're "linked" by the new file that I created, I could run web-harvest (sourceforge) or any number of other tools available on the web. As soon as the files were all linked the program treated it as a site and surfed the entire thing extracting the data I wanted. Web-Harvest took some playing around with to configure but it worked in the end. That made me think about it and if you ever run into a website that has information you need spread all through it this same tactic would work perfectly, as a matter of fact that is what these tools were really created for. I'd recommend HT Track or web2Disk by InSpyder to capture the website's content than run web-harvest to extract the data to a CSV, spreadsheet or what ever you need. All these tools mentioned are available on the web some free and some not free but cheap just the same. Keep this information in mind, might come in handy some day! Thank you - to all that gave thought to my problem! Jeff
  2. Yes... I agree now I need to find a way to do that.. got any suggestions? Jeff
  3. Basically I have a lot of .html and .txt files that have data in them and I extract that data. If I can get it into a excel spreadsheet, MS access I can get it to MYSQL from there. A closer estimate is about 152,400 files. I'd love to find a batch method or some automated or semi automated way to extract this data a useable format. (Spreadsheet, MS Access Table, MySQL Table...) The first post explains the file contents. I'm pretty good at importing data if there is a consistent delimiter. I have two issues here as I see it. #1 how do I handle so many files without repeating a set of procedures 152K times. #2 the lack of a consistent delimiter The files all have the same type of information and in the same order though. I'm sure someone in cyber space has run into this before I just hope the solution is not in the realm of theoretical physics. Thanks, Jeff
  4. BTW: Nto sure if it matters for this particular question but... My Host is running: MySQL version 5.0.90-community-log Apache version 2.0.63 PHP version 5.2.9 I also use MS Excel/Access 2007 (or earlier), PHPMagic Pro & Plus, Adobe CS3 Master Suite... Thanks Again! JConey
  5. New to PHP and MySQL. Not sure I'm even posting this in the right area but here it goes. I have html and text files with data in them. Not delimited in the normal way at all. Most of the text is in paragraph form but all the files have the same data in them. For example a page might look like this: Item1 text Item2 Text Text Text… Item3 text Item 4 Text Text Text Text Text Item# = a name like Year/item number/description... etc. There are about 25 items (or fields) each varies in length and paragraph style. For instance Item 4 in the example might just have one word or it might have 7 paragraphs. This would be easy if I only had two dozen files... but I have upwards of 100,000+ files, most are .html on a CD. :-\ OH one more thing... the many of the 'Item titles' are followed by a : (description:) but not all item names have it. I'm not very DB literate but I am IT/PC literate. I really need to find a quick and hopefully semi-automated way to import/convert this information in batches. Even if I could get it into excel or access, I could get it into PHP/MySQL from there myself. Don’t know if it matters but one of the fields has a photo, which I just need the name/link from not the photo. Please let me now if you have any ideas or need more information. Thank you! JConey
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.