Hi All
I'm after some advice about how to write a parser to match many different naming formats of a tv episode, so the one thing we can rely on is that a episodes are inside a folder named after the show, for example "House MD\episode name. s1e07.mkv" so we know the show name, that part is easy, the bit I need to match is the season and episode number.
The format of the file name can differ is so many ways, here are some examples:
- S01E01
- s1e10
- S3e6
- 105 (for season 1, episode 5)
- EP01 (usually there's one season in this case, but not 100% of the time)
So as you can see there are a few variations, almost always the show name is in there too, and sometimes they contain resolution, so: House.MD S1E09 720p HD.mkv, there are many combinations but since we already know the show name it's not something I think we need to worry *too* much about.
My question relations to how you would approach this? You can see from the examples above sometimes this would be hard to match and work out.
My initial idea would be to have a class called 'tvmatcher', or something, which has match handlers, one match handler for each format we need match, a handler would be a class that extends tvmatcher and have the same method, like $handler->match($string); the first one to match would be the winner.
This could be extended to sanity check the result and ensure that the season/ep actually exists.
I really don't know how to go about this, the idea above is my best so far, so again my questions really are:
How would you approach this?
Is there a pattern that would help?
any other ideas about how this could be acheived?
It's worth noting that I'd be using external APIs to get show information, but would be out of the scope of this library, but could help with said sanity checks.
Cheers,
Billy