Haskell - Enforcing Naming Convention with Parsec

Posted on September 4, 2020

In my project migratum I wanted to enforce a naming convention like this

Ways I can accomplish these is with

  • Regex
  • Parser Combinators

I’m not sure how to do this with regex because to be honest, I know very simple regex and I don’t have a regex license. I’m decent with parser combinators and plus it’s the title of this blog post, so that’s what I’m going to use to accomplish the task I set out for.

Parser combinators really shine when you parse something recursive like json. You know how json can contain an array and objects and then these objects and arrays can contain arrays and objects, so on and so forth? Yeah, parser combinators are really neat for that. So don’t judge parser combinators solely on this blog post. I chose parser combinators because it’s my go-to method for parsing, and I personally think they’re convenient.

Before we touch parsing, let’s create a type representation of the file name based on the ascii diagram above.

Now, we can focus on specific segments of the file name instead of thinking of parsing the entire file name. I think that makes it a bit easier. At least to me it does. I can think of parsing the file version, underscore, filename, etc individually.

Let’s start with the imports. Popular libraries that come to mind are parsec, megaparsec, and attoparsec. Evaluate which one suits your project, but on this blog post we’re using parsec.

Let’s start with the file version parser. Our convention says that we need to start with the character V, then it should be followed by numbers.

If we try this in ghci it will look like this

That’s what we want, to match Right and return our FileVersion with the text “V69”.

That’s right, it should fail when there’s no version number.

It drops the input that isn’t a number.

Finally, it will fail if it doesn’t find the “V” character.

Woohoo! One parser down, four to go!

Let’s do the rest of the parsers, and feel free to try these out in ghci.

Then, to create a parser for the whole file name we combine the rest of our parsers.

We can also do this with the do syntax if that’s more convenient.

Finally, our parser “runner”

Instead of manually trying them out in ghci we can instead do some unit testing. So we don’t have to keep messing with the terminal.

Now that we have our parser we can use it check for duplicates, because our types have an Eq instance we can do equality checking.

When creating your parsers, avoid doing any validation or any “smart” logic. Concentrate on parsing the input and nothing more. Once you have your parsers then you can do whatever you want with the output.

References

Haskell from First Principles

Monadic Parser Combinators