GitHub as a Database (CI with Users)

February 12, 2020

Recently, I decided to put together a silly little website so that my wife could upload her recipes to the internet. I’m a big fan of using static site generators like Jekyll (the tool used to make this blog) because it allows you to upload new content without having to create new markup each time. For a technical person like me, it’s not problem writing a bit of yaml and markdown to get a page onto the internet for free with GitHub Pages.

Now, add your non-developer wife into the mix, and you’ll quickly learn that yaml isn’t fun for non-developers. This makes using Jekyll by itself an issue, because you have to name and format files in a very specific way, and that’s hard to keep track of. The way this would typically be handled is by storing user data in a database, and serving it up with a server of some sort. This approach is honestly great, and I’ve used it on a lot of projects in the past. This one though, is a little different. I know that my wife and I are the only ones who will be modifying this data, and I also know that I want to host it for free. See where this is headed? Was the title enough of a clue?

So it begins, I build a Jekyll site. I manually craft the yaml files, and get a working wireframe together so she can see how it looks. She gives me some pointers of things to change, and I make the tweaks. Once she checked off on the layout, I got to work figuring out how she could get her information onto the site without having to learn yaml.

Obviously this needs to be done from the same website, so an API it is. I develop in C# in my day job, and I really enjoy it, so I got to work writing some code to tie this together. Of course there’s still a problem with writing a normal ASP.NET core api for this, and that’s the cost of hosting. This API is expected to have very little traffic, and I don’t want to pay to have this an api all the time. I could run it on my home network, but I’m picky about what I allow to run here, and what I expose to the internet. After careful consideration, I landed on Azure Functions. These are perfect for the job! I can use all the language features I’m used to in C#, and I don’t have to worry about the server.

That covers the website communicating with the backend. But what is the backend anyways? Where are we storing all this information? Well GitHub supports Jekyll with very minimal configuration, all you need to do is enable it in your repo, and it will start getting served automatically. When you commit new changes to the branch you specify (master by default) they are quickly reflected in your site. Sweet! Free CI/CD already created with no work from me, I’m going to use this to my advantage. All I need to do from this API to get the deployment to work, is commit the right files into the master branch of the GitHub repo.

Lucky for me, there’s already a .NET library for interacting with GIT - libgit2sharp. This makes it much easier than trying to figure out a way to get access to a command line version of git within the function. All you need is a simple nuget package. So this should work for me to get things in and out of git. I don’t love working with yml, and definitely didn’t want to deal with trying to post yml to this api using javascript on the frontend. I created some C# Dto classes to represent the yaml data, and that’s what the API accepts (as JSON). From there, I used YamlDotNet to convert my objects into the proper yml structures.

From here backend function simply has to write a file based on the data received in the api to the correct place, stage all the changes in git, create a commit, and push it up to GitHub. There were hardly any snags while setting this up. The few I ran into were that by default, if you use Visual Studio to do a default zip file deployment to a function app, it results in a read-only filesystem for your function. That’s not conducive to creating new files in a git repo, so keep that in mind if you try this. Also, because this uses git, it can encounter the same ‘problems’ you’d see in any other git repo that has multiple committers. You can get merge conflicts and also have an out of date local repo, so you need to code to handle these scenarios.

Adding authentication was also surprisingly easy. Azure functions have great support for Azure Active Directory, and you can control what directory is the authority. This makes it easy to lock down your functions to only the people you invite. All you have to do to access the logged in user information is add a ClaimsPrincipal as an argument to your function. I used the MSAL js library from the frontend to get a token from Azure AD that authenticates users to access the functions.

This was a super fun project to work with, and I’m happy I got it working. Of course the way this site was hacked together is not best practice, but it’s a testament to how cheaply you can get a semi-dynamic website on the internet in this day and age. If you can dream it, you can do it! Feel free to check this repo out on GitHub!