We left off getting all of the Class and Framework data into a database table and the next challenge is getting some numbers out of the APIs.
- Github - Holds user-submitted code from published projects. This is a great place to find solutions, examples, and custom frameworks put out there by the authors for the community to use and contribute to. It is also a great place to determine where iOS frameworks are being implemented. The more a given class in a framework is found here, the more common or popular it is.
- StackExchange - Contains the solutions to all your woes. It is crowd-sourced solutions to problems and answers to questions. This indicates the interest so the more a given class is mentioned here, the more it is being used in a situation where a solution is being poled to the members. Just because you find a term here though, doesn't meant that it is popularly used (as Github would reflect). More on that later.
How do we get the data we are after?
There are a few parts we have to address for each set of data.
- A place to store it.
We need to add some fields to the table that holds our classes.
- API Keys
Each API service requires user specific API keys before you can really start to dig into the data with any speed. This opens up the number of calls you're allowed to make to the API. Without one, you might be limited to a few calls a minute where as with an API key you can do dozens of calls per second. In any event, you need it.
- A URL call to the API.
This is simply a URL string that contains all your search criteria, your key, and the API URL.
- A programmatic means to loop over the classes and insert the results.
You need a way to submit each class to the API, and then get the information your after and put it into the database. This could be any number of programming languages that you can connect to your database but I used ColdFusion (CFML) because it is very simple and I have lots of history with this powerful yet out-of-vogue language.
1. Add some fields to store relevant data.
This required a bit of poking and prodding as you can't do a "code" level search so the results that can be brought back for a given language and framework are based on whether they are mentioned in the title of the Github project, or within the description. This is pretty helpful in the sense that if the project showcases a class and is worth mentioning in the description it is probably important to the project.
So the primary field we will store our resulting data we will call "totalSwiftCountGitHub"
2. API access... get some.
You are going to need some sort of means to connect with the API that will allow you to pass lots of queries to GitHub in quick succession. Without it, GitHub will force you to stagger your URL requests. There are a few ways to do this but I found that registering as an App and getting a Client Id and a Client Secret was the simplest way to get what I needed.
This covers the basics of how to get started : https://developer.github.com/guides/basics-of-authentication/
This is where you start your registration to get your Client ID/Secret : https://github.com/settings/developers
Here is the registration page : https://github.com/settings/applications/new
It will ask you for a handful of fields. My application isn't located at these URLs, that is just my personal website.
When you are finished, you get your Client ID/Secret and it will look like this :
3. Call the API
We want to test this out a bit and refine the query. Here is a link to help describe all these aspects.
But, to shortcut all the documentation and boil down to what we will use, let's break down the URL that worked for me.
Note : The above link won't work for you because I changed the secret.
- URL : https://api.github.com/search/repositories?
- q : This is how they format their query string. Here we are specifying that we are looking for "avaudioplayer" in the Name, Description and Readme, andIt needs to be in the language of Swift.
- page : I don't need more than one page's results because I am looking for the total and that is part of the meta data that will be returned.
- type : I know we said we couldn't query the actual code but this looks for the key word at the code level but not inside the code specifically.
- per_page : I don't need more than one result per page since the data we are after is in the Meta data.
- client_id / client_secret : You put the strings Github gave you when you registered your app.
So what comes back when you put your URI into a browser?
We are after the total_count part of this XML return and we will get at that nugget of data when we put it into some code that will extract it and put it into the database.
4. Loop and stick it into the database
The next part I am going to use a language called ColdFusion which is a procedural language for integrating logic into web pages. If you're not familiar with it, it is like PHP or .NET in that it puts programming logic into a page that when called, crunches the instruction and sends out resulting HTML. In this example though, I will only use it for the crunching part and output to the screen to confirm that what I am working on is working.
Also, I am not going to go over how to install ColdFusion. This is a high-level view of what I did to get the results I wanted but you should use whatever language you're familiar with. The logic should be pretty straight forward.
So once we run this, we limit each run to about 430 records because we don't want this to time out too fast. If you're running this on something that is not dependent on IIS or some other browser request based programming language, then this sensitivity to time-out may not be an issue.
And that is it!
We should have totals for all our classe!
Yeah, GitHub was one side of the challenge and StackExchange is similar in some regards but different in others. We will approach it the same way but the code will be a little bit different in how we call that API.
Next time.... we tackle StackExchange