Contact Us

Use the form on the right to contact us.

You can edit the text in this area, and change where the contact form on the right submits to, by entering edit mode using the modes on the bottom right. 


Oak Ridge, TN, 37830
United States

Swift-Snips

80/20 of Swift : Day 2

Wade Cantley

Lets Recap!

We left off getting all of the Class and Framework data into a database table and the next challenge is getting some numbers out of the APIs.

  • Github - Holds user-submitted code from published projects.  This is a great place to find solutions, examples, and custom frameworks put out there by the authors for the community to use and contribute to.  It is also a great place to determine where iOS frameworks are being implemented.  The more a given class in a framework is found here, the more common or popular it is.
  • StackExchange - Contains the solutions to all your woes.  It is crowd-sourced solutions to problems and answers to questions. This indicates the interest so the more a given class is mentioned here, the more it is being used in a situation where a solution is being poled to the members.  Just because you find a term here though, doesn't meant that it is popularly used (as Github would reflect). More on that later.

How do we get the data we are after?

There are a few parts we have to address for each set of data.

  1. A place to store it.
    We need to add some fields to the table that holds our classes.
  2. API Keys
    Each API service requires user specific API keys before you can really start to dig into the data with any speed.  This opens up the number of calls you're allowed to make to the API.  Without one, you might be limited to a few calls a minute where as with an API key you can do dozens of calls per second.  In any event, you need it.
  3. A URL call to the API.
    This is simply a URL string that contains all your search criteria, your key, and the API URL.
  4. A programmatic means to loop over the classes and insert the results.
    You need a way to submit each class to the API, and then get the information your after and put it into the database. This could be any number of programming languages that you can connect to your database but I used ColdFusion (CFML) because it is very simple and I have lots of history with this powerful yet out-of-vogue language.

Lets begin.


GitHub

1. Add some fields to store relevant data.

This required a bit of poking and prodding as you can't do a "code" level search so the results that can be brought back for a given language and framework are based on whether they are mentioned in the title of the Github project, or within the description.  This is pretty helpful in the sense that if the project showcases a class and is worth mentioning in the description it is probably important to the project.

So the primary field we will store our resulting data we will call "totalSwiftCountGitHub"

2. API access... get some.

You are going to need some sort of means to connect with the API that will allow you to pass lots of queries to GitHub in quick succession. Without it, GitHub will force you to stagger your URL requests.  There are a few ways to do this but I found that registering as an App and getting a Client Id  and a Client Secret  was the simplest way to get what I needed.

This covers the basics of how to get started : https://developer.github.com/guides/basics-of-authentication/

This is where you start your registration to get your Client ID/Secret : https://github.com/settings/developers

Here is the registration page : https://github.com/settings/applications/new

It will ask you for a handful of fields. My application isn't located at these URLs, that is just my personal website. 

When you are finished, you get your Client ID/Secret and it will look like this :

p.s. I will have changed my client secret so you can't use these.  ;)

p.s. I will have changed my client secret so you can't use these.  ;)

3. Call the API

We want to test this out a bit and refine the query.  Here is a link to help describe all these aspects.

https://developer.github.com/v3/search/#search-repositories

But, to shortcut all the documentation and boil down to what we will use, let's break down the URL that worked for me.

https://api.github.com/search/repositories?q=avaudioplayer+in:name,description,readme+language:swift&page=1&type=Code&per_page=1&client_id=85c31734801c959e966c&client_secret=46b4554d510f6e4e746c748d15421b583a8353b9

Note : The above link won't work for you because I changed the secret.

  • URL : https://api.github.com/search/repositories?
  • q : This is how they format their query string.  Here we are specifying that we are looking for "avaudioplayer" in the Name, Description and Readme, andIt needs to be in the language of Swift.
  • page : I don't need more than one page's results because I am looking for the total and that is part of the meta data that will be returned.
  • type : I know we said we couldn't query the actual code but this looks for the key word at the code level but not inside the code specifically.
  • per_page : I don't need more than one result per page since the data we are after is in the Meta data.
  • client_id / client_secret : You put the strings Github gave you when you registered your app.

So what comes back when you put your URI into a browser?

There's more... but this is good enough to show that we are after the total_count

There's more... but this is good enough to show that we are after the total_count

We are after the total_count part of this XML return and we will get at that nugget of data when we put it into some code that will extract it and put it into the database.

4. Loop and stick it into the database

The next part I am going to use a language called ColdFusion which is a procedural language for integrating logic into web pages.  If you're not familiar with it, it is like PHP or .NET in that it puts programming logic into a page that when called, crunches the instruction and sends out resulting HTML.  In this example though, I will only use it for the crunching part and output to the screen to confirm that what I am working on is working.

Also, I am not going to go over how to install ColdFusion.  This is a high-level view of what I did to get the results I wanted but you should use whatever language you're familiar with.  The logic should be pretty straight forward.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
<!--- Get the data and put it into a variable that we can loop over. --->
<cfquery name="getSwiftClasses" datasource="appclaytonweb" maxrows="430" >
	select 
		tblSwiftMethods.id,
		tblSwiftMethods.class 
	from util.tblswiftmethods
	where tblSwiftMethods.totalSwiftCountGitHub is null
	order by tblSwiftMethods.id asc

</cfquery>

<!--- GitHub - Search - Score for Swift --->
<cfoutput query="getSwiftClasses">
	<!--- Filter out where there are more than 1 word. --->
	<cfif listlen(class, ' ') eq 1>

		<!--- make lowercase so that it can be better searched. --->
		<cfset lowerCaseClassName = trim(lcase(class))>

		<!--- Make the call to the API --->
		<cfhttp url="https://api.github.com/search/repositories?q=#lowerCaseClassName#+in:name,description,readme+language:swift&page=1&type=Code&per_page=1&client_id=85c31734801c959e966c&client_secret=46b4554d510f6e4e746c748d15421b583a8353b9" method="get"  result="result" charset="utf-8">
		</cfhttp>

		<!--- Get the JSON data and turn it into something we can work with --->
		<Cfset deserializedJsonResults = deserializeJSON(#result.filecontent#) >

		<!--- Check that the data we are after exists --->
		<cfif isdefined("deserializedJsonResults.total_count")>
			<cfset total = deserializedJsonResults.total_count >

			<!--- send to the browser the current looped result --->
			#lowerCaseClassName# : #total#<br>

			<!--- Update the current record with the data --->
			<cfquery name="inserData" datasource="appClaytonWeb">
				UPDATE [Util].[tblSwiftMethods]
				 SET [totalSwiftCountGitHub] =  #total#
				 WHERE id = #id#
			</cfquery>

		<cfelse>
			<cfdump var="#deserializedJsonResults#" abort="true">
		</cfif>
	</cfif>

	<!--- Give a couple second pause so that we don't get bocked by hitting Github to many times at once --->
	<cfset sleep(2100)>
	<cfflush>

</cfoutput>

So once we run this, we limit each run to about 430 records because we don't want this to time out too fast.  If you're running this on something that is not dependent on IIS or some other browser request based programming language, then this sensitivity to time-out may not be an issue.

And that is it!

We should have totals for all our classe!


Yeah, GitHub was one side of the challenge and StackExchange is similar in some regards but different in others.  We will approach it the same way but the code will be a little bit different in how we call that API.

Next time.... we tackle StackExchange