Saturday, 2 September 2017

Unit Testing with Mocha, a local instance of dynamoDB & Promises

I'm writing the backend for my current iOS App in Javascript using node.js, AWS Lambda along with DynamoDB.

My AWS Lambda code is mostly AWS Lambda agnostic except for the initial handler methods, this makes them fairly testable outside of AWS. However, they depend on the DynamoDB. It's quite easy to write Unit Tests that run against a live version of DynamoDB but I wanted to run against a local instance, ideally an in-memory instance so that it would be quick (not that running against a real instance is that slow) and so that I could have a clean dB each time.

NOTES

  • As these tests are running against a dB it might be more accurate to call them Integration Tests implemented using a Unit Testing framework but I'll refer to them as Unit Tests (UTs).
  • This is running on MacOS
  • I don't Unit Test the actual AWS Lambda function. Instead I export the underlying functions & objects that the AWS Lambda uses and Unit Test these.
  • I'm a JavaScript, Node, AWS n00b so this if you spot something wrong or that'd bad please comment.
  • I don't like the callback pyramid so I use Promises where I can. I would use async and await but the latest version of Node that AWS Lambda supports doesn't support them :-(

In order to run this locally you'll need:
My goal was to have a clean dB for each individual Unit Test. The simplest way to achieve this I thought was to create an in-memory of dynamoDB and destroy it after each Unit Test.


This uses the child_process npm package to create an instance before each test, store the handle to the process in local-ish variable and following the tasks just kill it. The important points here are that the '-inMemory' option is used meaning that when the dB instance is killed and another re-started everything is effectively wiped without having to do everything.

The problem I had with this approach is that in addition to creating the dB each time I also needed to create a table. Whilst the documentation for local dynamoDB says that one of the differences between the AWS hosted & the local versions is that CreateTable completes immediately it seems that the function does indeed complete immediately the table isn't immediately available. This meant the UT using the table often failed with:

1) awsLambdaToTest The querying an empty db for a user Returns{}:
     ResourceNotFoundException: Cannot do operations on a non-existent table

I'm going to jump ahead and show the completed Unit Test file and explain the various things I had to do in order to get it working. This shows the tests.


before()/after() - creating/destroying dynamoDB


Rather than attempting to create & destroy the dB for each Unit Test I settled with creating it once per  Unit Test file. This is handled in the begin() & after() functions. Here the local instance of dynamoDB is spawned using the child_process package and reference to the process retained. This is then used to kill it afterwards. The important point to note here is the use of the sleep package & function.

I found when I had multiple test files, each with their own begin() & after() functions that did the same as these, even though kill had purported to have killed the processed (I checked the killed flag) it seemed the process hadn't died immediately. This meant that the before() function in the next set of tests would succesfully connect to the dying instance of dynamoDB. Then later when any operation was performed it would just hang until Mocha timed-out the Unit Test/before handler. I tried various ways to detect that the process was really dead but none worked so settled for a sleep.

beforeEach()/afterEach() - creating/destroying the table

Where possible I use Promises. Mocha handles promises quite simply for both hooks (the before*/after* functions) and Unit Tests. The key is to make sure to return the final promise (or pass in the done parameter & call it - though I don't use this mechanism).

Looking at the beforeEach() function createTable() is called which returns a promise (from the AWS.Request type that aws-sdk.DynamoDB.createTable() returns. This promise is then chained too by the synchronous waitFor method. This polls the dB for state of the table. The returned promise will not complete until the table has been created and waitFor has completed.

I am not convinced that waitFor is needed. According to the AWS vs Local DynamoDB guide for local instances tables are created immediately. I added this check as occasionally I was getting resources errors like the one earlier. However, I think the cause for that was because I forgot the return statement before the call to createTable() meaning no Promise was returned to Mocha so it thought the beforeEach() function had completed. I have removed this since in my real Unit Tests and they all seem to work.

Unit Tests

That's really it. The hard part wasn't writing the UTs but getting a local instance of DynamoDB running with the table that the functions to test used in the correct state. Again, due to the functions being tested usually returning promises themselves it is necessary to return Promise. The assertion(s) are made synchronously in a then continuation chained to Promise returned from the function being tested and Promise from the whole chain returned.

If an assertion returns false then even though it's within a continuation Mocha detects this and the test fails. If the function under test throws then Mocha also catches this and the test fails.

And finally

timeout

There's also the this.timeout(5000); at the top of the tests for this file. By default Mocha has 2 second timeout for all tests and hooks. By having the 1 second sleep for starting the dB it's possible for the hook to timeout then causing everything else to fail. The 5 second timeout protects against this.

localhost

When creating the local instance it uses the default port of 8000 and is accessible via the localhost hostname alias. This is used to access the dB when creating the reference to it in before(). A local aws.config is also constructed in each of the functions that access the dB in the actual code under test below.

The actual code being tested


Tuesday, 3 January 2017

Debugging AWS Lambda functions locally using VS Code and lambda-local

I've just started using AWS Lambda with node.js. I was able to develop these locally using the lambda-local npm package, e.g. with node.js installed (via brew) and lambda-local installed (using npm) then the following "hello, world" example is run as follows:

hellolambda.js

'use strict';

console.log('Loading function');

exports.handler = (event, context, callback) => {
    console.log('Received event:', JSON.stringify(event, null, 2));
    console.log('value1 =', event.key1);
    console.log('value2 =', event.key2);
    console.log('value3 =', event.key3);
    callback(null, event.key1);  // Echo back the first key value
    //callback('Something went wrong');

};

defaultevent.js

module.exports =
{
"key1": "hello",
"key2": "lambda",
"key3": "node"

};

/usr/local/bin/lambda-local -l hellolambda.js -e default event.js

Loading function
info: Logs
info: ------
info: START RequestId: d683128b-ac14-93c3-b2c1-5541f3bb3fda
Received event: {
  "key1": "hello",
  "key2": "lambda",
  "key3": "node"
}
value1 = hello
value2 = lambda
value3 = node
info: END
info: Message
info: ------
info: hello
info: -----

info: lambda-local successfully complete.

Rather than use bash and vi (I'm running on MacOS) I wanted to use some sort of IDE. VS Code seemed ideal as it's free and it also has builtin node.js debugging. Using it for editing is very simple, just open the folder containing the source. In this case ~/tmp/hellolambda

However, switching to the debugging section and creating the default launch configuration where VS Code will launch node with the specify file as the program doesn't do much good.



This is because when running a lambda locally using local-lambda the program that node needs to run is the local lambda environment that local-lambda creates and for it to launch the lambda function.

This can be simply configured by specifying the local-lambda script as the program (it's a node script) and then passing the lambda script and the event data as arguments using the args key (which isn't included when using the VS Code option to add a configuration). The original example above can be launched using the following configuration.

In the output window at the bottom the results of executing the lambda are shown. Breakpoints can be set and hit.

It's important that each command line argument, i.e. the option and the value are specified separately. Even though '-l' and its value are a pair they are separate command line arguments (2 in total) where "-l ${workspaceRoot}/hellolambda.js" is a single argument.

NOTE: The lambdas I'm writing are also using the AWS DynamoDB. Using a local instance of DynamoDB along with installing the AWS SDK via npm I've been able to successfully invoke local lambdas that have used the local instance of the DB.

Monday, 1 February 2016

Swift's defer statement is funkier than I thought

Swift 2.0 introduced the defer keyword. I've used this a little but only in a simple way, basically when I wanted to make sure some code would be executed regardless of where control left the function, e.g.

private func resetAfterError() throws
{
  defer
  {
    selectedIndex = 0
isError = false
}
  if /* condition */
  {
    // Do stuff
    return
  }

  if /* other condition */
  {
    // Do other stuff
    return
  }

  // Do default stuff
}

In my usage to date there has always been some code that should always be executed prior to the function's exit and additionally only one piece of code. Therefore I've always put the defer statement at the top of the function so when reading it's pretty obvious.

I was aware that if there were multiple defer statements then they'd be executed in reverse order but what I'd not given any thought to before was what happens if the defer statement isn't reached. In fact I'd just assumed it was more of a declaration that this code should always be executed on function exit and as I put mine right at the start of the function this was effectively the case.

However, for some functions (probably most) you don't want this. You only want the deferred code executing if some else as happened. This is shown simply in The Swift Programming Language book example:

  1. func processFile(filename: String) throws {
  2. if exists(filename) {
  3. let file = open(filename)
  4. defer {
  5. close(file)
  6. }
  7. while let line = try file.readline() {
  8. // Work with the file.
  9. }
  10. // close(file) is called here, at the end of the scope.
  11. }
  12. }

In this if the file is not opened then the deferred code should not be executed. Another very important usage is:

extension NSLock
{
  func synchronized<T>(@noescape closure: () throws -> T) rethrows -> T
  {
  self.lock()

    defer
    {
self.unlock()
    }
  return try closure()
  }
}

If the lock is never obtained then it should never be unlocked. In this case this shouldn't have as the self.lock() will not return until it obtains the lock but if that line were replaced with self.

This is how defer works. If the defer statement is never reached and/or encountered then the deferred code block will never be executed. This includes branches (if-statements etc.). The following example:

enum WhenToReturn
{
  case After0
  case After1
  case After2
}

func deferTest(whenToReturn: WhenToReturn, shouldBranch: Bool)
{
  print("Defer Test - whenToReturn:\(whenToReturn), shouldBranch:\(shouldBranch)")
  defer
  {
    print("defer 0")
  }
  print("0")
  if whenToReturn == WhenToReturn.After0
  {
    return
  }
  defer
  {
    print("defer 1")
  }
  print("1")
  if whenToReturn == WhenToReturn.After1
  {
    return
  }
  if shouldBranch
  {
    defer
    {
      print("shouldBranch")
    }
  }
  defer
  {
    print("defer 2")
  }

  print("3")
}

deferTest(WhenToReturn.After0, shouldBranch: false)
deferTest(WhenToReturn.After1, shouldBranch: true)
deferTest(WhenToReturn.After2, shouldBranch: false)
deferTest(WhenToReturn.After2, shouldBranch: true)

Results:

Defer Test - whenToReturn:After0, shouldBranch:false
0
defer 0

Defer Test - whenToReturn:After1, shouldBranch:true
0
1
defer 1
defer 0

Defer Test - whenToReturn:After2, shouldBranch:false
0
1
3
defer 2
defer 1
defer 0

Defer Test - whenToReturn:After2, shouldBranch:true
0
1
shouldBranch
3
defer 2
defer 1
defer 0

Program ended with exit code: 0


This shows that returning before and/or not branching results in defer statements not being encountered hence the deferred code is not executed.  This is no different to say a finally-block in C#. The reason for my initial confusion is that there is no additional content for a defer block as there is for a finally block, i.e. the presence of the try, e.g.

try
{
  // Try some stuff 
}
finally
{
  // Always do something having tried something regardless of whether it worked or not
}

Whereas the only and actual context of the defer block is it's position.

Tuesday, 26 January 2016

The Perils of debugging with return statements in languages without semi-colon statement terminators, i.e. Swift

This is a pretty obvious post but perhaps writing it will stop me falling prey to this issue.

When I'm debugging and I know that some code executed in a function is not to blame but is noisy in terms of what it causes to happen etc. I'll often just prevent it from being executed in order to simplify the system, e.g.

func foo()
{
/*
f()
g()
h()
// Do lots of other things...
*/
}

Sometimes I like to be quicker to I just put in an early return statement, i.e.

func foo()
{
return
f()
g()
h()
// Do lots of other things...
}

I must also go temporarily warning blind and ignore the following:


The effect of this is that rather than prevent everything after the return statement from executing it as per the warning the return statement takes f() as its argument and explicitly calls it returning its value, though not executing the remaining functions. In this case as foo() (and f() though it's not shown) is void that is nothing. In fact if foo() or f() had non-void return types this wouldn't compile.

The fix is easy. Just put a semi-colon after the return.

func foo()
{
return;
f()
g()
h()
// Do lots of other things...

}

I use this 'technique' when I'm debugging C++ where this works fine. This is slightly interesting as C++ has the same semantics. The following C++ code has the same problem as the Swift, in that this code also invokes f() as its return value.

void foo()
{
return
f();
g();
g();
}

I guess the reason it's not an issue with C++ (as much or at all) is that my muscle memory or something else is always wanting to terminate lines with semi-colons so the natural way to write the return would be 'return;' whereas in Swift without the semi-colon requirement it's natural not to hence this issue becomes slightly more prevalent.

Tuesday, 19 January 2016

OAuth authentication on tvOS

Recently I've just published an Apple TV (tvOS) App to view photos stored on Microsoft OneDrive.



Implementing this on tvOS rather than iOS presented one unique challenge. The OneDrive REST API requires OAuth2 authentication in order to obtain an OAuth token which is then used for all the other calls.

Normally (well based on my limited experience) OAuth within Apps is handled by using a UIWebView along with delegate code that performs the OAuth handshake (image linked from IBM).



tvOS does not contain any form of web view, i.e. no UIWebView and no WKWebView (not that it would be that much use due to the lack of hooks). As the actual authentication is performed within the UIWebView by the authenticating 3rd party (Microsoft in this case requiring the user logs in with their Microsoft Account credentials) there's not a lot that can be done without it.

However, both iOS and tvOS are generally logged into using an Apple Id which is also used to login into iCloud and generally for an Apple TV owned by the same person who owns another iOS device these use the same Apple Id. Therefore, what I did was to write a very simple iOS App that:
  1. Performs the OAuth Authentication handshake
  2. Stores the resulting OAuth token in the iCloud KeyValue Store






When written this is usually synchronized to iCloud very quickly. On the other side the tvOS app reads the iCloud KeyValue Store checking to see if the OAuth token exists.


If it does then it can continue as per any other App that has successfully performed the OAuth handshake.


I believe that iCloud Storage and the process of writing to and reading from iCloud is secure. This is important as following the handshake the OAuth token acts effectively as a password. Each token obtained from Microsoft is valid for one hour so after that the user needs to perform the authentication from the iOS device again.

It is possible to request an OAuth refresh token which allows a client to update an expired token as long as access to OneDrive for the App has not been revoked. However, I prefer to err on the side caution at the moment. I also only request read-only access to OneDrive as well.

For this to work the same user (Apple Id) needs to be logged into both the Apple TV and the iOS device as the same user and additionally be signed into iCloud on these devices. From a programmatic perspective both Apps need the iCloud capability enabling but only Key-value storage.

However, you'll notice that CloudKit has also been enabled. This is so that CKContainer methods can be called. In particular (well only)

CKContainer.defaultContainer().accountStatusWithCompletionHandler

In order to establish whether the user is currently signed in to iCloud and any changes (signing in & out potentially as a different user) whilst the App is running.

Enabling this automatically creates an Entitlement file (named <AppName>.entitlements) and within it creates Key confusingly called 'iCloud Key-Value Store' with the default value of '$(TeamIdentifierPrefix)$(CFBundleIdentifier)' - this is not the key to access values you store BTW but is just the iCloud KV configuration. This will happen for both the iOS and tvOS Apps.

NOTE: The two collapsed keys are as a result of enabling CloudKit.


For both Apps to have access to the same iCloud Key-Value storage the results of expanding the ''$(TeamIdentifierPrefix)$(CFBundleIdentifier)' macros needs to be the same. For my App I've created  a single App that has both an iOS and tvOS component so their CFBundleIdentifier is the same. The TeamIdentifierPrefix is taken from your Developer Apple Id.

The first part of the value has to be $(TeamIdentifierPrefix) as in order to make the KV Storage secure this value forms part of the signing process. If you replaced the whole value with say 'BOB' then it won't build properly.


As such it's possible for all your Apps (published from the with the same Apple Id) to share iCloud Key-Value Storage contents.

Reading & writing is very simple. I just use a single Key-Value pair to read, write (& where necessary delete the token. This is accomplished using:

NSUbiquitousKeyValueStore.defaultStore().setDictionary(stuff.asDict(), forKey: "mykey")

to write and

let stuff = NSUbiquitousKeyValueStore.defaultStore().dictionaryRepresentation["mykey"] as? [String:AnyObject]

to read.

This example reads & writes a dictionary as I needed to set store a set of KV Pairs (a dictionary) as the value of a single KV-pair but fundamental data types can be stored directly too.

When starting the App, according to the docs it is important to call synchronize method to initiate timely iCloud synchronization.

When the tvOS based Apple TV was first released there were various articles about how to enable users to login to their accounts for certain apps. These often involved similar configuration requiring the user to use an iOS device to input a string of numbers presented by the tvOS App. However, this solutions was usually for Apps that managed their own accounts. This solution is similar in that solves the cumbersome entry problem but also enables the use of browser (UIWebView) based OAuth2 on a device that directly support it.

Wednesday, 23 December 2015

Git remote repos with OneDrive

I have various public git repositories on GitHub but I like to keep some source (usually my active App Store apps) private. Whilst it'd be nice to use GitHub private repositories, given my Apps are for fun and don't really make anything, don't require collaboration then the pricing is prohibitive.

However, I really like the idea of having an offsite copy of my repository. As it happens I have an Office 365 Subscription which comes with 1TB of OneDrive space. I use OneDrive on OSX to sync a bunch of folders. I could use a synchronised OneDrive for my work directory but I don't want OneDrive synchronising all the temporary build files etc every time I build.

It turns out the ideal solution is to create a remote clone in a OneDrive synchronised folder. In fact I have a dedicated OneDrive folder called 'src' that contains clones of all of my git repos. Then, each time I commit to the local git repository and push OneDrive performs the synchronisation. If it happens that the code I'm working on is public then adding a public GitHub repo is a doddle.

Having created a local Git repo (though usually Xcode does this when starting a new project) it's easy to create the OneDrive clone:

  1. cd /Users/Pete/OneDrive/src
  2. git clone --bare file:////Users/Pete/Projects/<ProjectName>/.git <ProjectName>.git
As the remote clone is really just a backup come master that will never be working repository I create a bare clone and suffix the directory name with '.git'. I think this is a fairly common convention.

At this point the source repository is cloned but the source is now a remote of the new repository rather than than the other way round. This is easy to fix:

cd-ing into <ProjectName>.git (my example Project is Photone> and running git remotes gives:

~/OneDrive/src/Photone.git[23]git remote -v
origin file:////Users/Pete/Projects/PhotoneViewer/.git (fetch)
origin file:////Users/Pete/Projects/PhotoneViewer/.git (push)

than running:
  1. git remote remove origin
Removes the relationship between the new remote and the original source. To implement the desired reverse relationship:


  1. cd /Users/Pete/Projects<ProjectName>
  2. git remote add OneDrive file:////Users/Pete/OneDrive/src/<ProjectName>.git
I name my OneDrive remote repos 'OneDrive'. This helps if I have multiple remotes.

git remote (for my current project) now gives:


~/Projects/PhotoneViewer[49]git remote -v
OneDrive file://Users/Pete/OneDrive/src/Photone.git (fetch)
OneDrive file://Users/Pete/OneDrive/src/Photone.git (push)


From this point onwards I use SourceTree. However, if you use the command line then a couple of extra steps are required otherwise git complains. 

Firstly, when pushing if you just want to do:

git push OneDrive you need to tell git that the new OneDrive remote is the master. This is done by:

git push --set-upstream OneDrive master

Secondly, unless you've already set the push.default setting or just use 'git push --all' then you'll need to decide which option you want. The help from git describes these well:

Git 2.0 from 'matching' to 'simple'. To squelch this message
and maintain the traditional behavior, use:

  git config --global push.default matching

To squelch this message and adopt the new behavior now, use:

  git config --global push.default simple

When push.default is set to 'matching', git will push local branches
to the remote branches that already exist with the same name.

Since Git 2.0, Git defaults to the more conservative 'simple'
behavior, which only pushes the current branch to the corresponding
remote branch that 'git pull' uses to update the current branch.

See 'git help config' and search for 'push.default' for further information.
(the 'simple' mode was introduced in Git 1.7.11. Use the similar mode
'current' instead of 'simple' if you sometimes use older versions of Git)

If you're using the remote clone as a backup then perhaps the original, i.e. matching behaviour is desirable. Whilst I use OneDrive this configuration should work for any other file synchronisation service, e.g. iCloud, DropBox or a standard mounted File System, e.g. Samba, NFS etc.

Swift Enums and Protocols

I'm trying to clear out my inbox before Christmas and I noticed an emails to myself entitled 'Enum question. Add protocol to enum?'.

The short answer is yes. The longer one. Take the following protocol

protocol Foo
{
func f() -> Void
}

A simple enum can be created that implements it:

enum Test: Foo
{
case One
func f()
{
print("Hello")
}

}

So the following short program:

let baz = Test.One

baz.f()

generates the output:

Hello
Program ended with exit code: 0

The protocol can also be implemented by extension so the following is equivalent and produces the same results:

extension Test: Foo
{
func f()
{
print("Hello")
}
}

It's not surprising that Swift's enum types support protocols. Off the top of my head I can't think of any clever reasons why you'd have enums implement a protocol. Given an enum is a first class type in Swift then it makes perfect sense. In fact it's documented on page 424 in The Swift Programming Language.