App Development

Serverless architectures for multimodal experiences

AWS Lambda, Azure Functions, Google Cloud Functions, and IBM OpenWhisk, created by the major tech companies, have created offerings catered to providing serverless architectures for developers. These offerings were born out of an increasing need in modern cloud development, moving from monoliths, large single systems that handle all of the logic, to microservices, modular pieces of code that usually perform a single task. his micro-servive centric approach lends itself nicely to the creation of multimodal experiences, an experience that exsists across multiple modes of interaction like a mobile app, chatbot, or virtual assistant.

What is serverless?

The simplest way to think of serverless architecture is Functions-as-a-Service or FaaS. Instead of one monolithic app server, serverless systems are comprised of many disparate, single-use functions. The term serverless refers to the fact that these functions are not kept constantly running like traditional application servers. Instead, they are spun up as needed and taken down when done executing. The cloud provider controls these resources so developers don’t have to be cloud system experts to manage their functions. This “as-needed” approach is driven by event triggers. These triggers vary by platform, but in general they are based on either a direct HTTP request or when an action is performed on another service in the provider’s platform, e.g. when a document is added to a NoSQL database. Finally, some providers allow for built-in chaining of these functions. Features like AWS Step Functions or OpenWhisk Sequences allow a developer to set up these chains, taking the output of one function as input to another. In theory, one could create a psuedo-functional language making the backend just different function chains. Here’s a potential chain for a very simple customer service Twitter bot: serverless multimodal blog 1

Multimodal Experiences and Serverless

Serverless systems lend themselves nicely to multimodal experiences. The functions are inherently agonistic and operate off of generic events. This allows for code to be reused across multiple platforms, like an Android and iOS app, and across a suite of products. A company could create a unified bug reporting system where a function exists as an HTTP endpoint, takes in some BugReport JSON object and then sends it into the company’s bug tracking system. Now instead of implementing the same bug reporting systems’s SDK in each and every projection (like your web app, iOS app, Android app, etc.), they can all be funneled through this single endpoint. These functions are also great for pushing data to all of these different endpoints based on ambient triggers. A user could trip some local event trigger, like a beacon or entering a geo-fence, which then fires off an email or push notification to the user asking them to give feedback about their visit. These genericized interactions make adding new projections much more simple. Not only can an app’s business logic be moved off into these on-demand functions, but those functions themselves can be reused across different function chains. One could write a function that takes in a phone number and some text and uses an SMS service to send a text. Now if any system needs the ability to send texts, this function can be used across all of them. A company could create a library of service wrappers, data processors and work-flow integrators and have n-number of apps utilize them without having to duplicate effort. A similar type of system can be seen with Microsoft’s Flow offering. This modularity and composability is critical in a world where we care more about what a user’s intention is and not the interface they’re using.

Potential Drawbacks

So far serverless seems like a no-brainer: after all, reusable code has been the holy grail of development for years. However, moving this code onto a cloud provider comes with its own set of unique limitations. The primary one is latency. Regardless of the abstraction a platform might use, these functions are ultimately called via an HTTP request. This means network performance can limit the speed in which events are sent out and the result of them is received. The on-demand aspect of serverless functions also causes initial latency. Unlike traditional servers which are always running, these functions need to be spun back up after some period of inactivity. If the execution environment runs something like the JVM this means there is a non-insignificant delay before the code can actually run. serverless multimodal blog 2 Above you can see a benchmark done comparing Node, Java and Python on AWS Lambda that was done here.

Serverless functions are also not well suited for long-running processes. Providers generally cap the execution time a function is allowed, or charge extra, so most actions need to be short and simple. The local debugging tools for many of these offerings is also very much in its infancy. This should improve over time but as of this writing there’s a lot of trial and error, or having to use the provider’s web interface to do debugging. Finally, there is a good amount of vendor lock-in when setting up a serverless backend. Each offering behaves slightly different under the hood and most get setup using the interface of the hosting provider and integrate more easily with that provider’s services. This means if someone wanted to switch from IBM’s Openwhisk to AWS’s Lambda/Step Functions they would have to recreate a good chunk of their triggers and recreate their sequences in a new tool.

An Example using OpenWhisk

To showcase a very simple example we will be using OpenWhisk as the FaaS provider on a macOS system. The reason for this is because it is actually possible to run a local instance following the directions in their README. For those who don’t want to go through the hassle of setting it up, you can always use the hosted instance on IBM’s Bluemix. For non-macOS users you’ll have to get the tools through your own package manager or from the site itself but the OpenWhisk instance should work on Windows and Linux as well.

Setting up Openwhisk

This sample uses brew so if you don’t have it on your macOS system install it from here or, if you’re feeling lazy, just run this script to install it (taken from the Homebrew site):

/usr/bin/ruby -e "$(curl -fsSL"

And if you already have brew, or even if you just got it, be sure to run brew update before getting started.

You’ll need the following installed to run OpenWhisk locally:

brew cask install vagrant

brew cask install virtualbox

And of course the OpenWhisk CLI located here

*Ignore anything below downloading the cli unless you’re using Bluemix

Once downloaded, just make it executable and move it into your bin directory:

sudo chmod+x wsk
mv wsk /usr/local/bin

With those installed we’ll need to get the OpenWhisk code:

git clone --depth=1 openwhisk

We’ll then need to go to the following directory:

cd openwhisk/tools/vagrant

NOTE All files and commands will run and exist in this directory for this sample for simplicity’s sake.

Run the following script:


and after a lengthy build process the sample action should print out:

wsk action invoke /whisk.system/utils/echo -p message hello --result
    "message": "hello"

We then need to point the wsk cli to the newly spun up instance (The ip address should be the same):

wsk property set --apihost --auth `vagrant ssh -- cat openwhisk/ansible/files/auth.guest`

Creating The Actions

OpenWhisk supports actions in JavaScript, Python, Java, arbitrary Docker containers and in the case of this example, Swift, which is unique to OpenWhisk. For Swift actions there must be a function with the following signature func main(args: [String:Any]) -> [String:Any] That is a function named main, which will be the entry point, that takes in a Dictionary of String:Any and returns a dictionary of String:Any (essentially JSON in both cases). Keep in mind that OpenWhisk uses the linux build of Swift therefore not all features in Swift may be available. A good place to test your actions is the Swift Sandbox which also executes in a Linux environment. Our first action will take in JSON with the key names and an array of Strings as its value. It will then return all names that aren’t Matt.

// FilterAction.swift
func main(args: [String:Any]) -> [String:Any] {
    if let names = args["names"] as? [String] {
        let nonMatts = names.filter {$0.lowercased() != "matt"}
        return ["names" : nonMatts]
    } else {
        return [ "error" : "No names provided" ]

Assuming your OpenWhisk environment is all setup you can add your action using the following command from Terminal:

wsk action create -i FilterAction FilterAction.swift (the -i is only needed for local/non-https instances).

To push any changes you make to your source file just run this command:

wsk action update -i FilterAction FilterAction.swift

To run the action you just run this command:

wsk action invoke -i --result FilterAction

and currently you should see:

    "error": "No names provided"

So let’s pass in some parameters:

wsk action invoke -i --result FilterAction --param-file names.json

where names.json is a file with the following contents:


and we should now see:

    "names": [

Let’s create another action, this time it will sort a list of names alphabetically:

func main(args: [String:Any]) -> [String:Any] {
    if let names = args["names"] as? [String] {
        return ["names":names.sorted(by: {$0 < $1})]
    } else {
        return [ "error" : "No names provided" ]

We can use the same commands to create then run the action:

wsk action create -i SortAction SortAction.swift

and then invoke it:

wsk action invoke -i --result SortAction --param-file names.json

and you should see the following output:

    "names": [

Now we have two Actions which currently operate independently of each other. In the next section we’ll hook them up so that they can talk to each other.

Creating a Sequence

Next, let’s create a simple sequence that, given the same JSON that we’ve been using, filters out all the Matts and sorts the remaining names. A Sequence is nothing more than an Action that chains multiple actions together where the output of the previous is used as input for the next.

Creating one is very similar to creating an Action only with the --sequence option and the actions to combine as inputs:

wsk action create -i FilterAndSortAction --sequence FilterAction,SortAction

Let’s fire off our new sequence:

wsk action invoke -i --result FilterAndSortAction --param-file names.json

and we should see our sorted and filtered names:

    "names": [

So our FilterAndSortAction has taken in the JSON file, sent it to FilterAction, which then removes all Matts, sends that filtered JSON to the SortAction, which alphabetically sorts the names, and returns the sorted JSON as our result.

Triggers and Rules

As mentioned in the above sections, these FaaS systems are driven by events. In OpenWhisk these are called Triggers and they are associated to Actions via Rules. You can setup Rules such that a Trigger can fire multiple Actions and multiple Actions can use the same Trigger. Triggers and Rules should typically not be used to get some result back, we’ll see why later, but ultimately the last action in a Sequence in a rule should push the data to the client that needs it (e.g. sending an email, push notification, writing to a DB).

Creating a Trigger is simple. Let’s create one that would simulate a new name being added to our list (in a real example this could be a JSON document being updated in your database):

wsk trigger create -i namesUpdated

To see that your Trigger was added run the following command:

wsk trigger list -i

And you should see the following:

/guest/namesUpdated         private

Don’t worry what all that means! Our example doesn’t get into namespacing or exposing endpoints publicly. Now that it has been created, let’s fire it:

wsk trigger fire namesUpdated -i --param-file names.json

and you should see the following:

ok: triggered /_/namesUpdated with id <someID>

So the trigger fired but it’s not hooked up to anything. Let’s fix that. To associate a Trigger to an Action or a Sequence we need to create a Rule:

wsk rule create -i nameRule namesUpdated FilterAndSortAction

This creates a Rule named nameRule which associates the namesUpdated Trigger with our FilterAndSortAction Sequence.

Now let’s fire our new rule:

wsk trigger fire namesUpdated -i --param-file names.json

Which gets us this:

ok: triggered /_/namesUpdated with id <someID>

All this tells us is that our Trigger was fired and the namesUpdated Rule ran with some activation id. In order to actually see the result we need to get the activation id of the latest activation in our FilterAndSortAction Sequence. However, the id we get from the Trigger is the activation of the first action in our sequence, not the last. To see the most recent activation we need to run the following command which lists the most recent activation for an Action or Sequence, in our case it will be the activation of the last Action in our FilterAndSortAction Sequence :

wsk activation list --limit 1 -i FilterAndSortAction

We should see the following:

<someID> FilterAndSortAction

That is the actual id we’re looking for. To see its result we must run the following:

wsk activation result -i <someID

which will give us our sorted and filtered JSON!

    "names": [

Now you can see why Rules and Triggers aren’t ideal for actually getting data, but more for kicking off asynchronous actions that run behind the scenes. If you want the result of an Action/Sequence you should just invoke it directly.

If you’re interested in learning more advanced OpenWhisk features or using the REST/iOS interfaces check out their Github Page.

You can also explore some other Serverless systems:


The sample provides a rather trivial example, but it should show the power of being able to combine functions to process data and perform asynchronous actions outside of any particular client. The direct invocation of actions is very similar to that of hitting a particular REST endpoint, and often these sequences can be placed behind an API gateway to do just that. The Trigger and Rule invocation shows that you can create complex event-based systems where your system can fire off multiple actions based off some trigger and have various services invoked without the need for direct input from the user. This setup can provide seamless experiences for a user who sees things react to him/her, and even proactively provide information to him/her, across an ecosystem of modalities ultimately providing the rich experience we want all users to enjoy.

Moving from Monolith to Microservices Architecture

When a client decides to move from a monolith platform to microservice architecture,...

Read the article