CSRF in Echosim.io

Echosim.io is a nice experimental site which puts a virtual Amazon Echo in your browser!

You give the site access to your microphone, and then full control over your Alexa account (which it will keep indefinitely as you are guaranteed to forget you did this), and then you speak your Alexa questions and commands to Echosim.io. It works pretty damn well actually. Props to its developers at @iquariusmedia.

Well, it turns out there’s no CSRF defences (the lord giveth props, and the lord taketh props away). This means you can just trick users into uploading your own voice commands to Echosim.io and do whatever you damn well please on their Amazon/Alexa account.

Considering the site is predominantly for use by developers and in Beta, i was reasonably forgiving… but then i took a few moments to look into what you can do exactly with full access to Alexa. Turns out, its lots.

To demonstrate, I’ve made my own version of the classic Rick-Roll, entitled the ‘Rick (and morty) Roll’. Its a simple HTML page which uploads audio samples of my voice to Echosim.io in secession to do the following:

  • If you have an Amazon Fire device, it plays Rick and Morty on Netflix (your TV turns on if it supports CEC).
  • Changes all connected smart lights to a nice R&M ‘portal green’.
  • Orders 54 Rolls of Andrex Supreme Quilts Toilet Tissue (yes, you will need to cancel these rolls).

You can find that delightful experiment here:

https://hiburn8.org/experiments/rickandmortyroll.html

Now, all Echosim.io need to do is throw up a ‘same-site’ cookie policy, and the issue goes away. But what i found quite interesting about making that page was learning that chained-interactions (Alexa interactions which require more than one command… for example ordering a product from Amazon.com – which requires a ‘yes’ confirmation), do not have the level of security measures in place I just assumed they would.

I expected, that chained-events like ordering a product or paying a bill would be strung together with a series of tokens, each validating the next request. This would be done to keep the order of events in check, avoid potential race-conditions, and basically just provide an additional level of security and integrity for more complex interactions. Well apparently it doesn’t work like that… third-party controlling apps at-least, authenticate via an API key, and that key can do anything. The end. There is no contextually aware security; asking the time has the same security as asking to disable your burglar alarm.

Ordering 54 rolls of toilet paper, as the experiment page does, is achieved by first uploading audio of me saying “Order Andrex Supreme Quilts Toilet Tissue, 54 Rolls” followed shortly-after by my ‘Yes’ confirmation. I really expected this ‘Yes’ request to require some shared secret from the previous response, but nope… just the API key is fine. The order is placed.

I think the whole authentication and authorisation model of Smart Home Assistants is going to need a pretty good facelift in the next few years, or mine will be going out the window. It feels very much like the days of the WWW before the concept of the Same Origin Policy. In the mean time, be sure you are careful with those 3rd-party apps!

-Daniel