The Register reported recently on the discovery of large amounts of data exposed on Amazon's S3 cloud storage platform. Before I discuss the issues, I want to say that I am an S3 user myself.
The data are exposed through what is properly referred to as "leaky buckets". So-called buckets are the storage unit on S3. The term "leaky" means that the data in those buckets is exposed so anyone can read it. I discussed this a year or so ago, but I want to discuss the broader idea of authorization as it relates to data use.
How was the data exposed?
The data is exposed because the people who put it in the cloud didn't specify the proper access permissions. As I noted in the earlier post, the default is to protect the data, and users have to take specific action to expose it. That is, they have to authorize its use. As a simple explanation, authorization is who is allowed to do what. A formal definition from RFC4949 the Internet Security Glossary, Version 2 says authorization is, "An approval that is granted to a system entity to access a system resource." A user is an example of a "system entity".
In the case of the leaky buckets, it means the individuals configuring them did not properly manage the permissions that control the authorization.
How are users authorized to access data?
In most computer systems a user somehow proves their identity to the system. Maybe they log I with a password, maybe they provide a pin, or maybe they use a fingerprint. Once that identity is established, the system can use that to decide what the user is allowed to do. For instance, if the user Mary has a file she allows her group to read, but only allows herself to edit the file, that defines a set of permissions for the file that controls who has access. If Tom is a member of the group he is authorized to read the file, but not to edit it. If Luisa is not a member of the group she is not authorized to read it.
The user and group permissions may be a good way to control access on a traditional computer or server, but it is not always possible on the internet. Yes, S3 had a very good permission authorization system (that is only good if it is used, though), but that may not be appropriate in all cases. In some situations, an alternative is to encrypt the data and to manage the encryption keys to give access. For the above example, Mary could share the encrypted find in such a way that anyone can read it. She could then give Tom and the other members of her group an encryption key they can use to read the file. This could be implemented in a few different ways, but that's a bit beyond the scope of this post.
It is important on servers, shared systems, cloud storage, and other environments to verify the access permissions for data. There are many tools security professionals can use to assess S3 bucket security and search for unsecured buckets. There are also tools for server and end-user operating systems to scan for potentially incorrect permissions.
The key is to decide, based on the organization's security policy (or plan) who is authorized to do what. Once that is decided - and it can change dynamically - set permissions accordingly and audit to verify that the permissions implement the policy.
To your safe computing,