Author: Karthik Padmanabhan
Karthik has been with PageUp for over 5 years, having originally joining in a Technical Advisor role, he is now our Head of Architecture and has been a key contributor to our AWS architecture and strategy.
Some time ago at PageUp, we adopted AWS multi-account strategy. Today we have about 80 AWS accounts in PageUp.
The multi-account strategy enabled autonomy and ownership at PageUp, while like anything, it has trade-offs.
Even with a handful of accounts, managing elevated access was a big challenge for us. We did not have the luxury to have a person be responsible for this nor did we want centralised responsibility for it.
So, when it came to a multi-account strategy, we knew it would pay to be proactive and automated.
As you must have figured it out by now, this article discusses how at PageUp, we automated on-demand elevated access to AWS for our development teams.
Principles
We tried to stick to the “Principles of least privilege.”
- Everyone has ReadOnly access to AWS Accounts they belong to by default.
- Only provide elevated access to AWS accounts that are requested/needed.
- Elevation of access is only for an exceptional situation, on-demand, peer-reviewed, auditable, time-bound and automated.
The last principle especially highlights the importance given in PageUp to encourage Continuous Deployment and Infrastructure as Code practices.
Intended experience
Elevated AWS access experience @PageUp
Most of the times developer require elevated access to AWS only when they are dealing with a production situation, so we wanted to ensure the experience is as straight forward and pain-free as possible.
In Crowd — The solution
AWS State Machine
Given the process requires some form of the state machine, our readily available choice was AWS Step function.
We had some experience already with AWS Landing Zone that we deployed not long ago — so the team was quite comfortable with it.
Organising AWS Accounts
Like I mentioned above, we have more than 80+ accounts; we use a combination of AWS Organisation Units & PageUp AWS Domains to organise them.
Each of PageUp AWS Domain has seven accounts. The creation, as well as management of these AWS accounts, are based on Continuous deployment system called Domain Vending Machine (It uses AWS’s Account Vending Concept — a subject of another day).
Example: HR Organisation is a big thing in PageUp, so is Screening (in the context of recruitment). They have their own PageUp AWS Domains.
IAM User & Group management
All our users, groups and their associated permissions reside in a Git repo. It allows for easy audit and change management. As you see below each user is associated with groups each representing a PageUp AWS Domain.
Example
users:
karthikp:
groups:
- SoftwareDevelopers
- RecruitmentDomain
- CareersDomainterence:
groups:
- SoftwareDevelopers
- RecruitmentDomain
- CareersDomain
...
Groups such as “SoftwareDevelopers” provide access to standard IAM actions that are basic for all developers.
For every PageUp AWS Domain, we have two policies as highlighted below.
groups:
Screening:
policy_file: Screening/Screening.j2
ScreeningMaintenance:
policy_file: Screening/ScreeningMaintenance.j2
...
The maintenance policy is used for elevation and will have PowerExecutionRole or similar associated with the group.
{# Default Policy sample #}
{
"Statement": [ {
"Sid": "AllowAssumeScreeningUserExecutionRole",
"Effect": "Allow",
"Action": [ "sts:AssumeRole" ],
"Condition": {
"Bool": {
"aws:MultiFactorAuthPresent": "true"
}
},
"Resource": [
{# screening_deploy #}
"arn:aws:iam::xxx:role/ReadOnlyExecutionRole",
{# screening_staging #},
"arn:aws:iam::xxxx:role/ReadOnlyExecutionRole",
....{# Maintenance policy sample #}
{
"Statement": [ {
"Sid": "AllowAssumeScreeningUserExecutionRole",
"Effect": "Allow",
"Action": [ "sts:AssumeRole" ],
"Condition": {
"Bool": {
"aws:MultiFactorAuthPresent": "true"
}
},
"Resource": [
{# screening_deploy #}
"arn:aws:iam::xxx:role/PowerUserExecutionRole",
{# screening_staging #},
"arn:aws:iam::xxxx:role/PowerUserExecutionRole",
....
Circling back to the workings
The system knows that every PageUp AWS domain has two policies, a default one and a maintenance policy. Once a request is approved, it attaches the user with the corresponding maintenance policy for a configured amount of time.
a simple explanation on the system
It’s been good 18 months since we put in place and since then never bothered to touch it because
It’s one of those simple solutions that just works
Credits
The credits for designing and building this simple solution goes to my team of excellent developers current and past.
Terence Tham — Designed overall solution.
Colin Scott — Among several things, he named it as “In Crowd.”
Jack G — Contributed to “PageUp-AWS”