The AWS Well-Architected Framework is a set of Documents and Tools that sets out best practices and procedures to follow to build high quality applications in the cloud on AWS
The core of the well architected framework is 5 pillars that group the important considerations and practices that you need to stick to when building in AWS.
The 5 Pillars of the AWS Well-Architected Framework
- Security
- Reliability
- Performance Efficiency
- Operational Excellence
- Cost Optimization
The core tenet of the Well Architected Framework in AWS is that most of what you do is also done by others and has been navigated by others. The best practices taken from many deployments are the ones that you should be following when building your own products on AWS.
There is actually a tool available from Amazon now that will help you to assess your current workload and convert that into best practice proven designs on AWS:
https://aws.amazon.com/well-architected-tool/
More detailed explanations of the 5 pillars of the well architected framework can be found here along with links to the extensive whitepapers on each of them: https://aws.amazon.com/blogs/apn/the-5-pillars-of-the-aws-well-architected-framework/
For most people though the brief explanation below will suffice to give you enough of an understanding of them for most purposes.
The set of general design principles that can be applied to any project fall under the 5 pillars of the well architected framework. Each Pillar has a number of topics that help guide you to an excellent design.
AWS Well-Architected Framework -Security.
- Implement a strong identity foundation.
- The principle of least privilege means that you grant no permissions by default and specifically grant permissions and rights to perform actions and access resources. Also, when you do grant them, you give the minimum necessary.
- Separate duties and take advantage of users, roles and policies to manage access.
- Manage privileges centrally and try to give short term credentials where possible.
- Enable Traceability.
- Make sure that you have the ability to monitor actions and analyse them afterwards so that you can find out who or what did a particular action.
- Apply security at all layers.
- Don’t just protect the outer perimeter of your system. Implement protection at all levels and wherever components communicate with each other.
- Automate security best practices.
- Don’t leave the setup and maintenance of security to humans. People should set the policy and then as much as possible, it should be managed as code.
- Protect data in transit and at rest.
- Understand which bits of your data could be sensitive and understand how it is moved around and where it goes. Take advantage of methods for protecting that data.
- Prepare for a security event.
- Sooner or later it is likely that a breach or security event will happen. Have a plan for how to act when that happens. Run mock security breaches or events to help you get your processes in place and even expose holes and issues.
The well-architected framework security whitepaper:
https://d1.awsstatic.com/whitepapers/architecture/AWS-Security-Pillar.pdf
AWS Well-Architected Framework – Reliability
- Test Recovery processes.
- Its considered essential to test your application whether inside or outside of the cloud. Its rare to test recovery procedures though. In the cloud, you can easily test recovery procedures whether by spinning up an identical environment and causing it to fail or even by simulating an actual failure in your production system once you are confident in how it will react and recover by itself.
- This lets you make sure that your processes are rock solid.
- Automate recovery from failure.
- DOnt rely on humans to potentially wake up and follow some procedure to recover your system. By implementing your system in code and taking advantage of built in high availability option in AWS, it is possible to have your application fully recover from most scenarios.
- Scale horizontally to increase aggregate system availability.
- Machines fail. Given enough time, any server will crash eventually. By scaling horizontally and replacing 1 big machine with many smaller ones, you can easily tolerate the loss of one or more machines and still be operational while they are automatically replaced.
- Stop guessing capacity.
- Most capacity planing is done by some degree of guess work. That is almost always destined to be sub optimal. YOu either massively over provision and waste money or you under provision (perhaps in relation to usage spikes) and your system crashes. In AWS, you can monitor usage and automate the process of scaling your system to cope with increased demands.
- Manage change automatically.
- Change the automations to manage the process and the rules. Don’t change the underlying resources yourself as this is error prone.
The well-architected framework Reliability Pillar whitepaper. https://d1.awsstatic.com/whitepapers/architecture/AWS-Reliability-Pillar.pdf
AWS Well-Architected Framework – Performance efficiency
- Democratize advanced technologies
- It is inefficient to have your team learn every advanced technology that you use. Instead, let the cloud provider be experts in that field and free up your staff’s time to focus on other tasks that are specific to your business.
- Go global in minutes.
- In AWS, everything (mostly) is replicated across different regions throughout the world. So what works in one region will work in others, allowing you to get closer to your customers and reduce lag and latency as well as deal with data protection and locality concerns.
- Use serverless architectures
- Where possible, use services rather than managing your own servers. This frees up time and allows extra efficiencies and even cost savings where capacity can more closely match demand.
- Experiment more often.
- Because so many resources are automatable, it is easy to spin tests and experiments up and then destroy them afterwards, making them easy to do and cheap.
- Mechanical sympathy.
- Use the best technology for what you want to do and don’t get locked into systems that you have experience of managing.
The well-architected framework Reliability Pillar whitepaper https://d1.awsstatic.com/whitepapers/architecture/AWS-Reliability-Pillar.pdf
AWS Well-Architected Framework – Operational Excellence
- Perform operations as code
- Specifying all of your infrastructure as code makes it testable, reproducible and easily changeable. It can be peer reviewed and massively reduced the chances of someone making a mistake or forgetting to set something up.
- Annotated documentation
- You can automatically create documentation on your systems after every build, reducing the workload. This can also be used as further input to your code.
- Make frequent, small, reversible changes.
- Small changes that you can roll back from cause far less risk and allow for much more rapid development.
- Refine operations procedures frequently
- Constantly be looking to improve procedures as you learn more both about the technology and your specific needs.
- Anticipate failure
- Failure will happen. Look for weaknesses and conduct regular assessments to discover points of failure and where you can mitigate them. Plan your actions in the event of a failure.
- Learn from all operational failures
- Conduct post mortems to see where better actions could be taken and where changes can remove failure points.
The well-architected framework operational excellence whitepaper: https://d1.awsstatic.com/whitepapers/architecture/AWS-Operational-Excellence-Pillar.pdf
AWS Well-Architected Framework – Cost Optimization
- Adopt a consumption model
- Pay only for what you use and need. Can you turn off dev and test environments while your staff are not at work?
- Measure overall efficiency.
- Measure output and efficiency to understand what gains you will make by increasing output.
- Stop spending money on data center operationsLet Amazon focus on the hardware and instead focus on what is specific to your company and needs.
- Analyze and attribute expenditure
- Make sure that you take advantage of things like tagging resources so that you can tell where the money is being spent and from there measure returns and efficiency as well as identify and fix inefficiency.
- Use managed services to reduce cost of ownership
- Using systems managed by someone else reduces the costs associated with running them.
The aws well-architected framework Cost Optimization Pillar whitepaper https://d1.awsstatic.com/whitepapers/architecture/AWS-Cost-Optimization-Pillar.pdf
If you prefer to watch or listen to learn about the well architected framework in AWS then this is a good video that covers it quite well: