Introduction to the AWS Cloud
Define what the AWS Cloud is and describe the basic global infrastructure
Cloud Computing: the on demand delivery of IT resources and applications via the internet
- Instead of having to design and build our data centers, we can access a data center and all of its resources over the internet
- Allows us to scale as computing goes up without having to plan
- Frees companies from the limitation of running our own servers
Scalabe computing platform - AWS CLOUD
ALL THREE ACCESS THE AWS API
AWS CLI
- Open source tool built for interacting with AWS services
- Environments:
- Linux: linux, macOS, unix
- Windows: PowerShell or Windows Command Processor
- Remotely: Run commands on Amazon EC2 instances, SSH, or with Amazon EC2 systems manager
AWS SDK allow you to manage infrastructure as code (Python Boto as an example!)
Knowledge Check: The power to scale computing up or down easily?
- Elasticity
Core Services
EC2 Elastic Compute Cloud
- Cloud hosted compute resources that can be elastic (increase or decrease instances depending on demand)
- Called EC2 Instances
- Pay as you go
- Broad selection of Hardware and Software
- Selection of where to host (Global hosting)
Product Demonstration
- Login to console
- Choose a region
- Launch EC2 Wizard
- Select AMI (SoftWare or SW)
- Amazon Machine Image
- Select instance type (Hardware or HW)
- Configure network
- Configure storage
- Configure key pairs
- Launch & connect
Elastic Block Store
Designed to be available and durable
You can change sizes without shutting down the instance
S3
- Managed cloud storage service
- Store virtually unlimited number of objects
- Access any time, from anywhere
- Rich security controls
You can access s3 through the console, cli, and sdks
Virtual Private Cloud (VPC)
Integrated Services
Application Load Balancer The second type of load balancer.
- Supported request protocols
- CloudWatch Metrics
- Access Logs
- Health Checks Features:
- Ability to add path and host-based routing
- Native IPv6 support
- AWS WAF
- Dynamic Ports
- Deletion Protection & Request Tracing
1 Example: The ability to use containers to host micro services and route to those applications from the load balancer.
Application Load Balancer allow you to route different requests to the same instance but differ the path based on the port.
Auto Scaling Auto Scaling helps you ensure that you have the correct number of ec2 instances available to handle the load for your application. Answers two questions:
- How can I ensure that my workload has enough EC2 resources to meet fluctuating performance requirements? - SCALABILITY
- How can I automate EC2 resource provisioning to occur on-demand? - AUTOMATION
Scaling out - adding more instances Scaling in - terminating instances
- Launch Configuration
- What will be launched by auto scaling?
- AMI
- Instance type
- Security Groups
- Roles
- What will be launched by auto scaling?
- Auto Scaling Group
- Where a deployment takes place and some boundaries for deployment
- VPC and Subnets
- Load balancer
- Minimum instances
- Maximum instances
- Desired capacity
- Where a deployment takes place and some boundaries for deployment
- Auto Scaling Policy
- When to launch or terminate EC2 instances
- Scheduled
- On-demand
- Scale-out policy
- Scale-in policy
- When to launch or terminate EC2 instances
Route 53 DNS - Domain Name System
53 is a DNS service to route users to application.
DNS is a reliable way to route end users to endpoints
DNS Resolution Strategies
- Simple routing
- Geo-location
- Failover
- Weighted round robin
- Latency-based
- Multi-value answer
Domain Registration Global, highly available DNS Public and private DNS names Multiple routing algorithms Both IPv4 and IPv6 Integrated with other AWS cloud services
RDS (relational Database Services) Challenges of Relational Databases:
- Server maintenance and energy footprint
- Software install and pathces
- Database backups and high availability
- Limits on scalability
- Data security
- OS install and patches
RDS is a managed service that sets up and operates a relational database in the cloud
You manage:
- Application optimization
AWS manages:
- OS installation and patches
- Database software install and patches
- Database backups
- High availability
- Scaling
- Power and rack & stack
- Server maintenance
Creates a standby instance in new availability zone, if the main one goes down the application uses the standby
Read Replicas:
Summary: Highly Scalable High performance Easy to administer Available and durable Secure and compliant
AWS LAMBDA
- Fully-managed serverless compute
- Event-driven execution
- Sub-second metering
- Multiple langauges supported
AWS Elastic Beanstalk How can I quickly get my application into the cloud? -> AWS Elastic Beanstalk
- Platform as a Service
- Allows quick deployment of your applications
- Reduces management complexity
- Keeps control in your hands
- Choose you instance type
- Choose your database
- Set and adjust Auto Scaling
- Update your application
- Access server log files
- Enable HTTPS on load balanver
- Supports a large range of platforms
- Packer Builder
- Single Container, Multicontainer, or Preconfigured Docker
- Go
- Python
- PHP
- Ruby
- Node.js
- …
- Components
- Update your application as easily as you deploy it
Simple Notification Service (SNS)
- Flexible, fully managed pub/sub messaging and mobile communications service
- Coordinates the delivery of messages to subscribing endpoints and clients
- easy to setup, operate and send reliable communications
- Decouple and scale microservices, distributed systems and serverless applications
Cloudwatch Monitors your AWS resources and the applications you run on AWS in real time
Features: Collect and track metrics -> collect and monitor log files -> set alarms -> automatically react to changes
Use Cases:
- Respond to state changes in your AWS resources
- Automatically invoke an AWS Lambda function to update DNS entries when an event notifies that Amazon EC2 instance neters the Running state
- Direct specific API records from Cloud Trail to a Kinesis stream for detailed analysis of potential secuirty or availability risks
- Take a snapshot of an Amazon EBS volume on a schedule
- Log S3 Object Level Operations Using CloudWatch Events
Components: Metrics
- Data about the performance of the systems
- Represents a time-ordered set of data points that are published to CloudWatch
- By default, server services provide free metrics for resources
- Such as ec2 instances, ebs values and RDS DB instances
- Publish your own application metrics
- Load all the metrics in you account for search, graphing, and alarms. Alarms
- Watches a single metric
- Performs one or more actions
- Based on the value of the metric relative to a threshold over a number of time periods
- The action can be
- EC2 action
- An auto scaling action
- A notification sent to an SNS topic
- Invokes actions for sustained state changes only
Events
- Near real-time stream of system events that describe changes in AWS resources
- Use simple rules to match events and route them to one or more target functions or streams
- Aware of operational changes as they occur
- Responses to these operational changes and takes corrective action as necessary
- Schedule automated actions that self-trigger at certain times using Cron or rate expressions Logs
- Monitor and troubleshoot systems and applications using existing log giles
- Monitor logs for specific phrases, values or patterns
- Retrieve the associated log data from CloudWatch Logs
- Includes an installable agent for Ubuntu, Amazon Linux, and Windows at no additional charge
CloudWatch Logs Features
- Monitor Logs from Amazon EC2 Instances in REal-time
- Monitor AWS CloudTrail Logged Events
- Archive Log Data
Dashboards
- Customizable home pages in the CloudWatch console to monitor your resources in a single view
- Even those resources that are spread across different regions
- Create customized views of the metrics and alarms for your AWS resources
- Each dashboard can display multiple metrics, and can be accessorized with text and images
- Create dashboards by using the console, the was cli, or by using the PutDashboard API
CloudFront Overview:
- Global, Growing Network
- Secure Content at the Edge
- Deep integration with key AWS services
- High Performance
- Cost effective
- Easy to use
CloudFormation CloudFormation simplifies the task of repeatedly and predictably creating groups of related resources that power your applications
- Fully-managed service
- Create, update and delete resources in stacks
VERY SIMILAR TO TERRAFORM
Template Files
- Resources to provision
- Text file
- Json or YAML format
- Self-documenting environment
Architecture
The AWS Well-Architected Framework
- Assess and improve architectures
- Understand how design decisions impact business
- Learn the five pillars and design principles
- Security
- Identity and access management (IAM)
- Detective controls
- Infrastructure protection
- Data protection
- Incident response
- DESIGN PRINCIPLES:
- Implement security at all layers
- Enable traceability
- Apply principle of least privilege
- Focus on securing your system
- Automate
- Reliability
- Recover from issues/failures
- Apply best practices in :
- Foundations
- Change management
- Failure management
- Anticipate, response, and prevent failures
- DESIGN PRINCIPLES:
- Test recovery procedures
- Automatically recover
- Scale horizontally
- Stop guessing capacity
- Manage change in automation
- Performance efficiency
- Select customizable solutions
- Review to continually innovate
- Monitor aws services
- Consider the trade-offs
- DESIGN PRINCIPLES:
- Democratize advanced technologies
- Go global in minutes
- Use a serverless architectures
- Experiment more often
- Have mechanical sympathy
- Cost optimization
- Use cost-effective resources
- Matching supply with demand
- Increase expenditure awareness
- Optimize over time
- DESIGN PRINCIPLES:
- Adopt a consumption model
- Measure overall efficiency
- Reduce spending on data center operations
- Analyze and attribute expenditure
- Use managed services
- Operational Excelence
- Manage and automate changes
- Respond to events
- Define the standards
Fault Tolerance and High Availability Fault Tolerance:
- Ability of a system to remain operational
- Built-in redundancy of an application’s components High Availability:
- Systems are generally functioning and accessible
- Downtime is minimal
- Minimal human intervention is required
- Minimal up-front financial investment
High Availability: On Premises vs AWS
- Traditional (on premises)
- Expensive
- Only mission-critical applications
- AWS
- Multiple servers
- Availability zones
- Regions
- Fault-tolerant services
High Availability Service Tools
- Elastic load balancers
- Distributes incoming traffic (loads)
- Sends metrics to amazon cloud watch
- Triggers/notifies
- High latency
- Over utilization
- Elastic Ip addresses
- Are static ip addresses
- Mask failures (if they were to occur)
- Continues to access applications if an instance fails
- Amazon route 53
- Authoritative DNS service
- Translates domain names into ip addresses
- Supports
- Simple routing
- Latency-based routing
- Health checks
- DNS failovers
- GEo-location routing
- Authoritative DNS service
- Auto scaling
- Terminates and launches instances based on conditions
- Assists with adjusting or modifying capacity
- Creates new resources on demand
- Amazon cloud watch
- Distributed statistics gathering system
- Tracks your metrics of your infrastructure
- Create and use your own custom metrics
- Used with auto scaling
Fault Tolerant Tools
- Amazon Simple Queue Service (SQS)
- Amazon Simple Storage Service (S3)
- Amazon Relational Database SERvice (RDS)
Security
Introduction to AWS Security Security is of the utmost importance to AWS
- Approach to security
- AWS environment controls
- AWS offerings and features
Network Security
- Built-in firewalls
- Encryption in transit
- Private/dedicated connections
- Distributed denial of service (DDoS) mitigation
Inventory and Configuration Management
- Deployment tools
- Inventory and configuration tools
- Template definition and management tools
Data Encryption
- Encryption capabilities
- Key management options
- AWS Key Management Service
- Hardware-based cryptographic key storage options
- AWS CloudHSM
Access Control and Management
- Identity and access management (IAM)
- Multi-factor authentication (MFA)
- Integration and federation with corporate directories
- Amazon Cognito
- AWS SSO
Monitoring and Logging
- Tools and features to reduce your risk profile:
- Deep visibility into API calls
- Log aggregation and options
- Alert notifications
The Shared Responsibility Model Application Stack:
- Physical - AWS buildings and servers
- Network - AWS locks it down
- Hypervisor - AWS (uses zen?)
- EC2 (if running!) fits between these two groups - above and below
- AWS cannot see the below elements in the stack
- Guest OS
- Application
- User Data
Identity and Access Management IAM - we want to be extremely specific what each ‘word’ means
USER
- A permanent named operator (could be human could be machine)
- Credentials are permanent and stay with that user
GROUP
- A collection of USERS
- Users can belong to many groups, groups can contain many users
ROLE
- A Role is NOT your permissions
- It is an authentication method
- IT IS TEMPORARY
- Authentication method for your user (or operator)
Policy DOCS
- Attaches to a USER, GROUP or directly to a ROLE
- Lists the specific APIs that I am allowing against which resources
USER/GROUP/ROLE - Authentication POLICY DOCS - Authorization
Also solves the issue of compromised credentials. EX: Someone gets in with username and password - Security manager can execute a single API that removes policy docs from all users, group, and roles.
Amazon Inspector IT security matters and securing IT infrastructure is:
- complex
- expensive
- Time consuming - build/configure/maintain
- Difficult to track all the changes in IT environment
- Hard to do effectively
Amazon Inspector
- Asses Applications for:
- vulnerabilities
- Deviations from best practices
- Produces a detailed report with :
- Security findings
- Prioritize steps for remediation
Amazon Inspector enables you to
- Quickly and easily assess your AWS resources
- Offload security assessments so you can focus on more complex security issues
- Gain a deeper understanding of your AWS resources
AWS Shield AWS Shield is a managed Distributed Denial of Service (DDoS) protection service that safeguards applications running on AWS.
A DoS - Denial of Service attack: A deliberate attempt to make your website or application unavailable to users
A DDoS - Distributed Denial of Service Multiple sources are used to attack target; infrastructure and application layers can be affected
DDoS mitigation challenges
- Complex setup and implementation
- Bandwidth limitations
- Manual intervention
- Time consuming
- Degraded performance
- expensive
AWS Shield tiers
- Standard
- Automatic protection available for all AWS customers, at no additional charge
- Any AWS resource
- Any AWS region
- Quick detection - Always-On
- Online attack mitigation
- Built-in automated mitigation techniques
- Avoids latency impact
- Self service
- Not need to engage AWS support
- Automatic protection available for all AWS customers, at no additional charge
- Advanced
- Paid service for higher levels of protection, features, and benefits
- Specialized support
- Advanced attack mitigation
- Visibility and attack notification
- Always-On monitoring
- Amazon Route 53, CloudFront, Elastic Load Balancer, Elastic IP
- Enhanced detection
- DDoS cost protection
Benefits
- Cost efficient
- Seamless integration and deployment
- Customizable protection
AWS Shield provides:
- Built-in protection against DDoS attacks
- Access to tools, services and expertise to help you protect your AWS applications
Security Compliance
AWS shares security information by:
- Obtaining industry certifications
- Publishing security and control practices
- Compliance reports
Control evnironment
- Includes policies, processes, and control activities to secure the delivery of AWS service offerings
- Supports the operating effectiveness of AWS control framework
- Integrates cloud-specific controls
- Applies leading industry practices
AWS security compliance programs help customers:
- Understand robust controls in place
- Establish and operate in an AWS security control environment
Pricing and Support
Similar to paying for utilities - only pay for services you use
For each service, you pay for what you use
“Pay as you go”
Cost fundamentals
- Pay for:
- Compute capacity
- Storage
- Outbound data transfer (aggregated)
- No charge for:
- Inbound data transfer
Offerings
- EC2
- Web service that:
- Provides resizable compute capacity in the cloud
- Allows the configuration of capacity with minimal friction
- Provides complete control
- Charges only for capacity used
- Cost Factors:
- Clock-second/hourly billing:
- Resources incur charges only when running
- Instance configuration:
- Physical capacity of the instance
- Pricing varies with:
- AWS region
- OS
- Instance type
- Instance size
- Clock-second/hourly billing:
- Purchase types
- On-demand instance:
- Compute capacity by the hour and second
- Min of 60 seconds
- Reserved instances:
- Low or no up-front payment instances reserved
- Discount on hourly charge for that instance
- Spot instances:
- Bid for unused amazon EC2 capacity
- On-demand instance:
- Other considerations
- Number of instances: provision multiple instances to handle peak loads
- Load Balancing: use ELB to distribute traffic. Monthly cost based on
- Hours load balancer runs
- Data load balancer processes
- Product options
- Monitoring:
- Use Amazon CloudWatch to monitor instances
- Basic monitoring (default)
- Detailed monitoring (fixed rate; prorated partial months)
- Auto Scaling:
- Automatically adjusts number of instance
- No additional charge
- Elastic IP addresses:
- No charge when associated with a running instance
- Monitoring:
- Os and software
- OS prices included in instance prices
- software:
- Partnership with other vendors
- Vendor licenses required
- Existing licenses accepted through specific vendor programs
- Web service that:
- S3
- What is S3?
- Object storage built to store and retrieve any amount of data from anywhere
- Provides:
- Durability, availability and scalability
- Comprehensive security and compliance capabilities
- Query in place
- Flexible management and data transfer
- Compatibility - supported by partners, vendors, and AWS services
- Storage Classes
- Standard Storage:
- 99.999999999% durability
- 99.99% availability
- Standard-Infrequent Access (S-IA):
- 99.999999999% durability
- 99.9% availability
- Storage cost:
- Number and size of objects
- Type of storage
- Standard Storage:
- Pricing based on:
- Requests:
- Number of requests
- Type of requests - different rates for GET requests
- Data transfer:
- Amount of data transferred out of the Amazon S3 region
- Requests:
- What is S3?
- EBS
- What is EBS?
- Block-level storage for instances
- Volumes persist independently from the instance
- Analogous to virtual disks in the cloud
- 3 Volume types:
- General Purpose (SSD)
- Provisioned IOPS (SSD)
- Magnetic
- Cost Factors
- Volumes: All types charged by the amount provisioned per month
- IOPS:
- General Purpose (SSD): Included in price
- Magnetic: Charged by the number of requests
- Provision IOPS (SSD): Charged by the amount you provision in IOPS
- Snapshots: Added cost per GB-month of data stored
- Data transfer:
- Inbound data transfer has no charge
- Outbound data transfer charges are tiered
- What is EBS?
- RDS
- What is RDS?
- Relational database in the cloud
- Cost-efficient and resizable capacity
- Management of time-consuming administrative tasks
- Cost Factors
- Clock-hour billing: Resources incur charges when running.
- Database characteristics: engine, size, and memory class impacts cost
- DB purchase type:
- On-demand database instances are charged by the hour
- Reserved database instances require up-front payment for database instances reserved
- Provision multiple db instances to hand peak loads
- Provisioned storage:
- No charge for backup storage of up to 100% of database storage
- Backup storage for terminated DB instance billed at GB/month
- Additional storage: Backup storage in addition to provisioned storage billed at GB/Month
- Deployment type:
- Storage and I/O charges variable
- Single availability zones
- Multiple availability zones
- Data transfer
- No charge for inbound data transfer
- Tiered charges for outbound data transfer
- What is RDS?
- CloudFront
- What is CloudFront?
- Web service for content delivery
- Integration with other AWS services
- Low latency
- High data transfer speeds
- No minimum commitments
- Cost factors
- Pricing varies across geographic regions
- Based on:
- Requests
- Data transfer out
- What is CloudFront?
Trusted Advisor Provides best practices (or checks in 4 categories):
- Cost optimization
- Performance
- Security
- Fault tolerance
Support plans
- basic
- developer
- business
- enterprise
Neptune
Redshift
AWS Cloud Computing and Global Infrastructure
Cloud computing is on-demand delivery of compute power, database, storage, applications and other IT resources via internet with pay-as-you-go pricing.
Benefits of cloud computing
- agility
- elasticity
- Cost savings
- Deploy globally in minutes
AWS Cloud
- A brand portfolio of global cloud-based products that are on-demand, available in seconds, with pay-as-you-go pricing
AWS Global Infrastructure
- Regions
- Completely isolated from each other
- Certain resources tied to specific regions
- Availability Zones are within regions
- Availability Zones
- Each AZ is isolated from other AZ’s within the region
- All AZ are interconnected by fiber
- Local Zones
- Infrastructure deployment close to population centers
- Wavelength Zones
- Deploys on 5g services
- Low laytency projects
- Direct Connect Locations
- Bypasses internet and connects directly to AWS
- CloudFront
- Edge Locations
- Regional Edge Caches
Intro to Compute
Intro to AWS Compute
- Allows to develop, deploy, run and scale workloads in the AWS Cloud
Benefits of Amazon EC2
- Elasticity
- Control
- Flexible
- Integrated
- Reliable
- Secure
- Cost-effective
- Easy
Instance Types
- Wide selection of hardware and software configurations optimized to fit different use cases
- General purpose
- Compute optimized
- Memory optimized
- Accelerated computing
- Storage optimized
- Families within each type
- Generations within each family
Amazon Machine Images
- Initial software configuration of an instance
- Can use AWS or marketplace or user community or custom AMIs
Why Scaling Matters
- Launch new instances in advance of peak periods
- Use monitoring to grammatically scale out
- Automatically scale in
- Pay for the resources needed, only when needed
Auto Scaling group
- Automatically adjusts resource capacity
- Define where Amazon EC2 Auto Scaling deploys resources
- Specify the amazon VPC and subnets
Elastic load balancing
- Automatically distribute traffic across multiple EC2 instances
- Increases availability and fault tolerance
- Configure health checks
- Offlocat encryption and decryption
- types
- Application load balance (app layer)
- Network load balance (network layer)
- Gateway load balancer (third-party virtual appliances
Intro to Storage
Storage Services
Elastic Block Storage (EBS)
- Network-attached block storage for use with Amazon EC2 instances
- Persist independently from instance
- Used like a physical hard drive
- Automatically replicated
- Attached to any instance in the same AZ
- One EBS volume to one EC2 instance
- One instance to many EBS volumes
- EBS volumes can retain at a after EC2 instance termination
- Allow point-in-time snapshots to S3 GiB increments
Simple Storage Service (S3)
- Infinite scalability, greater analysis, and faster data retrieval
- 99.999999999 (11 9s) of durability
- Common s3 use cases:
- Data lakes
- Backup and storage
- Application hosting
- Media hosting
- Software delivery
Databases
Database services
- Purpose-built for specific application use cases
- Offload time-consuming management tasks EC2-hosted vs. AWS Database Services
Networking Services
Networking Services
- Isolate cloud infrastructure and scale request-handling capacity
Virtual Private Cloud (VPC)
- Networking layer for AWS resources
- A virtual network dedicated to a customers AWS account Subnet
- A range of IP addresses in a VPC
Securing a VPC
- Network Access Control Lists
- Control traffic at the subnet level
- Security groups
- control traffic at the instance level
- Flow logs
- Capture network flow information
- Host-based firewalls
- Operating system firewalls
Intro to Security
Cloud security on AWS
- Inherit benefits from AWS data center and network architeture
- Similar to on premises data centers, without maintaining facilities and hardware
- Can be easily automated
- Inherit all the best practices of AWS
Security, identity, and compliance services
- One of the most important concepts to understand
- AWS is designed to help build secure, high-performing, resilient, and efficient infrastructure for applications
AWS shared responsibility model
Identity and Access Management (IAM)
- Securely manage access to AWS services and resources
- Fine-grained access control to AWS resources
- Multifactor authentication
- The ability to analyze access
- Integration with corporate directories
Intro to Solution Design
Migration Strategies - Seven R’s
- Rehost - Lift and shift
- Recreating the on-premises network, only hosted on AWS
- Automating with tools such as AWS Application Migration Service
- Easier to optimize and re-architect applications after migration
- Relocate - hypervisor-level lift and shift
- Migration specific to VMware Cloud on AWS
- Example:
- Migrate hypervisor host Oracle database to VMware Cloud on AWS
- Replatform - lift, tinker, and shift
- Retaining the core architecture
- Making targeted AWS cloud optimizations
- Examples:
- Migrating databases to Amazon RDS
- Migrating applications to Amazon Elastic Beanstalk
- Refactor - modernize
- Re-imagining how the application is architected and developed
- Using cloud-native features
- Other strategies
- Retire
- Shutting off non-useful applications
- Reducing spend, management, and security
- Retain/Revist
- Keep certain applications on-premises
- Repurchase
- Moving workflows to software as a service (SaaS)
- Retire
Cloud Architecture Best Practices
- Design for failure and nothing fails
- Avoid single points of failure
- Multiple instances
- Multiple availability zones
- Sepearet single server into multiple tiered application
- For Amazon RDS, use the multi-Az feature
- Build security in every layer
- Encrypt Data at rest and in transit
- Enforce principle of least privilege in IAM
- Implement both Security Groups and Network Access Control Lists (NACL)
- Consider advanced security features and services
- Leverage different storage options
- Move static web assets to Amazon S2
- Use amazon CloudFront to serve globally
- Store session state in DynamoDB
- Use ElastiCache between hosts and databases
- Implement elasticity
- Implement Auto Scaling policies
- Architect residency to reboot and relaunch
- Leverage managed services like S3 and DynamoDB
- Think parallel
- Scale horizontally, not vertically
- Decouple compute from session/state
- Use elastic load balancing
- Right-size your infrastructure
- Loose coupling sets you free
- Instead of single, ordered workflow, use multiple queues
- Use amazon Simple Queue Service and Simple notification Service (SQS and SNS)
- Leverage existing services
- Don’t fear constraints
- Rethink traditional constraints
- Need more RAM? Distribute across instances
- Better IOPS for database? Scaling horizontally instead
- Response to failure? Rip and replace, decommission and spin up replacement
Well-Architected Framework
Well-Architected Framework
- A framework for ensuring infrastructures are:
- secure
- high-performing
- resilient
- efficient
- sustainable
- Practices developed through reviewing customers’ architectures on AWS
- Systematic approach for evaluating and implementing architectures
- Well-Architected Tool in the console
Cloud Adoption Framework and Perspectives
Cloud Adoption Framework
- Migrating to the cloud is a process
- Successful cloud migration requires expertise
- Harness different perspectives
- Ensure that you have the right talent
- The AWS Professional Services created the AWS CAF
- AWS CAF provides enables smooth transition through 6 perspectives
6 perspectives
- Business perspective
- Ensure IT aligns with business
- Creates a strong business case for cloud adoption
- Ensure business align with IT
- Common roles include: Business managers, finance managers, budget owners, strategy stakeholders
- People perspective
- Support change management strategy
- Evaluate organizational structures and roles
- Evaluate new skill and process requirements
- Identify gas
- Prioritize training
- Common roles: Human Resources, staffing, people managers
- Governance perspective
- Focus on skills and processes
- Ensure the business values are maximized and risks are minimized
- Update the staff skills and processes
- Measure cloud investments to evaluate business outcomes
- Common roles: Chief Information Officer, Program Managers, Enterprise Architects, Business analysts, and Portfolio Managers
- Platform perspective
- Implement new solutions in the cloud
- Migrate on-premises workloads to the cloud
- Understand and communicate the structure of IT systems and their relationships
- Describe the architecture of the target state environment in detail
- Common roles: Chief Technology Officer (CTO), IT managers, and Solutions architects
- Security perspective
- Meet security objectives for visibility, auditability, control, and agility
- Structure the selection and implementation of security controls
- Common roles: Chief Information security officer (CSIO), IT security managers, and IT security analysts
- Operations perspective
- Enable, run, use, operate, and recover IT workloads
- Define how business is conducted
- Align with and support the business operations
- Define current operating procedures
- Common roles: IT operations managers and IT support managers
Action Plan
- Uncover gaps in skills and processes
- Use inputs as bassi for creating AWS CAF Action Plan
- Guide an organizations change management
- Keep on track toward achieving their desired outcomes
AWS Solutions - Vetted, technical reference implementations designed to help customers solve common problems and build faster
AWS Solution Space provides customers who need help deploying an AWS Solution by highlighting AWS Competency Partner Solutions
- AWS Quick Starts - prebuilt solutions for specific use cases
Presenting AWS Solutions to Customers
Customer-facing discussions fall into three distinct categories, based upon typical milestones in the sales cycle.
- Discovery is the information-gathering meeting to help you understand your customers challenges
- After all the necessary information is collected that identifies the customer’s goals and pain points, you will meet with the customer again to present your findings and propose one or more AWS solutions. This may actually end up being several meetings, depending on whether tweaks to the solution are needed.
- After the customer agrees to a potential solutions, you will ask them if they would like to move forward with a proof of concept (POC), where they evaluate the solution in their own environment
Discovery Best Practices
Preparing for discovery
- Research customer’s business
- Determine market segment
- Identify industry trends
- Identify customer’s competitors
- Research recent news
- Research customer relationship to AWS
Encourage detailed conversation
- Ask targeted questions
- Ask open-ended questions
Five Why’s
- Dive deeper
- Uncover the real desired outcomes
Whiteboarding
- Keep track of the conversation
- Illustrate workflows and ideation
Customer Meeting Best Practices
Best Practices
- Data-driven approach
- Use case studies
- Dive deep
- Have backbone
- Keep the momentum going
Common objection response
- Security
- Security at AWS is our top priority
- Higher security posture than in legacy environments
- Customers inherit all the benefits of our experience
- Validated against the strictest of third-parts assurance frameworks
- Cost or cost savings
- Reduce total cost of ownership (TCO)
- Achieve continuously optimized and predicable spend
- No longer over provision infrastructure for peak demand
- GE Oil and Gas decreased TCO by 52%
- Scalability and Response
- Build Cloud Foundation Team
- Create guardrails around security, availability, reliability and compliance
- AWS Control Tower gives maximum control—without sacrificing speed and agility
Keys to consistent results
- prepare
- anticipate
- differentiate
- Stay on message
DO NOT
- Use words like definitely, never, or guaranteed
- Use acronyms or technical jargon
- Focus on technology
- Focus on the short/mid-term
- Read the slides
Delivering a Proof of Concept
POC Fundamentals
Building a POC
- Customer agrees to move forward with POC
- Determine what success looks like
- Include any modifications
- Consult as necessary
- Collect the following information
- Networking and security
- Application code
- databases
- Data
POC resources APN - partner portal has training for POC
AWS Quick Starts
- Rapidly deploy architectures based upon best practices
- Launch, configure, and run AWS services required to deploy a specific workload on AWS
- Reduce manual procedures into few steps
- Check back frequently for updates
Migration Considerations
The Migration Process
Minimum Viable Product (MVP)
- Avoid building a solution where you only discover if there is success at the end
- Instead start with something basic and gather feedback as you get more complex
MVP and delivering results
Going to Production
Best practices
- Involve AWS account team (Solutions Architect or Technical Account Manager)
- Customer-specific regulatory requirements
- AWS support level
Well Architected Review
- Architectural guidance
- Continuous review
- Improved architectures
Modernization
Modernize to drive growth
- Retire expensive legacy solutions
- Reduce TCO, improve cost optimization
- Gain agility through automation
- Free up resources to drive innovation
Modernization of architectures
- Containers
- serverless
- Data lakes and analytics
Containers
- Package code, configurations, and dependencies into a single object
- Share an operating system
- Run as resource- isolated processes
- AWS offers resources and orchestration services
Containers use cases
- microservices
- Batch processing
- Machine learning
- Hybrid applications
- Application migration to the cloud
- Platform as a service
Serverless
Serverless benefits
- No provisioning, maintaining, and administering servers
- AWS handles fault tolerance and availability
- Focus on product innovation
Data Lakes and Analytics
- Data in different silos can be difficult to access and analyze
- Store data in a “data lake"
- Easy to read data and obtain insights
Intro to AWS Organizations
Security
- Control access with AWS Identity and Access Management (IAM).
- IAM policies enable you to allow or deny access to AWS services for users, groups, and roles
- Service control policies enable you to allow or deny access to AWS services for individuals or group accounts in an OU.
Accessing Organizations
- Management Console
- CLI (command line tools)
- SDKs
- HTTPS Query API
Advantages of Cloud Computing
6 advantages of cloud computing
- Pay as you go
- Instead of investing in data centers and hardware before you know you are going to use them, you pay only when you use computing resources, and pay only for how much you use
- Benefit from massive economies of scale
- By using cloud computing, you can achieve a lower cost than you can get on your own. Because usage from hundreds of thousands of customers is aggregated in the cloud, AWS can achieve higher economies of scale, which translates into lower pay as-you-go prices
- Stop guessing capacity
- Eliminate guessing on your infrastructure capacity needs. When you make a capacity decision prior to deploying an application, you often end up either sitting on expensive idle resources or dealing with limited capacity. With cloud computing, these problems go away. You can access as much or as little capacity as you need, and scale up and down as required with only a few minutes notice.
- Increase speed and agility
- IT resources are only a click away, which means that you reduce the time to make resources available to your developers from weeks to minutes. This results in a dramatic increase in agility for the organization, since the cost and time it takes to experiment and develop is significantly lower.
- Realize cost savings
- Companies can focus on projects that differentiate their business instead of maintaining data centers. Cloud computing lets you focus on your customers, rather than on the heavy lifting of racking, stacking, and powering physical infrastructure. This is often referred to as undifferentiated heavy lifting.
- Go global in minutes
- Applications can be deployed in multiple regions around the world with a few clicks. This means that you can provide lower latency and a better experience for your customers at a minimal cost.
Resources
AWS Global Infrastructure
AWS Region considerations:
- Compliance
- Enterprise companies often must comply with regulations that require customer data to be stored in a specific geographic territory. If applicable, choose a Region that meets your compliance requirements.
- Latency
- If your application is sensitive to latency (the delay between a request for data and the response), choose a Region that is close to your user base. This helps prevent long wait times for your customers. Synchronous applications such as gaming, telephony, WebSockets, and Internet of Things (IoT) are significantly affected by high latency. Asynchronous workloads, such as ecommerce applications, can also suffer from user connectivity delays.
- Pricing
- Due to the local economy and the physical nature of operating data centers, prices vary from one Region to another. Internet connectivity, imported equipment costs, customs, real estate, and other factors impact a Region’s pricing. Instead of charging a flat rate worldwide, AWS charges based on the financial factors specific to each Region.
- Service availability
- Some services might not be available in some Regions. The AWS documentation provides a table that shows the services available in each Region.
Resources
Interacting with AWS
The AWS Management Console
One way to manage cloud resources is through the web-based console, where you log in and choose the desired service. This can be the easiest way to create and manage resources when you first begin working with the cloud. Below is a screenshot that shows the landing page when you first log in to the AWS Management Console.
The services are placed in categories, such as Compute, Storage, Database, and Analytics
On the upper-right corner is the Region selector. If you choose it and change the Region, you will make requests to the services in the chosen Region. The URL changes, too. Changing the Region setting directs your browser to make requests to a different AWS Region, represented by a different subdomain.
The AWS Command Line Interface (AWS CLI)
Consider the scenario where you run tens of servers on AWS for your application’s front end. You want to run a report to collect data from all the servers. You need to do this programmatically every day because the server details might change. Instead of manually logging in to the AWS Management Console and then copying and pasting information, you can schedule an AWS CLI script with an API call to pull this data for you.
The AWS CLI is a unified tool that you can use to manage AWS services. You can download and configure one tool that you can use to control multiple AWS services from the command line and automate them with scripts. The AWS CLI is open-source, and installers are available for Windows, Linux, and macOS.
AWS SDKs
API calls to AWS can also be performed by running code with programming languages. You can do this by using AWS software development kits (SDKs). SDKs are open source and maintained by AWS for the most popular programming languages, such as C++, Go, Java, JavaScript, .NET, Node.js, PHP, Python, and Ruby.
Developers commonly use AWS SDKs to integrate their application source code with AWS services. For example, consider an application with a front end that runs in Python. Every time the application receives a cat photo, it uploads the file to a storage service. This action can be achieved in the source code by using the AWS SDK for Python.
Resources
Security and the AWS Shared Responsibility Model
AWS responsibility AWS is responsible for security of the cloud. This means AWS protects and secures the infrastructure that runs the services offered in the AWS Cloud. AWS is responsible for:
- Protecting and securing AWS Regions, Availability Zones, and data centers, down to the physical security of the buildings
- Managing the hardware, software, and networking components that run AWS services, such as the physical servers, host operating systems, virtualization layers, and AWS networking components
The level of responsibility AWS has depends on the service. AWS classifies services into three categories. The following table provides information about each, including the AWS responsibility.
Note: Container services refer to AWS abstracting application containers behind the scenes, not Docker container services. This enables AWS to move the responsibility of managing the platform away from customers.
Customer responsibility Customers are responsible for security in the cloud. When using any AWS service, you’re responsible for properly configuring the service and your applications, in addition to ensuring that your data is secure.
Your level of responsibility depends on the AWS service. Some services require you to perform all the necessary security configuration and management tasks, while other more abstracted services require you to only manage the data and control access to your resources. Using the three categories of AWS services, you can determine your level of responsibility for each AWS service you use.
Due to the varying levels of effort, customers must consider which AWS services they use and review the level of responsibility required to secure each service. They must also review how the shared security model aligns with the security standards in their IT environment, in addition to any applicable laws and regulations.
A key concept is that customers maintain complete control of their data and are responsible for managing the security related to their content. For example, you are responsible for the following:
- Choosing a Region for AWS resources in accordance with data sovereignty regulations
- Implementing data-protection mechanisms, such as encryption and scheduled backups
- Using access control to limit who can access to your data and AWS resources
Resources
- External Site: AWS: Shared Responsibility Model
- External Site: AWS: Enabling a Hardware MFA Device (Console)
- External Site: AWS: Enabling a U2F Security Key (Console)
- External Site: AWS: Enabling a Virtual Multi-Factor Authentication (MFA) Device (Console)
- External Site: AWS: Table of Supported MFA Devices
IAM: Identity and Access Management
IAM policies:
- Grant or deny permission to take actions
- Actions are AWS API calls
- Attach policies to AWS Identities
An IAM user represents a person or service that interacts with AWS.
IAM user credentials An IAM user consists of a name and a set of credentials. When you create a user, you can provide them with the following types of access:
- Access to the AWS Management Console
- Programmatic access to the AWS Command Line Interface (AWS CLI) and AWS application programming interface (AWS API) To access the AWS Management Console, provide the user with a user name and password. For programmatic access, AWS generates a set of access keys that can be used with the AWS CLI and AWS API. IAM user credentials are considered permanent, which means that they stay with the user until there’s a forced rotation by admins.
IAM groups An IAM group is a collection of users. All users in the group inherit the permissions assigned to the group. This makes it possible to give permissions to multiple users at once. It’s a more convenient and scalable way of managing permissions for users in your AWS account.
IAM policies To manage access and provide permissions to AWS services and resources, you create IAM policies and attach them to IAM users, groups, and roles. Whenever a user or role makes a request, AWS evaluates the policies associated with them. For example, if you have a developer inside the developers group who makes a request to an AWS service, AWS evaluates any policies attached to the developers group and any policies attached to the developer user to determine if the request should be allowed or denied.
Policy Structure
Resources
Role-Based Access in AWS
IAM roles:
- No static login credentials
- IAM Roles are assumed programmatically
- Credentials are temporary for a configurable amount of time
- Credentials expire and are rotated
Lock down the AWS root user – The root user is an all-powerful and all-knowing identity in your AWS account. If a malicious user were to gain control of root-user credentials, they would be able to access every resource in your account, including personal and billing information. To lock down the root user, you can do the following:
- Don’t share the credentials associated with the root user
- Consider deleting the root user access keys
- Enable MFA on the root account
Follow the principle of least privilege – Least privilege is a standard security principle that advises you to grant only the necessary permissions to do a particular job and nothing more. To implement least privilege for access control, start with the minimum set of permissions in an IAM policy and then grant additional permissions as necessary for a user, group, or role.
Use IAM appropriately – IAM is used to secure access to your AWS account and resources. It simply provides a way to create and manage users, groups, and roles to access resources in a single AWS account. IAM is not used for website authentication and authorization, such as providing users of a website with sign-in and sign-up functionality. IAM also does not support security controls for protecting operating systems and networks.
Use IAM roles when possible – Maintaining roles is more efficient than maintaining users. When you assume a role, IAM dynamically provides temporary credentials that expire after a defined period of time, between 15 minutes and 36 hours. Users, on the other hand, have long-term credentials in the form of user name and password combinations or a set of access keys. User access keys only expire when you or the account admin rotates the keys. User login credentials expire if you applied a password policy to your account that forces users to rotate their passwords.
Consider using an identity provider – If you decide to make your cat photo application into a business and begin to have more than a handful of people working on it, consider managing employee identity information through an identity provider (IdP). Using an IdP, whether it’s an AWS service such as AWS Single Sign-On or a third-party identity provider, provides a single source of truth for all identities in your organization. You no longer have to create separate IAM users in AWS. You can instead use IAM roles to provide permissions to identities that are federated from your IdP. For example, you have an employee, Martha, who has access to multiple AWS accounts. Instead of creating and managing multiple IAM users named Martha in each of those AWS accounts, you could manage Martha in your company’s IdP. If Martha moves in the company or leaves the company, Martha can be updated in the IdP, rather than in every AWS account in the company.
Consider AWS Single Sign-On – If you have an organization that spans many employees and multiple AWS accounts, you might want your employees to sign in with a single credential. AWS SSO is an IdP that lets your users sign in to a user portal with a single set of credentials. It then provides users access to their assigned accounts and applications in a central location. Similar to IAM. AWS SSO offers a directory where you can create users, organize them in groups, set permissions across the groups, and grant access to AWS resources. However, AWS SSO has some advantages over IAM. For example, if you’re using a third-party IdP, you can sync your users and groups to AWS SSO. This removes the burden of having to re-create users that already exist elsewhere, and it enables you to manage the users from your IdP. More importantly, AWS SSO separates the duties between your IdP and AWS, ensuring that your cloud access management is not inside or dependent on your IdP.
Resources
AWS Compute
Compute as a Service
Servers
The first building block you need to host an application is a server. Servers usually can handle Hypertext Transfer Protocol (HTTP) requests and send responses to clients following the client-server model, although any API-based communication also falls under this model. A client is a person or computer that sends a request. A server handling the requests is a computer, or collection of computers, connected to the internet serving websites to internet users.
Servers power your application by providing CPU, memory, and networking capacity to process users’ requests and transform them into responses. For context, common HTTP servers include:
- Windows options, such as Internet Information Services (IIS)
- Linux options, such as Apache HTTP Web Server, Nginx, and Apache Tomcat
3 types of compute
- Virtual Machines
- Container Services
- Serverless
If you have prior infrastructure knowledge, a virtual machine is often be the easiest compute option to understand. This is because a virtual machine emulates a physical server and allows you to install an HTTP server to run your applications. To run virtual machines, you install a hypervisor on a host machine. The hypervisor provisions the resources to create and run your VMs.
In AWS, virtual machines are called Amazon Elastic Compute Cloud, or Amazon EC2. Behind the scenes, AWS operates and manages the host machines and the hypervisor layer. AWS also installs the virtual machine operating system, called the guest operating system.
Resources
Elastic Compute Cloud (EC2)
Amazon Machine Image (AMI)
- Root volume template
- Typically the OS
- Applications pre-installed on instance at boot
- Launch permissions
- Block device mapping
Resources
EC2 Instance Lifecycle
Instance Types
Instance Families
Instance Locations - inside network called VPCs
EC2 instance lifecycle
When you stop your instance, the data stored in memory (RAM) is lost. When you stop-hibernate an instance, AWS signals the operating system to perform hibernation (suspend-to-disk), which saves the contents from the instance memory (RAM) to the Amazon EBS root volume.
3 purchasing methods for instances
- On-demand instances
- Pay for compute capacity with no long-term commitments. Bills while running and stops when stopped. Price per second is fixed
- For when instance can be stopped
- Reserved instances
- 3 payment options within this - either 1 or 3 year commitments
- All upfront - higher discount than partial upfront
- Partial upfront - higher discount than no upfront
- No upfront - higher discount than on-demand
- Provide discounted hourly rate and an optional capacity reservation for EC2 instances
- 3 payment options within this - either 1 or 3 year commitments
- Spot Instances
- Takes advantage of unused EC2 capacity in AWS Cloud
- Up to 90% discount to On-Demand prices
- Set limit on how much you would like to pay for the instance hour
- Spot instance might be interrupted
- App must be able to be interrupted if capacity is no longer available
Resources
- External Site: AWS: Amazon EC2
- External Site: AWS: Default VPC and Default Subnets
- External Site: AWS: AWS Reliability Pillar (PDF)
- External Site: AWS: Instance Lifecycle
- External Site: AWS: Amazon EC2 Pricing
- External Site: AWS: Amazon EC2 On-Demand Pricing
- External Site: AWS: Amazon EC2 Spot Instances Pricing
- External Site: AWS: Amazon EC2 Reserved Instances Pricing
Container Services
Container orchestration services:
- Amazon Elastic Container Service (ECS)
- Amazon Elastic Kubernetes Service (EKS)
AWS Margate is a Serverless compute platform for ECS or EKS
A container is a standardized unit that packages your code and its dependencies. This package is designed to run reliably on any platform, because the container creates its own independent environment. With containers, workloads can be carried from one place to another, such as from development to production or from on premises to the cloud.
Docker When you hear the word container, you might associate it with Docker. Docker is a popular container runtime that simplifies the management of the entire operating system stack needed for container isolation, including networking and storage. Docker helps customers create, package, deploy, and run containers.
Containers vs Virtual Machines
Containers share the same operating system and kernel as the host they exist on, whereas virtual machines contain their own operating system. Each virtual machine must maintain a copy of an operating system, which results in a degree of wasted resources. A container is more lightweight. They spin up quicker, almost instantly. This difference in startup time becomes instrumental when designing applications that need to scale quickly during input/output (I/O) bursts. While containers can provide speed, virtual machines offer the full strength of an operating system and more resources, like package installation, dedicated kernel, and more.
Manage Containers with Amazon Elastic Container Service (ECS)
To run and manage your containers, you need to install the Amazon ECS container agent on your EC2 instances. This agent is open source and responsible for communicating to the Amazon ECS service about cluster management details. You can run the agent on both Linux and Windows AMIs. An instance with the container agent installed is often called a container instance.
Once the Amazon ECS container instances are up and running, you can perform actions that include, but are not limited to, launching and stopping containers, getting cluster state, scaling in and out, scheduling the placement of containers across your cluster, assigning permissions, and meeting availability requirements. To prepare your application to run on Amazon ECS, you create a task definition. The task definition is a text file, in JSON format, that describes one or more containers. A task definition is similar to a blueprint that describes the resources you need to run a container, such as CPU, memory, ports, images, storage, and networking information.
Manage Containers with Amazon Elastic Kubernetes Service (EKS)
Kubernetes is a portable, extensible, open source platform for managing containerized workloads and services. By bringing software development and operations together by design, Kubernetes created a rapidly growing ecosystem that is very popular and well established in the market. Amazon EKS is conceptually similar to Amazon ECS, but with the following differences:
- An EC2 instance with the ECS agent installed and configured is called a container instance. In Amazon EKS, it is called a worker node.
- An ECS container is called a task. In Amazon EKS, it is called a pod.
- While Amazon ECS runs on AWS native technology, Amazon EKS runs on top of Kubernetes.
Resources
Lambda Resources
AWS Lambda
Resources
- External Site: AWS: Serverless
- Coursera Course: Building Modern Python Applications on AWS
- External Site: AWS: AWS Serverless Resources
- External Site: AWS: Building Applications with Serverless Architectures
- External Site: AWS: Best Practices for Organizing Larger Serverless Applications
- External Site: AWS: Managing AWS Lambda Functions
- External Site: AWS: 10 Things Serverless Architects Should Know
- External Site: AWS: AWS Alien Attack! A Serverless Adventure
Networking
IP addresses To properly route your messages to a location, you need an address. Just like each home has a mailing address, each computer has an IP address. However, instead of using the combination of street, city, state, zip code, and country, the IP address uses a combination of bits, 0s and 1s.
IPv4 notation Typically, you don’t see an IP address in its binary format. Instead, it’s converted into decimal format and noted as an Ipv4 address. In the following diagram, the 32 bits are grouped into groups of 8 bits, also called octets. Each of these groups is converted into decimal format separated by a period.
In the end, this is what is called an Ipv4 address. This is important to know when trying to communicate to a single computer. But remember, you’re working with a network. This is where CIDR notation comes in.
CIDR notation 192.168.1.30 is a single IP address. If you want to express IP addresses between the range of 192.168.1.0 and 192.168.1.255, how can you do that? One way is to use Classless Inter-Domain Routing (CIDR) notation. CIDR notation is a compressed way of specifying a range of IP addresses. Specifying a range determines how many IP addresses are available to you. It begins with a starting IP address and is separated by a forward slash (the “/” character) followed by a number. The number at the end specifies how many of the bits of the IP address are fixed. In this example, the first 24 bits of the IP address are fixed. The rest are flexible.
32 total bits subtracted by 24 fixed bits leaves 8 flexible bits. Each of these flexible bits can be either 0 or 1, because they are binary. That means that you have two choices for each of the 8 bits, providing 256 IP addresses in that IP range. The higher the number after the /, the smaller the number of IP addresses in your network. For example, a range of 192.168.1.0/24 is smaller than 192.168.1.0/16. When working with networks in the AWS Cloud, you choose your network size by using CIDR notation. In AWS, the smallest IP range you can have is /28, which provides 16 IP addresses. The largest IP range you can have is a /16, which provides 65,536 IP addresses.
Resources
Virtual Pricate Cloud
Amazon VPC A virtual private cloud (VPC) is an isolated network that you create in the AWS Cloud, similar to a traditional network in a data center. When you create a VPC, you must choose three main factors:
- Name of the VPC.
- Region where the VPC will live. Each VPC spans multiple Availability Zones within the selected Region.
- IP range for the VPC in CIDR notation. This determines the size of your network. Each VPC can have up to four /16 IP ranges. Using this information, AWS will provision a network and IP addresses for that network.
Create a subnet After you create your VPC, you must create subnets inside the network. Think of subnets as smaller networks inside your base network – or virtual local area networks (VLANs) in a traditional, on-premises network. In an on-premises network, the typical use case for subnets is to isolate or optimize network traffic. In AWS, subnets are used to provide high availability and connectivity options for your resources. When you create a subnet, you must specify the following:
- VPC you want your subnet to live in. In this case: VPC (10.0.0.0/16)
- Availability Zone you want your subnet to live in. In this case: AZ1
- CIDR block for your subnet, which must be a subset of the VPC CIDR block. In this case: 10.0.0.0/24 When you launch an EC2 instance, you launch it inside a subnet, which will be located inside the Availability Zone you choose.
High availability with a VPC When you create your subnets, keep high availability in mind. To maintain redundancy and fault tolerance, create at least two subnets configured in two Availability Zones. As you learned earlier, remember that “everything fails all of the time.” With the example network, if one of the AZs fails, you will still have your resources available in another AZ as backup.
Reserved IPs For AWS to configure your VPC appropriately, AWS reserves five IP addresses in each subnet. These IP addresses are used for routing, Domain Name System (DNS), and network management. For example, consider a VPC with the IP range 10.0.0.0/22. The VPC includes 1,024 total IP addresses. This is divided into four equal-sized subnets, each with a /24 IP range with 256 IP addresses. Out of each of those IP ranges, there are only 251 IP addresses that can be used because AWS reserves five.
The five reserved IP addresses can impact how you design your network. A common starting place for those who are new to the cloud is to create a VPC with an IP range of /16 and create subnets with an IP range of /24. This provides a large amount of IP addresses to work with at both the VPC and subnet levels.
Gateways
Internet gateway To enable internet connectivity for your VPC, you must create an internet gateway. Think of the gateway as similar to a modem. Just as a modem connects your computer to the internet, the internet gateway connects your VPC to the internet. Unlike your modem at home, which sometimes goes down or offline, an internet gateway is highly available and scalable. After you create an internet gateway, you attach it to your VPC.
Virtual private gateway A virtual private gateway connects your AWS VPC to another private network. Once you create and attach a virtual private gateway to a VPC, the gateway acts as anchor on the AWS side of the connection. On the other side of the connection, you will need to connect a customer gateway to the other private network. A customer gateway device is a physical device or software application on your side of the connection. Once you have both gateways, you can then establish an encrypted VPN connection between the two sides.
Resources
VPC Routing
Main route table When you create a VPC, AWS creates a route table called the main route table. A route table contains a set of rules, called routes, that are used to determine where network traffic is directed. AWS assumes that when you create a new VPC with subnets, you want traffic to flow between them. Therefore, the default configuration of the main route table is to allow traffic between all subnets in the local network. Below is an example of a main route table. The destination and target are two main parts of this route table.
- The destination is a range of IP addresses where you want your traffic to go. In the example of sending a letter, you need a destination to route the letter to the appropriate place. The same is true for routing traffic. In this case, the destination is the VPC network’s IP range.
- The target is the connection through which to send the traffic. In this case, the traffic is routed through the local VPC network.
Custom route tables While the main route table is used implicitly by subnets that do not have an explicit route table association, you might want to provide different routes on a per-subnet basis, for traffic to access resources outside of the VPC. For example, your application might consist of a front end and a database. You can create separate subnets for the resources and provide different routes for each of them. If you associate a custom route table with a subnet, the subnet will use it instead of the main route table. Each custom route table you create will have the local route already inside it, allowing communication to flow between all resources and subnets inside the VPC. The local route cannot be deleted.
VPC Security
Secure subnets with network access control lists Think of a network access control list (network ACL) as a firewall at the subnet level. A network ACL enables you to control what kind of traffic is allowed to enter or leave your subnet. You can configure this by setting up rules that define what you want to filter. Here’s an example.
The default network ACL, shown in the preceding table, allows all traffic in and out of the subnet. To allow data to flow freely to the subnet, this is a good starting place. However, you might want to restrict data at the subnet level. For example, if you have a web application, you might restrict your network to allow HTTPS traffic and remote desktop protocol (RDP) traffic to your web servers.
Notice that in the preceding network ACL example, you allow inbound 443 and outbound range 1025–65535. That’s because HTTP uses port 443 to initiate a connection and will respond to an ephemeral port. Network ACLs are considered stateless, so you need to include both the inbound and outbound ports used for the protocol. If you don’t include the outbound range, your server would respond but the traffic would never leave the subnet. Since network ACLs are configured by default to allow incoming and outgoing traffic, you don’t need to change their initial settings unless you need additional security layers.
Secure EC2 instances with security groups The next layer of security is for your EC2 Instances. Here, you can create a firewall called a security group. The default configuration of a security group blocks all inbound traffic and allows all outbound traffic.
You might be wondering, “Wouldn’t this block all EC2 instances from receiving the response of any customer requests?” Well, security groups are stateful. That means that they will remember if a connection is originally initiated by the EC2 instance or from the outside, and temporarily allow traffic to respond without modifying the inbound rules. If you want your EC2 instance to accept traffic from the internet, you must open up inbound ports. If you have a web server, you might need to accept HTTP and HTTPS requests to allow that type of traffic into your security group. You can create an inbound rule that will allow port 80 (HTTP) and port 443 (HTTPS), as shown.
You learned in a previous unit that subnets can be used to segregate traffic between computers in your network. Security groups can be used in the same way. A common design pattern is to organize resources into different groups and create security groups for each to control network communication between them.
This example defines three tiers and isolates each tier with defined security group rules. In this case, internet traffic to the Web Tier is allowed over HTTPS, Web Tier to Application Tier traffic is allowed over HTTP, and Application tier to Database tier traffic is allowed over MySQL. This is different from traditional on-premises environments, in which you isolate groups of resources via a VLAN configuration. In AWS, security groups allow you to achieve the same isolation without tying it to your network.
Resources
- External Site: AWS: Route Tables
- External Site: AWS: Example Routing Options
- External Site: AWS: Working with Routing Tables
- External Site: AWS: Network ACLs
- External Site: AWS: Security Groups for Your VPC
- External Site: AWS: I Host a Website on an EC2 Instance. How Do I Allow My Users to Connect on HTTP (80) or HTTPS (443)?
Storage
Storage Types
File storage You might be familiar with file storage if you have interacted with file storage systems like Windows File Explorer or Finder on macOS. Files are organized in a tree-like hierarchy that consists of folders and subfolders. For example, if you have hundreds of cat photos on your laptop, you might want to create a folder called Cat photos, and place the images inside that folder to organize them. Since you know these images will be used in an application, you might want to place the Cat photos folder inside another folder called Application files.
Each file has metadata such as file name, file size, and the date the file was created. The file also has a path, for example, computer/Application_files/Cat_photos/cats-03.png. When you need to retrieve a file, your system can use the path to find it in the file hierarchy. File storage is ideal when you require centralized access to files that need to be easily shared and managed by multiple host computers. Typically, this storage is mounted onto multiple hosts, and requires file locking and integration with existing file system communication protocols. Common use cases for file storage include:
- Large content repositories
- Development environments
- User home directories
Block storage While file storage treats files as a singular unit, block storage splits files into fixed-size chunks of data called blocks that have their own addresses. Since each block is addressable, blocks can be retrieved efficiently. When data is requested, the addresses are used by the storage system to organize the blocks in the correct order to form a complete file to present back to the requestor. Outside of the address, no additional metadata is associated with each block. So, when you want to change a character in a file, you just change the block, or the piece of the file, that contains the character. This ease of access is why block storage solutions are fast and use less bandwidth.
Since block storage is optimized for low-latency operations, it is a typical storage choice for high-performance enterprise workloads, such as databases or enterprise resource planning (ERP) systems, that require low-latency storage.
Object storage Objects, much like files, are treated as a single unit of data when stored. However, unlike file storage, these objects are stored in a flat structure instead of a hierarchy. Each object is a file with a unique identifier. This identifier, along with any additional metadata, is bundled with the data and stored. Changing just one character in an object is more difficult than with block storage. When you want to change one character in a file, the entire file must be updated.
With object storage, you can store almost any type of data, and there is no limit to the number of objects stored, which makes it readily scalable. Object storage is generally useful when storing large datasets; unstructured files, like media assets; and static assets, like photos.
Resources
EC2 Instance Storage and Amazon Elastic Block Store
Amazon EC2 instance store Amazon EC2 instance store provides temporary block-level storage for an instance. This storage is located on disks that are physically attached to the host computer. This ties the lifecycle of the data to the lifecycle of the EC2 instance. If you delete the instance, the instance store is deleted, as well. Due to this, instance store is considered ephemeral storage. Instance store is ideal if you host applications that replicate data to other EC2 instances, such as Hadoop clusters. For these cluster-based workloads, having the speed of locally attached volumes and the resiliency of replicated data helps you achieve data distribution at high performance. It’s also ideal for temporary storage of information that changes frequently, such as buffers, caches, scratch data, and other temporary content.
Amazon Elastic Block Storage (Amazon EBS) As the name implies, Amazon EBS is a block-level storage device that you can attach to an Amazon EC2 instance. These storage devices are called Amazon EBS volumes. EBS volumes are essentially drives of a user-configured size attached to an EC2 instance, similar to how you might attach an external drive to your laptop. EBS volumes act similarly to external drives in more than one way.
- Most Amazon EBS volumes can only be connected with one computer at a time. Most EBS volumes have a one-to-one relationship with EC2 instances, so they cannot be shared by or attached to multiple instances at one time. (Recently, AWS announced the Amazon EBS multi-attach feature that enables volumes to be attached to multiple EC2 instances at one time. This feature is not available for all instance types, and all instances must be in the same Availability Zone. Read more about this scenario in the EBS documentation.)
- You can detach an EBS volume from one EC2 instance and attach it to another EC2 instance in the same Availability Zone, to access the data on it.
- The external drive is separate from the computer. That means, if an accident occurs and the computer goes down, you still have your data on your external drive. The same is true for EBS volumes.
- You’re limited to the size of the external drive, since it has a fixed limit to how scalable it can be. For example, you might have a 2-TB external drive, which means you can only have 2 TB of content on it. This relates to EBS as well, since a volume also has a max limitation of how much content you can store on it.
Scale Amazon EBS volumes You can scale Amazon EBS volumes in two ways.
- Increase the volume size, as long as it doesn’t increase above the maximum size limit. For EBS volumes, the maximum amount of storage you can have is 16 TB. If you provision a 5-TB EBS volume, you can choose to increase the size of your volume until you get to 16 TB.
- Attach multiple volumes to a single Amazon EC2 instance. EC2 has a one-to-many relationship with EBS volumes. You can add these additional volumes during or after EC2 instance creation to provide more storage capacity for your hosts.
Amazon EBS use cases Amazon EBS is useful when you must retrieve data quickly and have data persist long-term. Volumes are commonly used in the following scenarios.
- Operating systems: Boot/root volumes to store an operating system. The root device for an instance launched from an Amazon Machine Image (AMI) is typically an Amazon EBS volume. These are commonly referred to as EBS-backed AMIs.
- Databases: A storage layer for databases running on Amazon EC2 that rely on transactional reads and writes.
- Enterprise applications: Amazon EBS provides reliable block storage to run business-critical applications.
- Throughput-intensive applications: Applications that perform long, continuous reads and writes.
Amazon EBS volume types Amazon EBS volumes are organized into two main categories – solid-state drives (SSDs) and hard-disk drives (HDDs). SSDs provide strong performance for random input/output (I/O), while HDDs provide strong performance for sequential I/O. AWS offers two types of each.
Amazon EBS benefits Here are the benefits of using Amazon EBS.
- High availability: When you create an EBS volume, it is automatically replicated in its Availability Zone to prevent data loss from single points of failure.
- Data persistence: The storage persists even when your instance doesn’t.
- Data encryption: All EBS volumes support encryption.
- Flexibility: EBS volumes support on-the-fly changes. You can modify volume type, volume size, and input/output operations per second (IOPS) capacity without stopping your instance.
- Backups: Amazon EBS provides the ability to create backups of any EBS volume.
Amazon EBS snapshots Errors happen. One error is not backing up data and then inevitably losing it. To prevent this from happening to you, always back up your data – even in AWS. Since your EBS volumes consist of the data from your Amazon EC2 instance, you should make backups of these volumes, called snapshots. EBS snapshots are incremental backups that only save the blocks on the volume that have changed after your most recent snapshot. For example, if you have 10 GB of data on a volume, and only 2 GB of data have been modified since your last snapshot, only the 2 GB that have been changed are written to Amazon Simple Storage Service (Amazon S3). When you take a snapshot of any of your EBS volumes, the backups are stored redundantly in multiple Availability Zones using Amazon S3. This aspect of storing the backup in Amazon S3 is handled by AWS, so you won’t need to interact with Amazon S3 to work with your EBS snapshots. You manage them in the Amazon EBS console, which is part of the Amazon EC2 console. EBS snapshots can be used to create multiple new volumes, whether they’re in the same Availability Zone or a different one. When you create a new volume from a snapshot, it’s an exact copy of the original volume at the time the snapshot was taken.
Resources
Object Storage with Amazon Simple Storage Service
Amazon S3
S3 Bucket Policies
- Use JSON format
- Specify what actions are allowed or denied on the bucket
- Can only be placed on buckets
Amazon S3 Unlike Amazon Elastic Block Store (Amazon EBS), Amazon Simple Storage Service (Amazon S3) is a standalone storage solution that isn’t tied to compute. It enables you to retrieve your data from anywhere on the web. If you have used an online storage service to back up the data from your local machine, then you most likely have used a service similar to Amazon S3. The big difference between those online storage services and Amazon S3 is the storage type. Amazon S3 is an object storage service. Object storage stores data in a flat structure, using unique identifiers to look up objects when requested. An object is a file combined with metadata. You can store as many of these objects as you’d like. All of the characteristics of object storage are also characteristics of Amazon S3.
Amazon S3 concepts In Amazon S3, you store your objects in containers called buckets. You can’t upload any object, not even a single photo, to Amazon S3 without creating a bucket first. When you create a bucket, you specify, at the very minimum, two details – the AWS Region you want the bucket to reside in and the bucket name.
To choose a Region, you will typically select a Region that you have used for other resources, such as your compute. When you choose a Region for your bucket, all objects you put inside the bucket will be redundantly stored across multiple devices, across multiple Availability Zones. This level of redundancy is designed to provide Amazon S3 customers with 99.999999999% durability and 99.99% availability for objects over a given year. When you choose a bucket name, it must be unique across all AWS accounts. AWS stops you from choosing a bucket name that has already been chosen by someone else in another AWS account. Once you choose a name, that name is yours and cannot be claimed by anyone else unless you delete the bucket, which then releases the name for others to use. AWS uses the bucket name as part of the object identifier. In S3, each object is identified using a URL, as shown.
After the http://, you can see the bucket name. In this example, the bucket is named doc. Then, the identifier uses the s3 service name and the service provider, amazonaws. After that, you have an implied folder inside the bucket called 2006-03-01 and the object inside the folder that is named AmazonS3.html. The object name is often referred to as the key name. You can have folders inside of buckets to help you organize objects. However, remember that no actual file hierarchy supports this on the backend. It is instead a flat structure where all files and folders live at the same level. Using buckets and folders implies a hierarchy, which creates an understandable organization for users.
Amazon S3 use cases Amazon S3 is a widely used storage service, with far more use cases than could fit on one screen. The following list summarizes some of the most common ways you can use Amazon S3:
- Backup and storage: Amazon S3 is a natural place to back up files because it is highly redundant. As mentioned in the last unit, AWS stores your EBS snapshots in S3 to take advantage of its high availability.
- Media hosting: Because you can store unlimited objects, and each individual object can be up to 5 TBs, Amazon S3 is an ideal location to host video, photo, and music uploads.
- Software delivery: You can use Amazon S3 to host your software applications that customers can download.
- Data lakes: Amazon S3 is an optimal foundation for a data lake because of its virtually unlimited scalability. You can increase storage from gigabytes to petabytes of content, paying only for what you use.
- Static websites: You can configure your S3 bucket to host a static website of HTML, CSS, and client-side scripts.
- Static content: Because of the limitless scaling, the support for large files, and the fact that you access any object over the web at any time, Amazon S3 is the perfect place to store static content.
Choose the right connectivity option for resources Everything in Amazon S3 is private by default. This means that all S3 resources, such as buckets, folders, and objects can only be viewed by the user or AWS account that created that resource. Amazon S3 resources are all private and protected to begin with. If you decide that you want everyone on the internet to see your photos, you can choose to make your buckets, folders, and objects public. A public resource means that everyone on the internet can see it. Most of the time, you don’t want your permissions to be all or nothing. Typically, you want to be more granular about the way you provide access to your resources.
To be more specific about who can do what with your Amazon S3 resources, Amazon S3 provides two main access management features – IAM policies and S3 bucket policies.
IAM policies Previously, you learned about creating and using IAM policies. Now, you can apply that knowledge to Amazon S3. When IAM policies are attached to IAM users, groups, and roles, the policies define which actions they can perform. IAM policies are not tied to any one AWS service and can be used to define access to nearly any AWS action. You should use IAM policies for private buckets in the following two scenarios:
- You have many buckets with different permission requirements. Instead of defining many different S3 bucket policies, you can use IAM policies.
- You want all policies to be in a centralized location. Using IAM policies allows you to manage all policy information in one location.
S3 bucket policies Like IAM policies, Amazon S3 bucket policies are defined in a JSON format. The difference is IAM policies are attached to users, groups, and roles, whereas S3 bucket policies are only attached to S3 buckets. S3 bucket policies specify what actions are allowed or denied on the bucket. For example, if you have a bucket called employeebucket, you can attach an S3 bucket policy to it that allows another AWS account to put objects in that bucket. Or if you wanted to allow anonymous viewers to read the objects in employeebucket, then you can apply a policy to that bucket that allows anyone to read objects in the bucket using “Effect”:Allow on the “Action:[“s3:GetObject”]”. S3 bucket policies can only be placed on buckets, and cannot be used for folders or objects. However, the policy that is placed on the bucket applies to every object in that bucket. You should use S3 bucket policies in the following scenarios:
- You need a simple way to do cross-account access to S3, without using IAM roles.
- Your IAM policies bump up against the defined size limit. S3 bucket policies have a larger size limit.
Amazon S3 encryption Amazon S3 reinforces encryption in transit (as it travels to and from Amazon S3) and at rest. To protect data at rest, you can use encryption, as follows:
- Server-side encryption: This allows Amazon S3 to encrypt your object before saving it on disks in its data centers and then decrypt it when you download the objects.
- Client-side encryption: You can encrypt your data client-side and then upload the encrypted data to Amazon S3. In this case, you manage the encryption process, the encryption keys, and all related tools. To encrypt in transit, you can use client-side encryption or Secure Sockets Layer (SSL).
Amazon S3 versioning As described earlier, Amazon S3 identifies objects in part by using the object name. For example, when you upload an employee photo to Amazon S3, you might name the object employee.jpg and store it in a folder called employees. If you don’t use Amazon S3 versioning, every time you upload an object called employee.jpg to the employees folder, it will overwrite the original file. This can be an issue for several reasons, including the following:
- The employee.jpg file name is a common name for an employee photo object. You or someone else who has access to the bucket might not have intended to overwrite it, but once it’s overwritten, the original file can’t be accessed.
- You might want to preserve different versions of employee.jpg. Without versioning, if you wanted to create a new version of employee.jpg, you would need to upload the object and choose a different name for it. Having several objects all with slight differences in naming variations can cause confusion and clutter in S3 buckets. To counteract these issues, you can use S3 versioning. Versioning keeps multiple versions of a single object in the same bucket. This preserves old versions of an object without using different names, which helps with file recovery from accidental deletions, accidental overwrites, or application failures.
If you enable versioning for a bucket, Amazon S3 automatically generates a unique version ID for the object. In one bucket, for example, you can have two objects with the same key, but different version IDs, such as employeephoto.gif (version 111111) and employeephoto.gif (version 121212). Versioning-enabled buckets let you recover objects from accidental deletion or overwrite.
- Deleting an object does not remove the object permanently. Instead, Amazon S3 puts a marker on the object that shows you tried to delete it. If you want to restore the object, you can remove the marker, and it reinstates the object.
- If you overwrite an object, it results in a new object version in the bucket. You still have access to previous versions of the object.
Versioning states Buckets can be in one of the following three states:
- Unversioned (default): No new and existing objects in the bucket have a version.
- Versioning-enabled: Versioning is enabled for all objects in the bucket.
- Versioning-suspended: Versioning is suspended for new objects. All new objects in the bucket will not have a version. However, all existing objects keep their object versions. The versioning state applies to all objects in the bucket. Storage costs are incurred for all objects in your bucket, including all versions. To reduce your Amazon S3 bill, you might want to delete previous versions of your objects once they are no longer needed.
Six Amazon S3 storage classes When you upload an object to Amazon S3 and you don’t specify the storage class, you upload it to the default storage class – often referred to as standard storage. In previous lessons, you learned about the Amazon S3 standard storage class without even knowing it! Amazon S3 storage classes let you change your storage tier when your data characteristics change. For example, if you are accessing your old photos infrequently, you might want to change the storage class for the photos to save costs.
- Amazon S3 Standard
- This is considered general purpose storage for cloud applications, dynamic websites, content distribution, mobile and gaming applications, and big data analytics
- Amazon S3 Intelligent-Tiering
- This tier is useful if your data has unknown or changing access patterns. S3 Intelligent-Tiering stores objects in two tiers - a frequent access tier and an infrequent access tier. Amazon S3 monitors access patterns of your data and automatically moves your data to the most cost-effective storage tier based on frequency of access
- Amazon S3 Standard-Infrequent Access (S3 Standard-IA)
- This tier is for data that is accessed less frequently but requires rapid access when needed. S3 Standard-IA offers the high durability, high throughput, and low latency of S3 Standard, with a low per-GB storage price and per-GB retrieval fee. This storage tier is ideal if you want to store long-term backups, disaster recovery files, and so on.
- Amazon S3 One Zone-Infrequent Access (S3 One Zone-IA)
- Unlike other S3 storage classes that store data in a minimum of 3 AZs, S2 One Zone-IA stores data in a single AZ and costs 20% less than S3 Standard-IA. One Zone-IA is ideal for customers who want a lower-cost option for infrequently accessed data but do not require the availability and resilience of S3 Standard or S3 Standard-IA. Its a good choice for storing secondary backup copies of on-premises data or easily re-creatable data.
- Amazon S3 Glacier
- S3 Glacier is a secure, durable, and low-cost storage class for data archiving. You can reliably store any amount of data at costs that are competitive with or cheaper than on-premises solutions. To keep costs low yet suitable for varying needs, S3 Glacier provides three retrieval options that range from a few minutes to hours.
- Amazon S3 Glacier Deep Archive
- S3 Glacier Deep Archive is the lowest-cost Amazon S3 storage class, and supports long-term retention and digital preservation for data that might be accessed once or twice a year. It is designed for customers - particularly those in highly regulated industries, such as the financial services, healthcare, and public sectors - that retain data sets for 7 to 10 years, or longer, to meet regulatory compliance requirements.
Automate tier transitions with object lifecycle management If you keep manually changing your objects, such as your employee photos, from storage tier to storage tier, you might want to automate the process with a lifecycle policy. When you define a lifecycle policy configuration for an object or group of objects, you can choose to automate two actions – transition and expiration actions.
- Transition actions define when objects should transition to another storage class.
- Expiration actions define when objects expire and should be permanently deleted. For example, you might transition objects to S3 Standard-IA storage class 30 days after you create them, or archive objects to the S3 Glacier storage class one year after creating them.
The following use cases are good candidates for lifecycle management:
- Periodic logs: If you upload periodic logs to a bucket, your application might need them for a week or a month. After that, you might want to delete them.
- Data that changes in access frequency: Some documents are frequently accessed for a limited period of time. After that, they are infrequently accessed. At some point, you might not need real-time access to them, but your organization or regulations might require you to archive them for a specific period. After that, you can delete them.
Resources
Storage Service Review
Amazon EC2 instance store Instance store is ephemeral block storage. This is preconfigured storage that exists on the same physical server that hosts the EC2 instance and cannot be detached from Amazon EC2. You can think of it as a built-in drive for your EC2 instance. Instance store is generally well-suited for temporary storage of information that is constantly changing, such as buffers, caches, and scratch data. It is not meant for data that is persistent or long-lasting. If you need persistent long-term block storage that can be detached from Amazon EC2 and provide you more management flexibility, such as increasing volume size or creating snapshots, then you should use Amazon EBS.
Amazon EBS Amazon EBS is meant for data that changes frequently and needs to persist through instance stops, terminations, or hardware failures. Amazon EBS has two types of volumes – SSD-backed volumes and HDD-backed volumes. SSD-backed volumes have the following characteristics:
- Performance depends on IOPS (input/output operations per second).
- Ideal for transactional workloads, such as databases and boot volumes. HDD-backed volumes have the following characteristics:
- Performance depends on MB/s.
- Ideal for throughput-intensive workloads, such as big data, data warehouses, log processing, and sequential data I/O. Here are a few important features of Amazon EBS that you need to know when comparing it to other services.
- It is block storage.
- You pay for what you provision (you have to provision storage in advance).
- EBS volumes are replicated across multiple servers in a single Availability Zone.
- Most EBS volumes can only be attached to a single EC2 instance at a time.
Amazon S3 If your data doesn’t change that often, Amazon S3 might be a cost-effective and scalable storage solution for you. Amazon S3 is ideal for storing static web content and media, backups and archiving, and data for analytics. It can also host entire static websites with custom domain names. Here are a few important features of Amazon S3 to know about when comparing it to other services:
- It is object storage.
- You pay for what you use (you don’t have to provision storage in advance).
- Amazon S3 replicates your objects across multiple Availability Zones in a Region.
- Amazon S3 is not storage attached to compute.
Amazon Elastic File System (Amazon EFS) and Amazon FSx In this module, you’ve already learned about Amazon S3 and Amazon EBS. You learned that S3 uses a flat namespace and isn’t meant to serve as a standalone file system. You also learned most EBS volumes can only be attached to one EC2 instance at a time. So, if you need file storage on AWS, which service should you use? For file storage that can mount on to multiple EC2 instances, you can use Amazon Elastic File System (Amazon EFS) or Amazon FSx. The following table provides more information about each service.
Here are a few important features of Amazon EFS and Amazon FSx to know about when comparing them to other services:
- It is file storage.
- You pay for what you use (you don’t have to provision storage in advance).
- Amazon EFS and Amazon FSx can be mounted onto multiple EC2 instances.
Resources
Relational Databases
Relational databases A relational database organizes data into tables. Data in one table can be linked to data in other tables to create relationships – hence, the relational part of the name. A table stores data in rows and columns. A row, often called a record, contains all information about a specific entry. Columns describe attributes of an entry. Here’s an example of three tables in a relational database.
This shows a table for books, a table for sales, and a table for authors. In the books table, each row includes the book ISBN, title, author, and format. Each of these attributes is stored in its own column. The books table has something in common with the other two tables – the author attribute. That common column creates a relationship between the tables. The tables, rows, columns, and relationships between them is referred to as a logical schema. With relational databases, a schema is fixed. Once the database is operational, it becomes difficult to change the schema. This requires most of the data modeling to be done upfront before the database is active.
Relational database management system A relational database management system (RDBMS) lets you create, update, and administer a relational database. Here are some common examples of relational database management systems:
- MySQL
- PostgresQL
- Oracle
- SQL server
- Amazon Aurora You communicate with an RDBMS by using Structured Query Language (SQL) queries, similar to the following example: SELECT * FROM table_name. This query selects all the data from a particular table. However, the real power of SQL queries is in creating more complex queries that help you pull data from several tables to piece together patterns and answers to business problems. For example, querying the sales table and the book table together to see sales in relation to an author’s books. This is made possible by a join.
Relational database benefits Relational database offer a number of benefits, including the following:
- Joins: You can join tables, enabling you to better understand relationships between your data.
- Reduced redundancy: You can store data in one table and reference it from other tables instead of saving the same data in different places.
- Familiarity: Relational databases have been a popular choice since the 1970s. Due to this popularity, technical professionals often have familiarity and experience with this type of database.
- Accuracy: Relational databases ensure that your data is persisted with high integrity and adheres to the atomicity, consistency, isolation, durability (ACID) principle.Relational database use cases Much of the world runs on relational databases. In fact, they’re at the core of many mission-critical applications, some of which you might use in your day-to-day life. Here are some common use cases for relational databases.
- Applications that have a solid schema that doesn’t change often, such as lift-and-shift applications that lift an app from on-premises and shifts it to the cloud, with little or no modifications.
- Applications that need persistent storage that follow the ACID principle, such as:
- Enterprise resource planning (ERP) applications
- Customer relationship management (CRM) applications
- Commerce and financial applications
Choose between unmanaged and managed databases If you want to run a relational database on AWS, you first need to select how you want to run it – managed or unmanaged. The paradigm of managed versus unmanaged services is similar to the shared responsibility model. The shared responsibility model distinguishes between AWS security responsibilities and the customer’s security responsibilities. Similarly, managed versus unmanaged can be understood as a tradeoff between convenience and control.
On-premises database If you operate a relational database on-premises (in your own data center), you are responsible for all aspects of operation, including the data center’s security and electricity, the host machine’s management, database management, query optimization, and customer data management. You are responsible for absolutely everything, which means you have control over absolutely everything.
Unmanaged database
Now, suppose you want to shift some of the work to AWS by running your relational database on Amazon EC2. If you host a database on Amazon EC2, AWS takes care of implementing and maintaining the physical infrastructure and hardware, and installing the operating system of the EC2 instance. However, you would still be responsible for managing the EC2 instance, managing the database on that host, optimizing queries, and managing customer data. This is referred to as an unmanaged database option. In this option, AWS is responsible for and has control over the hardware and underlying infrastructure, and you are responsible and have control over management of the host and database.
Managed database
To shift more of the work to AWS, you can use a managed database service. These services provide the setup of both the EC2 instance and the database, and they provide systems for high availability, scalability, patching, and backups. However, in this model, you’re still responsible for database tuning, query optimization, and of course, ensuring that your customer data is secure. This option provides the ultimate convenience but the least amount of control compared to the two previous options.
Resources
RDS
Amazon RDS Amazon Relational Database Service (Amazon RDS) lets customers create and manage relational databases in the cloud without the operational burden of traditional database management. For example, if you sell healthcare equipment and your goal is to be the number-one seller in the Pacific Northwest, building a database doesn’t directly help you achieve that goal, although having a database is necessary to achieve the goal. Amazon RDS offloads some of the unrelated work of creating and managing a database. You can focus on the tasks that differentiate your application, instead of focusing on infrastructure-related tasks, like provisioning, patching, scaling, and restoring. Amazon RDS supports most of the popular relational database management systems, ranging from commercial options, open source options, and even an AWS-specific option. The supported Amazon RDS engines are:
- Commercial: Oracle, SQL Server
- Open Source: MySQL, PostgreSQL, MariaDB
- Cloud Native: Amazon Aurora The cloud native option, Amazon Aurora, is a MySQL- and PostgreSQL-compatible database built for the cloud. It is more durable, more available, and provides faster performance than the Amazon RDS version of MySQL and PostgreSQL. To learn more about Amazon Aurora, view the Amazon Aurora FAQs.
DB instances Just like the databases that you build and manage yourself, Amazon RDS is built off of compute and storage. The compute portion is called the DB (database) instance, which runs the database engine. Depending on the engine of the DB instance you choose, the engine will have different supported features and configurations. A DB instance can contain multiple databases with the same engine, and each database can contain multiple tables. Underneath the DB instance is an EC2 instance. However, this instance is managed through the Amazon RDS console instead of the Amazon EC2 console. When you create your DB instance, you choose the instance type and size. Amazon RDS supports the following three instance families:
- Standard, which includes general-purpose instances
- Memory Optimized, which is optimized for memory-intensive applications
- Burstable Performance, which provides a baseline performance level, with the ability to burst to full CPU usage The DB instance you choose affects how much processing power and memory it has. The available options depend on the selected engine. You can find more information about DB instance types in the Resources section. Much like a regular EC2 instance, a DB instance uses Amazon Elastic Block Store (EBS) volumes as its storage layer. You can choose from the following Amazon EBS volume storage types:
- General purpose (SSD)
- Provisioned IOPS (SSD)
- Magnetic storage (not recommended)
Amazon RDS in an Amazon Virtual Private Cloud When you create a DB instance, you select the Amazon Virtual Private Cloud (VPC) that your databases will live in. Then, you select the subnets that you want the DB instances to be placed in. This is referred to as a DB subnet group. To create a DB subnet group, you specify the following:
- Availability Zones (AZs) that include the subnets you want to add
- Subnets in the AZ where your DB instances are placed The subnets you add should be private, so they don’t have a route to the internet gateway. This ensures that your DB instance, and the data inside of it, can only be reached by the app backend. Access to the DB instance can be further restricted by using network access control lists (network ACLs) and security groups. With these firewalls, you can control, at a granular level, the type of traffic you want to allow into your database. Using these controls provide layers of security for your infrastructure. It reinforces that only the backend instances have access to the database.
Secure Amazon RDS with AWS Identity and Access Management (IAM)
Network ACLs and security groups help users dictate the flow of traffic. If you want to restrict the actions and resources others can access, you can use IAM policies. Backup data You don’t want to lose you data. To take regular backups of your RDS instance, you can use:
- Automatic backups
- Manual snapshots
Automatic backups Automated backups are turned on by default. This backs up your entire DB instance (not just individual databases on the instance) and your transaction logs. When you create your DB instance, you set a backup window that is the period of time that automatic backups occur. Typically, you want to set the windows during a time when your database experiences little activity, because it can cause increased latency and downtime. You can retain your automated backups between 0 and 35 days. You might ask yourself, “Why set automated backups for 0 days?” The 0 days setting actually disables automatic backups from happening. If you set it to 0, it will also delete all existing automated backups. This is not ideal. The benefit of having automated backups is to have the ability to do point-in-time recovery. Point-in-time recovery creates a new DB instance using data restored from a specific point in time. This restoration method provides more granularity by restoring the full backup and rolling back transactions up to the specified time range.
Manual snapshots If you want to keep your automated backups longer than 35 days, use manual snapshots. Manual snapshots are similar to taking Amazon EBS snapshots, except that you manage them in the Amazon RDS console. These are backups that you can initiate at any time. They exist until you delete them. For example, to meet a compliance requirement that mandates you to keep database backups for a year, you would need to use manual snapshots. If you restore data from a manual snapshot, it creates a new DB instance using the data from the snapshot.
Backup options It is advisable to deploy both options. Automated backups are beneficial for point-in-time recovery. Manual snapshots allow you to retain backups for longer than 35 days.
Redundancy with Amazon RDS Multi-AZ When you enable Amazon RDS Multi-AZ, Amazon RDS creates a redundant copy of your database in another AZ. You end up with two copies of your database – a primary copy in a subnet in one AZ and a standby copy in a subnet in a second AZ. The primary copy of your database provides access to your data so that applications can query and display the information. The data in the primary copy is synchronously replicated to the standby copy. The standby copy is not considered an active database, and it does not get queried by applications. To improve availability, Amazon RDS Multi-AZ ensures that you have two copies of your database running and that one of them is in the primary role. If an availability issue arises, such as the primary database loses connectivity, Amazon RDS triggers an automatic failover. When you create a DB instance, a Domain Name System (DNS) name is provided. AWS uses that DNS name to failover to the standby database. In an automatic failover, the standby database is promoted to the primary role, and queries are redirected to the new primary database. To ensure that you don’t lose Multi-AZ configuration, a new standby database is created by either:
- Demoting the previous primary to standby if it’s still up and running
- Standing up a new standby DB instance The reason you can select multiple subnets for an Amazon RDS database is because of the Multi-AZ configuration. You’ll want to ensure that you have used subnets in different AZs for your primary and standby copies.
Resources
DynamoDB
Amazon DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability. DynamoDB lets you offload the administrative burdens of operating and scaling a distributed database so that you don’t have to worry about hardware provisioning, setup and configuration, replication, software patching, or cluster scaling. With DynamoDB, you can create database tables that can store and retrieve any amount of data and serve any level of request traffic. You can scale up or scale down your tables’ throughput capacity without downtime or performance degradation. You can use the AWS Management Console to monitor resource usage and performance metrics. DynamoDB automatically spreads the data and traffic for your tables over a sufficient number of servers to handle your throughput and storage requirements, while maintaining consistent and fast performance. All of your data is stored on solid-state disks (SSDs) and is automatically replicated across multiple Availability Zones in an AWS Region, providing built-in high availability and data durability.
Amazon DynamoDB core components In DynamoDB, tables, items, and attributes are the core components that you work with. A table is a collection of items, and each item is a collection of attributes. DynamoDB uses primary keys to uniquely identify each item in a table and secondary indexes to provide more querying flexibility. The following are the basic DynamoDB components:
- Tables – Similar to other database systems, DynamoDB stores data in tables. A table is a collection of data. For instance, you could have a table called People that you could use to store personal contact information about friends, family, or anyone else of interest. You could also have a Cars table to store information about vehicles that people drive.
- Items – Each table contains zero or more items. An item is a group of attributes that is uniquely identifiable among all the other items. In a People table, each item represents a person. In a Cars table, each item represents one vehicle. Items in DynamoDB are similar in many ways to rows, records, or tuples in other database systems. In DynamoDB, there is no limit to the number of items you can store in a table.
- Attributes – Each item is composed of one or more attributes. An attribute is a fundamental data element, something that does not need to be broken down any further. For example, an item in a People table might contain attributes called PersonID, LastName, FirstName, and so on. In a Department table, an item might have attributes such as DepartmentID, Name, Manager, and so on. Attributes in DynamoDB are similar in many ways to fields or columns in other database systems.
Amazon DynamoDB security DynamoDB also offers encryption at rest, which eliminates the operational burden and complexity involved in protecting sensitive data. For more information, see DynamoDB Encryption at Rest
Resources
Choose the Right Database Service
AWS database services AWS has a variety of database options for different use cases. The following table provides a quick look at the AWS database portfolio.
Breaking up applications and databases As the industry changes, applications and databases change too. Today, with larger applications, you no longer see just one database supporting it. Instead, applications are broken into smaller services, each with their own purpose-built database supporting it. This shift removes the idea of a one-size-fits-all database and replaces it with a complimentary database strategy. You can give each database the appropriate functionality, performance, and scale that the workload requires.
Resources
Monitoring, Optimization, and Serverless
Application Management
Monitoring
Purpose of monitoring When operating a website like the Employee Directory Application on AWS, you might have questions like:
- How many people are visiting my site day to day?
- How can I track the number of visitors over time?
- How will I know if the website is having performance or availability issues?
- What happens if my Amazon Elastic Compute Cloud (EC2) instance runs out of capacity?
- Will I be alerted if my website goes down? You need a way to collect and analyze data about the operational health and usage of your resources. The act of collecting, analyzing, and using data to make decisions or answer questions about your IT resources and systems is called monitoring. Monitoring provides a near real-time pulse on your system and helps answer the questions listed above. You can use the data you collect to watch for operational issues caused by events like overuse of resources, application flaws, resource misconfiguration, or security-related events. Think of the data collected through monitoring as outputs of the system, or metrics.
Use metrics to solve problems The AWS resources that host your solutions create various forms of data that you might be interested in collecting. Each individual data point that a resource creates is a metric. Metrics that are collected and analyzed over time become statistics, such as average CPU utilization over time showing a spike. One way to evaluate the health of an Amazon EC2 instance is through CPU utilization. Generally speaking, if an EC2 instance has a high CPU utilization, it can mean a flood of requests. Or, it can reflect a process that has encountered an error and is consuming too much of the CPU. When analyzing CPU utilization, take a process that exceeds a specific threshold for an unusual length of time. Use that abnormal event as a cue to either manually or automatically resolve the issue through actions like scaling the instance. This is one example of a metric. Other examples of metrics EC2 instances have are network utilization, disk performance, memory utilization, and the logs created by the applications running on top of Amazon EC2.
Types of metrics Different resources in AWS create different types of metrics. An Amazon Simple Storage Service (Amazon S3) bucket would not have CPU utilization like an EC2 instance does. Instead, Amazon S3 creates metrics related to the objects stored in a bucket, like the overall size or the number of objects in a bucket. Amazon S3 also has metrics related to the requests made to the bucket, such as reading or writing objects. Amazon Relational Database Service (Amazon RDS) creates metrics such as database connections, CPU utilization of an instance, or disk space consumption. This is not a complete list for any of the services mentioned, but you can see how different resources create different metrics. You could be interested in a wide variety of metrics depending on your resources, goals, and questions.
Monitoring benefits Monitoring gives you visibility into your resources, but the question now is, “Why is that important?” This section describes some of the benefits of monitoring. Respond to operational issues proactively before your end users are aware of them. Waiting for end users to let you know when your application is experiencing an outage is a bad practice. Through monitoring, you can keep tabs on metrics like error response rate and request latency. Over time, the metrics help signal when an outage is going to occur. This enables you to automatically or manually perform actions to prevent the outage from happening, and fix the problem before your end users are aware of it. Improve the performance and reliability of your resources. Monitoring the various resources that comprise your application provides you with a full picture of how your solution behaves as a system. Monitoring, if done well, can illuminate bottlenecks and inefficient architectures. This helps you drive performance and improve reliability. Recognize security threats and events. When you monitor resources, events, and systems over time, you create what is called a baseline. A baseline defines what activity is normal. Using a baseline, you can spot anomalies like unusual traffic spikes or unusual IP addresses accessing your resources. When an anomaly occurs, an alert can be sent out or an action can be taken to investigate the event. Make data-driven decisions for your business. Monitoring keeps an eye on IT operational health and drives business decisions. For example, suppose you launched a new feature for your cat photo app and now you want to know if it’s being used. You can collect application-level metrics and view the number of users who use the new feature. With your findings, you can decide whether to invest more time into improving the new feature. Create more cost-effective solutions. Through monitoring, you can view resources that are being underused and rightsize your resources to your usage. This helps you optimize cost and make sure you aren’t spending more money than necessary.
Visibility AWS resources create data that you can monitor through metrics, logs, network traffic, events, and more. This data comes from components that are distributed in nature, which can lead to difficulty in collecting the data you need if you don’t have a centralized place to review it all. AWS has done that for you with a service called Amazon CloudWatch. Amazon CloudWatch is a monitoring and observability service that collects data like those mentioned in this module. CloudWatch provides actionable insights into your applications, and enables you to respond to system-wide performance changes, optimize resource usage, and get a unified view of operational health. You can use CloudWatch to:
- Detect anomalous behavior in your environments
- Set alarms to alert you when something is not right
- Visualize logs and metrics with the AWS Management Console
- Take automated actions like scaling
- Troubleshoot issues
- Discover insights to keep your applications healthy
Resources
CloudWatch
How CloudWatch works With CloudWatch, all you need to get started is an AWS account. It is a managed service that you can use for monitoring, without managing the underlying infrastructure. The Employee Directory app is built with various AWS services working together as building blocks. Monitoring the individual services independently could be challenging. CloudWatch acts as a centralized place where metrics are gathered and analyzed. You already learned how EC2 instances post CPU utilization as a metric to CloudWatch. Different AWS resources post different metrics that you can monitor. You can view a list of services that send metrics to CloudWatch in the Resources section. Many AWS services send metrics automatically for free to CloudWatch at a rate of one data point per metric per 5-minute interval. This gives you visibility into your systems without any extra cost. This is known as basic monitoring. For many applications, basic monitoring is adequate. For applications running on EC2 instances, you can get more granularity by posting metrics every minute instead of every 5 minutes using a feature like detailed monitoring. Detailed monitoring incurs a fee. You can read about pricing on the CloudWatch Pricing Page linked in the Resources section.
CloudWatch metrics Each metric in CloudWatch has a timestamp and is organized into containers called namespaces. Metrics in different namespaces are isolated from each other – you can think of them as belonging to different categories. AWS services that send data to CloudWatch attach dimensions to each metric. A dimension is a name/value pair that is part of the metric’s identity. You can use dimensions to filter the results that CloudWatch returns. For example, you can get statistics for a specific EC2 instance by specifying the InstanceId dimension when you search.
Custom metrics Suppose you have an application and you want to record the number of page views your website gets. How would you record this metric with CloudWatch? First, it’s an application-level metric. That means it’s not something the EC2 instance would post to CloudWatch by default. This is where custom metrics come in. Custom metrics allows you to publish your own metrics to CloudWatch. If you want to gain more granular visibility, you can use high-resolution custom metrics, which enable you to collect custom metrics down to a 1-second resolution. This means that you can send one data point per second per custom metric. Following are some other examples of custom metrics:
- Web page load times
- Request error rates
- Number of processes or threads on your instance
- Amount of work performed by your application You can get started with custom metrics by programmatically sending the metric to CloudWatch using the PutMetricData API.
CloudWatch dashboards Once you’ve provisioned your AWS resources and they are sending metrics to CloudWatch, you can then visualize and review that data using the CloudWatch console with dashboards. Dashboards are customizable home pages that you use for data visualization for one or more metrics through the use of widgets, such as a graph or text. You can build many custom dashboards, each one focusing on a distinct view of your environment. You can even pull data from different Regions into a single dashboard in order to create a global view of your architecture. CloudWatch aggregates statistics according to the period of time that you specify when creating your graph or requesting your metrics. You can also choose whether your metric widgets display live data. Live data is data published within the last minute that has not been fully aggregated. You are not bound to using CloudWatch exclusively for all your visualization needs. You can use external or custom tools to ingest and analyze CloudWatch metrics using the GetMetricData API. As far as security goes, you can control who has access to view or manage your CloudWatch dashboards through AWS Identity and Access Management (IAM) policies that get associated with IAM users, IAM groups, or IAM roles.
Amazon CloudWatch Logs CloudWatch can also be the centralized place for logs to be stored and analyzed, using Amazon CloudWatch Logs. CloudWatch Logs can monitor, store, and access your log files from applications running on Amazon EC2 instances, AWS Lambda functions, and other sources. CloudWatch Logs allows you to query and filter your log data. For example, suppose you’re looking into an application logic error for your application, and you know that when this error occurs it will log the stack trace. Since you know it logs the error, you query your logs in CloudWatch Logs to find the stack trace. You also set up metric filters on logs, which turn log data into numerical CloudWatch metrics that you can graph and use on your dashboards. Some services are set up to send log data to CloudWatch Logs with minimal effort, like AWS Lambda. With AWS Lambda, all you need to do is give the Lambda function the correct IAM permissions to post logs to CloudWatch Logs. Other services require more configuration. For example, if you want to send your application logs from an EC2 instance into CloudWatch Logs, you need to first install and configure the CloudWatch Logs agent on the EC2 instance. The CloudWatch Logs agent enables Amazon EC2 instances to automatically send log data to CloudWatch Logs. The agent includes the following components:
- Plug-in to the AWS Command Line Interface (AWS CLI) that pushes log data to CloudWatch Logs
- Script that initiates the process to push data to CloudWatch Logs
- cron job that ensures the daemon is always running After the agent is installed and configured, you can view your application logs in CloudWatch Logs.
CloudWatch Logs terminology Log data sent to CloudWatch Logs can come from different sources, so it’s important you understand how they’re organized and the terminology used to describe your logs. Log event: A log event is a record of activity recorded by the application or resource being monitored, and it has a timestamp and an event message. Log stream: Log events are then grouped into log streams, which are sequences of log events that all belong to the same resource being monitored. For example, logs for an EC2 instance are grouped together into a log stream that you can filter or query for insights. Log groups: Log streams are then organized into log groups. A log group is composed of log streams that all share the same retention and permissions settings. For example, if you have multiple EC2 instances hosting your application and you are sending application log data to CloudWatch Logs, you can group the log streams from each instance into one log group. This helps keep your logs organized.
CloudWatch alarms You can create CloudWatch alarms to automatically initiate actions based on sustained state changes of your metrics. You configure when alarms are triggered and the action that is performed. You first must decide which metric you want to set an alarm for, and then you define the threshold that will trigger the alarm. Next, you define the threshold’s time period. For example, if you want to set up an alarm for an EC2 instance to trigger when the CPU utilization goes over a threshold of 80%, you also must specify the time period the CPU utilization is over the threshold. You don’t want to trigger an alarm based on short temporary spikes in the CPU. You only want to trigger an alarm if the CPU is elevated for a sustained amount of time. For example, if CPU utilization is over 80% for 5 minutes or longer, there might be a resource issue. Keeping all that in mind, to set up an alarm you need to choose the metric, threshold, and time period. An alarm has three possible states.
- OK: The metric is within the defined threshold. Everything appears to be operating like normal.
- ALARM: The metric is outside the defined threshold. This could be an operational issue.
- INSUFFICIENT_DATA: The alarm has just started, the metric is not available, or not enough data is available for the metric to determine the alarm state. An alarm can be triggered when it transitions from one state to another. Once an alarm is triggered, it can initiate an action. Actions can be an Amazon EC2 action, an automatic scaling action, or a notification sent to Amazon Simple Notification Service (Amazon SNS).
Prevent and troubleshoot issues with CloudWatch alarms CloudWatch Logs uses metric filters to turn the log data into metrics that you can graph or set an alarm on. For the Employee Directory application, suppose you set up a metric filter for 500-error response codes. Then, you define an alarm for that metric that will go into the ALARM state if 500-error responses go over a certain amount for a sustained time period. If it’s more than five 500-error responses per hour, the alarm should enter the ALARM state. Next, you define an action that you want to take place when the alarm is triggered. In this case, it makes sense to send an email or text alert to you so you can start troubleshooting the website, hopefully fixing it before it becomes a bigger issue. Once the alarm is set up, you feel comfortable knowing that if the error happens again, you’ll be notified promptly. You can set up different alarms for different reasons to help you prevent or troubleshoot operational issues. In the scenario just described, the alarm triggers an Amazon SNS notification that goes to a person who looks into the issue manually. Another option is to have alarms trigger actions that automatically remediate technical issues. For example, you can set up an alarm to trigger an EC2 instance to reboot, or scale services up or down. You can even set up an alarm to trigger an Amazon SNS notification that triggers an AWS Lambda function. The Lambda function then calls any AWS API to manage your resources and troubleshoot operational issues. By using AWS services together like this, you can respond to events more quickly.
Resources
- External Site: AWS: Getting Started with Amazon CloudWatch
- External Site: AWS: What Is Amazon CloudWatch Logs?
- External Site: AWS Services That Publish CloudWatch Metrics
- External Site: AWS: View Available Metrics
- External Site: AWS: Amazon CloudWatch Pricing
- External Site: AWS: Amazon Simple Notification Service
- External Site: AWS: EC2 Auto Scaling Actions
Solution Optimization
Availability The availability of a system is typically expressed as a percentage of uptime in a given year or as a number of nines. In the table, you can see a list of the percentages of availability based on the downtime per year, as well as its notation in nines.
To increase availability, you need redundancy. This typically means more infrastructure – more data centers, more servers, more databases, and more replication of data. You can imagine that adding more of this infrastructure means a higher cost. Customers want the application to always be available, but you need to draw a line where adding redundancy is no longer viable in terms of revenue.
Improve application availability In the current application, one EC2 instance hosts the application, the photos are served from Amazon Simple Storage Service (Amazon S3), and the structured data is stored in Amazon DynamoDB. That single EC2 instance is a single point of failure for the application. Even if the database and Amazon S3 are highly available, customers have no way to connect if the single instance becomes unavailable. One way to solve this single point of failure issue is to add one more server.
Second Availability Zone The physical location of a server is important. On top of potential software issues at the operating system or application level, hardware issues must be considered. It could be in the physical server, the rack, the data center, or even the Availability Zone hosting the virtual machine. To fix the physical location issue, you can deploy a second EC2 instance in a different Availability Zone. And, the new instance might also solve issues with the operating system and the application. However, having more than one instance brings new challenges.
Replication, redirection, and high availability
Replication process The first challenge with multiple EC2 instances is that you need to create a process to replicate the configuration files, software patches, and application across instances. The best method is to automate where you can.
Customer redirection The second challenge is how to let the clients (the computers sending requests to your server) know about the different servers. Various tools can be used here. The most common is using a Domain Name System (DNS) where the client uses one record that points to the IP address of all available servers. However, the time it takes to update the list of IP addresses and for the clients to become aware of such change, sometimes called propagation, is typically the reason why this method isn’t always used. Another option is to use a load balancer, which takes care of health checks and distributing the load across each server. Situated between the client and the server, a load balancer avoids propagation time issues. You will learn more about load balancers in the next section.
Types of high availability The last challenge to address when having more than one server is the type of availability you need – active-passive or active-active.
- Active-Passive: With an active-passive system, only one of the two instances is available at a time. One advantage of this method is that for stateful applications where data about the client’s session is stored on the server, there won’t be any issues because the customers are always sent to the server where their session is stored.
- Active-Active: A disadvantage of active-passive and where an active-active system shines is scalability. By having both servers available, the second server can take some load for the application, allowing the entire system to take more load. However, if the application is stateful, there would be an issue if the customer’s session isn’t available on both servers. Stateless applications work better for active-active systems.
Resources
Traffic Routing with Amazon Elastic Load Balancing
ALB components:
- Listener
- Target Group
- Rules
Load balancers Load balancing refers to the process of distributing tasks across a set of resources. In the case of the Employee Directory application, the resources are EC2 instances that host the application, and the tasks are the requests being sent. You can use a load balancer to distribute the requests across all the servers hosting the application. To do this, you first need to enable the load balancer to take all of the traffic and redirect it to the backend servers based on an algorithm. The most popular algorithm is round-robin, which sends the traffic to each server one after the other. A typical request for an application starts from a client’s browser. The request is sent to a load balancer. Then, it’s sent to one of the EC2 instances that hosts the application. The return traffic goes back through the load balancer and back to the client’s browser. As you can see, the load balancer is directly in the path of the traffic. Although it is possible to install your own software load balancing solution on EC2 instances, AWS provides a service for you called Elastic Load Balancing.
ELB features The ELB service provides a major advantage over using your own solution to do load balancing – mainly, you don’t need to manage or operate it. It can distribute incoming application traffic across EC2 instances, containers, IP addresses, and AWS Lambda functions. Other key features include the following:
- Because ELB can load balance to IP addresses, it can work in a hybrid mode, which mean it also load balances to on-premises servers.
- ELB is highly available. The only option you must ensure is that the load balancer is deployed across multiple Availability Zones.
- In terms of scalability, ELB automatically scales to meet the demand of the incoming traffic. It handles the incoming traffic and sends it to your backend application.
Health checks Taking time to define an appropriate health check is critical. Only verifying that the port of an application is open doesn’t mean that the application is working. It also doesn’t mean that simply making a call to the home page of an application is the right way either. For example, the Employee Directory application depends on a database and Amazon S3. The health check should validate all of the elements. One way to do that would be to create a monitoring webpage, like “/monitor” that will make a call to the database to ensure that it can connect and get data, and make a call to Amazon S3. Then, you point the health check on the load balancer to the “/monitor” page.
After determining the availability of a new EC2 instance, the load balancer starts sending traffic to it. If ELB determines that an EC2 instance is no longer working, it stops sending traffic to it and lets EC2 Auto Scaling know. EC2 Auto Scaling’s responsibility is to remove it from the group and replace it with a new EC2 instance. Traffic is only sent to the new instance if it passes the health check. In the case of a scale down action that EC2 Auto Scaling needs to take due to a scaling policy, it lets ELB know that EC2 instances will be terminated. ELB can prevent EC2 Auto Scaling from terminating an EC2 instance until all connections to the instance end, while preventing any new connections. That feature is called connection draining.
ELB components The ELB service is made up of three main components. Choose the image markers to learn more about rules, listeners, and target groups.
Application Load Balancer Here are some primary features of Application Load Balancer. ALB routes traffic based on request data. ALB makes routing decisions based on the HTTP protocol, like the URL path (/upload) and host, HTTP headers and method, and the source IP address of the client. This enables granular routing to target groups. ALB sends responses directly to the client. ALB has the ability to reply directly to the client with a fixed response, such as a custom HTML page. It can also send a redirect to the client, which is useful when you must redirect to a specific website or redirect a request from HTTP to HTTPS, removing that work from your backend servers. ALB uses TLS offloading. Speaking of HTTPS and saving work from backend servers, ALB understands HTTPS traffic. To pass HTTPS traffic through ALB, an SSL certificate is provided by either importing a certificate by way of IAM or AWS Certificate Manager (ACM) services, or by creating one for free using ACM. This ensures that the traffic between the client and ALB is encrypted. ALB authenticates users. On the topic of security, ALB can authenticate users before they are allowed to pass through the load balancer. ALB uses the OpenID Connect protocol and integrates with other AWS services to support protocols like SAML, LDAP, Microsoft Active Directory, and more. ALB secures traffic. To prevent traffic from reaching the load balancer, you configure a security group to specify the supported IP address ranges. ALB uses the round-robin routing algorithm. ALB ensures each server receives the same number of requests in general. This type of routing works for most applications. ALB uses the least outstanding request routing algorithm. If the requests to the backend vary in complexity where one request might need a lot more CPU time than another, then the least outstanding request algorithm is more appropriate. It’s also the right routing algorithm to use if the targets vary in processing capabilities. An outstanding request is when a request is sent to the backend server and a response hasn’t been received yet. For example, if the EC2 instances in a target group aren’t the same size, one server’s CPU utilization will be higher than the other if the same number of requests are sent to each server using the round-robin routing algorithm. That same server will have more outstanding requests as well. Using the least outstanding request routing algorithm would ensure an equal usage across targets. ALB uses sticky sessions. If requests must be sent to the same backend server because the application is stateful, use the sticky session feature. This feature uses an HTTP cookie to remember across connections which server to send the traffic to. Finally, ALB is specifically for HTTP and HTTPS traffic. If your application uses a different protocol, consider the Network Load Balancer.
Network Load Balancer Here are some primary features of Network Load Balancer. Network Load Balancer supports TCP, UDP, and TLS protocols. HTTPS uses TCP and TLS as protocols. However, NLB operates at the connection layer, so it doesn’t understand what an HTTPS request is. That means all features that are required to understand the HTTP and HTTPS protocol, like routing rules based on that protocol, authentication, and least outstanding request routing algorithm, are not available with NLB. NLB uses a flow hash routing algorithm. The algorithm is based on:
- Protocol
- Source IP address and source port
- Destination IP address and destination port
- TCP sequence number If all of the parameters are the same, the packets are sent to the exact same target. If any of them are different in the next packets, the request might be sent to a different target. NLB has sticky sessions. Different from ALB, these sessions are based on the source IP address of the client, instead of a cookie. NLB supports TLS offloading. NLB understands the TLS protocol. It can also offload TLS from the backend servers, similar to how ALB works. NLB handles millions of requests per second. While ALB can also support this number of requests, it needs to scale to reach that number. This takes time. NLB can instantly handle millions of requests per second. NLB supports static and elastic IP addresses. In some situations, an application client needs to send requests directly to the load balancer IP address instead of using DNS. For example, this is useful if your application can’t use DNS or if the connecting clients require firewall rules based on IP addresses. In this case, NLB is the right type of load balancer to use. NLB preserves source IP address. NLB preserves the source IP address of the client when sending the traffic to the backend. With ALB, if you look at the source IP address of the requests, you will find the IP address of the load balancer. While with NLB, you would see the real IP address of the client, which is required by the backend application in some cases.
Select between ELB types Selecting between the ELB service types is done by determining which feature is required for your application. The table presents a list of the major features of load balancers.
Resources
EC2 Auto Scaling
Capacity issues Availability and reachability is improved by adding one more server. However, the entire system can again become unavailable if there is a capacity issue. This section looks at load issue for both types of systems discussed – active-passive and active-active.
Vertical scaling If too many requests are sent to a single active-passive system, the active server will become unavailable and hopefully failover to the passive server. But this doesn’t solve anything. With active-passive, you need vertical scaling. This means increasing the size of the server. With EC2 instances, you select either a larger type or a different instance type. This can only be done while the instance is in a stopped state. In this scenario, the following steps occur:
- Stop the passive instance. This doesn’t impact the application because it’s not taking any traffic.
- Change the instance size or type, and then start the instance again.
- Shift the traffic to the passive instance, turning it active.
- Stop, change the size, and start the previous active instance since both instances should match. When the number of requests reduces, the same operation must be done. Even though there aren’t that many steps involved, it’s actually a lot of manual work. Another disadvantage is that a server can only scale vertically up to a certain limit. When that limit is reached, the only option is to create another active-passive system and split the requests and functionalities across them. This could require massive application rewriting. This is where the active-active system can help. When there are too many requests, this system can be scaled horizontally by adding more servers.
Horizontal scaling As mentioned, for the application to work in an active-active system, it’s already created as stateless, not storing any client sessions on the server. This means that having two servers or having four wouldn’t require any application changes. It would only be a matter of creating more instances when required and shutting them down when traffic decreases. The Amazon EC2 Auto Scaling service can take care of that task by automatically creating and removing EC2 instances based on metrics from Amazon CloudWatch. You can see that there are many more advantages to using an active-active system in comparison with an active-passive. Modifying your application to become stateless enables scalability.
ELB with EC2 Auto Scaling The ELB service integrates seamlessly with EC2 Auto Scaling. As soon as a new EC2 instance is added to or removed from the EC2 Auto Scaling group, ELB is notified. However, before it can send traffic to a new EC2 instance, it needs to validate that the application running on the EC2 instance is available. This validation is done by way of the ELB health checks feature. Monitoring is an important part of load balancers, because they should route traffic to only healthy EC2 instances. That’s why ELB supports two types of health checks.
- Establishing a connection to a backend EC2 instance using TCP, and marking the instance as available if the connection is successful.
- Making an HTTP or HTTPS request to a webpage that you specify, and validating that an HTTP response code is returned.
Traditional scaling versus auto scaling With a traditional approach to scaling, you buy and provision enough servers to handle traffic at its peak. However, this means that at night time, for example, you might have more capacity than traffic, which means you’re wasting money. Turning off your servers at night or at times where the traffic is lower only saves on electricity. The cloud works differently with a pay-as-you-go model. You must turn off the unused services, especially EC2 instances that you pay for on-demand. You could manually add and remove servers at a predicted time. But with unusual spikes in traffic, this solution leads to a waste of resources with over-provisioning or a loss of customers due to under-provisioning. The need here is for a tool that automatically adds and removes EC2 instances according to conditions you define – that’s exactly what the EC2 Auto Scaling service does.
Amazon EC2 Auto Scaling The Amazon EC2 Auto Scaling service adds and removes capacity to keep a steady and predictable performance at the lowest possible cost. By adjusting the capacity to exactly what your application uses, you only pay for what your application needs. And even with applications that have steady usage, EC2 Auto Scaling can help with fleet management. If an EC2 instance has an issue, EC2 Auto Scaling can automatically replace the instance. This means that EC2 Auto Scaling helps both to scale your infrastructure and ensure high availability.
Configure EC2 Auto Scaling components Three main components of EC2 Auto Scaling are as follows:
- Launch template or configuration: What resource should be automatically scaled?
- EC2 Auto Scaling Group: Where should the resources be deployed?
- Scaling policies: When should the resources be added or removed?
Launch templates Multiple parameters are required to create EC2 instances – Amazon Machine Image (AMI) ID, instance type, security group, additional Amazon Elastic Block Store (EBS) volumes, and more. All this information is also required by EC2 Auto Scaling to create the EC2 instance on your behalf when there is a need to scale. This information is stored in a launch template. You can use a launch template to manually launch an EC2 instance. You can also use it with EC2 Auto Scaling. It also supports versioning, which allows for quickly rolling back if there’s an issue or a need to specify a default version. This way, while iterating on a new version, other users can continue launching EC2 instances using the default version until you make the necessary changes.
You can create a launch template in one of three ways.
- The fastest way to create a template is to use an existing EC2 instance. All the settings are already defined.
- Another option is to create one from an already existing template or a previous version of a launch template.
- The last option is to create a template from scratch. The following options will need to be defined: AMI ID, instance type, key pair, security group, storage, and resource tags. Another way to define what Amazon EC2 Auto Scaling needs to scale is by using a launch configuration. It’s similar to the launch template, but it doesn’t allow for versioning using a previously created launch configuration as a template. Nor does it allow for creating one from an already existing EC2 instance. For these reasons and to ensure that you’re getting the latest features from Amazon EC2, AWS recommends that you use a launch template instead of a launch configuration.
EC2 Auto Scaling groups The next component that EC2 Auto Scaling needs is an EC2 Auto Scaling Group. An auto scaling group helps you define where EC2 Auto Scaling deploys your resources. This is where you specify the Amazon VPC and subnets the EC2 instance should be launched in. EC2 Auto Scaling takes care of creating the EC2 instances across the subnets, so select at least two subnets that are across different Availability Zones. With Auto Scaling groups, you can specify the type of purchase for the EC2 instances. You can use On-Demand only, Spot only, or a combination of the two, which allows you to take advantage of Spot instances with minimal administrative overhead. To specify how many instances EC2 Auto Scaling should launch, you have three capacity settings to configure for the group size.
- Minimum: The minimum number of instances running in your Auto Scaling group, even if the threshold for lowering the amount of instances is reached.
- Maximum: The maximum number of instances running in your Auto Scaling group, even if the threshold for adding new instances is reached.
- Desired capacity: The amount of instances that should be in your Auto Scaling group. This number can only be within or equal to the minimum or maximum. EC2 Auto Scaling automatically adds or removes instances to match the desired capacity number.
When EC2 Auto Scaling removes EC2 instances because the traffic is minimal, it keeps removing EC2 instances until it reaches a minimum capacity. Depending on your application, using a minimum of two is a good idea to ensure high availability, but you know how many EC2 instances at a bare minimum your application requires at all times. When reaching that limit, even if EC2 Auto Scaling is instructed to remove an instance, it does not, to ensure the minimum is kept. On the other hand, when the traffic keeps growing, EC2 Auto Scaling keeps adding EC2 instances. This means the cost for your application will also keep growing. That’s why you must set a maximum amount – to make sure it doesn’t go above your budget. The desired capacity is the amount of EC2 instances that EC2 Auto Scaling creates at the time the group is created. If that number decreases, EC2 Auto Scaling removes the oldest instance by default. If that number increases, EC2 Auto Scaling creates new instances using the launch template.
Availability with EC2 Auto Scaling Different numbers for minimum, maximum, and desired capacity are used for dynamically adjusting the capacity. However, if you prefer to use EC2 Auto Scaling for fleet management, you can configure the three settings to the same number, for example four, as shown in the image. EC2 Auto Scaling will ensure that if an EC2 instance becomes unhealthy, it replaces it to always ensure that four EC2 instances are available. This ensures high availability for your applications.
Automation with scaling policies By default, an Auto Scaling group will be kept to its initial desired capacity. While it’s possible to manually change the desired capacity, you can also use scaling policies. In the AWS Monitoring module, you learned about Amazon CloudWatch metrics and alarms. You use metrics to keep information about different attributes of your EC2 instance, like the CPU percentage. You use alarms to specify an action when a threshold is reached. Metrics and alarms are what scaling policies use to know when to act. For example, you can set up an alarm that states when the CPU utilization is above 70% across the entire fleet of EC2 instances, trigger a scaling policy to add an EC2 instance. Three types of scaling policies are available – simple, step, and target tracking scaling.
Simple scaling policy A simple scaling policy allows you to do exactly what’s described in this module. You use a CloudWatch alarm and specify what to do when it is triggered. This can be a number of EC2 instances to add or remove, or a specific number to set the desired capacity to. You can specify a percentage of the group instead of using an amount of EC2 instances, which makes the group grow or shrink more quickly. Once the scaling policy is triggered, it waits a cooldown period before taking any other action. This is important because it takes time for the EC2 instances to start and the CloudWatch alarm might still be triggered while the EC2 instance is booting. For example, you could decide to add an EC2 instance if the CPU utilization across all instances is above 65%. You don’t want to add more instances until that new EC2 instance is accepting traffic. However, what if the CPU utilization was now above 85% across the Auto Scaling group? Adding one instance might not be the right move. Instead, you might want to add another step in your scaling policy. Unfortunately, a simple scaling policy can’t help with that.
Step scaling policy This is where a step scaling policy helps. Step scaling policies respond to additional alarms even while a scaling activity or health check replacement is in progress. Similar to the example above, you might decide to add two more instances when CPU utilization is at 85% and four more instances when it’s at 95%. Deciding when to add and remove instances based on CloudWatch alarms might seem like a difficult task. This is why the third type of scaling policy exists – target tracking.
Target tracking scaling policy If your application scales based on average CPU utilization, average network utilization (in or out), or request count, then this scaling policy type is the one to use. All you need to provide is the target value to track and it automatically creates the required CloudWatch alarms.
Supervised, Unsupervised and Reinforcement Learning
Supervised learning - the training data you feed to the algorithm includes the desired solutions, called labels
- Classification is a typical supervised learning task
- Trained with many examples along with their class and it must learn how to classify new examples
- Regression predicts a target numeric value given a set of features called predictors
- Feature means an attribute plus its value
- Ex: milage = 15,000
- Feature means an attribute plus its value
- List of most important supervised learning algorithms
- K-nearest neighbors
- Linear regression
- Logistic regression
- Support vector machines (SVMs)
- Decision Trees and Random Forests
- Neural Networks
Unsupervised Learning - training data is unlabeled and the system learns without a teacher
- Most important unsupervised learning algorithms
- Clustering
- k-Means
- Hierarchical Cluster Analysis (HCA)
- Expectation Maximization
- Visualization and Dimensionality reduction
- Principal Component Analysis (PCA)
- Locally-Linearly Embedding (LLE)
- T-distributed Stochastic Neighbor Embedding (t-SNE)
- Association rule learning
- Apriori
- Eclat
- Clustering
- Dimensionality reduction - simplify the data without losing too much information
- Often a good idea to reduce the dimension of your training data using a dimensionality reduction algorithm before you feed it to another Machine Learning algorithm
- Anomaly detection - the system is trained with normal instances and it determines if a new instance is normal or an anomaly
- Association rule learning - discover relationships between attributes
- People who buy barbecue sauce and chips also buy steak
Semisupervised learning - partially labeled training data, little labeled and lot unlabeled
- Google photos know people in photo 1, 5 and 11 are the same, need you to label them
Reinforcement Learning
- The learning system is called an agent and can observe the environment, select and perform actions, and get rewards in return (or penalties in the form of negative rewards).
- It learns by itself to determine the best strategy, called a policy
Batch vs Incremental Learning
Batch - it must be trained using all the available data
- Also called offline learning
Incremental Learning - you train the system incrementally by feeding it data instances sequentially, either individually or by small groups called mini-batches
- How fast should it adapt to changing data?
- This is called the learning rate
- Higher learning rate will rapidly adapt to new data but quickly forget old data
- Low learning rate the system will have more inertia
- Big challenge - if bad data is fed to the system the performance will gradually decline
- Requires system to be monitored and turning learning off if there is a drop in performance
Instance-Based vs Model-Based Learning
Instance-Based - the system learns the examples by heart, then generalizes to new cases using a similarity measure
Model-Based - Build a model of examples to generalize from then use this model to make predictions
Data Sampling/Processing
Nonrepresentative Training Data
- If sample is too small, you will get sampling noise - non representative data as a results of chance
- If sampling method is flawed, you will get sampling bias
Poor-Quality Data
- If some instances are clearly outliers, may be good to discard or fix the errors manually (incorrect data)
- Missing a few features, you can decide to ignore this feature, ignore those instances, impute the missing values, or train one model with the feature and one without it, or so on.
Irrelevant Features
- Garbage in, garbage out
- Feature Engineering
- Feature selection - selecting the most useful features to train on amount existing features
- Feature extraction - combining existing features to produce more useful one
- Creating new features by gathering new data
Overfitting
Overfitting the Training Data
- The model performs well on the training data but does not generalize it well (or perform on test/new data)
- Detecting patterns from the ’noise'
- Happens when the model is too complex relative to the amount and noisiness of the training data
- Solutions include:
- Simplify the model by fewer parameters
- Gather more training data
- Reduce the noise (fix data errors and remove outliers)
- Solutions include:
Regularization
Regularization
- Constraining a model to make it simpler and reduce the risk of overfitting
- The amount of regularization to apply during learning is controlled by a hyper parameter
- A hyper parameter s a parameter of a learning algorithm (not of the model)
- Must be set prior to training and remains constant during training
Undercutting the Training Data
Undercutting the Training Data
- Model is too simple to learn the underlying structure of the data
- Reality is more complex than the model so predictions are inaccurate
- Can be fixed by:
- Selecting a more powerful model, with more parameters
- Feeding better features to the learning algorithm (feature engineering)
- Reducing the constraints on the model (reducing the regularization hyperparameter)
Testing and Validating
Testing and Validating
Split data into training set and test set
- Error rate on new cases is called generalization error (or out-of-sample error)
- If training error is low and out-of-sample error (or test error) is high, its overfitting
- A second holdout set called the validation set
- Train models with various hyper parameters using training set
- Select model and hyper parameters that perform best on validation set
- Test against test set to get estimate of generalization error
- Common technique is cross validation
- The training set is split into complementary subset and each model is trained against a different combination of these subsets and validated again the remaining parts
Chapter 1 Summary
SUMMARY
- Machine learning is making machines get better at some task by learning from data instead of having to explicitly code rules
- Many types of ML systems
- Feed training set to a learning algorithm
- If model-based, it tunes some parameters to fit the model to the training set then it will make good predictions on new cases
- If instance-based, it learns the examples by heart and uses a similarly measure to generalize to new instances
- System will not perform well if your training set is:
- Too small
- Not representative
- noisy
- Polluted with irrelevant features
- Model needs to be neither too simple nor too complex
Types of databases
- Relational databases store data with a defined schema. These are commonly used for transactional and traditional applications
- Key-value databases are optimized to store and retrieve key-value pairs in large volumes and in milliseconds, without the performance overhead and scale limitations of relational databases
- Document databases are designed to store data as documents. Data is typically represented as a readable document
- In-memory databases are used for read-heavy and compute-intensive applications that require low-latency access to data
- Graph databases are used for applications that need to enable users to query and navigate relationships between highly connected datasets
RDS - Relational Database Service
Aurora
DynamoDB
DocumentDB
ElastiCache
Neptune
Redshift
1. Introduction
Two definitions of Machine Learning:
- “The field of study that gives computers the ability to learn without being explicitly programmed.” - Arthur Samuel (older, informal definition)
- “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.” - Tom Mitchell (more modern definition)
Generally speaking, any machine learning problem can be put into one of two broad classifications:
- Supervised learning
- You are given a data set and already know what the correct output should look like as there is a relationship between the input and output
- Can be further categorized into:
- Regression Problems - trying to map input variables to some continuous function, predicting results within a continuous output
- Classification Problems - trying to map input variables into discrete categories, predicting results in a discrete output
- Unsupervised learning
- Allows you to approach problems with little or no idea what our output should look like, we are deriving structure from data where we do not know the effects of the variables
- This structure can be derived from clustering data based upon relationships among the variables in the data
2. Model and Cost Function
(xi,yi) -> training set for the model x -> input and y-> output with i = 0, 1, … , m h -> hypothesis
Linear Regression with One Variable
Hypothesis: h𝜃 (x) = 𝜃0 + 𝜃1x 𝜃i’s: Parameters
Choose 𝜃0 , 𝜃1 so that h𝜃 (x) is close to y for our training examples (x,y) Minimization problem of 𝜃0 , 𝜃1 : ( 1 / 2m) * Sum[i to m] of (h𝜃 (x) - y ) 2 This is the same as the Cost Function (or the Squared Error Function/Mean squared error)
3. Parameter Learning
Note: It must be a simultaneous update for each theta-i
Gradient descent works to take the derivative of our cost function in order to get a direction to move. The size of each step is determined by alpha - called the learning rate.
Derivative is the partial derivative of the Cost Function with respect to theta-i
It is finished once you have reached the bottom of the graph
4. Multivariate Linear Regression
Multivariate Linear Regression is just Linear Regression with more than one variable
An easy way to write the above formula is simply by h𝜃 (x) = 𝜃(transpose) * x
We need to change our Gradient Descent algorithm in order to use multiple variables. Not that for each theta-i, we are just taking the partial derivative of the cost function which gives us the xj term, removes the squared and cancels out the 2 in the 1/2m term.
Be sure to fully understand the equation:
- Theta term - represents the different weights or parameters we associate with a given feature
- Alpha - learning rate or step size
- n - number of features
- h(x) - hypothesis equation, what we are guessing to fit the data
- m - number of training sets
- x(i) - feature values from training set
- y(i) - predicted values from training set
- J(theta) - cost function or squared error function, telling you the difference between true values and model
- *** We are taking the partial derivative of J(theta) with respect to theta-j ***
Feature Scaling: If we have our features on a similar scale then gradient descents can converge more quickly. We can easily scale them by dividing the features by their maximum values, which would make them shrink to between 0 <= x <= 1. ‘Get every feature into approximately a: -1 <= xi <= 1 range.’ - not a hard set rule, as long as it is a small range that is not astronomically large or small. Try to keep between -3 to 3 or -1/3 to 1/3.
Mean Normalization: Replace xi with xi-ui to make features have approximately zero mean (Do not apply to x0 = 1). Basically just making the features have around an average mean of zero so subtract by middle value (avg value of x) and divide by range (max-min).
A good way to debug gradient descent is to plot the cost function over the number of iterations. If J(theta) is increasing then use a smaller alpha.
If alpha is sufficiently small, then J(theta) will decrease on every iteration.
Below is another tip to improve Gradient Descent:
Note: if we do try to fit our model with a quadratic, cubic or square root we do need to feature scale because these ranges will compound on one another with the squared and cubed values.
5. Computing Parameters Analytically
Normal equation: Method to solve for Theta analytically
To minimize - take derivative and set to zero
***taken from https://eli.thegreenplace.net/2014/derivation-of-the-normal-equation-for-linear-regression to calculate the derivation ***
*** to full understand the last step of “we derive by each component of the vector, and then combine the resulting derivatives into a vector again.” See https://eli.thegreenplace.net/2015/the-normal-equation-and-matrix-calculus/ *** -it involves the use of Matrix Identities, Symmetric Matrices and a knowledgeable understanding of partial derivatives and Matrices in general
THERE IS NO NEED TO DO FEATURE SCALING WITH THE NORMAL EQUATION
Gradient Descent
- Need to choose alpha
- Needs many iterations
- O(kn^2)
- Works well even when n is large
Normal Equation
- No need to choose alpha
- Don’t need to iterate
- Need to compute inverse O(n^3)
- Slow if n is very large
If X^T *X is non-invertible (singular vs degenerate) (Remember non-invertible is due to redundant features [linearly dependent in mathematic terms] or too many features [m<=n] where we either delete some features or use regularization)
6. Hypothesis Representation and Decision Boundary
Statistical Logistic/Sigmoid Function with asymptotes at y = 0 and y = 1 towards negative infinity and infinity respectively.
NOTE: Remember that the decision Boundary is a trait of the hypothesis not the training sets. It comes from theta = …
7. Logistic Regression Model
-log(z) if y = 1 and -log(1-z) if y = 0 are very useful for optimizing our cost function as they allow us to ‘punish’ our learning algorithm for values that deviate from our expected values due how they approach infinity along the y axis.
This also ensures our cost function will be convex so that we do not have any local minima
We can rewrite our cost function so that it only contains one line. *Remember that y can only equal 0 or 1 always as this is the mathematical definition of y
Cost(h(x),y) = -ylog(h(x)) - (1-y)log (1 - h(x))
Optimization algorithms:
- Gradient descent
- Conjugate gradient
- BFGS
- L-BFGS
Advantages (conjugate, BFGS, L-BFGS)
- No need to manually pick alpha
- Often faster than gradient decent Disadvantages:
- More complex
We have more efficient and sophisticated algorithms than gradient descent but we do not need to write them ourselves, we can instead pull from Octave libraries in order to use.
Now, if we have multiple classes we want to classify, we can use the One-vs-all classification approach. We are predicting the probability that ‘y’ is a member of one of our classes while we combine all the other classes into one ‘negative’ class compared to the characteristic we are looking for.
8. Solving the Problem of Overfitting
“Underfit”/“High Bias” - using a linear prediction model when a higher order model would be more fitting “Overfit”/“High Varience” - using too high of a polynomial for a model
Overfitting: If we have too many features, the learned hypotheses may fit the training set very well but fail to generalize to new examples.
Note: This terminology is applied to both linear and logistic regression
We can help to mitigate the issue of overfitting through two different options:
- Reducing the number of features
- Regularizing the magnitude of parameters
The idea behind regularization is giving us a “simpler” hypothesis by penalizing some of the parameters to have small values so that they become less prone to overfitting.
Modify cost function to add a term at the end (regularization term) that will include lambda (regularization parameter). It controls the tradeoff of keeping theta parameters small and fitting the training set well.
If lambda is set too high, then we will underfit the data and made our theta terms too small - basically a flat line
Regularized linear regression:
Regularized logistic regression:
9. Neural Networks
We are modeling a neural network by taking in input and performing some calculation and spitting out an output.
Sigmoid (logistic) activation function.
Netwrok has multiple layers Layer 1 - input layer Layer 2 - Hidden layer (do not observe values) Layer 3 - output layer
Essentially, our parameters are now called weights but they act in the same way. There is not a lot of new information, just renaming certain aspects that we have already covered. The new edition would be the linear progression of x -> a -> h(x).
We are essentially doing the same thing as logistic regression, our features are just now a0 a1 a2 and a3 instead of just x0 x1 x2 x3
These are learned features making this pretty complex features to help add flexibility about what it feeds into our calculation for our output layer.
Architectures refer to how the neurons are connected.
Below is an example of predicting x1 AND x2 (the logical ‘and’ operator) and the x1 OR x2, either x1 is true or x2 is true or they are both true.
Now we can make a more complex logical operator (in this case the XNOR logical operator) using a neural network.
This can be taken further to have a multiclass classification by returning a vector.
10. NN Cost Function and Backpropagation
Now we want to find parameters to minimize J(theta). So we need to compute J(theta) and partial derivative of J(theta) wrt theta.
Given a training set we want to calculate the minimization of J(theta) along with its partial derivative.
If you had two training examples (x1, y1) and (x2, y2) - you would perform Forward Propagation on x1, Back Propagation on y1, then Forward on x2 and Back on y2.
Some more walkthrough on Back-Propagation.
Now we can ‘unroll’ our parameters to make it easier to use for our higher complexity algorithms
Gradient Checking: Make sure you forward/back Prop will be working correctly and accurately
Gradient Checking is very slow and costly when compared to backprop code, which is why we turn it off in order to use our backprop.
We can use random initialization in order to ensure that we have weights that will work.
Putting it all together:
11. Evaluating a Learning Algorithm
Machine Learning Diagnostic: Testing what is and isn’t working with a learning algorithm and gain guidance to how to best improve.
High Bias -> Underfit High Variance -> Overfit
As we increase the degree of polynomial d, the Training error tends to decrease. Cross validation has a more quadratic appearance. By looking at the cross validation error, we can determine if it is underfit or overfit
Bias (underfit): J train(theta) will be high and J crossVal(theta) = J trains(Theta)
Variance (overfit): J train(theta) will be low and J crossVal(theta) »
Being sure to know what is the actual problem is the main component in trying to “decide what to do next”
12. Machine Learning System Design
A major takeaway from this section is the idea to list out all the possible ways you can approach the problem. Far too often someone will come up with an idea then instantly devote 6 months of time to that when there may be an alternative approach that would be more effective.
“Use evidence to guide our learning, not gut feeling.”
“Key test that I often ask myself are first, can a human expert look at the features x and confidently predict the value of y because thats sort of a certification that y can be predicted accurately from the features x and second, can we actually get a large training set, and train the learning algorithm with a lot of parameters in the training set and if you can do both then thats more often give you a very kind performance learning algorithm.”
13. Support Vector Machines
We take logistic regression and modify it a little bit into a straight line that mimics logistic regression.
These new functions are called cost subscript 1 of z and cost subscript 0 of z (denoting when y is equal to 1 versus when y is equal to zero).
We now add a term C to be multiplied on A in order to replicate our lambda value in our regularization term. If C = 1/lambda then the equations will be equivalent.
This is our overall optimization objective function for the SVM and if we minimize that function then we have the parameters learned by the SVM.
Unlike logistic regression, the support vector machine doesn’t output the probability but outputs one if theta transpose x is greater or equal to zero and will predict zero otherwise.
Tries to establish a large margin between its decision boundary. It is also called the large margin classifier.
If only using large margin classifier, it can be susceptible to outliers.
Kernels:
14. Clustering, Dimensionality Reduction and PCA
Unsupervised learning, we give the training data X with no corresponding y values. We can do this with Clustering algorithms. K-means algorithm is a good starting example.
K-Means Clustering
- Works by placing two cluster centroids in our data set (or randomly selecting a data point)
- For this example we will choose 2 centroids, now all the other data points will join either of these cluster centroids by which one they are closer to
- Next, the centroids will move to the ‘averaged middle’ of these data points and repeat the process until they have converged
- The algorithm is outlined below into a two step process
- Cluster assignment step
- Move centroid
K-means optimization objective
- This algorithm is optimizing it’s version of the cost function in order to minimize the distance between the centroids and data points
- There is the possibility that when we are initializing our centroids they can converge at a local minimum instead of the global minimum
- In order to mitigate this, we use random initialization many times and pick the clustering that gave the lowest cost
- When choosing the value of K we can use the elbow method, but in general this is not a reliable method
- You should instead evaluate K-means based on a metric for how well it performs for that later purpose
- EX: T-shirts are pricey to order in 5 sizes instead of 3 so choose k = 3
Before we do Dimensionality Reduction, we want to do some Data preprocessing This means feature scaling and mean normalization.
Now we can reduce data from a higher dimension (say 3D) to a lower dimension (say 2D)
- Now PCA works to reduce the squared projection errors and find a vector u(1) (or u(1) and u(2) if we are going from 3D to 2D) to project onto.
- In the example below, we are projecting data with 3 coordinates onto a 2D plane
Principal Component Analysis
- This analysis works by calculating the projection error between our data points on the surface we are projecting it onto
- Then we are simply doing another optimization of this cost function
- NOTE: this is different than linear regression - LR measures error distance strictly vertically while PCA is orthogonal to the plane being projected onto
- Can account for differences despite being very visually similar
PCA Algorithm
- Reduce data from n-dim to k-dim
- Compute the “covariance matrix”:
- Compute “eigenvectors of matrix Sigma (E)
- We can do this easily by using the svd() function
- This gives us 3 matrices U, S, V where U is an n x n matrix that has these u vectors
Number of Principal Components
- In order to choose we will have to test multiple
- We take the Average Squared Projection Error and the Total Variation in the data and evaluate:
- If this is < 0.01 we say “99% of variance is retained” so we can have a set value in order to determine our k
- We can also use that SVD function from earlier and use the S matrix calculated
Application of PCA
- Compression
- Reduces memory/disk needed to store data
- Speeds up learning algorithm
- Done by choosing k by % of variance retained
- Visualization
- K = 2 or 3
- Do Not Use…
- Trying to reduce overfitting, it may work okay but regularization is a better idea
- Should always run with original/raw data and if it does not do what you want then reduce
15. Anomaly Detection and Recommender Systems
Main idea consists of creating a density estimation of the dataset. Then check whether our x-test is anomalous by determining if it is inside our density estimation ( p(x) ). We do this by comparing our value to Epsilon.
Using mean and variance we can create a distributed (normal) Gaussian from x in order to plot a bell curve. The integral of the curve will always be 1 (nature of the equation).
- As you decrease the variance, the graph gets narrower (or wider if increased)
- Mean value will determine where the graph is centered
Anomaly detection algorithm
Classifying an example (y = 0 or 1) helps us have real-number evaluation so that we can more successfully evaluate our learning algorithm. This allows us to do cross validation and use different evaluation metrics (precision, recall, F1 score)
Anomaly Detection vs. Supervised Learning
Anomaly detection
- Fraud detection
- Manufacturing (engines)
- Monitoring machines in a data center
Supervised Learning
- Email spam classification
- Weather prediction
- Cancer classification
Sometimes our features give us a graph that is a non-gaussian distribution.
- We can change our features in order to try to alter the graph to appear more gaussian
- EX: x1 = logx1 or x2 = sqrt(x2)
Multivariate Gaussian distribution
- By modeling p(x) all in one go instead of separately we can gain more information from the graph.
- Our parameters are: mean and Sigma (covariance matrix)
- This can alter our distribution to look more like an ellipse instead of a circle
Our relationship to the old model is the difference between the variance term and our new sigma term. This new term allows us to compute in one go as we are just taking the determinate of Sigma.
Recommender Systems
- We can use content-based recommender systems to use ratings to create our recommendations.
- This is going back to our gradient descent and minimization problems.
- Note: sum of I:r(I,j)=1 means that we only sum on the terms in which we have a rated content
- We also add a sum from j to n-u in order to include all the different persons recommendations
- When given x or theta, we can estimate the other
- There is also a way to minimize both simultaneously
- The optimization equations for each individually use the same gradient term, the regularization is the difference
- Therefore we add both to our equation
- This gives us our collaborative filtering algorithm
Low rank matrix factorization
- If we factor our content recommendations into matrices, we can easily create predictions
- Now we can use mean normalization in order to make recommendations on things we have little data on
16. Large Scale Machine Learning and Pipeline
Previously, we learned that in many cases it is not a question of which algorithm runs the best but instead who has more data. If we have a learning curve with high variance, we can add more training examples in order to decrease our error, while with high bias adding more data won’t do much.
Gradient Descent
- While GD can be very accurate, it can be very computationally expensive if we have say 300,000,000 training examples as we have to run through all 300 million before we move one step
- Normal Gradient Descent is usually called Batch gradient descent
Stochastic Gradient Descent
- Here we only take one training example and compute the gradient
- This may not take us directly to the global minimum but its avg will
- Especially if we gradually lower our step size
Mini Batch
- A sort-of in-between method, we do not use all examples nor only 1 but instead b examples (maybe b = 10 or = 50)
Checking for Convergence
Online Learning: We get information corresponding to a user/content and forever update our cost function in order to better optimize our algorithm as new data comes in
Map Reduce We are basically splitting the training set into smaller subsets and combine our results at the end. This takes pressure of our computational expense if we disperse between different machines/cores
Pipeline The idea of splitting a project into smaller tasks that can be focused on by individual groups
- We can create data using artificial data synthesis in order to increase our training data size
- Can be done by adding background noise in audio or distortion to an image Also need to evaluate how helpful doing certain tasks will actually be to the overall objective
- Does getting more data really help?
- Be sure to check if its low bias (use learning curves)
- How much work would it take to get more data?
- Collect and label it yourself can be easier that artificial data synthesis sometimes
- Also crowd sourcing
- Deciding on what part of pipeline should have most time dedicated to it
- Manually correct different aspects and see how your accuracy improves
- Can be a great way to see what requires more time and to avoid wasted time
Python Virtual Enviornment
# Creates a project folder then places virutal environment in `venv` folder inside project
mkdir project
cd project
python -m venv venv
# starts virtual environment
source venv/bin/activate
# stops virtual enviornment
deactivate
# store installed libraries by generating a text file listing dependencies
pip freeze > requirments.txt
# install all dependencies to recreate the developed environemnt
pip install -r requirements.txt
# upgrade pip
pip install --upgrade pip
Links:
Euler Problem 18
By starting at the top of the triangle below and moving to adjacent numbers on the row below, the maximum total from top to bottom is 23.
3
7 4
2 4 6
8 5 9 3
That is, 3 + 7 + 4 + 9 = 23.
Find the maximum total from top to bottom of the triangle below:
75
95 64
17 47 82
18 35 87 10
20 04 82 47 65
19 01 23 75 03 34
88 02 77 73 07 63 67
99 65 04 28 06 16 70 92
41 41 26 56 83 40 80 70 33
41 48 72 33 47 32 37 16 94 29
53 71 44 65 25 43 91 52 97 51 14
70 11 33 28 77 73 17 78 39 68 17 57
91 71 52 38 17 14 91 43 58 50 27 29 48
63 66 04 68 89 53 67 30 73 16 69 87 40 31
04 62 98 27 23 09 70 98 73 93 38 53 60 04 23
NOTE: As there are only 16384 routes, it is possible to solve this problem by trying every route. However, Problem 67, is the same challenge with a triangle containing one-hundred rows; it cannot be solved by brute force, and requires a clever method! ;o)
My Solution
def max_path_sum(triangle):
height = len(triangle)
#initialize state array
sum_triangle = []
for i in range(height):
sum_triangle.append([0 for _ in range(len(triangle[i]))])
sum_triangle[0][0] = triangle[0][0]
# Dynamic programming approach, loop through triangle and update states.
# This makes it run in polynomial time vs exponential with a recursion call.
for i in range(1,height):
for j in range(0,i+1):
# check edge cases
if (i==j):
sum_triangle[i][j] = sum_triangle[i-1][j-1] + triangle[i][j]
elif (j==0):
sum_triangle[i][j] = sum_triangle[i-1][j] + triangle[i][j]
else:
# take max of prior two states and add new value
sum_triangle[i][j] = max(sum_triangle[i-1][j-1], sum_triangle[i-1][j]) + triangle[i][j]
return max(sum_triangle[height-1])
max_path_sum(triangle)
Answer: 1,074
Euler Problem 19
You are given the following information, but you may prefer to do some research for yourself.
- 1 Jan 1900 was a Monday.
- Thirty days has September,
April, June and November.
All the rest have thirty-one,
Saving February alone,
Which has twenty-eight, rain or shine.
And on leap years, twenty-nine. - A leap year occurs on any year evenly divisible by 4, but not on a century unless it is divisible by 400.
How many Sundays fell on the first of the month during the twentieth century (1 Jan 1901 to 31 Dec 2000)?
My Solution
def leap_year(year):
if (year%4==0):
if (year%400==0):
return True
elif(year%100==0):
return False
return True
return False
def num_sum_on_first():
sunday_count=0
years = list(range(1901,2001))
# First Days of the month no leap year
first = [1,32,60,91,121,152,182,213,244,274,305,335]
# Change to day of the week
first = [x%7 for x in first]
# Jan 1, 1901 was a Tuesday, so values with remainder 6 will be Sundays.
# 365 % 7 = 1 so starting day will shift by one each year (2 on leap years).
for year in years:
# Adjust first dates on leap years
if leap_year(year):
first = [x%7+1 for x in first]
first[0] = first[0]-1
first[1] = first[1]-1
# For each first of the month, check against sunday value
sunday_count+=first.count(6)
# Increment for change of year
first = [x%7+1 for x in first]
return sunday_count
num_sum_on_first()
Answer: 171