Red Hat OpenStack Administration II: Day 2 Operations for Cloud Operators
CL210
Course Objectives and Structure
Chapter 1: Navigating the Red Hat OpenStack Platform Architecture
Goal: Describe the undercloud and overcloud architecture and service components.
Objectives:
-
Describe basic Red Hat OpenStack Platform architecture and terminology.
-
Describe containerized services and manage containers using basic commands.
-
Describe the core Red Hat OpenStack components, manage undercloud services, and view the undercloud structure.
-
Describe the core Red Hat OpenStack components, manage overcloud services, and view the overcloud structure.
Introducing the Red Hat OpenStack Platform Infrastructure
Introducing Cloud Personas
OpenStack persona ecosystem
OpenStack Personas in this Course
Interpreting the Classroom Environment
Completed classroom overcloud
Guided Exercise: Navigating the Red Hat Open Stack Platform Infrastructure
Introducing Containerized Services
Introducing Containerized Services
Managing containers with systemd services
Log and Configuration File Locations
Guided Exercise: Viewing Containerized Service Structures
Describing the Undercloud
Introducing the Undercloud
Power Management on the Undercloud
Guided Exercise: Viewing the Undercloud Architecture
Introducing the Overcloud
Viewing the Overcloud Network
Completed classroom overcloud structure
Working with Service Containers
Guided Exercise: Viewing the Overcloud Architecture
Lab: Navigating the Red Hat OpenStack Platform Architecture
Summary
- Red Hat uses personas to focus training on relevant user tasks and behavior. This course uses the cloud operator persona to define Red Hat OpenStack Platform operations and use cases. A cloud operator's scope is to manage one or more OpenStack overclouds.
- Each of the main Red Hat OpenStack Platform services run in containers. Running services in containers provides a method to keep each service within its own isolated namespace separated from the host. The systemd service interface manages the life cycle of the containerized services.
Summary (continued)
- Enterprise clouds today are built using multiple, interconnected cloud structures. The undercloud is a provisioning and management cloud for building and managing the production clouds. Red Hat OpenStack Platform Director is the undercloud in Red Hat OpenStack Platform.
- An enterprise production cloud is known as an overcloud. Underclouds and overclouds use the same technologies, but manage different workloads. Underclouds manage cloud infrastructure, and overclouds manage production and tenant workloads.
Chapter 2: Operating the OpenStack Control Plane
Goal: Describe and manage the critical core services of the OpenStack control plane.
Objectives:
-
Identify the shared services running on a controller node.
-
Manage messages and the message broker.
-
Back up and restore the control plane.
Identifying Control Plane Services and Operation
The constructed nova-novncproxy instance-specific URL
Guided Exercise: Viewing Control Plane Services and Operation
Describing Component Communication
Message Broker Exchange Concepts
A work queue distributes messages to worker consumers
Exchanges and queues use bindings to establish routing connections
Exchanges route topic messages using user-specified pattern matching
RPC calls are implemented using a preset direct exchange for the RPC response
Configuration and Log Files
Troubleshooting Service Communication over Messaging
Tracing RabbitMQ Messages
Guided Exercise: Describing Component Communication
Backing up and Restoring the Control Plane
Tools for Backing up and Restoring the Control Plane
Preparing the Backup and Control Plane Nodes
Backing up the Control Plane
Restoring the Control Plane
Guided Exercise: Backing up and Restoring the Control Plane
Lab: Operating the OpenStack Control Plane
Summary
- API endpoints are discoverable through the use of a service catalog. Endpoints are segregated by networks to avoid exposure of sensitive information.
- All Red Hat OpenStack Platform components communicate with each other using an AMQP-compliant messaging system. RabbitMQ is the default, however the use of other messaging back ends is possible. The components communicate using RPC implemented on top of the messaging system. The Oslo Messaging library abstracts the interprocess communication away from AMQP so that developers have no need to understand it.
Summary (continued)
- RabbitMQ has a tracing feature which allows administrators to view messages passing through the messaging system. Tracing adds overhead and so should only be enabled when required.
- There are many services that support the main components. They allow components to store data in a relational or nonrelational database, perform caching in memory to increase performance, and enable users to connect to instance consoles.
Chapter 3: Managing Infrastructure Security
Goal: Protect the OpenStack infrastructure by securing service component files and endpoints.
Objectives:
Managing End-to-end Secure Services
End-to-end Secure Services
Quiz: Managing End-to-end Secure Services
Managing File-based Component Security with AIDE
Advanced Intrusion Detection Environment
Guided Exercise: Managing File-based Component Security with AIDE
Summary
- TLS-everywhere may be deployed on new and pre-existing OSP environments. It is used to provide end-to-end SSL/TLS for internal and external endpoints.
- To deploy TLS-everywhere a properly configured IdM server with DNS functionality is required.
- AIDE is used to detect file system changes by comparing hashes against a database.
- The AIDE database is located at
/var/lib/aide/aide.db.gz
by default but can be moved to network storage and set up as read-only for security purposes.
Chapter 4: Managing OpenStack User Security
Goal: Configure secure user privileges using domain-based identity management, scoped roles, and project organization.
Objectives:
-
Describe the installation and architecture of a Red Hat Identity Management back end for the OpenStack identity service.
-
Manage user tokens to implement user authorization for accessing OpenStack services.
-
Manage project quotas, domains, hierarchies, and groups.
-
Customize user roles.
Managing an Integrated IdM Back-end Configuration
Identity Service Architecture
Configuring IdM Integration with TripleO
Guided Exercise: Creating an OpenStack Domain for a Client Organization
Managing Identity Service Tokens
Guided Exercise: Managing Identity Service Tokens
Managing Project Organization
Guided Exercise: Managing Project Organization
Describing OpenStack Policies
Viewing the Default Policy
Overriding the Default Policy
Guided Exercise: Customizing User Roles
Lab: Managing OpenStack User Security
Summary
- The Identity service can be configured to authenticate users from external IdM environments.
- To perform user credential authentication using a Red Hat IdM back end, the Identity service requires an LDAP look-up account on the IdM server.
- There are four types of token providers: UUID, PKI, PKIZ, and Fernet. Fernet tokens are enabled by default since Red Hat OpenStack Platform 12.
Summary (continued)
- Fernet tokens have a maximum limit of 250 bytes, which makes them small enough to be ideal for API calls and minimize the data kept on disk. Each Fernet token is composed of two smaller keys: a 128-bit AES encryption key and a 128-bit SHA256 HMAC signing key.
- Domains provide granularity in the authorization model. With domains, the resource mapping can be summarized as "a domain consists of users and projects, within which users can have roles at the project and domain level."
Chapter 5: Managing Application Deployment Resources
Goal: Create and manage the common, shared deployment resources, including custom images, flavors, and metadata services.
Objectives:
-
Describe the common image formats used within Red Hat OpenStack Platform.
-
Build an image using diskimage-builder.
-
Customize an image using Virt-customize.
-
Customize launched instances during deployment using Cloud-init.
Image Size Considerations
Images in OpenStack Services
Quiz: Comparing Image Formats
Building Customized Images
Diskimage-builder Architecture
Diskimage-builder Phase Subdirectories
Diskimage-builder Environment Variables
Diskimage-builder Options
Diskimage-builder Execution
Guided Exercise: Building Customized Images
Making Minor Image Customizations
Guestfish and Virt-customize Internals
Guided Exercise: Customizing Images
Initializing an Instance During Deployment
Reviewing Cloud-init Basics
Providing User Data with the Config Drive Method
Describing the Metadata Service Architecture
Cloud-init communication to the Compute Metadata API
Describing the Cloud-init Structure
Guided Exercise: Initializing an Instance During Deployment
Lab: Managing Application Deployment Resources
Summary
- The advantages and disadvantages of building an image as opposed to customizing an existing one, such as meeting organization security standards, including third-party agents, and adding operator accounts.
- When to use the Guestfish and Virt-customize tools. Use Guestfish when you need to perform low-level tasks such as partitioning disks, and use use Virt-customize for all common customization tasks such as setting passwords and installing packages.
Summary (continued)
- Making changes to an image using these tools affects SELinux file contexts, because SELinux is not supported directly in the
chroot
environment.
- To avoid image sprawl, use a smaller set of images and perform per-instance customization using Cloud-init or a configuration management system as required.
Chapter 6: Managing Storage in Red Hat OpenStack Platform
Goal: Describe and manage storage architecture and components, with an emphasis on Red Hat Ceph Storage back ends.
Objectives:
-
Describe back-end storage options for OpenStack services.
-
Discuss the Swift and Ceph object stores, comparing architectural considerations.
-
Describe the multiple methods for configuring the new shared file-systems component.
-
Interpret the behaviour of ephemeral storage configuration choices.
Implementing Block Storage
Storage in Red Hat OpenStack Platform
The Red Hat Ceph Storage Architecture
Ceph cluster RADOS back end
Ceph Storage Back-end Components
Ceph component placement on overcloud nodes
Authentication with Cephx
OpenStack image service with Red Hat Ceph Storage
OpenStack Block Storage Service with Red Hat Ceph Storage
Viewing the Ceph Cluster Status
Managing Ceph-backed Persistent Volumes
Introducing Cinder Multiattach
Guided Exercise: Implementing Block Storage
The OpenStack Object Store Architecture
The OpenStack object store architecture
Comparing Ceph with Swift for Object Storage
Eventual Consistency as an Advantage
Benefits, Use Cases, and Recommended Practices
Describing the Storage Policy
Configuration and Log Files
OpenStack Services That Use the Object Store
Guided Exercise: Comparing Object Stores
Managing Shared File Systems
Describing the Shared File Systems Service (Manila)
The Shared File Systems service architecture
Shared File Systems Service Resources
NFS-Ganesha Mediating Storage Access
Managing Shared File Systems Using the CephFS Native Driver
Guided Exercise: Managing Shared File Systems
Managing Ephemeral and Persistent Storage
Ephemeral Versus Persistent Storage
Ephemeral Storage and Libvirt
Choosing an Instance Storage Option
Creating Persistent Root Disks
Understanding Instance Migration
Guided Exercise: Managing Ephemeral and Persistent Storage
Lab: Managing Storage in Red Hat OpenStack Platform
Summary
- Red Hat OpenStack Platform supports both Red Hat Ceph Storage and NFS as storage back ends.
- The Red Hat Ceph Storage architecture is based on monitor (MON) daemons and object storage device (OSD) daemons.
- Red Hat Ceph Storage uses the CephX authentication mechanism to identify and authorize users.
- The Red Hat OpenStack Platform block storage and image services use RBDs to access Ceph.
- The Red Hat OpenStack Platform object storage service (Swift) provides scalable object storage to the tenant users.
- The OpenStack object store architecture includes a front-end service, the proxy server, and three back-end services: the account server, the object server, and the container server.
Summary (continued)
- The OpenStack Shared File Systems service manages the shares that serve the file-based storage requirements. Users from the instances access the shares as remote file systems. The shares can be hosted on various back-end storage providers, including Red Hat Ceph Storage and Red Hat Gluster Storage.
- The data in ephemeral storage does not persist after instance termination. Persistent storage stores data persistently. Persistent storage mainly stores data intended to be reusable.
Chapter 7: Managing OpenStack Networking
Goal: Create, manage and troubleshoot data center and virtual network infrastructure.
Objectives:
-
Discuss flat, VLAN, and GENEVE network types.
-
Describe the OVN architecture and how it integrates with OVS and OpenStack Neutron.
-
Describe OVN DHCP, security groups, metadata, and load balancing.
Describing Network Protocol Types
Introducing Modular Layer 2 (ML2)
Introducing Network Types
VLAN tagging on a Linux bridge
VXLAN and GENEVE frame comparison
Guided Exercise: Configuring Provider Networks
Describing Open Virtual Networking Implementation
Introduction to Open Virtual Networking (OVN)
OVN architecture
OVN control plane architecture
Comparing ML2 with OVS to ML2 with OVN
Guided Exercise: Implementing Open Virtual Networking
Open Virtual Networking Services
Multiple conntrack zones
Guided Exercise: Open Virtual Networking Services
Lab: Managing OpenStack Networking
Summary
- Open Virtual Networking (OVN) is the SDN networking project that extends OVS to provide layer 2 and layer 3 network functions. It utilizes the GENEVE framework to create tunnels between OpenStack nodes.
- The Modular Layer 2 (ML2) plug-in is a framework that enables the use of various technologies. Administrators can interact with Open vSwitch or any vendor technology, such as Cisco equipment, using the various plug-ins available for OpenStack networking.
- Self-service and provider networks can be used in combination or exclusively to define the network connection types available to instances. Provider networks give instances a layer 2 connection to networks outside of OpenStack, whereas tenant networks require an OpenStack router to connect them to an external network.
Chapter 8: Managing Compute Node Operations
Goal: Describe the architecture and management processes for normal and hyperconverged compute nodes. Perform common operational tasks for compute nodes to support workload uptime and scaling.
Objectives:
-
Explain the launch process and discuss the compute scheduling and compute placement services.
-
Explain Red Hat Hyperconverged Infrastructure.
-
Discuss the migration process, migrate an instance, and enable and disable a compute node.
Describing the Instance Launch Process
Describing the Instance Launch Process
Scheduling process
Guided Exercise: Describing the Instance Launch Process
Introducing Red Hat Hyperconverged Infrastructure
Introduction to HCI Nodes
Hyperconverged Configuration
Host aggregates
Guided Exercise: Launching to a Hyperconverged Compute Node
Administering Compute Nodes
Introduction to Migration and Evacuation
Guided Exercise: Administering Compute Nodes
Lab: Managing Compute Node Operations
Summary
- The instance launch process.
- The scheduler process receives resource requests from the conductor. The request is then sent to the placement service which runs a query to find available compute nodes that can satisfy the resource requirements. The list is then sent back to the scheduler service. The scheduler uses filters to refine the search and creates a ranked list of compute nodes. The scheduler then chooses the first one from the list and the conductor attempts to build the instance.
- Red Hat OpenStack Platform included the placement service in version 13. A RESTful API is implemented to address the problems of shared storage. Generic resource pools solve the problem of incorrect capacity and usage information.
Summary (continued)
- Hyperconverged nodes combine both compute and storage on the same node. Ceph is always used as the storage component in hyperconverged nodes. Hyperconverged storage is cheaper and more flexible than standard storage nodes.
- Resource constraints are configured to ensure that Ceph and Compute services do not interfere with each other.
- Migration is the process of moving a server instance from one compute node to another. Evacuation normally occurs when a compute node goes into failure or shutdown mode. In this case all instances on the compute node are moved to another. Instances can be migrated while running either using shared or block storage.
Chapter 9: Monitoring OpenStack Operations
Goal: Describe the monitoring framework and architecture to gather, store and use operational metrics.
Objectives:
Implementing the Service Telemetry Framework
Introduction to the Service Telemetry Framework
Service Telemetry Framework Architecture Overview
Analyzing and Visualizing Metrics
Grafana Quickview dashboard
Grafana Network Interfaces dashboard
Grafana CPU dashboard
Grafana Memory dashboard
Grafana File System dashboard
Grafana Disk dashboard
Summary
- The core components of the Service Telemetry Framework and its workflow.
- How to verify client-side communication to the STF storage domain.
- How to navigate the Grafana dashboard user interface.
- How to use
PromQL
queries to generate custom dashboards in Grafana.
Chapter 10: Automating Cloud Applications
Goal: Implement and deploy complex and scalable applications using automation technologies.
Objectives:
-
Describe the architecture and implementation of orchestration in OpenStack cloud-scale computing.
-
Create orchestration templates for deploying single servers, multiple resources, and mass-scale applications as stacks.
-
Create Ansible Playbooks for deploying single servers, multiple resources, and applications.
Managing Cloud-scale Application Deployment
Application Automation at Scale
Orchestration Service Architecture
The Orchestration Service architecture
Quiz: Managing Cloud-scale Application Deployment
Writing Heat Orchestration Templates
Heat Orchestration Template Syntax
Instance Configuration Using Orchestration
Software deployment processing
Guided Exercise: Writing Heat Orchestration Templates
Deploying Applications Using Ansible
Introduction to OpenStack Modules
Running OpenStack Modules
Using Playbooks to Automate Operations
Guided Exercise: Deploying Applications Using Ansible
Lab: Automating Cloud Applications
Summary
- The Orchestration Service (Heat) provides developers and system administrators a repeatable way to create and manage a collection of related OpenStack resources.
- The Orchestration API Service forwards requests to the Orchestration engine service using remote procedure calls (RPCs) over AMQP.
- The Orchestration engine service interprets the orchestration template and launches the stack.
- When user data is changed and the orchestration stack is updated using the
openstack stack update
command, the instance is deleted and recreated using the updated user-data script.
Summary (continued)
- Ansible Playbooks provide a repeatable way to interact with OpenStack to manage its resources and create tasks to orchestrate your application needs.
- The default values for Ansible modules can differ from the default values of OpenStack commands and can change the context or characteristics of the resource being manipulated.
Chapter 11: Troubleshooting OpenStack Operations
Goal: Describe and practice the tools and methods for diagnosing and troubleshooting issues encountered during deployment and management operations.
Objectives:
-
Use recommended diagnostics and troubleshooting tools and techniques.
-
Diagnose and troubleshoot the OpenStack networking, image, and volume services.
Diagnosing OpenStack Issues
Basic Troubleshooting Methodology Tools
Verifying OpenStack Service Status
Troubleshooting Tools for Messaging Broker
Guided Exercise: Diagnosing OpenStack Issues
Troubleshooting Common Core Issues
Troubleshooting OpenStack Networking Issues
Guided Exercise: Troubleshooting Common Core Issues
Lab: Troubleshooting OpenStack Operations
Summary
- The
openstack
client command includes a --debug
option, which allows you to see the request and response payloads. The output can also be logged to a file for later analysis using the --log-file
option.
- Network issues may require troubleshooting at the physical layer or at the SDN layer. Common commands for network troubleshooting include
ip link show
, tcpdump
, and ovs-vsctl
.
- OVN implements all of its functions using OpenFlow rules. These rules can be difficult to navigate; the
ovn-trace
and ovs-appctl ofproto/trace
commands allow administrators to trace the path for a packet with a given set of properties.
- The Compute scheduler component selects compute nodes based on an algorithm. The
no valid host
error means that the Compute scheduler has not identified a compute node that can provide the resources required by the instance.
Summary (continued)
- To access an instance using a floating IP address, both the external network associated with that floating IP address and the internal network to which the instance is connected must be connected using a router.
- Images can be protected to prevent accidental deletion.
- To troubleshoot issues when launching instances, verify that all the required user and project infrastructure, such as networks, routers, and key pairs, are in place first.
- The OpenStack block storage service requires that the
openstack
user has read, write, and execute capabilities in both the volumes
and the images
pool in Ceph.
Chapter 12: Comprehensive Review
Reviewing Red Hat OpenStack Administration II: Day 2 Operations for Cloud Operators
Lab: Building a Custom Image
Lab: Managing Project Organization
Lab: Launching an Application Stack