Red Hat OpenStack Administration II: Day 2 Operations for Cloud Operators

CL210

Welcome

Course Objectives and Structure

Schedule

Chapter 1: Navigating the Red Hat OpenStack Platform Architecture

Goal: Describe the undercloud and overcloud architecture and service components.

Objectives:

Describe basic Red Hat OpenStack Platform architecture and terminology.
Describe containerized services and manage containers using basic commands.
Describe the core Red Hat OpenStack components, manage undercloud services, and view the undercloud structure.
Describe the core Red Hat OpenStack components, manage overcloud services, and view the overcloud structure.

Introducing the Red Hat OpenStack Platform Infrastructure

Introducing Cloud Personas

OpenStack persona ecosystem

OpenStack Personas in this Course

Interpreting the Classroom Environment

Completed classroom overcloud

Guided Exercise: Navigating the Red Hat Open Stack Platform Infrastructure

Introducing Containerized Services

Container Commands

Managing containers with systemd services

Log and Configuration File Locations

Guided Exercise: Viewing Containerized Service Structures

Describing the Undercloud

Introducing the Undercloud

Power Management on the Undercloud

Guided Exercise: Viewing the Undercloud Architecture

Describing the Overcloud

Introducing the Overcloud

Managing the Overcloud

Viewing the Overcloud Network

Completed classroom overcloud structure

Working with Service Containers

Managing Storage Nodes

Guided Exercise: Viewing the Overcloud Architecture

Lab: Navigating the Red Hat OpenStack Platform Architecture

Summary

Red Hat uses personas to focus training on relevant user tasks and behavior. This course uses the cloud operator persona to define Red Hat OpenStack Platform operations and use cases. A cloud operator's scope is to manage one or more OpenStack overclouds.
Each of the main Red Hat OpenStack Platform services run in containers. Running services in containers provides a method to keep each service within its own isolated namespace separated from the host. The systemd service interface manages the life cycle of the containerized services.

Summary (continued)

Enterprise clouds today are built using multiple, interconnected cloud structures. The undercloud is a provisioning and management cloud for building and managing the production clouds. Red Hat OpenStack Platform Director is the undercloud in Red Hat OpenStack Platform.
An enterprise production cloud is known as an overcloud. Underclouds and overclouds use the same technologies, but manage different workloads. Underclouds manage cloud infrastructure, and overclouds manage production and tenant workloads.

Chapter 2: Operating the OpenStack Control Plane

Goal: Describe and manage the critical core services of the OpenStack control plane.

Objectives:

Identify the shared services running on a controller node.
Manage messages and the message broker.
Back up and restore the control plane.

Identifying Control Plane Services and Operation

Shared Services Overview

MariaDB

Redis

Memcached

Pacemaker

Ceph MON and MDS

NoVNC and SPICE

The constructed nova-novncproxy instance-specific URL

Guided Exercise: Viewing Control Plane Services and Operation

Describing Component Communication

RabbitMQ Overview

Message Broker Exchange Concepts

A work queue distributes messages to worker consumers

Exchanges and queues use bindings to establish routing connections

Exchanges route topic messages using user-specified pattern matching

RPC calls are implemented using a preset direct exchange for the RPC response

Oslo Messaging Library

Configuration and Log Files

Troubleshooting Service Communication over Messaging

Tracing RabbitMQ Messages

Guided Exercise: Describing Component Communication

Backing up and Restoring the Control Plane

Tools for Backing up and Restoring the Control Plane

Preparing the Backup and Control Plane Nodes

Backing up the Control Plane

Restoring the Control Plane

Guided Exercise: Backing up and Restoring the Control Plane

Lab: Operating the OpenStack Control Plane

Summary

API endpoints are discoverable through the use of a service catalog. Endpoints are segregated by networks to avoid exposure of sensitive information.
All Red Hat OpenStack Platform components communicate with each other using an AMQP-compliant messaging system. RabbitMQ is the default, however the use of other messaging back ends is possible. The components communicate using RPC implemented on top of the messaging system. The Oslo Messaging library abstracts the interprocess communication away from AMQP so that developers have no need to understand it.

Summary (continued)

RabbitMQ has a tracing feature which allows administrators to view messages passing through the messaging system. Tracing adds overhead and so should only be enabled when required.
There are many services that support the main components. They allow components to store data in a relational or nonrelational database, perform caching in memory to increase performance, and enable users to connect to instance consoles.

Chapter 3: Managing Infrastructure Security

Goal: Protect the OpenStack infrastructure by securing service component files and endpoints.

Objectives:

Describe the process required for configuring TLS-e.
Deploy and configure AIDE.

Managing End-to-end Secure Services

End-to-end Secure Services

Quiz: Managing End-to-end Secure Services

Managing File-based Component Security with AIDE

Advanced Intrusion Detection Environment

Guided Exercise: Managing File-based Component Security with AIDE

Summary

TLS-everywhere may be deployed on new and pre-existing OSP environments. It is used to provide end-to-end SSL/TLS for internal and external endpoints.
To deploy TLS-everywhere a properly configured IdM server with DNS functionality is required.
AIDE is used to detect file system changes by comparing hashes against a database.
The AIDE database is located at /var/lib/aide/aide.db.gz by default but can be moved to network storage and set up as read-only for security purposes.

Chapter 4: Managing OpenStack User Security

Goal: Configure secure user privileges using domain-based identity management, scoped roles, and project organization.

Objectives:

Describe the installation and architecture of a Red Hat Identity Management back end for the OpenStack identity service.
Manage user tokens to implement user authorization for accessing OpenStack services.
Manage project quotas, domains, hierarchies, and groups.
Customize user roles.

Managing an Integrated IdM Back-end Configuration

Identity Service Architecture

Authentication Methods

Installing Red Hat IdM

Red Hat IdM Integration

Configuring IdM Integration with TripleO

Guided Exercise: Creating an OpenStack Domain for a Client Organization

Managing Identity Service Tokens

Authentication Tokens

Token Providers

Token Administration

Guided Exercise: Managing Identity Service Tokens

Managing Project Organization

Projects

Guided Exercise: Managing Project Organization

Customizing User Roles

Describing OpenStack Policies

Viewing the Default Policy

Overriding the Default Policy

Guided Exercise: Customizing User Roles

Lab: Managing OpenStack User Security

Summary

The Identity service can be configured to authenticate users from external IdM environments.
To perform user credential authentication using a Red Hat IdM back end, the Identity service requires an LDAP look-up account on the IdM server.
There are four types of token providers: UUID, PKI, PKIZ, and Fernet. Fernet tokens are enabled by default since Red Hat OpenStack Platform 12.

Summary (continued)

Fernet tokens have a maximum limit of 250 bytes, which makes them small enough to be ideal for API calls and minimize the data kept on disk. Each Fernet token is composed of two smaller keys: a 128-bit AES encryption key and a 128-bit SHA256 HMAC signing key.
Domains provide granularity in the authorization model. With domains, the resource mapping can be summarized as "a domain consists of users and projects, within which users can have roles at the project and domain level."

Chapter 5: Managing Application Deployment Resources

Goal: Create and manage the common, shared deployment resources, including custom images, flavors, and metadata services.

Objectives:

Describe the common image formats used within Red Hat OpenStack Platform.
Build an image using diskimage-builder.
Customize an image using Virt-customize.
Customize launched instances during deployment using Cloud-init.

Comparing Image Formats

Common Image Formats

Image Size Considerations

Images in OpenStack Services

Quiz: Comparing Image Formats

Building Customized Images

Building a Custom Image

Diskimage-builder Architecture

Diskimage-builder Phase Subdirectories

Diskimage-builder Environment Variables

Diskimage-builder Options

Diskimage-builder Execution

Guided Exercise: Building Customized Images

Customizing Images

Making Minor Image Customizations

Guestfish and Virt-customize Internals

Use Cases

Additional Tools

Guided Exercise: Customizing Images

Initializing an Instance During Deployment

Reviewing Cloud-init Basics

Providing User Data with the Config Drive Method

Describing the Metadata Service Architecture

Cloud-init communication to the Compute Metadata API

Describing the Cloud-init Structure

Guided Exercise: Initializing an Instance During Deployment

Lab: Managing Application Deployment Resources

Summary

The advantages and disadvantages of building an image as opposed to customizing an existing one, such as meeting organization security standards, including third-party agents, and adding operator accounts.
When to use the Guestfish and Virt-customize tools. Use Guestfish when you need to perform low-level tasks such as partitioning disks, and use use Virt-customize for all common customization tasks such as setting passwords and installing packages.

Summary (continued)

Making changes to an image using these tools affects SELinux file contexts, because SELinux is not supported directly in the chroot environment.
To avoid image sprawl, use a smaller set of images and perform per-instance customization using Cloud-init or a configuration management system as required.

Chapter 6: Managing Storage in Red Hat OpenStack Platform

Goal: Describe and manage storage architecture and components, with an emphasis on Red Hat Ceph Storage back ends.

Objectives:

Describe back-end storage options for OpenStack services.
Discuss the Swift and Ceph object stores, comparing architectural considerations.
Describe the multiple methods for configuring the new shared file-systems component.
Interpret the behaviour of ephemeral storage configuration choices.

Implementing Block Storage

Storage in Red Hat OpenStack Platform

Block Storage Choices

The Red Hat Ceph Storage Architecture

Ceph cluster RADOS back end

Ceph Terminology

Ceph Storage Back-end Components

Ceph component placement on overcloud nodes

Authentication with Cephx

The Key-ring File

Command-line Tools

Authorization with Cephx

Cephx Capabilities

User Management

OpenStack image service with Red Hat Ceph Storage

OpenStack Block Storage Service with Red Hat Ceph Storage

Viewing the Ceph Cluster Status

Managing Ceph-backed Persistent Volumes

Introducing Cinder Multiattach

Guided Exercise: Implementing Block Storage

Comparing Object Stores

The OpenStack Object Store Architecture

The OpenStack object store architecture

Comparing Ceph with Swift for Object Storage

Eventual Consistency as an Advantage

The Rings

Benefits, Use Cases, and Recommended Practices

Describing the Storage Policy

Swift Commands

Configuration and Log Files

OpenStack Services That Use the Object Store

Guided Exercise: Comparing Object Stores

Managing Shared File Systems

Describing the Shared File Systems Service (Manila)

The Shared File Systems service architecture

Shared File Systems Service Resources

NFS-Ganesha Mediating Storage Access

Managing Shared File Systems Using the CephFS Native Driver

Guided Exercise: Managing Shared File Systems

Managing Ephemeral and Persistent Storage

Ephemeral Versus Persistent Storage

Ephemeral Storage and Libvirt

Choosing an Instance Storage Option

Creating Persistent Root Disks

Understanding Instance Migration

Guided Exercise: Managing Ephemeral and Persistent Storage

Lab: Managing Storage in Red Hat OpenStack Platform

Summary

Red Hat OpenStack Platform supports both Red Hat Ceph Storage and NFS as storage back ends.
The Red Hat Ceph Storage architecture is based on monitor (MON) daemons and object storage device (OSD) daemons.
Red Hat Ceph Storage uses the CephX authentication mechanism to identify and authorize users.
The Red Hat OpenStack Platform block storage and image services use RBDs to access Ceph.
The Red Hat OpenStack Platform object storage service (Swift) provides scalable object storage to the tenant users.
The OpenStack object store architecture includes a front-end service, the proxy server, and three back-end services: the account server, the object server, and the container server.

Summary (continued)

The OpenStack Shared File Systems service manages the shares that serve the file-based storage requirements. Users from the instances access the shares as remote file systems. The shares can be hosted on various back-end storage providers, including Red Hat Ceph Storage and Red Hat Gluster Storage.
The data in ephemeral storage does not persist after instance termination. Persistent storage stores data persistently. Persistent storage mainly stores data intended to be reusable.

Chapter 7: Managing OpenStack Networking

Goal: Create, manage and troubleshoot data center and virtual network infrastructure.

Objectives:

Discuss flat, VLAN, and GENEVE network types.
Describe the OVN architecture and how it integrates with OVS and OpenStack Neutron.
Describe OVN DHCP, security groups, metadata, and load balancing.

Describing Network Protocol Types

Introducing Modular Layer 2 (ML2)

Introducing Network Types

VLAN tagging on a Linux bridge

VXLAN and GENEVE frame comparison

Guided Exercise: Configuring Provider Networks

Describing Open Virtual Networking Implementation

Introduction to Open Virtual Networking (OVN)

OVN architecture

OVN control plane architecture

Comparing ML2 with OVS to ML2 with OVN

Guided Exercise: Implementing Open Virtual Networking

Open Virtual Networking Services

OVN Gateway Router

Security Groups

Multiple conntrack zones

Describing OVN DHCP

OVN Load Balancing

OVN Metadata

Guided Exercise: Open Virtual Networking Services

Lab: Managing OpenStack Networking

Summary

Open Virtual Networking (OVN) is the SDN networking project that extends OVS to provide layer 2 and layer 3 network functions. It utilizes the GENEVE framework to create tunnels between OpenStack nodes.
The Modular Layer 2 (ML2) plug-in is a framework that enables the use of various technologies. Administrators can interact with Open vSwitch or any vendor technology, such as Cisco equipment, using the various plug-ins available for OpenStack networking.
Self-service and provider networks can be used in combination or exclusively to define the network connection types available to instances. Provider networks give instances a layer 2 connection to networks outside of OpenStack, whereas tenant networks require an OpenStack router to connect them to an external network.

Chapter 8: Managing Compute Node Operations

Goal: Describe the architecture and management processes for normal and hyperconverged compute nodes. Perform common operational tasks for compute nodes to support workload uptime and scaling.

Objectives:

Explain the launch process and discuss the compute scheduling and compute placement services.
Explain Red Hat Hyperconverged Infrastructure.
Discuss the migration process, migrate an instance, and enable and disable a compute node.

Describing the Instance Launch Process

Scheduling process

Placement Service

Guided Exercise: Describing the Instance Launch Process

Introducing Red Hat Hyperconverged Infrastructure

Introduction to HCI Nodes

Hyperconverged Configuration

Host Aggregates

Host aggregates

Guided Exercise: Launching to a Hyperconverged Compute Node

Administering Compute Nodes

Introduction to Migration and Evacuation

Storage Migration

Troubleshooting

Guided Exercise: Administering Compute Nodes

Lab: Managing Compute Node Operations

Summary

The instance launch process.
The scheduler process receives resource requests from the conductor. The request is then sent to the placement service which runs a query to find available compute nodes that can satisfy the resource requirements. The list is then sent back to the scheduler service. The scheduler uses filters to refine the search and creates a ranked list of compute nodes. The scheduler then chooses the first one from the list and the conductor attempts to build the instance.
Red Hat OpenStack Platform included the placement service in version 13. A RESTful API is implemented to address the problems of shared storage. Generic resource pools solve the problem of incorrect capacity and usage information.

Summary (continued)

Hyperconverged nodes combine both compute and storage on the same node. Ceph is always used as the storage component in hyperconverged nodes. Hyperconverged storage is cheaper and more flexible than standard storage nodes.
Resource constraints are configured to ensure that Ceph and Compute services do not interfere with each other.
Migration is the process of moving a server instance from one compute node to another. Evacuation normally occurs when a compute node goes into failure or shutdown mode. In this case all instances on the compute node are moved to another. Instances can be migrated while running either using shared or block storage.

Chapter 9: Monitoring OpenStack Operations

Goal: Describe the monitoring framework and architecture to gather, store and use operational metrics.

Objectives:

Describe the components of the Service Telemetry Framework architecture, as well as the workflow for platform monitoring.

Implementing the Service Telemetry Framework

Introduction to the Service Telemetry Framework

Service Telemetry Framework Architecture Overview

Monitoring Architecture

Analyzing and Visualizing Metrics

Grafana Quickview dashboard

Grafana Network Interfaces dashboard

Grafana CPU dashboard

Grafana Memory dashboard

Grafana File System dashboard

Grafana Disk dashboard

Grafana dashboards

Summary

The core components of the Service Telemetry Framework and its workflow.
How to verify client-side communication to the STF storage domain.
How to navigate the Grafana dashboard user interface.
How to use PromQL queries to generate custom dashboards in Grafana.

Chapter 10: Automating Cloud Applications

Goal: Implement and deploy complex and scalable applications using automation technologies.

Objectives:

Describe the architecture and implementation of orchestration in OpenStack cloud-scale computing.
Create orchestration templates for deploying single servers, multiple resources, and mass-scale applications as stacks.
Create Ansible Playbooks for deploying single servers, multiple resources, and applications.

Managing Cloud-scale Application Deployment

Application Automation at Scale

Orchestration Service Architecture

The Orchestration Service architecture

Quiz: Managing Cloud-scale Application Deployment

Writing Heat Orchestration Templates

Heat Orchestration Template Syntax

Instance Configuration Using Orchestration

Software deployment processing

Guided Exercise: Writing Heat Orchestration Templates

Deploying Applications Using Ansible

Ansible Automation

Introduction to OpenStack Modules

Running OpenStack Modules

Using Playbooks to Automate Operations

Guided Exercise: Deploying Applications Using Ansible

Lab: Automating Cloud Applications

Summary

The Orchestration Service (Heat) provides developers and system administrators a repeatable way to create and manage a collection of related OpenStack resources.
The Orchestration API Service forwards requests to the Orchestration engine service using remote procedure calls (RPCs) over AMQP.
The Orchestration engine service interprets the orchestration template and launches the stack.
When user data is changed and the orchestration stack is updated using the openstack stack update command, the instance is deleted and recreated using the updated user-data script.

Summary (continued)

Ansible Playbooks provide a repeatable way to interact with OpenStack to manage its resources and create tasks to orchestrate your application needs.
The default values for Ansible modules can differ from the default values of OpenStack commands and can change the context or characteristics of the resource being manipulated.

Chapter 11: Troubleshooting OpenStack Operations

Goal: Describe and practice the tools and methods for diagnosing and troubleshooting issues encountered during deployment and management operations.

Objectives:

Use recommended diagnostics and troubleshooting tools and techniques.
Diagnose and troubleshoot the OpenStack networking, image, and volume services.

Diagnosing OpenStack Issues

Basic Troubleshooting Methodology Tools

Tracing Packets in OVN

Verifying OpenStack Service Status

Troubleshooting Tools for Messaging Broker

Guided Exercise: Diagnosing OpenStack Issues

Troubleshooting Common Core Issues

Troubleshooting OpenStack Networking Issues

Image Service Issues

Managing Volumes

Guided Exercise: Troubleshooting Common Core Issues

Lab: Troubleshooting OpenStack Operations

Summary

The openstack client command includes a --debug option, which allows you to see the request and response payloads. The output can also be logged to a file for later analysis using the --log-file option.
Network issues may require troubleshooting at the physical layer or at the SDN layer. Common commands for network troubleshooting include ip link show, tcpdump, and ovs-vsctl.
OVN implements all of its functions using OpenFlow rules. These rules can be difficult to navigate; the ovn-trace and ovs-appctl ofproto/trace commands allow administrators to trace the path for a packet with a given set of properties.
The Compute scheduler component selects compute nodes based on an algorithm. The no valid host error means that the Compute scheduler has not identified a compute node that can provide the resources required by the instance.

Summary (continued)

To access an instance using a floating IP address, both the external network associated with that floating IP address and the internal network to which the instance is connected must be connected using a router.
Images can be protected to prevent accidental deletion.
To troubleshoot issues when launching instances, verify that all the required user and project infrastructure, such as networks, routers, and key pairs, are in place first.
The OpenStack block storage service requires that the openstack user has read, write, and execute capabilities in both the volumes and the images pool in Ceph.

Chapter 12: Comprehensive Review