Red Hat

Copyright © 2008 Red Hat. This material may only be distributed subject to the terms and conditions set forth in the Open Publication License, V1.0 or later with the restrictions noted below (the latest version of the OPL is presently available at http://www.opencontent.org/openpub/).

Distribution of the work or derivative of the work in any standard (paper) book form for commercial purposes is prohibited unless prior permission is obtained from the copyright holder.

Red Hat and the Red Hat "Shadow Man" logo are registered trademarks of Red Hat, Inc. in the United States and other countries.

All other trademarks referenced herein are the property of their respective owners.

The GPG fingerprint of the security@redhat.com key is:

CA 20 86 86 2B D6 9D FC 65 F6 EC C4 21 91 80 CD DB 42 A6 0E

1801 Varsity Drive
RaleighNC 27606-2072
USA
Phone: +1 919 754 3700
Phone: 888 733 4281
Fax: +1 919 754 3701
PO Box 13588
Research Triangle ParkNC 27709
USA

Abstract

Documentation for the Genome tooling


Preface
1. Document Conventions
2. We Need Feedback!
1. Genome Appliances
1.1. Appliances
1.1.1. Cloud Appliance
1.1.2. Genome Appliance
1.2. Custom Machine Types
2. Getting Started
3. Tooling
3.1. genome-replace-self
3.1.1. Usage
3.2. genome-bootstrap
3.2.1. Background
3.2.2. Features
3.2.3. Installation
3.2.4. Usage
3.2.5. Advanced Mode
3.2.6. Bootstrap Parameters
3.2.7. Post bootstrapping
3.3. genomed
3.3.1. Configuration
3.4. cloudmasterd
3.5. genome-sync
3.5.1. Usage
4. Open source technologies used with Genome
4.1. Koan
4.1.1. Background
4.1.2. Installation
4.1.3. Guest Provisioning
4.1.4. Watching the VM
4.1.5. Cleaning Up
4.1.6. Known Issues
4.2. LVM
4.3. Xen Virtualization
4.4. JBoss
4.5. Source Code Management (Git)
4.6. Configuration Management (Puppet)
4.7. General
5. Cookbook
5.1. Setting up an environment to host virtual machines
5.1.1. Using genome-replace-self
5.2. Creating your own Genome Repo Appliance
5.2.1. Installing the genome-repo RPM
5.2.2. Replicate cobbler data
5.2.3. Syncronize the git repositories
5.2.4. Define access controls
5.3. Cleaning up SSL certificates
5.4. Bootstrapping a machine that already has an OS
5.5. Adding a new machine type
5.6. Changing a machine's parameters
5.7. Change the puppet master of any given machine type
5.8. Git Recipes
6. Self Tests
6.1. LVM
6.2. Xen
6.3. Git
7. Debugging
7.1. Puppet
7.2. Puppetmaster
8. Contribute
8.1. Licensing
8.2. Design Axiom
8.3. Community
8.3.1. Please Be Friendly
8.3.2. Community Communication
8.4. Working With The Code
8.4.1. Checkout The Code
A. Revision History
A.1. Logging in to Genome machines
A.2. Managing releases with the Genome tooling
A.2.1. The "release" repository
A.2.2. Creating a superproject
A.2.3. A word on pushing superprojects
A.2.4. Branching strategy
A.2.5. What about the master branch?
Glossary

Preface

The genome is fundamental to the reliable encoding and transfer of both genetic code and data. As the project name suggests, Genome is the equivalence for software systems. The project started formally in early 2008, though the origins can be traced back several years prior to real struggles within Red Hat IT developing and deploying software.

While it may not be the perfect analogy, it is indeed fitting to say that hereditary information is stored within every IT organization. The truth is software systems, like species, face extinction through poor replication of this information. Sadly, the knowledge that is required to maintain and reproduce complex systems often only lives in the form of tangled configuration scripts or, worse still, only in the minds of consulting domain experts. Transfering knowledge in such manners is practically a recipe for building legacy systems.

Taking the biological analogy a little further, briefly imagine a world in which generations of genetic information had to be manually replicated by any number of people. Now try to imagine a different world in which genetic information could only be copied exactly, that is to say, diversity is altogether unattainable. Genome aims to solve both of these problems for IT; that of reproducing exceedingly complicated systems in a world where heterogeneity is always more of the rule than the exception.

As you begin tackling these problems for your organization it cannot be emphasized enough that the collaboration amongst teams enabled by Genome is more important than any particular tool implementation. Feel free to mutate Genome into any shape or form to solve your problems. The truth is, we readily await your patches and enjoy seeing the best ideas rise to the top.

1. Document Conventions

Certain words in this manual are represented in different fonts, styles, and weights. This highlighting indicates that the word is part of a specific category. The categories include the following:

Courier font

Courier font represents commands, file names and paths, and prompts .

When shown as below, it indicates computer output:

Desktop       about.html       logs      paulwesterberg.png
Mail          backupfiles      mail      reports

bold Courier font

Bold Courier font represents text that you are to type, such as: service jonas start

If you have to run a command as root, the root prompt (#) precedes the command:

# gconftool-2

				
italic Courier font

Italic Courier font represents a variable, such as an installation directory: install_dir/bin/

bold font

Bold font represents application programs and text found on a graphical interface.

When shown like this: OK , it indicates a button on a graphical application interface.

Additionally, the manual uses different strategies to draw your attention to pieces of information. In order of how critical the information is to you, these items are marked as follows:

Note

A note is typically information that you need to understand the behavior of the system.

Tip

A tip is typically an alternative way of performing a task.

Important

Important information is necessary, but possibly unexpected, such as a configuration change that will not persist after a reboot.

Caution

A caution indicates an act that would violate your support agreement, such as recompiling the kernel.

Warning

A warning indicates potential data loss, as may happen when tuning hardware for maximum performance.

2. We Need Feedback!

Send commends to genome-list@redhat.com. All bugs can posted to the Genome Trac.

Chapter 1. Genome Appliances

1.1. Appliances

Appliances in the Genome environment are machines that enable the Genome tooling.

1.1.1. Cloud Appliance

The Cloud Appliance is used to host Xen guests in the Genome environment.

1.1.1.1. System Requirements

CPU
1GHz
Memory
This depends on the number of cloud members you plan on hosting. We recommend 1.5G of RAM to start.
System architecture
A cloud appliance can be installed on either i386 or x86_64 architectures.
Hardware Virtualization
When using Fedora as the distribution for the cloud, the machine must support hardware virtualization.
Storage
This depends on how many cloud members you plan on hosting. We recommend 100G of hard drive space to start.

Important

Cloud Appliances should only be installed on a Physical Machine and not a Virtual Machine

1.1.1.2. Features

  • Virtualization (Either Xen or KVM)

  • Genome tooling for managing the cloud

1.1.2. Genome Appliance

The Genome Appliance is the center of development in the Genome environment. In a nutshell it is the self-contained provisioning, configuration and artifact store. For this reason Genome Appliances are generally not considered volatile.

As all other machine types it is designed to work as both a "baremetal" and virtual machine. The main resource requirement that distinguishes this machine type is disk space, which is a function of the amount of bits imported to cobbler.

Note

There was a time when Genome Appliances were able to be created via genome-bootstrap. This led to several "chicken and the egg" sorts of problems. For this reason the method for provisioning Genome Appliances was switched to RPM.

1.1.2.1. Minimum System Requirements

CPU
1GHz
Memory
512M RAM
System architecture
A Genome Appliance can be installed on either i386 or x86_64 architectures.
Storage
This depends on how many distros you plan on hosting in cobbler. We recommend 50G of hard drive space to start.

1.1.2.2. Features

  • Cobbler for all RPM/provisioning

  • A Puppetmaster for all configuration

  • Bare git repos for all content located under /pub/git

  • GitWeb running on http://[hostname]/git/gitweb.cgi

  • Genomed running http://[hostname]:8106/nodes.html

1.1.2.3. Genome Appliance cloning

The state of a particular Genome Appliance can be described by the content stored under /var/www/cobbler and /pub/git. Cloning a particular Genome Appliance is really just a matter of getting the correct bits from those locations onto a new Genome Appliance.

Aside from the simple bit replicatation that must be performed there are also a few "one-off" things that need to happen. This involves:

  • Getting the puppet modules to the location where the puppetmaster can see them.

  • Setting up commit hooks for the puppet module git repositories.

  • Sets up commit hook for the Genome documentation.

See the cookbook for more information.

1.1.2.4. Genome Appliance customization

The genome-repo RPM is designed to get users up and running with a known working configuration. There are certain custom settings users of Genome will need to configure for their environment. The two most common needs for customization are adding new Genome machine types to genomed and any extra cobbler customization.

How these customizations are managed is at the user's discretion. However, since the Repo machine is already controlled by puppet it makes sense in many cases to simply use it for this as well.

For this to work a puppet module named repo_extensions must be created and exist on the module path. The class that this module must define is also called repo_extensions.

Important

The reason this works is because by default the Genome Appliance's puppet external nodes script includes two classes: genomerepo::appliance and repo_extensions.

1.2. Custom Machine Types

A custom machine type in the Genome environment can be roughly described as a collection of known working puppet classes matched with an operating system (or more precisely, a cobbler profile). The list of machines that can be provisioned from a given Genome Appliance can be found when using the genome-bootstrap wizard or the genomed UI.

Note

See the cookbook for more information on creating custom machine types.

Important

From Puppet's point of view these "types" are not bound to any particular OS version. You choose the OS with genome-bootstrap or when provisioning directly with Koan. This allows users to test out different OS and applications versions using the same Puppet code.

Chapter 2. Getting Started

For those who wish to get up and running quickly with Genome you can simply use the cookbook. That being said, a typical Genome evironment consists of:

Chapter 3. Tooling

One of the goals of Genome is not to event new tools but rather to leverage and contribute to existing Open Source projects. This section presents the user with links to many technologies that can be considered prerequisites for contributing to Genome.

3.1. genome-replace-self

To avoid many "chicken and the egg" sorts of provisioning problems the Genome tooling provides a RPM and script called genome-replace-self. As the name suggests this tool is a quick way to completely replace a machine. The term replace-self is borrowed from koan and under the covers that is basically all that is really happening. The script does includes some helpful logic to properly install koan on whatever Red Hat based system was previously running on the system in question.

Important

Machines set up via genome-replace-self are not always controlled by puppet. They tend to be treated more like appliances.

3.1.1. Usage

To use this tool the user must know the profile that will be used to replace-self. This can be obtained easily with koan.

 
genome-replace-self --help

# Select a profile from the list this command returns
koan -s [Your Genome Repo machine] --list=profiles 

# Only certain types of machines require the -m (metadata) flag
genome-replace-self -c [Your Genome Repo machine] -p [Profile selected in previous step]

Note

Ideally which profile to select should be obvious based on the names. A good practice is to have the profiles include both the architecture and operating system in the name.

3.2. genome-bootstrap

3.2.1. Background

With the introduction of virtualization, we are able to easily rebuild entire environments quickly; however there is a fair amount of complexity involved in doing so. We've created a tool called genome-bootstrap that automates the process of wiring a machine up to puppet.

There are essentially two ways of using genome-bootstrap. One way, which does not require root privileges, is to run with the --config-only option. This configures everything on Genome repo to prepare for an installation. Obvously this can also be used to update a machine's configuration.

The second way to use the tool requires root privileges since it kicks off a Koan process.

Note

The --config-only option is especially useful for bootstrapping Host machines. Quite often a machine that will be turned into a Host does not even have genome-bootstrap installed. It's easy to run genome-bootstrap with --config-only from another machine and then procede with Koan on the soon-to-be Host.

3.2.2. Features

  1. Creates DDNS entries for your machines during the bootstrap process

  2. Creates a Cobbler system record for provisioning

  3. Guides the user through the process of setting parameters to be used by Puppet for configuration

  4. Optionally starts the Koan process

Important

As of version 0.4.0 we are no longer tied to using Red Hat's internal DDNS solution. This is great for users with stricter networking requirements.

3.2.3. Installation

The required RPMs should already be installed on any machine already bootstrapped in the Genome environment. This is useful for the --config-only mode. If you want to install Genome machines in a virtualized environment you will need to setup a Cloud Appliance . That machine comes with genome-bootstrap already installed.

If you are installing genome-bootstrap on a separate machine, like a laptop, you can easily add the Genome yum repositories and install genome-bootstrap on any machine you like. Run the following commands to create the Genome yum repository file:

# Switch to root
su -

echo """
[genome-noarch]
name=Genome (noarch)
baseurl=http://brenton.fedorapeople.org/genome/yum/Fedora-9-genome-noarch
enabled=1
gpgcheck=0

[genome-i386]
name=Genome (i386)
baseurl=http://brenton.fedorapeople.org/genome/yum/Fedora-9-genome-i386
enabled=1
gpgcheck=0
""" > /etc/yum.repos.d/genome.repo

# Install genome-bootstrap
yum install rubygem-genome-bootstrap

Should you need to install genome-bootstrap manually, yum can be configured to point to the Cobbler server running on your Genome Repo. Once yum is properly setup you can run:

# yum install rubygem-genome-bootstrap

3.2.4. Usage

genome-bootstrap does not need any parameters. Simply run the program and you will be guided through the bootstrap process.

Note

Originally genome-bootstrap had all sorts of complicated command line flags. While this was nice for scripting it was not a very pleasant UI. Switching to a wizard was the simplist way to keep the documentation up to speed with the tool. For scripting be sure to checkout the Advanced Mode for genome-bootstrap.

3.2.5. Advanced Mode

Important

Most casual users of the genome-bootstrap tool will not require this feature and can simply skip this section.

The advanced mode is most useful for scripting. It also allows for more complicated use cases which operate outside of a normal Red Hat internal network. The only required input are the --fqdn and --repo flags. The yaml configuration can either be specified as another flag or it can be piped to stdin. If a --virt-path is provided then a Koan process will be started. See the --help for more information.

# genome-bootstrap advanced --help

Note

The yaml fed to genome-bootstrap must be in the same format the Puppet expects for its external nodes. You must know exactly which parameters are required for a given Genome machine. The nice thing is that this yaml can be obtained from Genomed.

3.2.6. Bootstrap Parameters

When using the genome-bootstrap wizard the user will be asked a series of questions. The answers are used as parameters for configuring Genome machines. These parameters live server side and are accessible via Genomed

The nice thing about these parameters is that they can be used anywhere a normal variable can be used in Puppet manifests and templates. Some values are client specific such as:

  • What cobbler server a node looks to for dependencies

  • What machines a particular apache will proxy

  • What server should be considered its upstream git repo

Often you will want to change a parameter post bootstrap. This is usually because you want to point your machine at a different Puppetmaster or Cobbler server. For help on this process see the cookbook.

3.2.7. Post bootstrapping

After successfully running genome-bootstrap a Koan process will be started (that is, unless you specify --config-only). You can watch the installation the same way you watch any Xen guest. See the Xen documentation for more details.

Once it's finished start the guest back up to allow the /usr/sbin/genome-firstboot script to run. This behaves exactly like the normal firstboot script on a Red Hat based system.

Note

See the appendix for information about logging in to koan'ed machines.

Important

It's also possible to bootstrap a machine that already has an OS. See the cookbook to see how this is done.

Important

Consistent machine provisioning in the Genome is vital for efficient team collaboration. There should be no post provisioning instructions needed to bring up a working machine. The process for creating a Jboss ESB machine should be the same as a node on the Selenium grid.

3.3. genomed

The genomed service is a simple web app that serves as the canonical source of Genome machine information used in compiling puppet configurations. It's really quite simple so it's probably best explained by simply showing the link: http://[your repo hostname]/genome/nodes.html. From there it should be simple to browse find the other features by exploration.

It's also worth noting that most of the resources (this is a RESTful service) have several representations. Try changing the urls to end with xml or yaml.

3.3.1. Configuration

genomed has two configuration files. One is located at /etc/genomed/config.yml and the other is at /etc/genome/machine_types.rb. The later of the two is the only one that deserves special attention. This file is a traditional Domain Specific Language that gets executed by the ruby interpreter. A documented sample configuration ships with the genome-repo RPM which should be sufficient to get up and running quickly. If changes are made to this file the genomed must be restarted.

3.4. cloudmasterd

The cloudmasterd is a RESTful web service running on a cloud master that provides cloud computing capabilities across one or more cloud members .

The cloudmasterd service also provides a simple status page indicating the current state of the cloud members.

3.5. genome-sync

The goal of genome-sync is to make the process of syncronizing git repositories from one Genome Repo to another as easy as possible. The main mode start guides the user through the process.

In the start mode work will be performed in a working directory. The app will then iterate over each repository, asking the user what work to perform. After this process has completed the user can publish their changes with the save mode.

3.5.1. Usage

The following oneliners are in no particular order.

# Start the syncronization wizard
genome-sync start --repo=[remote Repo machine]

# Hard reset to a given repositories state (This is the fastest way to 
# get up and running with a newly created repo machine).
genome-sync start quick --repo=[remote Repo machine]

# Push content where it needs to go.  If puppet modules are updated the
# puppetmaster may need to be bounced.
genome-sync save

# Remove the working directory  
genome-sync clean

Important

genome-sync must be run as the genome user.

Note

All genome-sync modes take the --help, --verbose and --workingdir flags.

Chapter 4. Open source technologies used with Genome

4.1. Koan

4.1.1. Background

Koan is a tool coming out of Red Hat Emerging Technologies that is used to provision machines from Cobbler servers. Following the unix philosophy it's very simple to use and the man page will tell you everything you need to know. For more information check out the cobbler documentation.

Note

Most provisioning with the Genome tools can be done without having to work with Koan directly. However, a good understanding of its basic operation is useful for advanced usage of the Genome tooling.

4.1.2. Installation

RPMs exist for both Fedora and RHEL (through EPEL). If your repositories are configured correctly you should simply be able to yum install koan. Koan doesn't have many dependencies so if you don't feel like adding the EPEL repo to your RHEL machine you can simply install the RPM.

Once installed you should test your installation against a cobbler server.

koan -s genome-repo.usersys.redhat.com --list=profiles
koan -s genome-repo.usersys.redhat.com --list=systems

4.1.3. Guest Provisioning

Note

genome-bootstrap now wraps Koan for provisioning virtual machines. This is only included for advanced use cases.

 
koan -s genome-repo.usersys.redhat.com --virt --virt-type=xenpv --virt-path=HostVolGroup00 --system=[your hostname]

Here the most important part is obviously the --virt flag. If you pass in a Volume Group name for --virt-path koan will automatically create (or reuse) a logical volume in the format of [name]-disk0. With cobbler much of the configuration lies on the server side (the memory, size of the logical volume, etc). If you have different requirements you can either create a new profile for cobbler or you can use the tooling that makes up Genome achieve the desired results.

Tip

One trick to creating a quest with a larger logical volume than a cobbler profile specifies is to simply create it by hand and specify the size you desire. Koan will simply reuse that logical volume.

4.1.4. Watching the VM

During the kickstart provisioning process you can connect to the virtual framebuffer which is accessible through VNC. It's only available locally so don't try and connect from another machine. From the Xen host you should be able to use:

 
ssh -X root@YourXenHost.usersys.redhat.com
vncviewer localhost:5900

The port may vary according to how many guests you have running. To find out which ports are being used:

 
# If you are using RHEL5 less than U2
netstat -nlp | grep vnc

#otherwise
netstat -nlp | grep qemu-dm

4.1.5. Cleaning Up

If you would like to remove work performed by koan:

  • Remove the Xen configuration for the guest under /etc/xen

  • Remove the file or logical volume that backs your guest.

4.1.6. Known Issues

Provisioning will fail if a config file under /etc/xen has the same name as the machine you are trying to create. The error message is fairly cryptic and says something like "machine already exists". The fix is to simply remove the config file.

4.2. LVM

LVM is used to back our virtualized guests. It is an extremely flexible and pervasive storage technology for Linux. One of the most useful features is the ability to create copy-on-write snapshots.

4.3. Xen Virtualization

Virtualization is a key component of the new architecture. Managing development, build and deployment environments on a variety of hardware and operating systems has always been extremely costly. We have a fairly low tolerance for inconsistencies in both environments, yet customization is critical to most developers and avoided at all costs in production. Virtualization is the technology that gives Genome isolation in both worlds. The development, build, and deployment environments can all be isolated and managed on virtual machines to enable different configurations and optimizations while still residing on the same machine. It also gives us the flexibility to modify our virtualization option as time progress (e.g. from Xen to KVM) but keep the core strategy consistent for the foreseable future.

4.4. JBoss

JBoss is going to be a cornerstone of our new infrastructure. We will be using a slightly newer version of the JBoss EAP stack with components from the JBoss SOA team to incorporate the JBoss ESB.

  • JBoss Getting Started Guide

    Make sure you are comfortable with starting and stopping JBoss as well as the server configurations, deployment mechanisms, jmx console, and general filesystem layout to find logs.

  • JBoss ESB Documentation

    These documents aren't incredibly thorough at the moment, but it should give you a good initial understanding of the JBoss ESB technologies.

  • JBoss Seam Documentation

    The reference documentation for Seam 2.0 CR3 will probably be the best document to read through.

4.5. Source Code Management (Git)

In addition to refining our infrastructure, we also have needed to refine our development and deployment practices for quite a while. We need to be able to run multiple development efforts in parallel, collaborate between them, and maintain a sane state of a deployable branch (e.g. trunk). Subversion has worked in some regards but has fallen short in our ability to utilize it for multiple development streams. Complicated merges end up very error prone and have almost always resulted in production defects. Also, given the errors around branching and merging, it has been very difficult to get the development community to maintain a clean revision history and state of our production branch. Git's distributed nature will allow development to proceed in an offline fashion and result with a small number of clean patches being applied to our production branches. So in essence, be warned, stream of consciousness coding and commits will no longer be accepted.

4.6. Configuration Management (Puppet)

Puppet is a configuration management technology that will help us eliminate many of the manual steps required during releases. Today, the configuration and release process is extremely manual and becoming increasingly difficult to scale. Moving the configuration management aspects down to development allows developers to drive more automation into the release process by providing container and system configurations using a mechanism that can be deployed without modification into production. This also allows groups like Release Engineering to operate in more of a review role and reduce the manual steps they are required to deploy projects.

  • Puppet Documentation

    Since your virtual environment will be running a puppet master to configure all of your virtual machines, make sure you have an understanding of what the puppet master does as well as the templating process used to generate files. This knowledge will be key in enabling developers to make system configuration changes, testing and submitting patches instead of making manual requests for various changes to be applied.

4.7. General

Two places that you should always look for documentation are:

Chapter 5. Cookbook

This chapter contains many useful Genome recipes. While a solid understanding of the toolset is always preferred this chapter is meant to be the user's reference manual.

5.1. Setting up an environment to host virtual machines

Virtualization is by no means a requirement to make use of the Genome tooling, though it is the more common than "bare metal" provisioning.

The machine used to host virtual machines is called the Cloud Appliance. As the name suggests, there can be one-to-many physical machines. The first goal of this machine is to provide and environment to host virtual machines and for that reason are always provisioned on "bare metal". The second is to provide an effective way to manage resources amongst underutilized commodity hardware.

Since virtualization plays such a key role in the Genome environment these machines amongst the first that users of the Genome tooling desire to get up and running quickly.

Note

One of the main goals of the current Cloud machine tooling is to Do the simplest thing that could possibly work. This functionality, implemented through Func modules, will most likely be entirely replaced with ovirt.

5.1.1. Using genome-replace-self

The first step is to install the genome-replace-self RPM. You can do this by running one of the following commands:

rpm -Uvh --force http://brenton.fedorapeople.org/genome/yum/Fedora-9-genome-noarch/genome-replace-self-1.0.0-2.fc9.noarch.rpm

If you want to install the RPM from your local Genome server, the format would be:

rpm -Uvh --force http://$GENOME_MACHINE/cobbler/repo_mirror/Fedora-9-genome-noarch/genome-replace-self-1.0.0-2.fc9.noarch.rpm

Cloud Appliances use genome-replace-self to get up and running quickly. The key to using genome-replace-self to provision a Cloud Appliance is to:

  • Use a Cloud profile.

  • Set the -m (metadata) flag appropriately.

    The value for this flag varies depending if it is the certmaster or a minion in func parlance.

    For the master set -m to certmaster=localhost

    For the minion set -m to certmaster=[Any previously created Cloud master]

5.2. Creating your own Genome Repo Appliance

The Repo Appliance provisioning story is fairly straight forward:

  • install the genome-repo RPM (either via kickstart or manually)

  • Replicate cobbler to pull down the bits and kickstarts

  • Syncronize the git repositories from another Repo Appliance.

  • Define access controls

5.2.1. Installing the genome-repo RPM

There are several options on how to go about this step. If another Repo Appliance is already available genome-replace-self or koan based options are generally preferred since they require fewer manual steps.

5.2.1.1. Creating a virtual Repo Appliance with koan

# See what profiles are available
koan -s [remote Repo Appliance] --list=profiles

# If choosing a virtual machine
koan -s [remote Repo Appliance] --virt --virt-name=[Any name]  --profile=[Profile from first step] --virt-path=[Usually a Volume Group or path to a file]

5.2.1.2. Creating a baremetal Repo Appliance with genome-replace-self

The only step needed for this is to use genome-replace-self and pass in a Repo profile.

5.2.1.3. Creating a Repo Appliance out of a machine that already has an Operating System

If installing this on a machine that already has an operating system (OS) installed simply yum install genome-repo.

In case the genome-repo RPM is not available in the stock yum repositories for your OS you can grab the RPMs from any other Genome Repo Appliance at http://[hostname]/cobbler/repo_mirror/[distro]-genome-noarch.

Whenever installing the genome-repo RPM manually there is one other step that needs to happen to properly configure a Repo Appliance.

/sbin/service genome-repo-bootstrap start

Note

This step is designed to work "offline" since all the external dependencies are pulled down via the genome-repo RPM. That being said there are external factors than can cause it to fail. If you suspect a Repo Appliance is not configured correctly it is always safe to run this command multiple times as puppet will only perform outstanding tasks.

5.2.2. Replicate cobbler data

The simplest way to do this is to use cobbler's replicate functionality.

cobbler replicate --master=[remote Genome Repo machine] --full-data-sync

For more advanced usage see the cobbler manpage.

Important

Currently the use of cobbler replicate requires the user to know the root password on the master cobbler server (the remote Repo machine). This is less than desirable but for now the default password is password.

Note

Technically it is also possible to configure cobbler by hand. This obviously requires a more in depth understanding of cobbler and is outside the scope of this document.

5.2.3. Syncronize the git repositories

Since this is a newly provisioned Repo Appliance the fastest way to get up and running is to run:

# Switch to the genome user
su - genome

# "reset --hard" to all the git repositories on a remote Repo
genome-sync start quick --repo=[remote Repo Appliance]

# Push all the repos where they need to go on the new Repo Appliance
genome-sync save

# Restart the puppetmaster (see below) and re-run puppet
su -
service puppetmaster restart
puppetd --test

Important

If any new puppet modules are saved the puppetmaster will have to be restarted for them to be available.

5.2.4. Define access controls

All machines in the Genome environment have a default root password set as detailed in the appendix. By default Repo Appliances have a local user named genome who owns all content under /pub/git. Out of the box there is no password set and users of Repo Appliances should feel free to use whatever method of authentication they choose.

Typical methods of authentication are:

  • Setting a password

  • Using SSH public key authentication

Note

There is a good chance Genome will make use of FreeIPA someday.

5.3. Cleaning up SSL certificates

Due to the volatile nature of Genome machines there occasionally comes a need to clean up SSL certificates. To clean up all Puppet certs you can simply stop the puppetmaster (in the case of a Repo Application) and puppetd services and then remove /var/lib/puppet/ssl. When you start the services back up the certificates will be created anew.

Sometimes the Puppetmaster will have a cert that corresponds to a machine previously provisioned with the same hostname. Our bootstrap process cleans this up automatically but it's not hard to get into a state where it will need to be cleaned manually on the Puppetmaster side. Luckily this is easy to do. The error from Puppet even hints at how to do it. Login to your Repo Appliance as the local user (usually genome) and run sudo /usr/sbin/puppetca --clean [your hostname]

Important

sudo access to puppetca has been given to the local Genome user.

5.4. Bootstrapping a machine that already has an OS

Important

This particular recipe should be considered advanced. There are plenty of ways to make a custom system incompatible with the Genome tooling. Be sure you are comfortable with how /usr/sbin/genome-firstboot works.

In some cases you might not want to reinstall the OS on a system in the Genome environment. This happens mostly for laptops where a developer simply wants a build environment so they can work "off the grid".

For this recipe the user simply needs to pass the --config-only option to genome-bootstrap to create the needed server-side configuration. Once that is done it's just a matter of installing the genome-firstboot RPM and editing /etc/sysconfig/genome-firstboot. Here's an example config file:

				RUN_BOOTSTRAP=YES
				export GENOME_REPO=genome-staging-repo.usersys.redhat.com
				export FQDN=arch-repo.usersys.redhat.com

Note

It's always best to grab the RPMs from the Genome Repo you are going to use. That way you can be certain you are using a compatible version.

5.5. Adding a new machine type

There are several tools and config files that need to know what types of machines are available for provisioning on a particular Genome Appliance machine. To simplify this process Genome includes a DSL (Domain Specific Language) for describing the machines available.

Let's start with an example for creating a new developer workstation machine type. It's important to get all the wiring setup to make puppet and your machine aware of the new module before adding too much function.

The first step is usually to create a stubbed out module on the Genome Appliance and use genome-sync to setup the right information. To do this, run these steps (on the Genome Appliance):

# Switch to the genome user
su - genome

# Create a working directory (the 'puppet' directory is important)
mkdir -p /tmp/working/puppet/dev_workstation/manifests
cd /tmp/working/puppet/dev_workstation

# Create a simple puppet configuration
# The name of the class must match the directory name
echo """
class dev_workstation {
  package { 'vim-enhanced':
    ensure => latest;
  }
}
""" > manifests/init.pp

# Make the directory a git repository
git config --global user.email "you@example.com"
git config --global user.name "Your Name"
git init
git add .
git commit -m "Priming the dev_workstation module"

# Run genome-sync to publish this module
# This sets up the 'public' git repositories
# and the puppet working directory
genome-sync save --workingdir=/tmp/working

# This is probably a bug in puppet, but you
# need to restart the puppetmaster when you introduce
# a new module
su -
service puppetmaster restart
exit

At this point, you should see the git repositories listed on your Genome Appliance at http://$GENOME/git/gitweb.cgi

The next step is to update the Genome DSL file to make it aware of a new 'machine' that contains this new puppet class. Machines can consist of multiple puppet classes, but in this case, we are starting simple. To do this, run these steps (on the Genome Appliance):

# Make sure the permissions are okay on the machine_types.rb file
su -
chown genome:genome /etc/genome/machine_types.rb
chown -R genome:genome /etc/puppet/modules/main
exit

# Switch to the genome user to update the machine_types.rb file
su - genome

# Clone the repo_extensions repository
git clone /pub/git/puppet/repo_extensions /tmp/repo_extensions
cd /tmp/repo_extensions

# Add the new machine consisting of your new puppet module
echo """
newmachine('developer-workstation') do
   set_classes 'dev_workstation'
end
""" >> files/machine_types.rb

# Commit and push the change
git add .
git commit -m "Adding the new developer-workstation machine"
git push

After you have pushed your changes, they will be applied the next time puppet runs on the repo, which is usually in 5 to 10 minutes. However, for the sake of this example, we will manually kick off a puppet run:

# Force a puppet run
su -
puppetd --test

At this point, you should see your new machine type listed on your Genome Appliance at http://$GENOME/genome/machine_types.html

genome-sync is a great way to handle moving custom machine types from one Genome Appliance to another.

5.6. Changing a machine's parameters

It's convenient to be able to change bootstrapped parameters. One typical use case is if you would like to point your machine to a different puppet master.

To do this simplify access the genomed running on the Genome Repo that your machine is subscribed to and update the yaml file. Once the yaml has been updated you probably want to manually run Puppet (puppetd --test) to update the machine's configuration.

5.7. Change the puppet master of any given machine type

  1. Browse to http://[repo]/genome/nodes/[hostname].html and update the yaml file.

  2. Edit the /etc/puppet/puppet.conf and change any instances of the current puppet master hostname to the hostname of the new puppet master

  3. Make sure there exists a machine configuration for this host on the other repo machine. If not simply create a new one using genomed or by using genome-bootstrap --config-only.

  4. Remove the &PUPPET_SSL_DIR; directory, this will be recreated with a valid set of certificates and keys

  5. Run puppetd --test to make sure everything worked. If all goes well, start the puppet client and you're done.

5.8. Git Recipes

This recipe involves a situation where you have 4 commits, with 1,2, and 4 being related and commit 3 just in the middle. Ideally, before you submit this, you would like to combine 1,2, and 4 into a single commit and then have commit 3 and a second, isolated commit.

Run the following to prime you repository with this scenario:

cd ~
mkdir git-test
cd git-test
git init
touch a.txt
git add a.txt
git-commit -m "Adding one file"
touch b.txt
git add b.txt
git-commit -m "Adding another file"
touch different.txt
git add different.txt
git-commit -m "Commiting something entirely different"
touch c.txt
git add c.txt
git-commit -m "Adding yet another file"

Correcting the problem:

  1. Look at the log history and get the id of the initial commit revision: git log

  2. Start a working branch and check it out to serve as the cleanup branch, starting at the initial commit: git checkout -b cleanup [REVISION]

  3. Confirm you are now working on the new branch: git status

  4. Confirm the log history only contains the initial commit: git log

  5. Now, since we want to add two more things to this commit and re-commit it. First, you need to get the commit revisions from the master branch: git log master

  6. Now, you need to cherry pick commits 2 and 4 without commiting:

    git-cherry-pick -n [REVISION 2]
    git-cherry-pick -n [REVISION 4]
    
    

  7. Now you need to revise the current commit comment to include what was done in revisions 2 and 4 git-commit --revise

  8. At this point, you have combined 1,2, and 4 into a single commit. The last step is to cherry pick and commit revision 3: git-cherry-pick [REVISION 3]

  9. Now you can push the changes from this branch to the desired public repository.

Chapter 6. Self Tests

To walk away with a deeper understanding than just honed copy and paste skills when using the Cookbook, you need some knowledge about the underlying technologies. These self tests will access your ability to use the Genome tooling.

6.1. LVM

  1. How can you find out how many free extents are in your available volume groups?

  2. What command will tell you how much free space is left on your logical volume snapshot?

  3. What happens if your snapshot becomes full?

  4. How can you determine the origin logical volume of a snapshot?

  5. If your root paritition is a single volume group that occupies all extents on your only volume group. How can you free up space for creating other logical volumes / volume groups?

  6. If you are using LVM to back a Xen guest why does simply growing the logical volume not give your guest more disk space?

  7. If your volume groups or logical volumes are not showing up at their appropriate device mount points under /dev, what command(s) can you run to create the necessary device nodes? (Useful for working with LVM in rescue mode)

6.2. Xen

  1. How can you make your Xen guests start at boot time?

  2. Explain the relationship between the Dom0 (or Domain-0) and the DomU.

  3. From a user's point of view, what are the main differences between using para-virtualization and hardware assisted virtualization?

  4. What is the name of the library that both xm and virsh use?

  5. What service must be running for this library to make hypercalls? (When you figure it out, temporarily shut it off and try running +virt-install+)

  6. Where does virt-install create its guest configuration files?

  7. How can you completely delete a guest from the command line? (Say, if you created it originally with virt-manager).

  8. When you are running Xen's default bridged network what is the default name for your real ethernet device?

  9. What will the affect be on your guests when running a Xen host without network connectivity? Why?

  10. Give a highlevel explaination of the difference between a bridged and a routed network.

  11. If you wanted to use NAT instead of the default bridged network setup what config file would you edit?

6.3. Git

  1. Name a command that is not safe to run while other people are using your repo.

  2. Say you just made a bad commit on your private branch, how can you fix it?

  3. What does git pull do under the covers? How is that different than +git fetch+?

  4. What is a bare repo? How can you convert a working repo into a bare one?

  5. How many bytes does it take to create a new branch?

  6. What do the commit SHA1 sums represent?

  7. What is unique about cloning a repo to a location on the same filesystem?

  8. What is the danger in using git rebase on a public branch?

  9. How can you erase all traces of a bad commit on your private branch?

  10. How can you checkout the state of your current branch 6 commits ago?

Chapter 7. Debugging

When things go wrong with Genome look here first.

7.1. Puppet

Sometimes things go wrong when puppet configurations are applied. Most of these failures are due to timing issues that a particular manifest rely on. Timing issues most often encountered during bootstrapping. Usually this is an indication that the manifest needs to be fixed (though there are cases that can't be worked around easily). If a configuration seems to have be half-way applied to your machine you can always force the configuration to run and watch the logging.

  • When debugging it's helpful to stop the long running Puppet service so that changes will be made to you system only when you trigger them explicitly.

    # service puppet stop
    
    

    Runing Puppet manaully:

    # puppetd --test
    
    

    Runing Puppet manaully with full debug info.

    Note

    You must stop this command with ^c

    # puppetd --debug --trace --no-daemonize
    
    

7.2. Puppetmaster

  • Stop the service

    # service puppetmaster stop
    
    

    Runing Puppetmaster manually:

    # puppetmasterd --debug
    
    

    Note

    When you are done be sure to start the puppetmaster service back up.

Chapter 8. Contribute

We're excited that Genome has become a community project! There are a few things to know regarding Genome community participation

8.1. Licensing

All Genome source and pre-built binaries are provided under the GNU General Public License, version 2

8.2. Design Axiom

The Genome framework really tries to delegate as much functionality as it can to tools that are invented to do a particular function. That said, any code contributed to glue tools together should be as minimal as possible to get the job done.

8.3. Community

Now that you're ready to be an active community member, here are a few directions to get you started.

8.3.1. Please Be Friendly

We strongly encourage everyone participating in the Genome community to be friendly and courtious toward other community members. Of course, being courteous is not the same as failing to constructively disagree with each other, but it does mean that we should be respectful of each other when enumerating the 42 technical reasons that a particular proposal may not be the best choice. There's never a reason to be antagonistic or dismissive toward anyone who is sincerely trying to contribute to a discussion

8.3.2. Community Communication

The best way to participate in the community is to use the mailing list and/or the IRC channel. The mailing list is genome-list@redhat.com and the IRC channel is #genome on irc.freenode.net.

8.4. Working With The Code

If you're not familiar with the Git source code management tool, do yourself a favor and take time to get over the learning curve. It's bliss once you 'get it'

8.4.1. Checkout The Code

Developer Checkout URI:

                    ssh://git.fedorahosted.org/git/genome

Anonymous Checkout URI:

                    git://git.fedorahosted.org/git/genome

                    or

                    http://git.fedorahosted.org/git/genome

The Genome project code is seperated into several Git repositories. The code repositories are granular so that the repositories are small and easy to work with. We have sepearted core tooling, core documentation, puppet configuration manifests, third party tool extensions, application code, and website into their respective Git repositories. When you clone the Git repository from fedorahosted.org/git/genome, that is actually a supermodule, which references all the git repositories hosted on gitorious.org. If you do want to get use get all the Genome code at once, you can use the fedorahosted.org/git/genome URL.

                    # Clone the Genome supermodule
                    git clone git://git.fedorahosted.org/git/genome

                    # Move into the cloned supermodule
                    cd genome
                    
                    # Then initialize the submodules
                    git submodule init

                    # Then do the actual cloning of the remote submodules, if you already have them checked out, this will update the submodules locally
                    git submodule update

If you want to work with a specific Git repository, you can review the gitorious genome project and then use the clone urls listed for each Git repository under the project. For example, if I want to clone the Genome tools repository I would go to http://gitorious.org/projects/genome/repos/tools and the choose a clone URL.

                    # Clone the tools git repository
                    git clone git://gitorious.org/genome/tools

Appendix A. Revision History

Revision History
Revision 1.0 Red Hat
IT

Ported documentation to publican

A.1. Logging in to Genome machines

The only interesting thing about logging into Genome machines is the root password. It is currently set in the kickstart file in our Cobbler profiles. That means if you do any provisioning with Koan in the Genome environment your root password will be password. Users can change the password to anything they like once logged in.

A.2. Managing releases with the Genome tooling

One of the challenges of working on large teams is simply keeping track of all the various forms of content that make up a project. While teams have traditionally used some sort of Source Code Management tool such as subversion or git the same discipline also applies to configuration artifacts and binary dependencies.

For this reason, projects making use of the Genome tooling have the ability to track all content via git repositories. Detailed below is a process that handles bundling the state of several puppet modules, RPM dependencies and source code into one deliverable that can be tracked throughout the project lifecycle.

A.2.1. The "release" repository

The release git repository is basically just a superproject which can contain any number of submodules. This allows project dependencies to be woven together as needed.

A.2.2. Creating a superproject

A superproject is really just a normal git repository for tracking the states of other repositories.

# Create a new git repository
mkdir release
cd release
git init

Once the repository has been created submodules can be added.

# Add the submodule
git submodule add [url] [path/in/superproject/to/be/created]


At this point a new file will have been added called .gitmodules. This is the submodule configuration file. Another "file" that is created is a pseudofile that corresponds to the path created when the submodule was added. Both of these should be committed.

Important

The url used when adding the submodule can be relative. This is often more desirable than hard coding the path to a specific repository. The main reason is that the content referenced by a particular release repository should actually exist in the Repo Appliance. This is a best practice that allows Repo Appliance state to be backed up with the guarantee that a project's state can be rebuilt and the machines involved can be provisioned. See the git-submodule manpage for more information.

Note

See the git-submodule manpage for more information.

A.2.3. A word on pushing superprojects

Typically only metadata is stored in the release superproject. For this reason copying release deliverables from one Repo Appliance to another is not as simple as using git push on only the release repository. If relative submodule paths are used (and they should be) the state referenced in all submodules must exist on a given Repo Appliance. Luckily, this is quite easy to do with genome-sync.

A.2.4. Branching strategy

Complexity, risk as well as external factors all play a large role in how a particular project decides to branch. Conventions go a long way to simplifying this process and can make projects move smoothly from development to production.

In a nutshell it conventions are:

  • If a project is named foo then there will be a branch called foo on all git repositories touched by that project.

  • Branches that match the project name are considered to be stable and "on their way to production".

  • Using the release superproject is simply a matter of wiring up the branches for a particular project into one branch, which also bears the name of the project.

    In practice what this equates to is, after adding the submodules to a superproject, going into the submodule's directory and getting the working directory to match the desired state. If the project branch naming conventions are being followed the content can simply be fetched and then checked out.

    If the fetch/checkout process results in a change, at the root of the superproject git status will reflect the change. The changes can then been commited (on the superproject branch that corresponds to the project name).

Note

These conventions only need to be followed at the by the people who are "interfaces" between teams. The use of Repo Appliances can also aid the branching strategy in that it allows each group to determine what works best for them. For example, development and release engineering (RE) teams have different goals when it comes to managing a codebase. In development a team will be more concerned with how to balance the bugfix and feature streams of a project while RE will focus more on how moving these changes through the various environments affects other projects.

A.2.5. What about the master branch?

For most git repositories it really isn't even needed and only aids to confusion since there is no consensus as to how branches like trunk and master should be used. The main exception with the Genome toolings is the case of the puppet module repositories. The hook that checks out the module code and puts it on the modulepath needs to know the name of a particular branch to work with. That branch is the master branch.

The normal workflow for a puppet module is to test changes on the master branch and then push changes to the project branch when they are baked.

Important

This process can be followed regardless of where in the lifecycle the change occurs. Development can test their changes, push to their project branch and then QA can push the project branch into their master. Once through QA, the code can again be pushed to a project branch where RE can take over.

Glossary

cloud appliance

A server appliance that controls a number of cloud members as func minions.

A Cloud Appliance is simply a prepackaged cloud master.

See Also cloud master.

cloud master

A server that controls a number of cloud members as func minions.

cloudmasterd

A service running on a cloud master that provides the ability to control one or more cloud members .

cloud member

A server that can host virtual machines and that is controlled by a cloud master . Cloud members are controlled through the use of func. When a cloud member is added to a cloud, it is added to the cloud master as a func minion. This allows the cloud master to take control of certain functions on the cloud member. For the purposes of Genome, this means taking control of the ability to koan new virtual machines on the cloud member.

In order for a server to become a viable cloud member, it must have been kickstarted with an appropriate cobbler profile for cloud machines. This ensures that the cloud member has the correct virtual machine hosting capabilities and storage facilities.

genome appliance

An server appliance that serves as the central controlling unit in the Genome framework.