Wednesday, September 10, 2014

Creating a CentOS 7 Master Server Image


I know CentOS is still in beta but, like others, I want to become familiar with it as soon as I can. So I'm going through the process of setting up a master server image. When CentOS 7 is officially released, I'll create the real master server image from these steps.

I want to start with a clean system. Just the basics. After the OS is installed, I'll add a few specific packages and customize the configuration of various parts of the server to get it just the way I need it for my environment. Everyone's environment is different and some steps below could be changed or eliminated altogether depending on what you need. (This doc will be updated as needed.)

*This server will be installed as a VM running on a VMware ESX 5.5 host.

Installation
1) Download the CentOS 7 ISO.

2) Create the VM in vCenter
- 1 vCPU
- 4 Gb memory
- 30 Gb disk
- enable Memory and CPU Hot Add

3) Install the OS
- Configure Network
- set static IP4 address, netmask, gateway
- set IPV6 to 'ignore'
- enable interface
- Configure Time
- set time zone
- enable NTP
- configure corporate time servers
- Custom Disk Partitioning
- /boot, 500Mb on a standard partition
- /, 10Gb, logical volume lv_root in volume group vg00
- /var, 5Gb, lv_var in vg00
- /home, 2Gb, lv_home in vg00
- /tmp, 2Gb, lv_tmp in vg00
- swap, 4Gb, lv_swap in vg00
- /opt, <remaining disk>, lv_opt in vg00
- Software
- infrastructure server
- Begin installation
- Set root password
- Reboot after installation completes

4) Configure Network
- uninstall NetworkManager rpm
- edit /etc/hosts
- add line with server IP address, hostname and FQDN
- edit /etc/sysconfig/network
- delete line with comment, "#Created by anaconda"
- add line "NETWORKING=yes"
- add line with "HOSTNAME=<server name>"
- edit /etc/sysconfig/network-scripts/ifcfg-ens160
- add line with DEVICE="ens160" (device name has to match the device in /sys/class/net)
- add line with BROADCAST="<broadcast address>"
- add line with NETMASK="<netmask>"
- comment out lines with UUID, HWADDR and PREFIX0
- remove zero from GATEWAY0 and IPADDR0 parameters
- edit /etc/resolv.conf
- delete line with comment, "#Generated by NetworkManager"
- add line with "options timeout:1"
- disable firewalld.service
# systemctl disable firewalld.service
- edit /etc/sysconfig/selinux and set selinux to disabled
- set hostname
# hostnamectl set-hostname <host name>
- reboot

5) Setup Login Environment and SysAdmin Accounts

- edit /etc/bashrc
- add a few custom aliases for common commands
- set the custom default prompt
- add "HISTFILESIZE=10000"
- add "PROMPT_COMMAND='history -a'
- add "export HISTFILESIZE"
- setup accounts for the system administrators
- add sysadmin user accounts
- set the correct UID
- chage -M -1 to never expire
- setup sudo

6) Setup Mail
- verify postfixd is enabled
- edit /etc/aliases and add mail forward address for root
- reinitialize the alias database
# newaliases

7) Setup SSH
- edit /etc/ssh/sshd_config
- uncomment line with "Protocol 2"
- set/add banner with "Banner /etc/issue"
- scp the custom /etc/issue file from the CentOS 6 master image server
- restart sshd
# systemctl restart sshd.service

8) Setup Repositories
- add epel and rpmforge repos
- for each
- google for the URL to the correct version of the repo setup rpm
# wget <URL to rpm>
# rpm -ivh <rpm>
- set new repos to not be enabled by default by editing the corresponding /etc/yum.repos.d/*.repo file

9) Setup Root Crontab
- scp custom scripts from /usr/local/bin on the CentOS 6 master image server
- verify the scripts run correctly on CentOS 7
- scp /var/spool/cron/root file from CentOS 6 master image server as a starting point
- edit root's crontab
- adjust cron job run times and commands as needed

10) Install Various Software and RPMs
- install RPMs via Yum
- pam_krb5 (for kerberos authentication)
- ntp (for time; didn't come with infrastructure server installation)
- ncompress (contains compress utility that is used by several custom scripts)
- sg3_utils (for scsi-rescan utility)
- lsscsi (for lsscsi utility)
- nagios-nrpe (for Nagios monitoring)
- nagios-plugins (Nagios)
- nagios-common (Nagios)
- htop (great, top-like utility)
- iotop (top-like utility for io)
- iptraf (for watching network traffic)
- dstat (another top-like, information gathering utility)
- haveged (for better random number generation in a virtual machine)
- install VMware Tools
- follow normal procedures
- install Toptracker (a custom app for trending system performance data)
- follow normal procedures
- install backup software
- follow normal procedures
- update everything
# yum update
- reboot

11) Setup Kerberos Authentication to AD
- pam_krb5 was installed in previous section
- scp /etc/krb5.conf (has custom configuration) from CentOS 6 master image server
- enable kerberos authentication
# authconfig --enablekrb5 --krb5kdc=<kdc> --krb5realm=<realm> --update
- verify login with AD credentials works

12) Setup Docker
- install docker
# yum install docker
- start docker
# systemctl start docker.service
- set docker to start on boot
# systemctl enable docker.service


Thursday, September 4, 2014

SA Cheat Sheet

I need a place to jot down some general configuration helps and command syntax. Hence, the SA Cheat Sheet.
(**Primarily for VMware, RHEL and CentOS systems)
(**This is a living document and will be updated as I find more tips I need to remember)


-----
Using tar to copy directories and files to another location on the same server or a different server.
Example to copy /opt/copyme to /data/copyme on the same server:
$ cd /opt
$ tar cvfp - copyme | (cd /data && tar xvfp -)
Example to copy a user bart’s home directory to a different server:
$ cd /home
$ tar cvfp - bart | ssh root@targetserver ‘cd /home && tar xvfp -’

Clear out all messages in the postfix queue
$ postsuper -d ALL

Show what is using virtual memory.
$ ps -e -o pid,vsz,comm= | sort -n -k 2

A note about xargs.
If you have two arguments, say r1.txt and r2.txt from the command “ls r*.txt”:
$ ls r*.txt | xargs rm
will execute the rm command this way...
rm r1.txt r2.txt
However,
$ ls -r*.txt | xargs -i{} rm {}
will execute the args separately, this way...
rm r1.txt
rm r2.txt

Using ssh keys for no-password root login to ESX hosts.
For ESX 5.x, place the public key (id_dsa.pub or id_rsa.pub) in:
<ESX 5 host>:/etc/ssh/keys-root/authorized_keys
For ESX 4.x, place the public key in:
<ESX 4 host>:/.ssh/authorized_keys

Various ways of listing disks / storage on a server.
fdisk - list disks
$ fdisk -l
lsscsi - list SCSI devices and various info (rpm lsscsi)
$ lsscsi
pvscan - list physical volumes that are part of the LVM
$ pvscan

Scan a live server for new scsi disk using scsi-rescan (included in sg3_utils rpm)
$ scsi-rescan
After running scsi-rescan, new disks will show up with “fdisk -l” and can be added to LVM with the pvcreate utility.

Remounting a read only file system as read write:
mount -o remount <file system>
Example:
$ mount -o remount /var

Show server port information and connected / listening ports.
netstat (included with net-tools rpm)
$ netstat -putlan
lsof (included with lsof rpm)
$ lsof -i
$ lsof -i -P #shows what is using which ports but doesn’t convert port numbers to names
$ lsof -i :80 #shows what is using port 80

Remove the virbr0 interfaces:
$ virsh net-destroy default
$ virsh net-undefine default
$ service libvirtd stop
$ chkconfig libvirtd off

Search and replace in vi (starting at the first line through the entire file):
:1,$s/<text to match>/<replace with this>/g

Mail .forward and /etc/aliases
The aliases defined in /etc/aliases take precedence over user's .forward file settings.


Systemd Cheat Sheet

I'm still getting used to systemd commands and syntax. Here are some common tasks and their commands. (**This is a living doc and content will be added as needed)

-----
Show duration of the last boot:
$ systemd-analyze

Show how long each systemd unit took to start on last boot:
$ systemd-analyze blame

Show the systemd journal:
$ journalctl

Show all systemd journal events for today:
$ journalctl --since=today

Show all errors in the systemd journal:
$ journalctl -p err

Show the default run level:
$ systemctl get-default

Set default run level:
$ ln -sf /lib/systemd/system/<target name>.target /etc/systemd/system/default.target

Set default run level to multi-user (run level 3):
$ ln -sf /lib/systemd/system/multi-user.target /etc/systemd/system/default.target

Set default run level to graphical (run level 5):
$ ln -sf /lib/systemd/system/graphical.target /etc/systemd/system/default.target

Change to a different run level:
$ systemctl isolate <target name>.target

List active "units", as they are called (i.e. services, targets, mounts, etc):
$ systemctl

List units that are services:
$ systemctl -t service

List the status of a specific unit:
$ systemctl status <unit>
Example:
$ systemctl status sshd.service

Check if a service is enabled:
$ systemctl is-enabled <service>
Example:
$ systemctl is-enabled sshd.service
$ enabled

Kill all the processes of a given service:
$ systemctl kill <name>.service
Example:
$ systemctl kill crond.service

List all units and their state:
$ systemctl list-unit-files

Reload systemd, scanning for new or changed units:
$ systemctl daemon-reload

Display process details plus control group info. This is a great command to make an alias for.

$ ps xawf -eo pid,user,cgroup,args

Set the hostname:
$ hostnamectl set-hostname <name>

Tuesday, September 2, 2014

Yum Segmentation Fault

I have a server that, every now and then, spits out a "segmentation fault" error when trying to run yum. "yum clean" and the like don't fix it. The problem seems to crop up when a user installs new libraries on the server via source rather than via yum. A library link gets changed as a result and yum falls apart.

A few forums had posts about this issue but it took some effort to dig through to the ones that offered solutions. To save myself time, I'll put my steps here.

Resolution (this is a CentOS 6, 64bit server)
rpm -q zlib                   # see what version of libz is installed
cd /lib64                      # change to the library directory
rm libz.so.1 # remove existing link to the incorrect version, libz.1.2.8
ln -s libz.1.2.3 libz.so.1 # create link to the correct version, libz.1.2.3

Monday, September 1, 2014

A Basic Root Crontab Contains

Everyone has their own layout and content preferences for cron files. These are some basic elements I usually include in root's crontab.

1) PATH statement at the top to help setup the environment. Even though this is root's crontab, cron entries don't run with the same environment setup for user root. Having the PATH statement in the cron file means I can shorten some of my entries. I think it looks cleaner.

2) A monthly job to gather system information. The script run by this job gathers key system information and various configuration files. The output of the script is sent to a central server where all server configurations are stored. The gathered data is a safety net in case a server needs to be rebuilt or I just need to compare the current server configuration with a previous configuration, to see what changed.

3) A job to check available disk space. This script runs a few times per hour. It checks for file systems that have filled up above a given threshold. The sysadmin receives an email when a threshold is crossed.  **This is beyond Nagios disk monitors that also check the servers for file systems that are filling up. I like having both disk space checks in place.

4) A job to check system security and integrity. This script checks a wide variety of critical system areas and reports differences and changes to the sysadmin.

5) A weekly cron job to check backups. This script compresses and rotates backup logs. It also sends an email report to the sysadmin about what is being backed up.

6) A cron job to verify VMtools is running. This script runs several time a day. It checks whether VMware VMtools is running or not. If it isn't, it tries to restart it. If that fails, an email is sent to the sysadmin to check into the problem.

7) A job to check NFS mounts (on servers with NFS mounts). Monitoring the status of NFS mounts is critical to making sure the environments are functioning properly. This script checks for missing NFS mounts and mounts that are stale.

8) A cron job to report on sudo activity. This script sends a daily sudo report to the sysadmin.

9) A daily job to report system errors. This job runs a custom script which parses various logs for errors and alerts of interest to the sysadmin and reports them via email.


Friday, August 29, 2014

Getting Started with PowerCLI

Getting started with PowerCLI can be a bit overwhelming, even if you have a background in shell scripting. Thankfully, the syntax is straightforward, there are tons of resources available online and the user communities are a great place to go if you get stuck.

Launchpad to Some Helpful Resources


PowerShell

PowerCLI
     Setup
     Scripting

Saturday, August 16, 2014

SA Summit Mid-Year 2014

Every six months I hold a sysadmin team meeting I call an SA Summit. These half-day meetings are meant to get everyone on the same page and get us excited for the work ahead. To me, the technical discussions and the trainings are the most valuable. The team is great about sharing ideas. Participation is never a problem as everyone willingly contributes comments and suggestions. Each person also provides brief training to the rest of the team covering some area where they have expertise / experience.

Here is the agenda from the August 2014 meeting:

Purpose
To recognize our accomplishments.
To review the current state of systems and the server environment.
To review team goals and adjust/add goals.
To hold technical discussions and team training. 

Accomplishments
< accomplishments list >

Goals
< goals list >

State of the Systems
Slides
   - number of physical hosts and virtual machines (VMs)
   - number of physical hosts of each model of hardware
   - number of VMs in various environments
   - growth chart of VMs over the past five years
   - image of the landscape of blade chassis
   - image of the landscape of SAN disk units and switches

Technical Discussions
- Shared storage across blades
- Disaster recovery and business continuity status, testing and plans
- Monitoring script to monitor NFS mounts
- Cloud environment status and plans
- VMware environment status and plans
- Automation status and plans

Training
- DNS architecture and future plans
- Method for refreshing test environments
- VMware vCOPs 
- EMC cloning and snapshots

Saturday, March 22, 2014

Useful ESXi CLI Commands

Various ESXi 5.x Commands to Gather Information
System
List all VMs (VM name, vmx file, world ID, etc)
     #> esxcli vm process list
List all VMs (VM name, vmx file, guest OS, hardware version, etc)
     #> vim-cmd vmsvc/getallvms
Show hostname (hostname, FQDN and domain)
     #> esxcli system hostname get
Show ESXi version and build
     #> esxcli system version get
Show abbreviated ESXi version and build
     #> vmware -v
Show hardware info (name, vendor, serial number, etc)
     #> esxcli hardware platform get
Show CPU info (cores, vCPU, hyperthreading, etc)      #> esxcli hardware cpu global get
Show memory info
     #> esxcli hardware memory get
Show load
     #> esxcli system process stats load get

Settings
Show current system time
     #> esxcli system time get
Show DNS servers
     #> esxcli network ip dns server list

Network
List nics and info
     #> esxcfg-nics -l
Another way to list nics and info
     #> esxcli network nic list
List vswitch connected to the management console

     #> esxcfg-vswitch -l
List vswitches
     #> esxcli network vswitch standard list

Show default route
     #> esxcfg-route -l

List VMs and their network
     #> esxcli network vm list

List portgroups
     #> esxcli network vswitch standard portgroup list

Show ARP table
     #> esxcli network ip neighbor list

Show active TCP/IP connections
     #> esxcli network ip connection list


Storage Adapters
List storage adapters
     #> esxcli storage core adapter list
List fiber channel adapters
     #> esxcli storage san fc list


Volumes
List volumes (VMFS, NAS, VFAT)
     #> esxcli storage filesystem list
Show the number of volumes
     #> ls -F /vmfs/volumes | grep \@$ | wc -l

Show snapshots, if any
     #> esxcli storage vmfs snapshot list (show SNAPSHOTS, if any)


Misc
Show Dell OpenManage info, if installed
     #> esxcli software vib list | grep -i openmanage