EFM AND PGPOOL INTEGRATION FOR HA

Obtaining high availability in any infrastructure setup is very important. Thanks to the pgpool and EDB EFM team who make this possible.

I will not go deep into explaining what is EFM or pgpool. What i intended to archived here is to show how this integration works with a demo. EFM is just a tool that can be used to archived high availability with postgres during failover or switchover. You can read more on EFM from here. Pgpool on the other hand is widely used for load balancing and connection pooling although its also used for streaming as well. You can read a guide on pgpool from here.

In this illustration, i have a cluster of EFM to monitor the database nodes (master and two standby). In case of a failover or you want to switchover, EFM can handle that by performing the switch to the next available standby node.

I also setup two nodes which make up the pgpool cluster to provide continuous application connectivity. This comes handy if for some reason, one of the pgpool node crash or stops, watchdog detech and does a failover to the next pgpool node.

Also we have the attached and the detach scripts which are called each time EFM does a failover or switchover to update the pgpool node of the new master and nodes in the cluster.

I hope this picture below will speak more for itself. but if its not clear enough, I will recommend watching the video of the below setup and how i integrated this.

Below are the two scripts used for this integration.

You might need to edit user, ports or IP as needed if you want to use.

===efm_loadbalancer_attach.sh===
==============Note that in the script below, i used the delegate IP so that this scripts could be run on any node having the VIP.If this is not used, then this script will fail if pgpool service goes down.=================================

-bash-4.2$ cat efm_loadbalancer_attach.sh
#!/bin/bash
###################################################################################
#title           : efm_loadbalance_attach.sh
#description     : Script executes pcp_attach_node/pcp_promote_node command to 
#                : notify pgpool for NEW Master or Attach returning nodes back to  
#                : pgpool cluster. 
#date            : October 27, 2018
#version         : 1.0
#bash_version    : GNU bash, version 4.2.46(2)-release (x86_64-redhat-linux-gnu)
#author          : Raghavendra Rao([email protected])
#
#Note: Script runs pcp_* commands. Password less PCP configuration is MUST..
#
###################################################################################
# quit on any error - verify any undefined shell variable
set -e
set -u

# EFM variable
#-------------
EFM_HOST_FROM_HOOK=$1                 # argument from EFM fencing hook

# Pgpool cluster connection information
#--------------------------------------
PCP_USER=enterprisedb                 # PCP user name (enterprisedb or postgres)
PCP_PORT=9898                         # PCP port number as in pgpool.conf
PCP_HOST=192.168.91.222                 # hostname of Pgpool-II
PGPOOL_PATH=/usr/edb/pgpool4.0/bin    # Pgpool-II installation path
PGPOOL_PORT=9999
PCPPASSFILE=/var/efm/pcppass          # Path to PCPPASS file

# EPAS/PG DB connection information
#-----------------------------------
PGPATH=/usr/edb/as10                  # Path to EDB binaries
PGUSER=enterprisedb
PGPORT=5432
PGHOST=${EFM_HOST_FROM_HOOK}

# Pgpool pcp & db check commands
#-------------------------------
PCP_NODE_COUNT=${PGPOOL_PATH}/pcp_node_count
PCP_ATTACH_NODE=${PGPOOL_PATH}/pcp_attach_node
PCP_PROMOTE_NODE=${PGPOOL_PATH}/pcp_promote_node
PCP_NODE_INFO=${PGPOOL_PATH}/pcp_node_info
PG_ISREADY=${PGPATH}/bin/pg_isready

export PCPPASSFILE PCP_USER PCP_PORT PCP_HOST PGPOOL_PATH PGPOOL_PORT \
       PGPATH PGUSER PGPORT PG_ISREADY \
       PCP_NODE_INFO PCP_NODE_COUNT PCP_ATTACH_NODE PCP_PROMOTE_NODE

logfile=/usr/edb/efm-3.7/efm-scripts/efm_attach_"`date +"%Y%m%d%H%M%S"`".log


# Print Node status from pool_nodes to log
#-----------------------------------------

print_node_status()
{
   echo "--------------------------------------------">>$logfile
   echo " Node IP: $1 Node ID: $2 $3 :">>$logfile
   echo "--------------------------------------------">>$logfile
   ${PCP_NODE_INFO} -h ${PCP_HOST} -U ${PCP_USER} -p ${PCP_PORT} -w -n ${2} >>$logfile
}

# Get db running or not
#----------------------
db_status_check()
{
   local val=$(${PG_ISREADY} -h ${1} \
                             -U ${PGUSER} \
                             -p ${PGPORT} \
                             -t 0 \
                             -d template1 | grep "accepting" | wc -l)
   echo "$val"
}

#Get Node ID information from show pool_nodes
#---------------------------------------------
get_node_id_using_pcp()
{
  NO_OF_NODES=$(${PCP_NODE_COUNT} --host=${PCP_HOST} \
                                  --username=${PCP_USER} \
                                  --port=${PCP_PORT} \
                                  --no-password )

  for (( i=0 ; i < ${NO_OF_NODES} ; i++ ))
  do
     exists=$(${PCP_NODE_INFO} --host=${PCP_HOST} \
                               --username=${PCP_USER} \
                               --port=${PCP_PORT} \
                               --no-password ${i} | grep ${1} | wc -l)
     if [[ ${exists} -eq 1 ]]; then
        NODE_ID=${i}
        break
     fi
  done
  echo "$NODE_ID"
}


# Check local node is in Recovery or not
#---------------------------------------

db_recovery_check()
{
   local dbcheck=$(db_status_check ${PGHOST})
   val=x
   if [[ ${dbcheck} -eq 1 ]]; then
       val=$(${PGPATH}/bin/psql -h ${PGHOST} \
                                 -U ${PGUSER} \
                                 -p ${PGPORT} \
                                 -d template1 \
                                 -Atc "select pg_is_in_recovery();")
   fi
   echo "$val"
} 

# pcp_attach_node command
# -----------------------

attach_node_to_pgpool()
{
   echo -e "\nAttaching Node :">>$logfile
   echo "-----------------">>$logfile
   ${PCP_ATTACH_NODE} --host=${PCP_HOST} \
                      --username=${PCP_USER} \
                      --port=${PCP_PORT} \
                      --no-password \
                      --verbose \
                      ${1} >>$logfile
}

attaching_remaining_nodes_to_pgpool()
{
   local tot_nodes=$(${PCP_NODE_COUNT} --host=${PCP_HOST} \
                                       --username=${PCP_USER} \
                                       --port=${PCP_PORT} \
                                       --no-password )
   for (( j=0; j < 2; j++ ))
   do
   for (( i=0 ; i < ${tot_nodes} ; i++ ))
   do
      if [[ ${i} -ne ${1} ]]; then
         node_ip=$(${PCP_NODE_INFO} --host=${PCP_HOST} \
                                          --username=${PCP_USER} \
                                          --port=${PCP_PORT} \
                                          --no-password ${i} | cut -d" " -f1)
         if [[ ! -z ${node_ip} ]]; then
            local node_state=$(db_status_check $node_ip)
            if [[ ${node_state} -eq 1 ]]; then 
               echo -e "\nAttaching Node-ID: ${i} Node-IP: ${node_ip} NodeDBStatus[1-UP/0-DOWN] : ${node_state}">>$logfile 
               attach_node_to_pgpool "${i}"
               print_node_status "${node_ip}" "${i}" "after attach"
            fi
         fi
      fi
   done
   echo -e "\n*** Re-confirming DB Status to attach,due to delay when performing switchover/failover ***">>$logfile
   done
}


# Get Node Id from PgPool cluster & DB status
#--------------------------------------------
EFM_HOOKIP_NODE_ID=$(get_node_id_using_pcp $EFM_HOST_FROM_HOOK)
DB_RECOVERY_STATUS=$(db_recovery_check)


echo "-----------------------------------------------------------">>$logfile
echo "EFM Attaching Nodes Rules :- ">>$logfile
echo "--------------------------">>$logfile
echo "   Rule 1: Node should be attached if DB is in Recovery">>$logfile
echo "   Rule 2: Node should be Promoted if DB is NOT in Recovery">>$logfile
echo "-----------------------------------------------------------">>$logfile
echo "EFM Attaching Node            : $EFM_HOST_FROM_HOOK">>$logfile
echo "DB Recovery Status[True/False]: ${DB_RECOVERY_STATUS} ">>$logfile
echo "Node ID in PgPool Cluster     : ${EFM_HOOKIP_NODE_ID}">>$logfile
echo "-----------------------------------------------------------">>$logfile


# Attach Node if DB node is back as Standby
# -----------------------------------------

if [[ ${DB_RECOVERY_STATUS} == "t" ]]; then
   print_node_status $EFM_HOST_FROM_HOOK ${EFM_HOOKIP_NODE_ID} "before attach"
   attach_node_to_pgpool ${EFM_HOOKIP_NODE_ID}
   print_node_status $EFM_HOST_FROM_HOOK ${EFM_HOOKIP_NODE_ID} "after attach"
fi


# Promote Only if Node is back as Master
# --------------------------------------

if [[ ${DB_RECOVERY_STATUS} == "f" ]]; then
   ${PCP_PROMOTE_NODE} --host=${PCP_HOST} \
                       --username=${PCP_USER} \
                       --port=${PCP_PORT} \
                       --no-password \
                       --verbose \
                       ${EFM_HOOKIP_NODE_ID} >>$logfile
   print_node_status $EFM_HOST_FROM_HOOK ${EFM_HOOKIP_NODE_ID} "after promote" 
   attaching_remaining_nodes_to_pgpool $EFM_HOOKIP_NODE_ID
fi
echo -e "\nNote: Check 'show pool_nodes;' if logs show node status as 'down/waiting'">>$logfile
exit 0
==================efm_loadbalancer_detach.sh=================
==============Note that in the script below, i used the delegate IP so that this scripts could be run on any node having the VIP.If this is not used, then this script will fail if pgpool service goes down.=================================

-bash-4.2$ cat efm_loadbalancer_detach.sh
#!/bin/bash
###################################################################################
#title           : efm_loadbalance_detach.sh
#description     : Script executes pcp_detach_node command to notify pgpool on node 
#                : removal from pgpool cluster. 
#date            : October 1, 2018
#version         : 1.0
#bash_version    : GNU bash, version 4.2.46(2)-release (x86_64-redhat-linux-gnu)
#author          : Raghavendra Rao([email protected])
#
#Mandatory       : SCRIPT REQUIRES PASSWORD LESS EXECUTION. PCPPASS SHOULD BE
#                : CONFIGURED BEFORE CALLING IT. REFER TO THE BLOG ON THE USAGE
###################################################################################
# quit on any error
set -e
set -u

# Set environment variables, so EFM can connect to pgpool cluster and database
# to perform node status checks
#
# EFM variable
#-------------
EFM_HOST_FROM_HOOK=$1                        # argument from EFM fencing hook

# Pgpool cluster connection information
PCP_USER=enterprisedb                 # PCP user name (enterprisedb or postgres)
PCP_PORT=9898                         # PCP port number as in pgpool.conf
PCP_HOST=192.168.91.222                # hostname of Pgpool-II
PGPOOL_PATH=/usr/edb/pgpool4.0/bin    # Pgpool-II installation path
PCPPASSFILE=/var/efm/pcppass          # Path to PCPPASS file
PGPOOL_PORT=9999

# EPAS/PG DB connection information
#-----------------------------------
PGPATH=/usr/edb/as10                 # Path to EDB binaries
PGUSER=enterprisedb
PGPORT=5432
PGHOST=${EFM_HOST_FROM_HOOK}

# Pgpool pcp & db check commands
#-------------------------------
PCP_NODE_COUNT=${PGPOOL_PATH}/pcp_node_count
PCP_NODE_INFO=${PGPOOL_PATH}/pcp_node_info
PCP_DETACH_NODE=${PGPOOL_PATH}/pcp_detach_node
PG_ISREADY=${PGPATH}/bin/pg_isready

export PCPPASSFILE PCP_USER PCP_PORT PCP_HOST PGPOOL_PATH PGPOOL_PORT \
       PGPATH PGUSER PGHOST PGPORT PG_ISREADY \
       PCP_DETACH_NODE PCP_NODE_COUNT 

logfile=/usr/edb/efm-3.7/efm-scripts/efm_detach_script_"`date +"%Y%m%d%H%M%S"`".log

# Print Node status from pool_nodes to log
#-----------------------------------------

print_node_status()
{
   echo -e "\nNode $EFM_HOST_FROM_HOOK Status $2 $3 :">>$logfile
   echo "--------------------------------------------">>$logfile
   ${PGPATH}/bin/psql -h ${PCP_HOST} \
                                  -U ${PCP_USER} \
                                  -p ${PGPOOL_PORT} \
                                  -d template1 \
                                  -c "show pool_nodes;" | awk -v ip="$1" 'NR==1 || NR==2|| $0 ~ ip' >>$logfile
}

#Get Node ID information from show pool_nodes
#---------------------------------------------
get_info_from_pool_nodes()
{
  NO_OF_NODES=$(${PCP_NODE_COUNT} --host=${PCP_HOST} \
                                  --username=${PCP_USER} \
                                  --port=${PCP_PORT} \
                                  --no-password )

  for (( i=0 ; i < ${NO_OF_NODES} ; i++ ))
  do
     exists=$(${PCP_NODE_INFO} --host=${PCP_HOST} \
                               --username=${PCP_USER} \
                               --port=${PCP_PORT} \
                               --no-password ${i} |grep ${1} | wc -l)
     if [[ ${exists} -eq 1 ]]; then
        NODE_ID=${i}
        break
     fi
  done
  echo "$NODE_ID"
}

# Get db running or not
#----------------------
db_status_check()
{
   local val=$(${PG_ISREADY} -h ${PGHOST} \
                             -U ${PGUSER} \
                             -p ${PGPORT} \
                             -t 0 \
                             -d template1 | grep "accepting" | wc -l)
   echo "$val"
}


DB_STATUS=$(db_status_check)

echo "Dettach Node = ( DBNode=Down )">>$logfile
echo "DB Status on $EFM_HOST_FROM_HOOK: `[[ $DB_STATUS -eq 1 ]] && { echo UP; } || { echo DOWN; }`">>$logfile

if [[  ${DB_STATUS} -eq 0 ]]; then
   echo "Performing Detach operation">>$logfile
   print_node_status $EFM_HOST_FROM_HOOK before detach
   EFM_HOOKIP_NODE_ID=$(get_info_from_pool_nodes $EFM_HOST_FROM_HOOK 1)
   echo -e "\nDetaching Node :">>$logfile
   echo "-----------------">>$logfile
   ${PCP_DETACH_NODE} --host=${PCP_HOST} \
                      --username=${PCP_USER} \
                      --port=${PCP_PORT} \
                      --no-password \
                      --verbose \
                      ${EFM_HOOKIP_NODE_ID} >>$logfile

   print_node_status $EFM_HOST_FROM_HOOK after detach
fi

exit 0

About the author

User Avatar

bensonyerima

Hi, I'm Benson Yerima, a database administrator with an obsession for all things tech. This blog is dedicated for helping people learn about database technology.

View all posts