Quote

"Between stimulus and response there is a space. In that space is our power to choose our response.
In our response lies our growth and freedom"


“The only way to discover the limits of the possible is to go beyond them into the impossible.”


Thursday, 3 November 2016

The Conundrum of Automation Frameworks



While the industry is realizing the importance and criticality of a proper test automation practice in delivering world class software, but still it is very rarely looked as a ‘profit center’. The moment ‘cost center’ word gets associated with a practice all the ‘eagles’ of the world decent to look at ‘measures’ to ‘save’ cost and run with ‘bare minimum’. The ‘required focus’ to ‘criticality’ ratio still remains miserably poor. To augment the lack of ‘desired focus’ it is often tried to be supplemented with the ‘available’ and sometimes the ‘available and not so useful’. The result: an inconsequential, mediocre, fragile, maintenance intensive, unreliable automation practice. It barely does the job for an abysmally low percentage of targeted functions which results in low usability and low confidence of stake holders. Under tremendous delivery pressure and customer escalations there is no choice but to organically boost the manual testing team, work overtime etc... etc…. The story goes on and on, release after release.  

The problem is not only on the planning and focus front but also on the available or appropriate talent pool front. While manual resourcing challenges are a different ball game which need to be addressed with the right approach there are readily available technical solutions in market and community which if employed with right customization can turn out to be the perfect fit for the insurmountable in-house problem since ages.   

‘Agile’, dependable, extensible, cost effective, rapid in-sprint test automation is very difficult to achieve without having a proper extensible automation framework in place. Most teams rely on internal development/automation teams for development, enhancement, and customization of frameworks backing test automation efforts. However the framework is a complete product in itself which requires commensurate product development focus. Again because of the cost and possibility of the product to fail, most orgs shy away from setting up such a framework development team. So if we do not want to develop then the only options are to look at the market or go to the community. As a community and or practice matures it is likely to churn out some reliable solutions which can be customized and adopted.

But if you go out hunting in the market you need to be very clear of what your ‘exact’ need or requirement is. Unless you are clear with your requirements you are highly likely to pick up a misfit or a solution for another problem which you do not have to solve.

So what is the requirement? Let us take the high level requirement as ‘‘Agile’, dependable, extensible, cost effective, rapid in-sprint test automation’.
So how to achieve this? A framework or a solution which is ‘easy to use’, ‘does not require huge learning curve’, ‘easy to write tests as in English-words/KWs’, ‘handy reporting and logging’, ‘platform or application agnostic’, multiple interfaces (API, Command, GUI), data-driven, decoupled test assets, and integration with CI. If you are planning to build, keep these features in mind. Even if you are not planning to build there are other solutions…

Since we have setup the boundary requirements, now we can proceeed to the next steps...  I will try to touch other aspects and solutions in upcoming series…


To be continued….  Stay tuned... J

Monday, 17 October 2016

What SSH you are running?

If two servers are running different type of SSH, connecting the two server can be a nightmare. To correct this you can to convert the keys. But before that you need to know what SSH is running or supported on these servers.

Secure Shell (SSH) enables remote login or remote command execution between two hosts. SSH connection is cryptographically secure communication channel. SSH ensures security by enabling authentication, data integrity, encryption, authorization, and forwarding/tunneling.

SSH1 and SSH2



In fact, SSH1 and SSH2 are totally different protocols in terms of design and do not have inter-compatibility in between them. In SSH1 is a monolithic design where several all functions, such as authentication, transport, connection are packed into a single protocol. While SSH2 has a layered architecture for extensibility and flexibility. For enhanced security, SSH2 has MAC-based integrity check, flexible session re-keying, fully-negotiable cryptographic algorithms, public-key certificates, etc.

Sample SSH1 public key:


ssh-dss AAAAB3NzaC1kc3MAAAEBAKueha6mfr5OUcscc88lmQUBBgYSZ08htHFaYzke2N5WG6ql1NgwQsyY2mMRxvvGckBeInx2GvRlz1+izDs5p4UGhkMzG8qOoT2y2vLwTFQyxi4IXET1e0E8VYC0dcLfs5Zg6RxEY7GA5FiydS6dceuPnLJgCYDfyb9Qbk4rVEvREODo8dV/KRlZxecEgaeKOO7ZnEzaIVPRCVbfasdfaseaRtZvxKfGnNFI957AfZ+Hqevz1IeQNDCp00EmaNli8Ow4rjOPlH7o818r35Ea8mMoV0hkirNQ25zf/Z1LvCS3649537YDi/SVmMMpGCvT93w/TRvk5RKlwVVy+H52C8/MKEAAAAVAOuDCV61LvfKz0bd8hYEJ/gGof9XAAABAQCFRhlpWtVOhTxcWcrnZ9EbbVRZO16St5TPjL86khb7b/VjScOAgt0tslHwtEEQzImv1xRkk6ZQ1o9pvAzb1fMZrZMGIy9zUXvL0v6LNXxCxN9YIjx14OXYfH8EIQDZJGRJoxHvEvUVjv3lHnTuxbdKrcbagvakxvgjq1wVyEueilO+g+WhJm+Q+XIYRl0TK9qtsAVFmzxBxT5USZFJ+1kbG7ippfFSGWRd3KPUCVQ8iGO3IMjtIlfcuGOArbKB06kMlxsdjNjhcEIHtR0jpaEeB2X+HrVScQEoXG4S8YkiIExlIvjhrVr571BTOuO9H5VHt4CtKUxeXxKZWslulYwAAABAHm3zlMsXxPL/HOq29qf7Lk90b7El+j19E2UkyssfSu6+/k4bFf6ax2n3yEn31S5bUdNvgqmlEjdERc4SkU65b5LW2ZI1v7kRoegG+bD2Q21N9Rv/lwS7CTprenKiMMRJ8TU7FMIVT3zEZkV+etC7cbaN+09GoiFTt+h7IDmo7onlo64oSMrcc+xt++ZUzENTVBgDoS9treternELkyJqZgb1/fdEPT6wRj132yBxWLqDGmbp9msmY1us+XNDY8isF80u9yTTXGTskOtCSaeavDDtPOKN5ZR20sHpIBgt6zd6mm/zKD6OZo14BLSJr7ldwSRzNNYMtkLnNyFSYxAIrm9Y==


Sample SSH2 public key:


—- BEGIN SSH2 PUBLIC KEY —-
Comment: “1024-bit DSA, converted from OpenSSH by satyam@sing”
AAAAB3NzaC1kc3MAAACBAJ7QKkrLoasddfasfasdfVmKVedk1GAr/S+Cruq3/GtjRnxvJqbBbfnelWYUC+vbHc5a+7bgRsQfCgoCeGKH5wGD4CDWQMhy2XYomnGf1gUC86Hq77/Noqa02N441EFSTIEoNlU2aYi8zwVQKlgP6e22mG9sK7zSaGX639ctaigHuST8qPAAAAFQC2az8dfxHkkDZAEw+RcvRn3asdfsafdasAAIEAgYpPs6d+Kyw37ZaBarlMEaZoEfrxhUZ44SN+KoqBZYpSVwyHJ+/RB0zVUizXCmZ5RhYSsYZ5asdfasfasdfmBxogaEh5d7xxUpg/9Xctf94Jsf7vxccjZ4XYARrVikq/0L9fuKOmo4ET9iAf+GL7w2u5gzxxZr+xX5jw/A7907lOCwAAACAMoHHk0o1XkG+yeaPtuwbrHshGqTjpOUkJ/AYuQ8OBuVAOdqse1di9JpeHko26G0zoH3N+nDHMGdYYTNHzRNYRd2q20ztcAP52crZo1rtpNdvs6c+RTEIgoP3oYh1e1+rg70tWKIW3R/NYB39CESHoyqsAJ7vzOPm0iUOd36YECY=
—- END SSH2 PUBLIC KEY —-


Verifying what SSH is running/supported


Since they(SSH1 and SSH2) are not compatible it is required to verify what SSH is running in to/from where we are trying to connect.

Option 1: /etc/ssh/sshd_config

  
If you check what SSH is running/supported on local machine then check /etc/ssh/sshd_config to see if it has 'Protocol 2' or 'Protocol 1,2' is present it. If /etc/ssh/sshd_config has 'Protocol 2' then only SSH2 is supported and if 'Protocol 1,2' is present then SSH1 and SSH2 both are supported.

Option 2: ssh <-v> user@remote_server

  
If you want to verify the SSH version supported on a remote machine then you can run the following to see which one connects successfully.
ssh <-v> user@remote_server
e.g ssh -1 user@remote_server or ssh -2 user@remote_server
if incorrect version is used then the following error is returned: 'Protocol major versions differ: 1 vs. 2'

Option 3: sshscan

  
If you want to scan entire network or a large group of machines then sshscan can be used.
By default this utility may not be installed on your machine. So you may need to install it before you can use it.

Option 4: ssh -V


ssh -V will give following output:

OpenSSH_5.3p1, OpenSSL 1.0.0-fips 29 Mar 2010


Once you know version are incorrect you can correct/convert the keys to be added in ~/.ssh/authorized_keys file using conversion methods as defined here.


Monday, 13 June 2016

Manage High Availability Clusters Using Pacemaker



Computer Cluster 

A computer cluster consists of a set of loosely or tightly connected computers that work together so that, in many respects, they can be viewed as a single system. Unlike ‘grid computers’,  computer clusters have each node set to perform the same task, controlled and scheduled by software.

The components of a cluster are usually connected to each other through fast local area networks ("LAN"), with each node (computer used as a server) running its own instance of an operating system. In most circumstances, all of the nodes use the same hardware and the same operating system, although in some setups (i.e. using Open Source Cluster Application Resources (OSCAR)), different operating systems can be used on each computer, and/or different hardware.

They are usually deployed to improve performance and availability over that of a single computer, while typically being much more cost-effective than single computers of comparable speed or availability.

Computer clusters emerged as a result of convergence of a number of computing trends including the availability of low-cost microprocessors, high speed networks, and software for high-performance distributed computing. They have a wide range of applicability and deployment, ranging from small business clusters with a handful of nodes to some of the fastest supercomputers in the world such as IBM's Sequoia. The applications that can be done however, are nonetheless limited, since the software needs to be purpose-built per task. It is hence not possible to use computer clusters for casual computing tasks.


Pacemaker

Pacemaker is an open source high availability resource manager software used on computer clusters since 2004. Until about 2007, it was part of the Linux-HA project, then was split out to be its own project.
It implements several APIs for controlling resources, but its preferred API for this purpose is the Open Cluster Framework resource agent API. Pacemaker is generally used with Corosync or Heartbeat.

Pacemaker’s key features include:

  • Detection and recovery of node and service-level failures
  • Storage agnostic, no requirement for shared storage
  • Resource agnostic, anything that can be scripted can be clustered
  • Supports STONITH for ensuring data integrity
  • Supports large and small clusters
  • Supports both quorate and resource driven clusters 
  • Supports practically any redundancy configuration
  • Automatically replicated configuration that can be updated from any node
  • Ability to specify cluster-wide service ordering, colocation and anti-colocation
  • Unified, scriptable, cluster shell
  • Support for advanced service types
        Clones: for services which need to be active on multiple nodes
        Multi-state: for services with multiple modes 
           (eg. master/slave, primary/secondary)

Pieces of Clusters

At a high level, the cluster is made up of three pieces:

Non-cluster aware components. These pieces include the resources themselves, scripts that start, stop and monitor them, and also a local daemon that masks the differences between the different standards these scripts implement.

Resource management Pacemaker provides the brain that processes and reacts
to events regarding the cluster. These events include nodes joining or leaving the cluster; resource events caused by failures, maintenance, scheduled activities; and other administrative actions. Pacemaker will compute the ideal state of the cluster and plot a path to achieve it after any of these events. This may include moving resources, stopping nodes and even forcing them offline with remote power switches.

Low level infrastructure Corosync provides reliable messaging, membership and quorum information about the cluster (illustrated in red).


Node Configurations

While the components may not vary but a cluster can support different type of Node configurations. The most common size for an HA cluster is a two-node cluster, since that is the minimum required to provide redundancy, but many clusters consist of many more, sometimes dozens of nodes. Such configurations can sometimes be categorized into one of the following models:

Active/active — Traffic intended for the failed node is either passed onto an existing node or load balanced across the remaining nodes. This is usually only possible when the nodes use a homogeneous software configuration.

Active/passive — Provides a fully redundant instance of each node, which is only brought online when its associated primary node fails. This configuration typically requires the most extra hardware.

N+1 — Provides a single extra node that is brought online to take over the role of the node that has failed. In the case of heterogeneous software configuration on each primary node, the extra node must be universally capable of assuming any of the roles of the primary nodes it is responsible for. This normally refers to clusters that have multiple services running simultaneously; in the single service case, this degenerates to active/passive.

N+M — In cases where a single cluster is managing many services, having only one dedicated failover node might not offer sufficient redundancy. In such cases, more than one (M) standby servers are included and available. The number of standby servers is a tradeoff between cost and reliability requirements.

N-to-1 — Allows the failover standby node to become the active one temporarily, until the original node can be restored or brought back online, at which point the services or instances must be failed-back to it in order to restore high availability.

N-to-N — A combination of active/active and N+M clusters, N to N clusters redistribute the services, instances or connections from the failed node among the remaining active nodes, thus eliminating (as with active/active) the need for a 'standby' node, but introducing a need for extra capacity on all active nodes.

The terms logical host or cluster logical host is used to describe the network address that is used to access services provided by the cluster. This logical host identity is not tied to a single cluster node. It is actually a network address/hostname that is linked with the service(s) provided by the cluster. If a cluster node with a running database goes down, the database will be restarted on another cluster node.

Following picture from the Pacemaker documentation shows an Active/Active configuration:


Using PCS Command Line to Manage Pacemaker and Corosync

The pcs command line interface provides the ability to control and configure corosync and pacemaker.

The general format of the pcs command is as follows:
pcs [-f file] [-h] [commands]

List of PCS commands are as follows:
  • cluster: Configure cluster options and nodes.
  • resource: Create and manage cluster resources.
  • stonith: Configure fence devices for use with Pacemaker.
  • constraint: Manage resource constraints.
  • property: Set Pacemaker properties.
  • status: View current cluster and resource status.
  • config: Display complete cluster configuration in user-readable form.

Hope this helps clarify how pacemaker fits into high availability clusters setup.