Securing Cassandra

Data security is a major concern and is given top priority in every organization. Securing sensitive data and keeping it out of hands from those who should not have access is challenging even in traditional database environments, let alone a cloud hosted database.  Data should be secured on the fly and on rest. In this blog, we will talk about securing data in Cassandra database on cloud environment specifically on AWS. We will divide the blog into two.

  1. Secure Cassandra on AWS
  2. Cassandra data access security

Secure Cassandra on AWS

Cassandra is best used when hosted across multiple datacenters.  Hosting it on cloud across multiple datacenters will reduce lot of cost and peace of mind knowing that you can survive regional outages. However, securing cloud infra is most fundamental activity that need to be carried when hosted on cloud.

Securing Ports

Securing ports and unknown host access is the foremost think when hosted on cloud. Cassandra needs the following ports to be opened on your firewall for a multi-node cluster else it will act as standalone cluster.

Public ports

Port Number Description
22 SSH port

 

Create a Security Group with default rule as SSH traffic allowed on port 22 (both inbound and outbound).

  1. Click ‘ADD RULE’ (both inbound and outbound)
  2. Choose ‘SSH’ from the ‘Type’ dropdown
  3. Enter only allowed IPs from the ‘Source’ (inbound) / ‘Destination’ (outbound).

Private – Cassandra inter node ports

Ports used by the Cassandra cluster for inter-node communication must be restricted to communicate within the node, restricting the traffic flow from and to the external resources.

Port Number Description
7000 Inter node communication without SSL encryption enabled
7001 Inter node communication with SSL encryption enabled
7199 Cassandra JMX monitoring port
5599 Private port for DSEFS inter-node communication port.

 

To configure inter-node communication ports in a Security Group:

  1. Click ‘ADD RULE’.
  2. Choose ‘Custom TCP Rule’ from the ‘Type’ dropdown.
  3. Enter the port number in the ‘Port Range’ column.
  4. Choose ‘Custom’ from the ‘Source’ (inbound) / ‘Destination’ (outbound) dropdown and enter the same Security Group ID as the value. This allows communication only within the cluster over the configured port, when this Security Group would be attached to all the nodes in the Cassandra cluster.

Private – Cassandra inter node ports

The following port needs to be secured and opened only for the clients which will be connecting with our cluster.

Port Number Description
9042 Client port without SSL encryption enabled
9160 Client port SSL encryption enabled
9142 Should be open when both encrypted and unencrypted connections are required
9160 DSE client port (Thrift) port

 

To configure public ports in a Security Group:

  1. Click ‘ADD RULE’.
  2. Choose ‘Custom TCP Rule’ from the ‘Type’ dropdown.
  3. Enter the port number in the ‘Port Range’ column.
  4. Choose ‘Anywhere’ from the ‘Source’ (inbound) / ‘Destination’ (outbound).

To restrict the public ports to certain known IP or IP Range:

d.Choose ‘Custom’ from the ‘Source’ (inbound) / ‘Destination’ (outbound) dropdown and provide the IP value or CIDR block corresponding to the IP Range.

Now that we have configured the firewall, our VMS are secured for unknown access.  It is recommended to create Cassandra clusters in a private subnet within your VPC which does not have Internet access.

Create a NAT instance in a public subnet or configure NAT Gateway that can route the traffic from the Cassandra cluster in the private subnet for software updates.

Cassandra Data Access Security

Securing data involves the following security accesses,

  1. Node to node communication
  2. Client to node communication
  3. Encryption at rest
  4. Authentication and authorization

Node to Node and Client to Node Communication Encryption

Cassandra is a master-less database. Master-less design offers no single point of failure for any database process or function. Every node is same on Cassandra. Read and write is served by every node for any query on the database. So, there is lot of data transfer between each node on the cluster. When the database is hosted on public cloud network, this communication needs to be secured. Likewise, the data transferred between the database and client on the public network is always at risk. To secure the data on flight during these scenarios, usually encryption of data by sending over a SSL is preferred widely.

Most developers are not exposed to encryption in their day to day work. And setting up an encryption layer is always a tedious process. Cassandra helps this by providing a built-in feature. All we need to do is enable the server_encryption_options: and client_encryption_options: configurations on your cassandra.yaml file and provide the required certificates and keys. Cassandra takes care of the encryption of data during node to node and client to server communications.

Additionally, Cassandra follows Client Certificate Authentication. Imagine, without authentication that we are talking to another Cassandra node, the cluster is only expecting a SSL key, we can write programs to attach to a cluster and execute any commands, listen to writes on arbitrary token ranges, even create a admin account into the system_auth table.

To avoid this, Cassandra follows Client Certificate Authentication. Using this approach Cassandra takes the extra step of verifying the client against a local trust store. If it does not recognize the client’s certificate, it will not accept the connection. This additional verification can be enabled by setting require_client_auth:true in cassandra.yaml configuration file.

In the rest of the blog we will see step by step process of enabling and configuring the cluster for SSL connection. If you have a certificate already, you can skip Generating certificates using OpenSSL.

Generating Certificates using OpenSSL

Most of the UNIX system should have OpenSSL tool installed on it. If not available, install OpenSSL before proceeding further.

Steps:

  1. Create a configuration file gen_ca_cert.conf with the below configurations.

linkedin_sponsor_sentiment_v1

2.Run the following OpenSSL command to create the CA:
linkedin_sponsor_sentiment_v1
linkedin_sponsor_sentiment_v1

3.You can verify the contents of the certificate you just created with the following command:
linkedin_sponsor_sentiment_v1

You can generate certificate for each node if required, but doing that is not recommended. Because it is very tough to maintain separate key for each node. Imagine, when a new node is added to the cluster, the certificate for that node needs to be added to all other nodes which is tedious process. So, we recommend using the same certificate for all the nodes. Following steps will help you to use the same certificate for all the nodes.

Building Keystore

I will be explaining the keystore building for a 3-node cluster. Same can be followed for a n node cluster.

linkedin_sponsor_sentiment_v1
linkedin_sponsor_sentiment_v1

To verify that the keystore is generated with correct key pair information and accessible, execute the below command

linkedin_sponsor_sentiment_v1

With our key stores created and populated, we now need to export a certificate from each node’s key store as a “Signing Request” for our CA:

linkedin_sponsor_sentiment_v1

With the certificate signing requests ready to go, it’s now time to sign each with our CA’s public key via OpenSSL:

linkedin_sponsor_sentiment_v1

Add CA to the keystore into each node’s keystore via -import sub command of keytool.

linkedin_sponsor_sentiment_v1
linkedin_sponsor_sentiment_v1

Building Trust Store

Since Cassandra uses Client Certificate Authentication, we need to add a trust store to each node. This is how each node will verify incoming connections from the rest of the cluster.

We need to create trust store by importing CA root certificate’s public key:

linkedin_sponsor_sentiment_v1

Since all our instance-specific keys have now been signed by the CA, we can share this trust store instance across the cluster.

Configuring the Cluster

After creating all the required files, you can keep the keystore and truststore files in /usr/local/lib/cassandra/conf/ or any directory of your choice. But make sure that the cassandra demon has access to the directory. By making he below configuration in cassandra.yaml file the inbound and outbound requests will be encrypted.

Enable Node to Node Encryption

linkedin_sponsor_sentiment_v1

Enable Client to Node Encryption

linkedin_sponsor_sentiment_v1
linkedin_sponsor_sentiment_v1

Repeat the above process on all the nodes on the cluster and your cluster data is secured on flight and from unknowns.

Author Credits: This article was written by Bharathiraja S, Senior Data Engineer at 8KMiles Software Services.

Cassandra Backup and Restore Methods

Cassandra Backup and Restore Methods

Cassandra is a distributed database management system. In Cassandra, data is replicated among multiple nodes across multiple data centers. Cassandra can survive without any interruption in service when one or more nodes are down. It keeps its data in SSTable files. SSTables are stored in the keyspace directory within the data directory path specified by the ‘data_file_directories’ parameter in the cassandra.yaml file.  By default, its SSTable directory path is /var/lib/cassandra/data/<keypace_name>. However, Cassandra backups are still necessary to recover from following scenario

  1. Any errors made in data by client applications
  2. Accidental deletions
  3. Catastrophic failure that will require you to rebuild your entire cluster
  4. Data can become corrupt
  5. Useful to roll back the cluster to a known good state
  6. Disk failure

Cassandra Backup Methods

Cassandra provides two types of backup. One is snapshot based backup and the other is incremental backup.

Snapshot Based Backup

Cassandra provides nodetool utility which is a command line interface for managing a cluster. The nodetool utility gives a useful command for creating snapshots of the data. The nodetool snapshot command flushes memtables to the disk and creates a snapshot by creating a hard link to SSTables. SSTables are immutable. The nodetool snapshot command takes snapshot per node basis. To take an entire cluster snapshot, the nodetool snapshot command should be run using a parallel ssh utility, such as pssh.  Alternatively, snapshot of each node can be taken one by one.

It is possible to take a snapshot of all keyspaces in a cluster, or certain selected keyspaces, or a single table in a keyspace. Note that you must have enough free disk space on the node for taking the snapshot of your data files.

The schema does not get backed up in this method.  This must be done manual separately.

Example:

a.All keyspaces snapshot

If you want to take snapshot of all keyspaces on the node then run the below command.

$ nodetool snapshot

The following message appears:

Requested creating snapshot(s) for [all keyspaces] with snapshot name [1496225100] Snapshot directory: 1496225100

The snapshot directory is /var/lib/data/keyspace_name/table_nameUUID/ snapshots/1496225100

b.Single keyspace snapshot

Assuming you created the keyspace university. To took a snapshot of the keyspace and you want a name of the snapshot the run the below command

$ nodetool snapshot -t 2017.05.31 university

The following output appears:

Requested creating snapshot(s) for [university] with snapshot name [2015.07.17]

Snapshot directory: 2017.05.31

c.Single table snapshot

If you want to take a snapshot of only the student table in the university keyspace then run the below command

$ nodetool snapshot --table student university

The following message appears:

Requested creating snapshot(s) for [university] with snapshot name [1496228400]

Snapshot directory: 1496228400

After completing the snapshot, you can move the snapshot files to another location like AWS S3 or Google Cloud or MS Azure etc. You must backup the schema because Cassandra can only restore data from a snapshot when the table schema exists.

Advantages:

  1. Snapshotbased backup is simple and much easier to manage.
  2. Cassandra nodetool utility provides nodetool clearsnapshot command which removesthe snapshot files.

Disadvantages:

  1. For large datasets, it may be hard to take a daily backup of the entire keyspace.
  2. It is expensive to transfer large snapshot data to a safe location like AWS S3

Incremental Backup

Cassandra also provides incremental backups. By default incremental backup is disabled. This can be enabled by changing the value of “incremental_backups” to “true” in the cassandra.yaml file.

Once enabled, Cassandra creates a hard link to each memtable flushed to SSTable to a backup’s directory under the keyspace data directory. In Cassandra, incremental backups contain only new SSTable files; they are dependent on the last snapshot created.

In the case of incremental backup, less disk space is required because it only contains links to new SSTable files generated since the last full snapshot.

Advantages:

  1. The incremental backup reduces disk space requirements.
  2. Reducesthe transfer cost.

Disadvantages:

  1. Cassandra does not automatically clear incremental backup files. If you want to remove the hard-link files then write your own script for that. There is no built-in tool to clear them.
  2. Creates lots of small size file in backup. File management and recovery not a trivial task.
  3. It is not possible to select a subset of column families for incremental backup.

Cassandra Restore Methods

Backups are meaningful when they are restorable under situations when keyspace gets deleted or new cluster gets launched from the backup data or a node get replaced. Restoring backed up data is possible from snapshots and if you are using incremental backups then you need all incremental backup files created after the snapshot. There are mainly two ways to restore data from backup. One is using nodetool refresh and another one using sstableloader.

Restore using nodetool refresh:

Nodetool refresh command loads newly placed SSTables onto the system without a restart. This method is used when new node replace a node which is not recoverable. Restore data from a snapshot is possible if the table schema exists. Assuming you have created a new node then follow the below steps

  1. Create the schema if not created already.
  2. Truncate the table,if necessary.
  3. Locate the snapshot folder(/var/lib/keyspace_name/table_name UUID/snapshots/snapshot_name) and copy the snapshot SSTable directory to the /var/lib/keyspace/table_name-UUID directory.
  4. Run nodetool refresh.

Restore using sstableloader:

The sstableloader loads a set of SSTable files in a Cassandra cluster. The sstableloader provides the following options.

  1. Loading external data
  2. Loading existing SSTables
  3. Restore snapshots

The sstableloader does not simply copy the SSTables to every node, but also transfers the relevant part of the data to each node and also maintain the replication factor. Here sstableloader used for restore snapshots. Follow the below steps for restore using sstableloader

  1. Create the schema if not exists.
  2. Truncate the table if necessary.
  3. Bring your back up data to a node from AWS S3 or Google Cloud or MS AzureExample: Download your backup data in /home/data
  4. Run the below command
    sstableloader -d ip /home/data

 

Author Credits: This article was written by Sebabrata Ghosh, Data Engineer at 8KMiles Software Services  and can reach him here.

 

Cloud Boundaries Redefined in AWS Chennai Meetup on 30th April @8KMiles

AWS Chennai Meetup
“You don’t have to say everything to be a light. Sometimes a fire built on a hill will bring interested people to your campfire.” ― Shannon L. Alder

This is one of the days where the above quote is proven to be right. As a market leader in delivering quality Cloud solutions, 8K Miles has this habit of stretching every new service offered by different cloud service providers to explore and solve the contemporary business problems. In yet another effort in that direction, we had a bunch of technical evangelists and architects gathering at 8K Miles today for the #AWSChennaiMeetup event, to discuss two broad areas on AWS architecture designs.

1) The Pros and Cons of Architecting Microservices on AWS

2) Cloud Boundaries redefined: Running ~600 million jobs every month on AWS

AWS Chennai MeetUp I Session

Session 1: Pros and Cons of Architecting Microservices on AWS

This topic was discussed by Sudhir Jonathan from Real Image. Sudhir works as a consultant to Real Image, on the teams that build Moviebuff.com and Justickets.in. His history includes ThoughtWorks, Own Startup and a few personal projects. He is an avid coder and specialities includes Ruby on Rails, Go, React, AWS and Heroku and  a few.

AWS Chennai Meetup

His valuable knowledge sharing session started with the Pros and Cons of Architecting Microservices on AWS, also covering automated deployment, inter process communication using SQS, ECS, cost reductions using spot instances, ELB and Autoscaling groups.

Session 2: Cloud Boundaries redefined: Running ~600 million jobs every month on AWS

In the world of cloud “Speed is Everything”.  To identify various security, compliance, risk and vulnerability drifts instantly on our customer environment, 8K Miles  cloud operations team runs ~600 million jobs every month.  Mohan and Saravanan – the technical architects of 8K Miles shared their experience in running distributed and fault tolerant scheduler stack and how it has evolved.

AWS Chennai Meetup

During the event we also organized a simple tweet quiz in our handle @8KMiles for all the participants. Dwarak discussed each question in detail with all the participants.

AWS Chennai Meetup

For more detailed updates on this event, please check the hashtag #AWSChennaiMeetup and our handle @8KMiles in Twitter
**Chennai Amazon Web Services Meetup, is organized by AWS Technology Evangelists from Chennai for AWS Cloud Enthusiasts. The goal is to conduct meetups often, share and learn the latest technology implementations on AWS, the challenges, the learnings, the limitations etc.

8K Miles is a leading Silicon Valley based Cloud Services firm, specializing in high-performance Cloud computing, Analytics, and Identity Management solutions and is emerging as one of the top solution providers for the IT and ITIS requirement on Cloud for the Pharma, Health Care and allied Life Sciences domains.

 

 

CloudWatch + Lambda Case 4: Control launch of Specific “C” type EC2 instances post office hours to save costs

We have a customer who has predictable load volatility between 9 am to 6 pm and uses specific large EC2 instances during office hours for analysis, they use “c4.8xlarge” for that purpose. Their IT wanted to control launch of such large instance class post office hours and during nights to control costs, currently there is no way to restrict or control this action using Amazon IAM. In short we cannot create complex IAM policy with conditions that user A belonging to group A cannot launch instance type C every day between X and Y.

Some stop gap followed is to have a job running which removes the policy from an IAM user when certain time conditions are met. So basically what we would do is, to have a job that calls an API that removes the policy which restricts an IAM user or group from launching instances. This will make the IAM policy management complex and tough to assess/govern drifts between versions.

After the introduction of the CloudWatch events our Cloud operations started controlling it with lambda functions. Whenever an Instance type is launched it will trigger a lambda function, the function will filter whether it is a specific “C” type and check for the current time, if the time falls after office hours, it will terminate the EC2 instance launched immediately.

As a first step, we will be creating a rule in Amazon CloudWatch Events dashboard. We have chosen AWS API Call as an Event to be processed by an AWSCloudTrail Lambda function as a target.

CloudWatch Events Lambda EC2

The next step would be configuring rule details with Rule definition

CloudWatch Events Lambda EC2

Finally, we will review the Rules Summary

CloudWatch Events Lambda EC2

Amazon Lambda Function Code Snippet (Python)
import boto3

def lambda_handler(event, context):
#print (“Received event: ” + json.dumps(event, indent=2))
#print (“************************************************”)

ec2_client = boto3.client(“ec2”)

print “Event Region :”, event[‘region’]

event_time = event[‘detail’][‘eventTime’]
print “Event Time :”, event_time

time = event_time.split(‘T’)
t = time[1]
t = t.split(‘:’)
hour = t[0]

instance_type = event[‘detail’][‘requestParameters’][‘instanceType’]
print “Instance Type:”, instance_type

instance_id = event[‘detail’][‘responseElements’][‘instancesSet’][‘items’][0][‘instanceId’]
print “Instance Id:”,instance_id

if( instance_type.startswith( ‘t’ ) and hour > 18 or hour < 8 ):
print ec2_client.terminate_instances( InstanceIds = [ instance_id ] )

GitHub Gist URL:  https://github.com/cloud-automaton/automaton/blob/master/aws/events/TerminateAWSEC2.py

This post was co authored with Priya and Ramprasad of 8KMiles.

This article was originally published in: http://harish11g.blogspot.in/

CloudWatch + Lambda Case 3 -Controlling cross region EBS/RDS Snapshot copies for regulated industries

If you are part of regulated industry like Pharmaceutical/ Life sciences/BFSI running mission critical applications on AWS, at times as part of the compliance requirements you will have to restrict/control data movement to a particular geographic region in the cloud. This becomes complex to restrict sometimes. Let us explore in detail:

We all know there are varieties of ways to move data from one AWS region to another, but one commonly used method is Snapshot copy across AWS regions. Usually you can restrict snapshot copy permission in IAM Policy, but what if you need the permission enabled for moving data between AWS accounts inside a region, but still want to control EBS/RDS snapshot copy action across regions. It can be only mitigated by automatically deleting the snapshot on destination AWS region in case snapshot copy activity is done.

Our Cloud operations team used to altogether remove this permission in IAM or monitor this activity using polling scripts for customers with multiple accounts who need this permission and still need control. Now after the introduction of CloudWatch Events we have configured a rule that points to an AWS Lambda which gets triggered in near real time when snapshot is copied to destination AWS region. The lambda function will initiate a deletion process immediately. Though it is reactive it is incomparably faster than manual intervention.

In this use case, Amazon CloudWatch Event will identify the EBS Snapshot copies across the regions and delete them.

As a first step, we will be creating a rule in Amazon CloudWatch Events dashboard. We have chosen AWS API Call as an Event to be processed by an AWSCloudTrail Lambda function as a target.

CloudWatch Events Lambda

The next step would be configuring rule details with Rule definition

CloudWatch Events Lambda

Finally, we will review the Rules Summary

CloudWatch Events Lambda

Amazon Lambda Function Code Snippet (Python)

CloudWatch Events Lambda

GitHub Gist URL: https://github.com/cloud-automaton/automaton/blob/master/aws/events/AWSSnapShotCopy.py

https://github.com/cloud-automaton/automaton/blob/master/aws/events/AWSSnapShotCopy.py

This post was co-authored with Muthukumar and Ramprasad of 8KMiles

This article was originally published in: http://harish11g.blogspot.in/

CloudWatch + Lambda Case 2- Keeping watch on AWS ROOT user activity is normal or anomaly ?

As a Best Practice you should never use your AWS root account credentials to access AWS. Instead, create individual (IAM) users for anyone who needs access to your AWS account. This allows you to give each IAM user a unique set of security credentials and grant different permissions to each user. Example: Create an IAM user for yourself as well, give that user administrative privilege, and use that IAM user for all your work and never share your credentials to anyone else.

Usually Root has full access and it is not ideal to restrict the same in AWS IAM. Imagine you suddenly doubt some anomaly/suspicious activities done as Root user (using EC2 API’s etc) in your logs other than normal IAM user provisioning; this could be because Root user is compromised or forced, but ultimately it is a deviation from the best practice.

In the past we used to poll the CloudTrail logs using programs and differentiate between “root” and “Root”, and our cloud operations used to react to these anomaly behaviors. Now we can inform the cloud operations and customer stake holders near real time using CloudWatch events.

In this use case, Amazon CloudWatch Event will identify activities if any performed by an AWS ROOT user and notifications will be sent to SNS thru AWS Lambda.

As a first step, we will be creating a rule in Amazon CloudWatch Events dashboard. We have chosen AWS API Call as an Event to be processed by an AWSCloudTrail Lambda function as a target. The lambda function will detect if the event is triggered by root user and notifies through SNS.

CloudWatch Events Lambda Root Activity Tracking

The next step would be configuring rule details with Rule definition

CloudWatch Events Lambda Root Activity Tracking

Finally, we will review the Rules Summary

CloudWatch Events Lambda Root Activity Tracking

Amazon Lambda Function Code Snippet (Python)

CloudWatch Events Lambda Root Activity Tracking

GitHub Gist URL:

https://github.com/cloud-automaton/automaton/blob/master/aws/events/TrackAWSRootActivity.py

This post was co-authored with Saravanan and Ramprasad of 8KMiles

This article was originally published in: http://harish11g.blogspot.in/

CloudWatch + Lambda Case 1- Avoid malicious CloudTrail action in your AWS Account

As many of you know AWS CloudTrail provides visibility into API activity in your AWS account, Cloud Trail Logging lets you see which actions users have taken and which resources have been used, along with details such as the time and date of actions and the actions that have failed because of inadequate permissions. It enables you to answer important questions such as which user made an API call or which resources were acted upon in an API call. If a user disables CloudTrail logs accidentally or with malicious intent then audit logging events will not captured and hence you fail to have proper governance in place. The situation will get complex, If the user disables- enables back CloudTrail for a brief period of time where some important activities can go unlogged and unaudited. In short once CloudTrail logging is enabled it should not be disabled and this action needs to be defended in depth.

Our Cloud operations team had earlier written a program that periodically scans the Cloud Trail logs entries, if any log activity was missing after an X period of time it alerts the operations.  Overall reaction time on our cloud operations was >15-20 mins to mitigate this CloudTrail disable action.

Now after the introduction of CloudWatch Events we have configured a rule that points to an AWS Lambda function as target. This function gets triggered in near real time when CloudWatch is disabled and automatically enables it back without any manual interaction from Cloud operations. The advanced version of the program triggers workflow which logs entries into ticket system as well. This event model has helped us reduce the mitigation to less than a minute.
We have illustrated below the detailed steps on how to configure this event. Also we given the link for GIT with basic AWS Lambda Python code that can be used by your cloud operations.

In this use case, Amazon CloudWatch Event will identify whether an AWS account has got CloudTrail enabled or not, if not enabled, Amazon CloudWatch Events will take corrective actions by enabling the same.

As a first step, we will be creating a rule in Amazon CloudWatch Events dashboard. We have chosen AWS API Call as an Event to be processed by an AWSCloudTrail Lambda function as a target.

CloudWatch Events Lambda CloudTrail

The next step would be configuring rule details with Rule definition

CloudWatch Events Lambda CloudTrail

Finally, we will review the Rules Summary

CloudWatch Events Lambda CloudTrail

Amazon Lambda Function Code Snippet (Python)
import json
import boto3
print(‘Loading function’)
“”” Function to define Lambda Handler “””
def lambda_handler(event, context):
    try:
        client = boto3.client(‘cloudtrail’)
        if event[‘detail’][‘eventName’] == ‘StopLogging’:
            response = client.start_logging(Name=event[‘detail’][‘requestParameters’][‘name’])
    except Exception, e:
        sys.exit();

 

GitHub Gist URL:

This post was co-authored with Mohan and Ramprasad of 8KMiles

This article was originally published in: http://harish11g.blogspot.in/

Apache Solr to Amazon CloudSearch Migration Tool

In this post, we are introducing a new tool called S2C – Apache Solr to Amazon CloudSearch Migration Tool. S2C is a Linux console based utility that helps developers / engineers to migrate search index from Apache Solr to Amazon CloudSearch.

Very often customers initially build search for their website or application on top of Solr, but later run into challenges like elastic scaling and managing the Solr servers. This is a typical scenario we have observed in our years of search implementation experience. For such use cases, Amazon CloudSearch is a good choice. Amazon CloudSearch is a fully-managed service in the cloud that makes it easy to set up, manage, and scale a search solution for your website. To know more, please read the Amazon CloudSearch documentation.

We are seeing growing trend every year, organizations of various sizes are migrating their workloads to Amazon CloudSearch and leveraging the benefits of fully managed service. For example, Measured Search, an analytics and e-Commerce platform vendor, found it easier to migrate to Amazon CloudSearch rather than scale Solr themselves (see article for details).

Since Amazon CloudSearch is built on top of Solr, it exposes all the key features of Solr while providing the benefits of a fully managed service in the cloud such as auto-scaling, self-healing clusters, high availability, data durability, security and monitoring.

In this post, we provide step-by-step instructions on how to use the Apache Solr to Amazon CloudSearch Migration (S2C) tool to migrate from Apache Solr to Amazon CloudSearch.

Before we get into detail, you can download the S2C tool in the below link.
Download Link: https://s3-us-west-2.amazonaws.com/s2c-tool/s2c-cli.zip

Pre-Requisites

Before starting the migration, the following pre-requisites have to be met. The pre-requisites include installations and configuration on the migration server. The migration server could be the same Solr server or independent server that sits between your Solr server and Amazon CloudSearch instance.

Note: We recommend running the migration from the Solr server instead of independent server as it can save time and bandwidth. It is much better if the Solr server is hosted on EC2 as the latency between EC2 and CloudSearch is relatively less.

The following installations and configuration should be done on the migration server (i.e. your Solr server or any new independent server that connects between your Solr machine and Amazon CloudSearch).

  1. The application is developed using Java. Download and Install Java 8 .Validate the JDK path and ensure the environment variables like JAVA_HOME, classpath, path is set correctly.
  2. We assume you already have setup Amazon Web services IAM account. Please ensure the IAM user has right permissions to access AWS services like CloudSearch.
    Note: If you do not have an AWS IAM account with above mentioned permissions, you cannot proceed further.
  3. The IAM user should have AWS Access key and Secret key. In the application hosting server, set up the Amazon environment variables for access key and secret key. It is important that the application runs using the AWS environment variables.
    To setup AWS environment variables, please read the below link. It is important that the tool is run using AWS environment variables.http://docs.aws.amazon.com/AWSSdkDocsJava/latest/DeveloperGuide/credentials.htmlhttp://docs.aws.amazon.com/AWSSdkDocsJava/latest/DeveloperGuide/java-dg-roles.html
    Alternatively, you can set the following AWS environment variables by running the commands below from Linux console.
    export AWS_ACCESS_KEY=Access Key
    export AWS_SECRET_KEY=Secret Key
  4. Note: This step is applicable only if migration server is hosted on Amazon EC2.
    If you do not have an AWS Access key and Secret key, you can opt for IAM role attached to an EC2 instance. A new IAM role can be created and attached to EC2 during the instance launch. The IAM role should have access to Amazon CloudSearch.
    For more information, read the below link
    http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html
  5. Download the migration utility ‘S2C’ (You would have completed this step earlier), unzip the tool and copy it in your working directory.Download Link: https://s3-us-west-2.amazonaws.com/s2c-tool/s2c-cli.zip

S2C Utility File
The downloaded ‘S2C’ migration utility should have the following sub directories and files.

Folder / Files Description  
 
bin Binaries of the migration tool
 
lib Libraries required for migration
 
application.conf Configuration file that allows end users to input parameters Require end-user’s input.
 
logback.xml Log file configuration Optional. Does not require end-user  / developer input
 
s2c script file that executes the migration process

Configure only application.conf and logback.xml.  Do not modify any other file.
Application.conf: The application.conf file has the configuration related to the new Amazon CloudSearch domain that will be created. The parameters configured  in the application.conf file are explained in the table below.

s2c {api {SchemaParser = “s2c.impl.solr.DefaultSchemaParser”SchemaConverter = “s2c.impl.cs.DefaultSchemaConverter”DataFetcher = “s2c.impl.solr.DefaultDataFetcher”DataPusher = “s2c.impl.cs.DefaultDataPusher”  } List of API that is executed step by step during the migration.Do not change this.
solr {dir = “files”
server-url = “http://localhost:8983/solr/collection1”
fetch-limit = 100}
dirThe base directory path of Solr.Ensure the directory is present and also its validity.Eg:/opt/solr/example/solr/collection1/conf
server-url– Server host, port and collection path.The endpoint which will be used to fetch the data.If the utility is run from a different server, ensure the IP address and port has firewall access.
fetch-limit– number of solr documents that can be fetched for each batch call. This configuration number should be carefully set by the developer.The fetch limit depends on the following factors:

  1. Record size of a Solr record(1KB or 2KB)
  2. Latency between migration server and Amazon CloudSearch
  3. Current Request Load on the Solr Server

E.g.: If the total Solr documents is 100000 and fetch limit is 100, then it would take 100000 / 10 = 10000 batch calls to complete the fetch.If size of each Solr record is 2KB, then 100000 * 2KB = 200MB data is migrated.

cs {domain = “collection1”
region = “us-east-1″
instance-type = ” search.m3.xlarge”
partition-count = 1
replication-count = 1}
domain – CloudSearch domain name. Ensure that the domain name does not already exist.
Region – AWS region for the new CloudSearch domain
Instance type – Desired instance type for CloudSearch nodes. Choose the instance type based on the volume of data and the expected query volume. 
Partition count – Number of partitions required for CloudSearch
replication-count – Replication count for CloudSearch
wd = “/tmp” Temporary file path to store intermediate data files and migration log files

Running the migration

Before launching the S2C migration tool, verify the following:

    • Solr directory path – Make sure that the Solr directory path is valid and available. The tool cannot read the configuration if the path or directory is invalid.
    • Solr configuration contents – Validate that the Solr configuration contents are correctly set inside the directory. Example: solrconfig.xml, schema.xml, stopwords.txt, etc.
    • Make sure that the working directory is present in the file system and has write permissions for the current user. It can be an existing directory or a new directory. The working directory stores the fetched data from Solr and migration logs.
    • Validate the disk size before starting the migration. If the available free disk space is lesser than the size of the Solr index, the fetch operations will fail.

For example, if the Solr index size is 7 GB, make sure that the disk has at least 10 GB to 20 GB of free space.
Note: The tool reads the data from Solr and stores in a temporary directory (Please see configuration wd = /tmp in the above table).

  • Verify that the AWS environment variables are set correctly. The AWS environment variables are mentioned in the pre-requisites section above.
  • Validate the firewall rules for IP address and ports if the migration tool is run from a different server or instance. Example: Solr default port 8983 should be opened to the EC2 instance executing this tool.

Run the following command from directory ‘{S2C filepath}’
Example: /build/install/s2c-cli

/s2c or JVM_OPTS=”-Xms2048m -Xmx2048m” ./s2c (With heap size)

The above will invoke the shell ‘s2c’ script that starts the search migration process. The migration process is a series of steps that require user inputs as shown in the screen shots below.
Step 1: Parse the Solr schema The first step of migration prompts for a confirmation to parse the Solr schema and Solr configuration file. During this step, the application generates a ‘Run Id’ folder inside the working directory.
  Example: /tmp/s2c/m1416220194655

The Run Id is a unique identifier for each migration. Note down the Run Id as you will need it to resume the migration in case of any failures.

Step 2: Schema conversion from Solr to CloudSearch.The second step prompts confirmation to convert Solr schema to CloudSearch schema. Press any key to proceed further.

The second step will also list all the converted fields which are ready to be migrated from Solr to CloudSearch. If any fields are left out, this step will allow you to correct the original schema. User can abort the migration and identify the ignored fields, rectify the schema and re-run the migration again.The below screen shot shows the fields ready for CloudSearch migration.


Step 3: Data Fetch: The third step prompts for confirmation to fetch the search index data from the Solr server. Press any key to proceed. This step will generate a temporary file which will be stored in the working directory. This temporary file will have all the fetched documents from the Solr index.


There is also option to skip the fetch process if all the Solr data is already stored in the temporary file. If this is the case, the prompt will look like the screenshot below.

Step 4: Data push to CloudSearchThe last and final step prompts for confirmation to push the search data from the temporary file store to Amazon CloudSearch. This step also creates the CloudSearch domain with the configuration specified in application.conf including desired instance type, replication count, and multi-AZ options.

If the domain is already created, the utility will prompt to use the existing domain. If you do not wish to use an existing domain, you can create a new CloudSearch domain using the same prompt.
Note: The console does not prompt for any ‘CloudSearch domain name’ but instead it uses the domain name configured in the application.conf file.

Step 5: Resume (Optional) During the migration steps, if there is any failure during the fetch operation, it can be resumed. This is illustrated in the screen shot below.

Step 6: Verification Log into AWS CloudSearch management console to verify that the domain and index fields.

Amazon CloudSearch allows running test queries to validate the migration and as well the functionality of your application.

Features supported

  • Support for other non-Linux environments is not available for now.
  • Support for Solr Shards is not available for now. The Solr shard needs to be migrated separately.
  • The install commands may vary for different Linux flavors. Example installing software, file editor command, permission set commands can be different for every Linux flavors. It is left to engineering team to choose the right commands during the installation and execution of this migration tool.
  • Only fields configured as ‘stored’ in Solr schema.xml are supported. The non-stored fields are ignored during schema parsing.
  • The document id (unique key) is required to have following attributes:
    1. Document ID should be 128 characters or less in size.
    2. Document ID can contain any letter, any number, and any of the following characters:      _ – = # ; : / ? @ &
    3. The below link will help you to understand in data  preparation before migrating to CloudSearch http://docs.aws.amazon.com/cloudsearch/latest/developerguide/preparing-data.html
  • If the conditions are not met in a document, it will be skipped during migration. Skipped records are shown in the log file.
  • If a field type (mapped to fields) is not stored, the stopwords mapped to that particular field type are ignored.

Example 1:

<field name=”description” type=”text_general” indexed=”true” stored=”true” />   

Note: The above field ‘description’ will be considered for stopwords.Example 2:

<field name=”fileName” type=”string” />     

Note: The above field ‘fileName’ will not be migrated and ignored in the stopwords.

Please do write your feedback and suggestions in the below comments section to improve this tool. The source code of the tool can be downloaded at https://github.com/8KMiles/s2c/. We have written a follow-up post in regard to that.

About the Authors
 Dhamodharan P is a Senior Cloud Architect at 8KMiles.

 

 

 

 Dwarakanath R is a Principal Architect at 8KMiles.

 

 

Taking Your e-Commerce to the Cloud – Strategies to Optimize for High Traffic Demands

This whitepaper will help you build a strategy to address high-demand e-Commerce challenges—such as holiday seasons and product launches—using Amazon Web Services. You’ll learn how to identify the right AWS components for your e-commerce application, take advantage of practical use cases, and prepare for traffic surges.
Read More…e-commerce Challenges and Strategies

Loading Big Index Data into newly launched Amazon CloudSearch engine

Search tier is the most critical section of many online verticals like travel, e-commerce, classifieds etc. If users cannot search products efficiently they will not make their buying decisions properly, which in turn massively affects the revenues of these companies. Most of them are usually powered by Apache Solr, FAST , Autonomy, ElastiSearch etc.  AWS also has a Search Service called CloudSearch which is a fully-managed service in the cloud that makes it easy to set up, manage, and scale a search solution for your website. Amazon CloudSearch relieves you from the worry of hardware provisioning, setup, and maintenance. As your volume of data and traffic fluctuates, Amazon CloudSearch automatically scales to meet your needs.

In AWS infrastructure Apache Solr has been the king and the software to beat till now, recently it has got heavy competitor in the form of Amazon CloudSearch – API 2013-01-01.

API version 2013-01-01 of Amazon CloudSearch is internally powered by customized version of Apache Solr Engine, and it is specifically designed for running highly scalable and available search on Amazon Web Services Cloud. This 2013 CloudSearch API has lots of similarities with Apache Solr and customers can easily migrate to this version and leverage the benefits of Amazon Cloud Infrastructure. We are already hearing many AWS customers are planning their migration from FAST, Solr and A9 Engine into the Amazon CloudSearch – 2013-01-01 API engine.

My team is already migrating couple of customers into this Amazon CloudSearch 2013-01-01 API and i have shared our experience on this process for the benefit of AWS community.

Reference Migration Architecture and requirements:


In this article i am going to explore how to

  • Migrate a 300+ GB index containing close to 247+ million records distributed in 105 searchable fields in a highly scalable /parallel manner in AWS infrastructure.
  • 300 + GB index file is stored in Amazon S3
  • Custom Data loader program built on Amazon Elastic MapReduce is used for parallel loading
  • Around ~6 Search.M2.2Xlarge are created with 2 partitions and 5 replication count
  • Around 10+ M1.large EMR Core nodes are for Data loading. This loader can be increased to hundreds of nodes depending upon the volume and velocity of data pump required.
  • Amazon CloudSearch Infrastructure provisioning, Automated partitioning, replication count are handled by AWS.

Lets get into the details below:

Step 1)Create a new Amazon CloudSearch Domain: We have named the search domain as “bigdatasearch” and chose the search instance type as search.m2.2xlarge.  Since we are planning to pump and query a 300 GB index with millions of document, it did not make sense for us to chose a smaller instance type of Amazon CloudSearch.  Usually the base instance type can be selected based on the number and size of the documents you are planning to maintain in the Amazon CloudSearch.
Note: Here we have chosen replication count as 5.  This is little strange in a distributed architecture because usually more replication count for the master decreases the speed of document upload. But when we were playing with Amazon CloudSearch we observed that it is increasing the speed of uploads. In addition we also observed the following :

  • If we keep the replication count 0 or less, use a smaller search instance type and pump documents in parallel from multiple nodes, either the Amazon CloudSearch Server is failing sometimes or error rates are high.
  • If we keep the replication count 0 or less , use a larger search instance type and pump documents in parallel from multiple nodes, internally Amazon Cloud Search itself is creating 3-5 nodes and it shows in the replication count. Waiting to discuss with AWS SA folks on this behavior.

We will be utilizing distributed uploading technique which we custom built using Amazon Elastic MapReduce to pump data to the Amazon CloudSearch server. This technique enables us to write more Index data in parallel.

Step 2) Select how you would like to create the Amazon CloudSearch Schema: Here we have chosen Manual setup, since we already have schema to be migrated to Amazon CloudSearch.

Next step is to Add index fields to create your Amazon CloudSearch Schema configuration.

Step 3)Adding Amazon CloudSearch Index Fields: Once all the fields have been configured in the schema, click on continue button. In the schema file used we have 100+ fields to be indexed for this particular search domain.
Step 4) Review the setup configurations and launch:
We have 100+ Index fields with scaling options instance type as m2.2xlarge and replication count 5 in the “bigdatasearch” domain.
Step 5 ) Wait till the Amazon CloudSearch Infrastructure is provisioned for you on the back. Usually it takes 10 minutes, it will also list if there is any error encountered when creating the index fields.
Once the Amazon CloudSearch infrastructure is provisioned at the back end , you should notice the “bigdatasearch” domain is“Active”. The search and Document endpoints are published and currently no of searchable document is “0”. There is only 1 CloudSearch Index Partition (Shards) and 5 search.m2.2xlarge instances.
Step 6)Configuring Synonyms: We have 2+ MB of Synonyms which needs to be configured into the Amazon CloudSearch domain. For this, we used Cloud Search cli-toolkit to upload synonyms to Cloud Search.
cs-configure-analysis-scheme -d bigdatasearch –name customanalysisscheme –lang en -e cloudsearch.ap-southeast-1.amazonaws.com –synonyms customsynonyms.txt
Since the volume of index data is huge (300+ GB) we have created a Custom Data Loader built on Amazon Elastic MapReduce to pump the data in parallel into Amazon CloudSearch. Since it is built on Amazon Elastic MapReduce,  we can use the same program without modification for scale to upload TB’s of index into the search system with hundreds of Data loader EMR core/task nodes.
Step 7) Create Amazon Elastic MapReduce Data Loader Cluster Configuration:
Step 8) Configure the Elastic MapReduce (EMR) Capacity: We are using 10 M1.Large core node instances for uploading the data from inside AWS VPC. Depending upon the Data size (GB->TB) and Upload hours we can increase the EMR core nodes capacity and number to speed up the data pump (upload) process.

To know more about How Spot instances can save cost on Amazon EMR ? refer URL AWS Cost Saving Tip 12: Add Spot Instances with Amazon EMR

Step 9)Add Custom data loader program Jar to EMR:
We have exported the data from a MSSQL server as flat UTF-8 dump file and stored it in Amazon S3. We are giving the 300+ GB Dump file as the input for the Amazon EMR CloudSearch Data Loader program to upload into Amazon CS in parallel. Buckets configurations of the Data Loader jar, Input, output and log files are configured in this screen

Step 10) Configure Amazon CloudSearch Access Policies:  We need to open Cloud Search security group access policies to accept upload requests from EMR cluster inside VPC. Configure static IP’s of all the instances or IP range of the data loader clients
Step 11)Run the Amazon Elastic MapReduce Data loader job :
Step 12) Analyzing the Amazon EMR Data loader Job Output:
Output of the JOB can be seen in the AWS EMR JOB logs. Here are few details:
  • “Map output records” in the log tells how many records are inserted into the Amazon CloudSearch , we can observe close to 247,681,520 documents(247+ million) are pumped.
  • “Bytes Read” in the output tells what is size of data set which the JOB has read. We can observe 322387978332 bytes which is equivalent to 300+ GB of index in the Amazon CloudSearch
  • The entire pumping process took ~30 hours with 10 m1.large core nodes for us. We observed that increasing the number of Data loader EMR nodes or their capacity improves the upload speed drastically.
 Step 13) Clean up : Reset Replication Count to level of HA needed ideally 1-2 nodes. Once the Job is completed, Revert back the Security Access Policies in Amazon cloud search. Terminate the EMR Cluster and clean any leftover resources.

Step 14) Analyzing the CloudSearch Dashboard :
We observed that it takes some time for cloud search to reflect actual count of the indexed documents.

After the pumping of 300 + GB index you can observe that currently 2 Amazon CloudSearch partitions ( shards) are used to distribute 247+ million documents with 100+ index fields. This is tremendous cost savings compared to A9 powered Amazon CloudSearch. Amazon CloudSearch has automatically created shards based on the volume of data pumped in to the system. This is cool !!!, it reduces the maintenance headache of the infra admins. If the Amazon CloudSearch team can make this partition concept as configurable parameter in future it will be useful.
Step 15) Executing a Sample Search queries: We are executing a some sample product search queries on the “bigdatasearch” domain to check whether everything is fine. Distributed query was fired and Results came Sub Second from one of the partitions.
In short, It is cost effective compared to old A9 powered CloudSearch, Automated scaling of replication counts for request scalability, automated scaling of partitions for data scalability relieves the infra admin headaches, strong apache Solr pedigree and its long list of feature additions in coming months will make it more interesting.
After working with this service few weeks, we feel it is going to become the major search service on AWS in coming years, giving tough fight for Apache Solr and ElastiSearch deployments on EC2.
This article was co authored with Ankit @8Kmiles.

25 Best Practice Tips for architecting your Amazon VPC

According to me Amazon VPC is one of the most important feature introduced by AWS. We have been using AWS from 2008 and Amazon VPC from the day it was introduced and i strongly feel the customer adoption towards AWS cloud gained real momentum only after the introduction of VPC into the market.
Amazon VPC comes with lots of advantages over the limitations faced in Amazon Classic cloud like: Static private IP address , Elastic Network Interfaces :  possible to bind multiple Elastic Network Interfaces to a single instance, Internal Elastic Load Balancers, Advanced Network Access Control ,Setup a secure bastion host , DHCP options , Predictable internal IP ranges , Moving NICs and internal IPs between instances, VPN connectivity, Heightened security etc. Each and everything is a interesting topic on its own and i will be discussing them in detail in future.
Today i am sharing some of our implementation experience on working with hundreds of Amazon VPC deployments as best practice tips for the AWS user community. You can apply some of the relevant ones in your existing VPC or use these points as part of your migration approach to Amazon VPC.

Practice 1) Get your Amazon VPC combination right: Select the right Amazon VPC architecture first.  You need to decide the right Amazon VPC & VPN setup combination based on your current and future requirements. It is tough to modify/re-design the Amazon VPC at later stage, so it is better to design it taking into consideration your NW and expansion needs for next ~2 years. Currently different types of Amazon VPC setups are available; Like Public facing VPC, Public and Private setup VPC, Amazon VPC with Public and Private Subnets and Hardware VPN Access, Amazon VPC with Private Subnets and Hardware VPN Access, Software based VPN access etc. Choose the one which you feel you will be in next 1-2 years.

Practice 2) Choose your CIDR Blocks: While designing your Amazon VPC, the CIDR block should be chosen in consideration with the number of IP addresses needed and whether we are going to establish connectivity with our data center. The allowed block size is between a /28 netmask and /16 netmask. Amazon VPC can have contain from 16 to 65536 IP addresses. Currently Amazon VPC once created can’t be modified, so it is best to choose the CIDR block which has more IP addresses usually. Also when you design the Amazon VPC architecture to communicate with the on premise/data center ensure your CIDR range used in Amazon VPC does not overlaps or conflicts with the CIDR blocks in your On premise/Data center. Note: If you are using same CIDR blocks while configuring the customer gateway it may conflict.
E.g., Your VPC CIDR block is 10.0.0.0/16 and if you have 10.0.25.0/24 subnet in a data center the communication from instances in VPC to data center will not happen since the subnet is the part of the VPC CIDR. In order to avoid these consequences it is good to have the IP ranges in different class. Example., Amazon VPC is in 10.0.0.0/16 and data center is in 172.16.0.0/24 series.

Practice 3) Isolate according to your Use case: Create separate Amazon VPC for Development , Staging and Production environment (or) Create one Amazon VPC with Separate Subnets/Security/isolated NW groups for Production , Staging and development. We have observed 60% of the customer preferring the second choice. You chose the right one according to your use case.

Practice 4) Securing Amazon VPC : If you are running a machine critical workload demanding complex security needs you can secure the Amazon VPC like your on-premise data center or more sometimes. Some of the tips to secure your VPC are:

  • Secure your Amazon VPC using Firewall virtual appliance, Web application firewall available from Amazon Web Services Marketplace. You can use check point, Sophos etc for this
  • You can configure Intrusion Prevention or Intrusion Detection virtual appliances and secure the protocols and take preventive/corrective actions in your VPC
  • Configure VM encryption tools which encrypts your root and additional EBS volumes. The Key can be stored inside AWS (or) in your Data center outside Amazon Web Services depending on your compliance needs. http://harish11g.blogspot.in/2013/04/understanding-Amazon-Elastic-Block-Store-Securing-EBS-TrendMicro-SecureCloud.html
  • Configure Privileged Identity access management solutions on your Amazon VPC to monitor and audit the access of Administrators of your VPC.
  • Enable the cloud trail to audit in the VPC environments  ACL policy’s. Enable cloud trail :http://harish11g.blogspot.in/2014/01/Integrating-AWS-CloudTrail-with-Splunk-for-managed-services-monitoring-audit-compliance.html
  • Apply anti virus for cleansing specific EC2 instances inside VPC. Trend micro has very good product for this.
  • Configure Site to Site VPN for securely transferring information between Amazon VPC in different regions or between Amazon VPC to your On premise Data center
  • Follow the Security Groups and NW ACL’s best practices listed below

Practice 5) Understand Amazon VPC Limits: Always design the VPC subnets in consideration with the expansion in the future. Also understand the Amazon VPC’s limits before using the same. AWS has various limitations on the VPC components like Rules per security group, No of route tables and Subnets etc. Some of them may be increased after providing the request to the Amazon support team while few components cannot be increased. Ensure the limitations are not affecting your overall design. Refer URL:
http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Appendix_Limits.html

Practice 6) IAM your Amazon VPC: When you are going to assign people to maintain your Amazon VPC you can create Amazon IAM account with the fine grained permissions (or) use Sophisticated Privileged identity Management solutions available on AWS marketplace to IAM your VPC.

Practice 7) Disaster Recovery or Geo Distributed Amazon VPC Setup : When you are designing a Disaster Recovery Setup plan using VPC or expanding to another Amazon VPC region you can follow these simple rules. Create your Production site VPC CIDR : 10.0.0.0/16 and your DR region VPC CIDR:  172.16.0.0/16. Make sure they do not conflict with on premises subnet CIDR block in event both needs to be integrated to on premise DC as well. After CIDR blocks creation , setup a VPC tunnel between regions and to your on premise DC. This will help to replicate your data using private IP’s.

Practice 8) Use security groups and Network ACLs wisely:  It is advisable to use security groups over Network ACLs inside Amazon VPC wherever applicable for better control. Security groups are applicable on EC2 instance level while network ACL is applicable on Subnet level.  Security groups are used for White list mostly. To blacklist IPs, one can use Network ACLs.

Practice 9) Tier your Security Groups : Create different security groups for different tiers of your infrastructure architecture inside your VPC. If you have Web, App, DB tiers create different security group for each of them. Creating tier wise security groups will increase the infrastructure security inside Amazon VPC.  EC2 instances in each tier can talk only on application specified ports and not at all ports. If you create Amazon VPC security groups for each and every tier/service separately it will be easier to open a port to a particular service. Don’t use same security group for multiple tiers of instances, this is a bad practice.
Example: Open ports for security group instead of IP ranges : For example : People have tendency to open for port 8080 to 10.10.0.0/24 (web layer) range. Instead of that, open port 8080 to web-security-group. This will make sure only web security group instances will be able to contact on port 8080. If someone launches NAT instance with NAT-Security-Group in 10.10.0.0/24, he won’t be able to contact on port 8080 as it allows access from only web security group.
Practice 10 ) Standardize your Security Group Naming conventions : Following a security group naming conventions inside Amazon VPC will improve operations/management for large scale deployments inside VPC. It also avoids manual errors, leaks and saves cost and time overall.
For example: Simple ones like Prod_DMZ_Web_SG or Dev_MGMT_Utility_SG (or) complex coded ones for large scale deployments like
USVA5LXWEBP001- US East Virginia AZ 5 Linux Web Server Production 001
This helps in better management of security groups.
Practice 11) ELB on Amazon VPC:  When using Amazon ELB for Web Applications, put all other EC2 instances( Tiers like App,cache,DB,BG etc)  in private subnets as much possible. Unless there is a specific requirement where instances need outside world access and EIP attached, put all instances in private subnet only. Only ELBs should be provisioned in Public Subnet as secure practice in Amazon VPC environment.
Practice 12) Control your outgoing traffic in Amazon VPC: If you are looking for better security, for the traffic going to internet gateway use Software’s like Squid or Sophos to restrict the ports,URL,Domains etc so that all traffic go through the proxy tier controlled and it also gets logged. Using these proxy/security systems we can also restrict the unwanted ports, by doing so,  if there is any security compromise to the application running inside Amazon VPC they can be detected by auditing the restricted connections captured from the logs. This helps in corrective security measure.
Practice 13) Plan your NAT Instance Type: Whenever your Application EC2 instances residing inside private subnet of Amazon VPC are making Web Service/HTTP/S3/SQS calls they go through NAT instance. If you have designed Auto scaling for your application tier and there are chances ten’s of app EC2 instances are going to make lots of web calls concurrently, NAT instance will become a performance bottleneck at this juncture. Size your NAT instance capacity depending upon application needs for avoiding performance bottlenecks. Using the NAT instances provides us with advantages of saving cost of Elastic IP and provides extra security by not exposing the instances to outside world for accessing the internet.
Practice 14) Spread your NAT instance with Multiple Subnets: What if you have hundreds of EC2 instances inside your Amazon VPC and they are making lots of heavy web service/HTTP calls concurrently. A single NAT instance with even largest EC2 size cannot handle that bandwidth sometimes and may become performance bottleneck. In Such scenarios, span your EC2 across multiple subnets and create NAT’s for each subnet. This way you can spread your out going bandwidth and improve the performance in your VPC based deployments.
Practice 15) Use EIP when needed: At times you may need to keep a part of your application services to be kept in Public subnet for external communication. It is recommended practice to associate them with Amazon Elastic IP and white list these IP address in the target services used by them
Practice 16) NAT instance practices : If needed, enable Multi factor authentication on NAT instance. SSH and RDP ports are open only on sources and destination IP’s, not global network (0.0.0.0/0). SSH / RDP ports are opened only on static exit IP’s not dynamic exit IP’s.
Practice 17) Plan your Tunnel between On-Premise DC to Amazon VPC: 
Select the right mechanism to connect your on premises DC to Amazon VPC. This will help you to connect the EC2 instance via private IP’s in a secure manner.
  • Option 1: Secure IPSec tunnel to connect a corporate network with Amazon VPC (http://aws.amazon.com/articles/8800869755706543)
  • Option 2 : Secure communication between sites using the AWS VPN CloudHub (http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPN_CloudHub.html)
  • Option 3: Use Direct connect between Amazon VPC and on premise when you have lots of data to be transferred with reduced latency (or) you have spread your mission critical workloads across cloud and on premise. Example: Oracle RAC in your DC and Web/App tier in your Amazon VPC. Contact us if you need help on setting up direct connect between Amazon VPC and DC.
Practice 18) Always span your Amazon VPC across multiple subnets in Multiple Availability zones inside a Region. This helps is architecting high availability inside your Amazon VPC properly. Example: Classification of the VPC subnet : WEB Tier Subnet : 10.0.10.0/24 in Az1 and 10.0.11.0/24 in Az2, Application Tier Subnet :  10.0.12.0/24 and 10.0.13.0/24, DB Tier Subnet :  10.0.14.0/24 and 10.0.15.0/24, Cache Tier Subnet : 10.0.16.0/24 and 10.0.17.0/24 etc
Practice 19) Good security practice is that to have only public subnet with route table which carries route to internet gateway. Apply this wherever applicable.
Practice 20) Keep your Data closer : For small scale deployments in AWS where cost is critical than high availability, It is better to keep the Web/App in same availability zone as of ElastiCache , RDS etc inside your Amazon VPC. Design your subnets accordingly to suit this. This is not a recommended architecture for applications demanding High Availability.
Practice 21) Allow and Deny Network ACL : Create Internet outbound allow and deny network ACL in your VPC.
First network ACL: Allow all the HTTP and HTTPS outbound traffic on public internet facing subnet.
Second network ACL: Deny all the HTTP/HTTPS traffic. Allow all the traffic to Squid proxy server or any virtual appliance.
Practice 22 ) Restricting Network ACL : Block all the inbound and outbound ports. Only allow application request ports. These are stateless traffic filters that apply to all traffic inbound or outbound from a Subnet within VPC. AWS recommended Outbound rules : http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Appendix_NACLs.html
Practice 23) Create route tables only when needed and use the Associations option to map subnets to the route table in your Amazon VPC
Practice 24) Use Amazon VPC Peering (new) : Amazon Web Services has introduced VPC peering feature which is quite useful one. AWS VPC peering connection is a networking connection between two Amazon VPCs that enables you to route traffic between them using private IP addresses. Currently it can be in same AWS region, Instances in either VPC can communicate with each other as if they are within the same network. Since AWS uses the existing infrastructure of a VPC to create a VPC peering connection; it is neither a gateway nor a VPN connection, and does not rely on a separate piece of physical hardware (which essentially means there is no single point of failure for communication or a bandwidth bottleneck).

We have seen it is useful in following scenarios :
  1. Large Enterprises usually run Multiple Amazon VPC in single region and some of their applications are so interconnected that they may need to access them privately + securely inside AWS. Example Active Directory, Exchange, Common business services will be usually interconnected.
  2. Large Enterprise have different AWS accounts for different business units/teams/departments , at times systems deployed by some business units in different AWS accounts need to be shared or need to consume a shared resource privately. Example: CRM , HRMS ,File Sharing etc can be internal and shared. In such scenarios VPC peering comes very useful.
  3. Customer can peer their VPC with their core suppliers to have tighter integrated access of their systems.
  4. Companies offering Infra/Application Managed Services on AWS can now safely peer into customer Amazon VPC and provide monitoring and management of AWS resources.

Practice 25) Use Amazon VPC: It is highly recommended that migrate all your new workloads inside Amazon VPC rather than Amazon Classic Cloud. I also strongly recommend to migrate your existing workloads from Amazon Classic cloud to Amazon VPC in phases or one shot which ever is feasible. In addition to the benefits of the VPC that is detailed in the start of the article, AWS has started introducing lots of features which are compatible only inside VPC and in the AWS marketplace as well there are lots of products which are compatible only with Amazon VPC.  So make sure you leverage this strength of VPC. If you require any help for this migration please contact me.

readers feel free to suggest more.. I will link relevant ones in this article

Load Testing tool comparison – JMeter on it’s own vs JMeter & BlazeMeter together

Load testing is an important aspect of web applications life cycle on Amazon Cloud. Some of our customers ask us to generate 50000+ RPS to load test the scalability of their application deployed on Amazon cloud. Whenever we used to help such customers and migrate their applications on Amazon cloud for achieving scalability, load testing phase itself becomes a pain. Setting up the Load Testing infrastructure, writing automation around it, Managing, Maintaining and monitoring the load test infrastructure is an headache. Our Load testers and Infrastructure teams were spending considerable time and efforts on the above , instead of focusing only on load testing. We usually work with variety of tools from Grinder, JMeter, HP Load Runner to custom engineered load testing tools during the load testing phase. Some time back , our team started playing around with a SAAS based load testing tool called BlazeMeter. In this article i am going to share our experience in the form comparison between BlazeMeter and JMeter and why BlazeMeter has a bright future.
Blazemeter is a Saas based high scalable load testing tool that handle up to 300,000+ concurrent users. Their load test infrastructure is spread across major AWS regions. Since most of us have been using JMeter for years , the 100 % compatibility it provides to existing JMeter scripts is a good feature. Blazemeter also provides a Chrome Extension which can record browser actions & convert it to .jmx file.

10 Things I like about BlazeMeter

Point 1) Load Test becomes effective only when the load comes from different IP Addresses similar to real world scenario and not from a single source IP. When multiple virtual user load is generated from the same IP, the router as well as the server tries to cache information and optimize the throughput many times. Hence by using multiple IP addresses for the host, the EC2 server will get an illusion of receiving requests from multiple source IP’s. Also it is better that load is generated from multiple IP’s for Amazon ELB to evenly distribute load. Refer URL. BlazeMeter has capability to generate load from IP’s which is very important on load testing the cloud applications.
Point 2) Customizing the Network Emulation: Usually online applications will be accessed from multiple devices like PC’s, Laptop and mobiles. These devices have multiple network types such as 3G, broad band etc.Also at times times our online application will be accessed from locations which has poor network bandwidth , Both these parameters play an important role in capacity planning and load testing. We can chose the Bandwidth and network type emulation while doing the load test using BlazeMeter. Example we can configure the network type such as Unlimited Internet, 3G, Cable, Wifi etc and. Bandwidth download limit per device can also be set.
Point 3) Controlling the Throughput: Target throughput is a parameter of Apache JMeter that can be used to achieve a required throughput value of the application. A server’s performance need not always satisfy the target throughput value mentioned in JMeter. It could provide more throughput or lesser.The target throughput parameter can be controlled in run time in Blazemeter. Live server monitoring can help us identify if our servers are performing well for say 5000 Hits/sec & change the throughput value in run-time to a higher or lower value based on the server’s performance.
Point 4) Controlling the Agents: Apache JMeter works based on Master-Agent based architecture where the Master controls multiple agents generating the load. The number of agents parameter has to be usually decided before the starting of the test while using JMeter based load testing on Amazon Cloud. Option to dynamically change the throughput value is a very good feature to have while load testing a cloud application requiring thousands of Requests per second. BlazeMeter enables us to add or remove agent instances when a test is running. Any instance can be marked as Master or Slave(Agent) while the test is running.
Point 5) Controlling No. of Simulated Users on Slaves (Agents) : A load test strategy is mainly determined by following parameters like number of concurrent users, ramp up time, no. of test engines and test iterations and the test duration. Apache JMeter allows us to manually configure these values before the test is started. New EC2 instances have to provisioned for the Agents, the IP addresses (Usually Elastic IP) of the slaves/agents has to be manually added to the master. The entire setup has to be maintained, managed and monitored during the test cycles. This is ok for an load testing environment with few load test agents and low RPS, imagine an environment where you have generate thousands of RPS and having 50+ agents running. This process of managing the EC2 load test infrastructure will become tedious process overall for the load testing teams. In BlazeMeter, once the number of concurrent users is given, the number of test engines, number of threads and engine capacity is chosen automatically. This can be made semi-automatic, where the number of engines & number of threads as well can be selected by user and only engine capacity is chosen by BlazeMeter. Since it is a managed Load Test infrastructure, the Load Testers can concentrate the testing and not managing 100’s of EC2 load agents.
Point 6) Integrated Monitoring:
BlazeMeter offers live monitoring of essential parameters of test servers when the test is running which enables us to decide on the number & instance type for the test. In the conventional Apache JMeter load test setup in Amazon EC2 we have to observe the Key parameters using AWS Cloudwatch.
Blaze Meter provides AWS Cloud watch integration.An account with IAM access has to be created and
AWS Access Key & Secret Key values have to be configured so that the metrics are available in the Blazemeter’s dashboard. This features helps us to understand how the assets in the cloud are reacting to our load tests and help us accordingly tune the infrastructure.
While performing load testing, it is important not only to monitor your Web Servers & Databases but also the agents from where the load is generated . The New Relic plugin gives us the front end KPIs and back end KPIs.
BlazeMeters’s frontend KPIs provide insight on how many users are actually trying to access your website, mobile site or mobile apps.
BlazeMeters’s backend KPIs show how many users are getting through to your applications.
Point 7) Blazemeter allows us to have a different csv file per load test engine. Though this possible in Apache JMeter, it had to be done manually by copying the files onto the JMeter Agent EC2 instances and have the same filename since the agents refer to the Master’s properties. Blazemeter allows us to parameterize the values of even filenames and have different csv files in each engine without giving us to the trouble of copying files into specific EC2 instances & holds the files in a common repository so that it can be referred from there to each agent.
Point 8) Run the load test using older version of JMeter scripts: Old scripts can be reusable with this feature of BlazeMeter which lets us run the test using any version of Apache JMeter right from version 2.3.2 to 2.10. Some complex scripts prepared some months/years ago can be still be made usable and need not be redone. Saves efforts and costs.
Point 9) Schedule the Test & Stay Relaxed: BlazeMeter as well as JMeter lets you schedule your test duration & test time so that we can run longevity test at any time of the day. Even weekly scheduling is possible in BlazeMeter it is an added advantage, though it is not widely used.
Point 10) Interesting Plug-ins provided by Blazemeter :
Integration with Google Analytics: At the time of scripting, it is enough if we select the Google Analytics Option & provide account details of Google Analytics. BlazeMeter obtains the last 12 months of data and creates a test with 5 most visited pages and sets up the number of concurrent users based on that record.
Integration with WordPress: BlazeMeter provides integration with WordPress where WordPress users can test their App by using the BlazeMeter plug-in without any scripting.
Integration with Drupal & Jenkins: Plugins are available to load test Drupal & Jenkins servers as well.

Post Co Authored with Harine 8KMiles.

Architecting Highly Available ElastiCache Redis replication cluster in AWS VPC

In this post lets explore how to architect and create a Highly Available + Scalable Redis Cache Cluster for your web application in AWS VPC. Following is the architecture in which the ElastiCache Redis Cluster is assembled:

  • Redis Cache Cluster inside Amazon VPC for better control and security
  • Master Redis Node 1 will be created in AZ-1 of US-West
  • Redis Read Replica Node 2 will be created in AZ-2 of US-West
  • Redis Read Replica Node 3 will be created in AZ-3 of US-West

You can position all the 3 Redis Nodes in different Availability zones for Achieving High Availability (or) you can position Master + RR 1 in AZ1 and RR 2 in AZ2. This reduces the Inter – AZ latency and might give better performance for heavily used clusters.
Step 1: Creating Cache Subnet groups:
To create Cache Subnet group  navigate to the dashboard of ElastiCache, select Cache Subnet groups and then click “Create Cache Subnet group”. Add the Subnet Id and the Availability Zone you need to use for the ElastiCache cluster.

 

We have created Amazon VPC spreading across 3 availability zones. In this post we are going to place the Redis Master and 2 Redis Replica Slaves in these 3 availability zones. Since Redis will be most of the times accessed by your application tier it is better if you place them on Private Subnet of your VPC.
Step 2: Creating Redis Cache Cluster: 
To create Cache Cluster navigate to the  dashboard of ElastiCache, select Launch Cache Cluster and provide the necessary details. We are launching it inside Amazon VPC, so we have to select the Cache Subnet group .
Note: It is mandatory to create Cache Subnet group before Launch if you need ElastiCache Redis cluster in Amazon VPC.

 

For test purposes i have used m1.small EC2 instance for the Redis. Since this is a fresh Redis installation, i have not mentioned S3 bucket from where the persistent Redis Snapshot will be used as input. On successful creation of the Cache Cluster you can see the details in the dashboard.
Step 3: Replication Group Creation:
To create Replication group select the option of Replication Groups from dashboard and then select the “Create Replication Group”

Select the master Redis node “redisinsidevpc” created previously as the primary cluster id of the Cache cluster.  Give the Replication group id and description as illustrated below.

Note: Replication Group should be created only after the Primary Cache Cluster node is UP and running, else you will get the error as shown below.

On the successful creation of the Replication group you can see the following details. You can observe from below screenshot that there is only one primary node in US-WEST-2A and zero Redis Read Replica’s are attached to it.

Step 4: Adding Read Replica Nodes:
When you select the Replication group, you can see the option to add Redis Read Replica. We are adding 2 Redis Read Replica named Redis-RR1 (in US-West-2B) and Redis-RR2 (in US-WEST-2C). Both the Read replica’s are pointed to the master node “redisinsidevpc”. Currently we can add up to 5 Read replica Nodes for a Redis Master Node. This is more than enough to handle Thousands of messages per second. If you combine it with Redis Pipeline handling 100K messages per second from a node is like cake walk.
Adding Read Replica 1 in Us-West -2B

Adding Read Replica 2 in US-West-2c

On successful creation you can see the following details of Replication group in the dashboard. Now you can see there are 3 Redis nodes listed with Number of read Replica’s as 2. Placing the Read Replica’s and master node in multiple AZ will increase the high availability and protects you from node and AZ level failure. On our sample tests inter AZ Replication deployments had <2 second replication lag for massive writes on master and <1 second replication lag between master slave inside same AZ deployments. We pumped @100K messages per second for few minutes on m1.large Redis instance cluster.
In event, if you need additional read scalability i recommend to use more read Replica slaves added to the master.
In your application tier you need to use the primary Endpoint “redis-replication.qcdze2.0001.usw2.cache.amazon.aws.com:6379” shown below to connect to Redis.

If you need to delete/reboot/Modify you can make it through the options available here.

Step 5: Promoting the Read replica:

You can also promote any node as the Primary cluster using the Promote/Demote option. There will be only one Primary Node.
Note: This step is not part of the cluster creation process.

This promotion has to be carried out with caution and proper understanding for maintaining data consistency.

Post was co authored with Senthil 8KMiles

Billion Messages – Art of Architecting scalable ElastiCache Redis tier`

Whenever we are designing a highly scalable architectures on AWS running thousands of application servers and supporting millions of requests, usage of NoSQL solutions have become inevitable part. One such solution we often been using for years on AWS is Redis . We love Redis. 
AWS introduced ElastiCache Redis on 2013 and we started using the same since it eased the management and operational efforts.  In this article i am going to share my experience on designing large scale Redis tiers supporting billions of messages per day on AWS, step by step guide on how to deploy the same, what are the Implications you face at scale ? Best Practices to be adopted while designing sharded+replicated Redis Tiers etc.

Since we need to support billions of message requests per day and it was growing:

  • the ElastiCache Redis tier was designed with Partitions( shards) to scale out as the customer grows
  • the ElastiCache Redis tier was designed with Replica Slaves for HA and read scaling as the read volumes grow

When your application is growing at Rapid pace and lots of data are created every day, you cannot keep increasing (scaling up) the size of the ElastiCache Node. At one point you will hit the maximum memory capacity of your EC2 instance and you will be forced to partition.  Partitioning is the process of splitting your Key Value data into multiple ElastiCache Redis instances, so that every instance will only contain a subset of your Key Value pair. It allows for much larger ElastiCache Redis data stores, using the sum of the memory of many ElastiCache Redis Nodes. It also allows to scale the computational power to multiple cores and multiple EC2, and the network bandwidth to multiple EC2 network adapters. There are two widely used partition/shard implementation techniques that are available for ElastiCache Redis Tier :
Technique 1) Client side partitioning means that the Redis clients directly select the right ElastiCache Redis node where to write or read a given key. Many Redis clients implement client side partitioning, chose the right one wisely.
Technique 2) Proxy assisted partitioning means that your clients send requests to a proxy that is able to speak the Redis protocol, which in turn sends requests directly to the right ElastiCache Redis instance. The proxy will make sure to forward our request to the right Redis instance accordingly to the configured partitioning schema. Currently the most widely used Proxy assisted partitioning tool is Twemproxy , written by Manju Raj of twitter. Git hub link https://github.com/twitter/twemproxy . Twemproxy is a proxy developed at Twitter for the Memcached ASCII and the Redis protocol. Twemproxy supports automatic partitioning among multiple Redis instances and  currently it is the suggested way to handle partitioning with Redis.

In this article we are going to explore in detail about Proxy assisted partitioning technique for highly scalable and available Redis tier.

Welcome to Twemproxy

Twemproxy( nutcracker) is a fast single-threaded proxy supporting the Memcached ASCII protocol and more recently the Redis protocol.

Installing Twemproxy:

Download the Twemproxy package.
wget http://twemproxy.googlecode.com/files/nutcracker-0.3.0.tar.gz
tar -xf nutcracker-0.3.0.tar.gz
cd nutcracker-0.3.0
./configure
make
make install

Configuration:

Twemproxy (Nutcracker) can be configured through a YAML file specified by the -c or –conf-file command-line argument on process start. The configuration file is used to specify the server pools and the servers within each pool that nutcracker manages. The configuration files parses and understands the following keys:

• listen: The listening address and port (name:port or ip:port) for this server pool.
• hash: The name of the hash function.
• hash_tag: A two character string that specifies the part of the key used for hashing. Eg “{}” or “$$”. Hash tag enable mapping different keys to the same server as long as the part of the key within the tag is the same.
• distribution: The key distribution mode.
• timeout: The timeout value in msec that we wait for to establish a connection to the server or receive a response from a server. By default, we wait indefinitely.
• backlog: The TCP backlog argument. Defaults to 512.
• preconnect: A boolean value that controls if nutcracker should preconnect to all the servers in this pool on process start. Defaults to false.
• redis: A boolean value that controls if a server pool speaks redis or memcached protocol. Defaults to false.
• server_connections: The maximum number of connections that can be opened to each server. By default, we open at most 1 server connection.
• auto_eject_hosts: A boolean value that controls if server should be ejected temporarily when it fails consecutively server_failure_limit times. See liveness recommendations for information. Defaults to false.
• server_retry_timeout: The timeout value in msec to wait for before retrying on a temporarily ejected server, when auto_eject_host is set to true. Defaults to 30000 msec.
• server_failure_limit: The number of consecutive failures on a server that would lead to it being temporarily ejected when auto_eject_host is set to true. Defaults to 2.
• servers: A list of server address, port and weight (name:port:weight or ip:port:weight) for this server pool.

For More details Refer: https://github.com/twitter/twemproxy

Running and Accessing Twemproxy 

To start the proxy just use the command “nutcracker” with the configuration file path specified or in its default path(conf/nutcracker.yml) .
Based on the configuration the twemproxy will be running and listening. Configure your application to point to the port and address instead of the Redis cluster.

Twemproxy Deployment models:

We usually deploy Twemproxy one one of the following models in AWS :

Model 1: Twemproxy as a separate Proxy Tier: In this model Twemproxies are deployed in separate EC2 instances, The application tier is configured to point to Twemproxies . The Twemproxy tier in turn maintains the mappings to the ElastiCache redis nodes. It is better to use instances with very good IO bandwidth for twemproxy tier in AWS. In case you feel the instance CPU is underutilized, you can launch multiple Twemproxy instances inside the same single EC2 instance as well.

Though the above model looks clean and efficient there are optimizations that can be applied to this architecture :
What happens when the twemproxy01 fails, how will the Application server instances know about it ?
Why should i pay additional for twemproxy EC2 instances, Can it be minimized ?

Model 2 : Twemproxy bundled with application tier EC2’s: 

In this model twemproxies are bundled in the same box of the application server EC2 itself. Since two twemproxies are not aware of each others existence, it is easy to architect this model even in App->Auto Scaling mode. Every application server talks to the local twemproxy deployed in the same box this saves cost and avoids managing additional tier complexity as well.

Reference ElastiCache Redis + Twemproxy  deployment:

(This is a Reference deployment, the same can be scaled out to hundreds depending upon the need. It is a Redis Partitioned + replicated setup )
1. Two ElastiCache Redis nodes in AWS (twem01 and twem02)
2. Replication group for each ElastiCache redis nodes (twem01-rg and twem02-rg with one Read Replica each)
3. Two twemproxy servers running in separate EC2. (twemproxy01 and twemproxy02)
Once the above setup is done please note down the endpoints. We will be using the Replication group endpoint as the ElastiCache Redis endpoint for the twemproxy.

ElastiCache Redis Endpoints:

twem01-twem01.qcdze2.0001.usw2.cache.amazonaws.com:6379
twem02-twem02.qcdze2.0001.usw2.cache.amazonaws.com:6379
ElastiCache Redis Replication endpoints:

twem01-rg.qcdze2.ng.0001.usw2.cache.amazonaws.com:6379
twem02-rg.qcdze2.ng.0001.usw2.cache.amazonaws.com:6379

To test the Twemproxy we pumped following keys:
Pump KV data through the Twemproxy01 (1-2000 keys)
Pump KV data through the Twemproxy02(2001-4000 keys).

Configuration:
beta:
listen: 127.0.0.1:22122
hash: fnv1a_64
hash_tag: “{}”
distribution: ketama   #Consistent Hashing
auto_eject_hosts: false
timeout: 5000
redis: true
servers:
– twem01-rg.qcdze2.ng.0001.usw2.cache.amazonaws.com:6379:1 server1
– twem02-rg.qcdze2.ng.0001.usw2.cache.amazonaws.com:6379:1 server2

Test 1: Testing Key accessibility . Testing “GET” operation across both the Twemproxy Instances for few sample keys. 

Fetch 4 Keys spread across 4000 KV data from Twemproxy01  EC2 instance:
[root@twemproxy01 redish]# src/redis-cli -h 127.0.0.1 -p 22122
redis 127.0.0.1:22122> get 1000
“1000-data”
redis 127.0.0.1:22122> get 2000
“2000-data”
redis 127.0.0.1:22122> get 3000
“3000-data”
redis 127.0.0.1:22122> get 4000
“4000-data”
Fetch 4 Keys spread across 4000 KV data from Twemproxy02  EC2 instance:
[root@twemproxy02 redish]# src/redis-cli -h 127.0.0.1 -p 22122
redis 127.0.0.1:22122> get 1000
“1000-data”
redis 127.0.0.1:22122> get 2000
“2000-data”
redis 127.0.0.1:22122> get 3000
“3000-data”
redis 127.0.0.1:22122> get 4000
“4000-data”

From the above test it is evident that all 4000 KV data inserted using both Twemproxies are accessible from both Twemproxies( testing the sample) even though they are not aware among themselves. This is because of the same hashing and Key mapping translation done at Twemproxy level.

Test 2: Testing the ElastiCache Redis Availability and Fail over mechanism:

We are going to promote the twem01-rg replication group read replica to be the Primary Redis Node. After promotion we are going to test:

 

  1. Whether the Twemproxy is able to recognize the newly promoted master
  2. Whether the sample KV data is safely replicated and still accessible , to ensure failover is successful.

To promote ElastiCache Redis slave just click the promote Action and confirm or automate using API. During the promotion of Read Replica to master we observed that the transition happens very quickly and there is no timeout but the response time for the query is about 4-5 secs for about 3-4 minutes during the switch over. In the Twemproxy configuration we can set the timeout configuration, this value needs to be set accordingly so that during switch over there will be no connection refused. For the sample test we have set it as 5000

Repeat Test 1:

[root@twemproxy01 redish]# src/redis-cli -h 127.0.0.1 -p 22122
redis 127.0.0.1:22122> get 1000
“1000-data”
redis 127.0.0.1:22122> get 2000
“2000-data”
redis 127.0.0.1:22122> get 3000
“3000-data”
redis 127.0.0.1:22122> get 4000
“4000-data”
Fetch 4 Keys spread across 4000 KV data from Twemproxy02  EC2 instance:
[root@twemproxy02 redish]# src/redis-cli -h 127.0.0.1 -p 22122
redis 127.0.0.1:22122> get 1000
“1000-data”
redis 127.0.0.1:22122> get 2000
“2000-data”
redis 127.0.0.1:22122> get 3000
“3000-data”
redis 127.0.0.1:22122> get 4000
“4000-data”

From the above test it is evident that all 4000 KV data are replicated properly between master and slaves nodes and the transition between slave to master happened successfully with all the data.
Reporting

Nutcracker exposes stats at the granularity of server pool and servers per pool through the stats monitoring port. The stats are essentially JSON formatted key-value pairs, with the keys corresponding to counter names. By default stats are exposed on port 22222 and aggregated every 30 seconds.

Some best practices while designing highly scalable+available ElastiCache Redis Tier :

Practice 1 : Reduce the Number of Connections and pipeline messages:

Whenever the application instance gets a request to get/put value to the ElastiCache redis node, the client makes a connection to the Redis Tier. Imagine it is a heavy traffic site, then thousands of requests hitting translates to thousands of connections from the application instance to Redis Tier. Now when you add Auto- scaling to your application tier and you have few hundred servers scaled out , then imagine the connection complexity and overhead this architecture brings to the ElastiCache Redis Tier.

Best practice is minimize the number of connections made from your application instance to your ElastiCache redis node. Use Twemproxy in bundled mode with Application EC2 instance, this keeps the process in close proximity and reduces the connection overhead.  Secondly, Twemproxy internally uses minimal connections to ElastiCache Redis Instance by proxying multiple client connections onto one or few server connections.
Redis also supports pipelines, where multiple requests can be pipelined and sent on a single connection. In a simple test using large Application & ElastiCache node we were able to process 125K message/sec in pipeline mode, now imagine what you could achieve on bigger instance types on AWS. The connection minimization architectural setup of twemproxy makes it ideal for pipelining requests and responses and hence saving on the round trip time.  For example, if twemproxy is proxying three client connections onto a single server and we get requests – ‘get key\r\n’, ‘set key 0 0 3\r\nval\r\n’ and ‘delete key\r\n’ on these three connections respectively, twemproxy would try to batch these requests and send them as a single message onto the server connection.

Note : It is important to note that “read my last write” constraint doesn’t necessarily hold true when twemproxy is configured withserver_connections: > 1. Let us consider a scenario where twemproxy is configured with server_connections: 2. If a client makes pipelined requests with the first request in pipeline being set foo 0 0 3\r\nbar\r\n (write) and the second request being get foo\r\n (read), the expectation is that the read of key foo would return the value bar. However, with configuration of two server connections it is possible that write and read request are sent on different server connections which would mean that their completion could race with one another. In summary, if the client expects “read my last write” constraint, you either configure twemproxy to use server_connections:1 or use clients that only make synchronous requests to twemproxy.

Practice 2:  Configure Auto Ejection and Hashing combination properly

Design for failure is the mantra of cloud architecture. Failures are commons when things are distributed on scale. Though partitioning when using ElastiCache Redis as a data store or cache is conceptually the same on broad lines, there is a huge difference operationally on large scale systems. When you are using ElastiCache Redis as a data store you need to be sure that a given key always maps to the same instance, Whereas if you are using  ElastiCache Redis as cache if a given node is not available, then you can always start afresh using a different node in the hash ring with consistent hashing implementations.
To be resilient against failures, it is recommended that you configure Auto eject hosts false when you treat redis as a Data Store and true in when you treat redis as a cache.
resilient_pool:
auto_eject_hosts: true
server_retry_timeout: 30000
server_failure_limit: 3
Enabling auto_eject_hosts: This property ensures that a dead ElastiCache redis Node can be ejected out of the hash ring after server_failure_limit: consecutive failures have been encountered on that node. A non-zero server_retry_timeout: ensures that we don’t incorrectly mark a node as dead forever especially when the failures were really transient. The combination of server_retry_timeout: and server_failure_limit: controls the tradeoff between resiliency to permanent and transient failures.
Note that an ejected node will not be included in the hash ring for any requests until the retry timeout passes. This will lead to data partitioning as keys originally on the ejected node will now be written to another node still in the pool. If ElastiCache Redis is used as a cache (in memory) then in event of a Redis Node going down, the cache data will be lost. This cache miss can cascade performance problems to other tiers and altogether bring down your system on the cloud. To minimize KV cache miss,  you can design your hash ring with Ketama hashing on the Redis Proxy. This will minimize the Cache miss in event of cache node failure, also it decreases the overall re-balancing needed in your Redis tier.  In addition to helping hand on availability problems, Redis Proxy+Ketama can also help your Redis farm to Scale out and Scale down easily with minimal cache miss. To know more about Ketama on ElastiCache refer http://harish11g.blogspot.com/2013/01/amazon-elasticache-memcached-internals_8.html  .
The below diagram illustrates a ElastiCache Redis Cache Farm with Consistent Hash Ring.
In short to minimize the cache miss when using auto eject with true it is recommended to use “Ketama Hashing ( Consistent Hashing Algorithm)” on your Twemproxy configuration. 
ElastiCache Redis as a Data Store:

What if the data stored in your Cache is important and needs to persisted across node failures and launch ? What if the date stored in your Cache cannot be lost and it needs to be replicated and promoted during failures?
Welcome to ElastiCache Redis as Data store. ElastiCache Redis offers features to persist the in memory cache data to disk and also replicate it to a slave for high availability. If ElastiCache Redis is used as a store (persistent), you need to keep the map between keys and nodes fixed, and a fixed number of nodes. Since the data stored is important when you treat ElastiCache Redis as a data store, in event one Redis node goes down, you should have immediate standby up and running in minutes.  You can architect ElastiCache Redis master with one or more replication Slave launched on different AZ from Master for High Availability in AWS. In event master node failure or master AZ failure, the slave Redis node can be promoted in minutes to act as master. This whole High availability design keeps the number of nodes on the hash ring stable and simple, Otherwise, you will end up building a system to re balance the keys (which is not easy) between nodes whenever there is a addition or removal of nodes during outages. In addition to above the ElastiCache Redis supports Partial Resynchronization with Slaves – If the connection between a master node and a slave node is momentarily broken, the master now accumulates data that is destined for the slave in a backlog buffer. If the connection is restored before the buffer becomes full, a quick partial resync will be done instead of a potentially longer full resync. This really saves network bottleneck during momentary failures.
In large scale systems you will often find some partitions are heavily used than others , in event the usage is read heavy in nature you can add upto 5 Read replicas for the ElastiCache Redis Master partition. Since these replicas are used only for read they do not affect the Hash ring structure. But Twemproxy lacks the support for read scaling with Redis Replica’s. So in event when you face this problem, you will have to Scale up the capacity(instance/node type) of the Master and Slave of that partition alone.

If you are using ElastiCache redis as a Data store in the TwemProxy it is recommended to keep “auto_eject_hosts” property false so that in event of redis node failure it is not ejected from the hash ring. The hash ring can be built with both ketama or modula hash algorithms , since in event of Primary node failure, the Slave is going to be promoted and ring structure is going to be always maintained. But if you feel there is immense possibility for the number of primary node partitions to grow, or major failures to occu, it is better to choose ketama hash ring itself from beginning. The below diagram illustrates the architecture.

Practice 3: Configure the Buffer properly:

All memory for incoming requests and outgoing responses is allocated in mbuf in Twemproxy. Mbuf enables zero copy for requests and responses flowing through the proxy. By default an mbuf is 16K bytes in size and this value can be tuned between 512 and 16M bytes using -m or –mbuf-size=N argument. Every connection has at least one mbuf allocated to it. This means that the number of concurrent connections twemproxy can support is dependent on the mbuf size. A small mbuf allows us to handle more connections, while a large mbuf allows us to read and write more data to and from kernel socket buffers. Large Scale web/mobile applications involving millions of hits might have small size request/response and lots of concurrent connections to handle in their backend. So at such scenarios, when Twemproxy is meant to handle a large number of concurrent client connections, you should set chunk size to a small value like 512 bytes to 1K bytes using the -m or –mbuf-size=N argument.

Practice 4: Configure proper Timeouts
It is always a good idea to configure Twemproxy timeout: for every server pool, rather than purely relying on client-side timeouts. Eg:

resilient_pool_with_timeout:
auto_eject_hosts: true
server_retry_timeout: 30000
server_failure_limit: 3
timeout: 400
Relying only on client-side timeouts has the adverse effect of the original request having timed out on the client to proxy connection, but still pending and outstanding on the proxy to server connection. This further gets exacerbated when client retries the original request.

Benefits of using Twemproxy for Redis Scaling

  • Avoids re inventing the wheel. Thanks Manju Raj (twitter).
  • reduce the number of connections to your cache server by acting as a proxy
  • shard data automatically between multiple cache servers
  • support consistent hashing with different strategies and hashing functions
  • be configured to disable nodes on failure
  • run in multiple instances, allowing client to connect to the first available proxy server
  • Pipelining and batching of requests and hence saving of round-trips

Disadvantages of Partitioning Model:

Point 1) Operations involving multiple keys are usually not supported. For instance you can’t perform the intersection between two sets if they are stored in keys that are mapped to different Redis instances (actually there are ways to do this, but not directly).Redis transactions involving multiple keys can not be used.
Point 2) The partitioning granularity is the key, so it is not possible to shard a dataset with a single huge key like a very big sorted set. Ideally in such cases you should Scale UP the particular Redis Master-Slave to larger EC2 instance or pro grammatically stitch up the sorted set.
Point 3)When partitioning is used, data handling is more complex, for instance you have to handle multiple RDB / AOF files, and to make a backup of your data you need to aggregate the persistence files/snapshots from multiple EC2 Redis slaves.
Point 4) Architecting a partitioned + replicated ElastiCache Redis tier not complex. What is more complex is ? supporting transparent rebalancing of data with the ability to add and remove nodes at runtime. Systems like client side partitioning and proxies don’t support this feature. However a technique called Presharding helps in this regard with limitations. Presharding technique ->Since Redis is lightweight, you can start with a lot of EC2 instances since the beginning itself. For example if you start with 32 or 64 EC2 instances (micro or small Cache Node instance type)  as your node capacity , it will provide enough room to keep scaling up the capacity when your data storage needs increase. It is not a highly recommended technique. But still can be used in production if your growth pattern is very predictable.

Future of highly scalable + available Redis tiers -> Redis Cluster

Redis Cluster is the preferred way to get automatic sharding and high availability. It is currently not production ready. Once Redis Cluster / Client  is available on Amazon ElastiCache, it will be the de facto standard for Redis partitioning. It uses a mix between query routing and client side partitioning.

References:
http://redis.io/documentation
https://github.com/twitter/twemproxy

This article was co-authored with Senthil

8KMiles Cloud Connect 2013- Lock Stock and x Smoking EC2’s

This Slide was presented @ Cloud Connect 2013. Lock, Stock and X Smoking EC2’s was by inspired by Guy Ritchie movies. It describes how we put Amazon EMR + Spot EC2 instances to use for a customer and achieved cost savings while solving a Big Data problem.

8KMiles Cloud Connect 2012:How Enterprises are leveraging Mobile Cloud Computing

This slide was presented by Harish Ganesan at Cloud Connect 2012. Mobile App development is big business and everyone from graduate students to large corporations are making huge investments. The key to good app development – is engagement and architecture. One of the ways to keep users engaged is to keep data fresh at all times, which requires a strong mobile backend that is both scalable and always on. This requires cloud. Join Harish Ganesan as he talks about how enterprises are leveraging Cloud for mobile applications to provide dynamic, feature rich applications without breaking the bank. This session will be beneficial for enterprise product managers, technology and innovation leaders, mobile app architects and anyone interested in understanding how cloud computing can deliver unique experiences to end users with minimal cost and time investment. We will see how to architect a Mobile Cloud Application for an Enterprise in a case based approach, What are the characteristics of this application , What are the unique challenges and intricacies that Enterprise brings into the table for Mobile cloud Architectures? What are the best practices that need to be adopted? How we can solve those using AWS or other clouds?

Architecting your Mobile Application for the Cloud

This presentation talks about how to architect mobile applications in the Cloud. Starting with a simplest case it walks you through to an enterprise mobile app running on AWS with integrations to external data centers enjoying the benefits of Cloud Computing such as Scalability and High Availability