27 Best Practice Tips on Amazon Web Services Security Groups

AWS Security Groups are one of the most used and abused configurations inside an AWS environment if you are using them on cloud quite long. Since AWS security groups are simple to configure, users many times ignore the importance of it and do not follow best practices relating to it. In reality, operating on AWS security groups every day is much more intensive and complex than configuring them once. Actually, nobody talks about it! So in this article, I am going to share our experience in dealing with AWS Security groups since 2008 as a set of best practice pointers relating to configuration and day to day operations perspective.
In the world of security, proactive and reactive speed determines the winner. So a lot of these best practices should be automated in reality. In case your organizations’ Dev/Ops/Devops teams needs help with security group best practices automation, feel free to contact me.

AWS released so many features in the last few years relating to Security, that we should not visualize Security groups in isolation, It just does not make sense anymore. The Security Group should always be seen in the overall security context, with this I start the pointers.

Practice 1:  Enable AWS VPC Flow Logs for your VPC or Subnet or ENI level. AWS VPC flow logs can be configured to capture both accept and reject entries flowing through the ENI and Security groups of the EC2, ELB + some more services. This VPC Flow log entries can be scanned to detect attack patterns,alert abnormal activities and information flow inside the VPC and provide valuable insights to the SOC/MS team operations.

Practice 2: Use AWS Identity and Access Management (IAM) to control who in your organization has permission to create and manage security groups and network ACLs (NACL). Isolate the responsibilities and roles for better defense. For example, you can give only your network administrators or security admin the permission to manage the security groups and restrict other roles.

Practice 3: Enable AWS Cloud Trail logs for your account. The AWS Cloud Trail will log all the security group events and it is needed for management and operations of security groups. Event streams can be created from AWS Cloud Trail logs and it can be processed using AWS Lambda. For example : whenever a Security Group is deleted , this event will be captured with details on the AWS Cloud Trail logs. Events can be triggered in AWS Lamdba which can process this SG change and alert the MS/SOC on the dashboard or email as per your workflow. This is a very powerful way of reacting to events within span of <7 minutes. Alternatively, you can process the AWS Cloud Trail logs stored in your S3 every X frequency as a batch and achieve the above. But the Operation teams reaction time can vary depending on generation and polling frequency of the AWS Cloud Trail logs. This activity is a must for your operations team.
Practice 4: Enable AWS App Config for your AWS account. App records all events related to your security group changes and can even send emails.

Practice 5: Have proper naming conventions for the Amazon Web Services security group. The naming convention should follow a enterprise standards. For example it can follow the notation: “AWS Region+ Environment Code+ OS Type+Tier+Application Code”
Security Group Name – EU-P-LWA001
AWS Region ( 2 char ) = EU, VA, CA etc
Environment Code (1 Char)  = P-Production , Q-QA, T-testing, D-Development etc
OS Type (1 Char)= L -Linux, W-Windows etc
Tier (1 Char)= W-Web, A-App, C-Cache, D-DB etc
Application Code ( 4 Chars) = A001
We have been using Amazon Web Services from 2008 and found over the years managing the security groups in multiple environments is itself a huge task. Proper naming conventions from beginning is a simple practice, but will make your AWS journey manageable.

Practice 6: For security in depth, make sure your Amazon Web Services security groups naming convention is not self explanatory also make sure your naming standards stays internal. Example : AWS security group named UbuntuWebCRMProd is self explanatory for hackers that it is a Production CRM web tier running on ubuntu OS. Have an automated program detecting AWS security groups with Regex Pattern scanning of AWS SG assets periodically for information revealing names and alert the SOC/Managed service teams.

Practice 7: Periodically detect, alert or delete AWS Security groups not following the organization naming standards strictly. Also have an automated program doing this as part of your SOC/Managed service operations.  Once you have this stricter control implemented then things will fall in line automatically.

Practice 8: Have automation in place to detect all EC2,ELB and other AWS assets associated with Security groups. This automation will help us to periodically detect Amazon Web Services Security groups lying idle with no associations, alert the MS team and cleanse them. Unwanted security groups accumulated over time will create unwanted confusion.

Practice 9: In your AWS account, when you create a VPC, AWS automatically creates a default security group for the VPC. If you don’t specify a different security group when you launch an instance, the instance is automatically associated with the appropriate default security group. It will
allow inbound traffic only from other instances associated with the “default” security group and allow all outbound traffic from the instance. The default security group specifies itself as a source security group in its inbound rules. This is what allows instances associated with the default security group to communicate with other instances associated with the default security group. This is not a good security practice. If you don’t want all your instances to use the default security group, you can create your own security groups and specify them when you launch your instances. This is applicable to EC2 , RDS , ElastiCache and some more services in AWS. So detect “default” security groups periodically and alert to the SOC/MS.

Practice 10: Alerts by email and cloud management dash board should be triggered whenever critical security groups or rules are added/modified/deleted in production.  This is important for reactive action of your managed services/security operations team and audit purpose.

Practice 11 : When you associate multiple security groups with an Amazon EC2 instance, the rules from each security group are effectively aggregated to create one set of rules. AWS uses this set of rules to determine whether to allow access or not. If there is more than one SG rule for a specific port, AWS applies the most permissive rule. For example, if you have a rule that allows access to TCP port 22 (SSH) from IP address 203.0.113.10 and another rule that allows access to TCP port 22 for everyone, then everyone will have access to TCP port 22 because permissive takes precedence.
Practice X.1 : Have automated programs detecting EC2 associated with multiple SG/rules and alert the SOC/MS periodically. Condense the same manually to 1-3 rules max as part of your operations.

Practice X.1 : Have automated programs detecting conflicting SG/rules like restrictive+permissive rules together and alert the SOC/MS periodically.

Practice 12 : Do not create least restrictive security groups like 0.0.0.0/0 which is open to every one.
Since web servers can receive HTTP and HTTPS traffic open, only their SG can be permissive like
0.0.0.0/0,TCP, 80, Allow inbound HTTP access from anywhere
0.0.0.0/0,TCP, 443, Allow inbound HTTPS access from anywhere
All least restrictive SG created in your account should be alerted to SOC/MS teams immediately.

Practice 13: Have a security policy not to launch servers with default ports like 3306, 1630, 1433, 11211, 6379 etc. If the policy has to be accepted, then security groups also have to be created on the new hidden listening ports instead of the default ports. This provides a small layer of defense, since one cannot infer the information from the security group port on the EC2 service it is protecting. Automated detection and alerts should be created for SOC/MS, if security groups are created with default ports.

Practice 14: Applications which require stricter compliance requirements like HIPAA, PCI etc to be met need end to end transport encryption to be implemented on server back end in AWS. The communication from ELB to Web->App->DB->Others tiers need to be encrypted using SSL or HTTPS. This means only secured ports like 443, 465, 22 are permitted in corresponding EC2 security groups. Automated detection and alerts should be created for SOC/MS if security groups are created on secure ports for regulated applications.

Practice 15: Detection , alert and actions can be taken by parsing the AWS Cloud Trail logs based on usual patterns observed in your production environment
Example:
15.1 :If a port was opened and closed in <30 or X mins in production can be a candidate for suspicious activity if it is not normal pattern for your production
15.2 :If a permissive Security Group was created and closed in <30 or X mins can be a candidate for suspicious activity if it is not the normal pattern for your production
Detect anomalies on how long a change effected and reverted in security groups in production.

Practice 16: In case ports have to be opened in Amazon Web Services security groups or a permissive AWS security group needs to be applied, Automate this entire process as part of your operations such that a security group is open for X agreed minutes and will be automatically closed aligning with your change management. Reducing manual intervention avoids operational errors and adds security.

Practice 17: Make sure SSH/RDP connection is open in AWS Security Group only for jump box/bastion hosts for your VPC/subnets. Have stricter controls/policies avoid opening SSH/RDP to other instances of production environment. Periodically check , alert and close for this loop hole as part of your operations.

Practice 18: It is a bad practice to have SSH open to the entire Internet for emergency or remote support. By allowing the entire Internet access to your SSH port there is nothing stopping an attacker from exploiting your EC2 instance. The best practice is to allow very specific IP address in your security groups, this restriction improves the protection. This could be your office or on premise or DC through which you connect your jump box.

Practice 19: Too much or Too less: How many security groups for a usual multi tiered web app is preferred is a frequently asked question ?
Option 1 : One security group cutting across multiple tiers is easy to configure, but it is not a recommended for secure production applications.
Option 2: One Security group for every instance is too much protection and tough to manage operationally on longer term
Option 3: Individual Security group for different tiers of the application, For example : Have separate security groups for ELB, Web , App, DB and Cache tiers of your application stack.
Periodically check whether Option 1 type rule is being created in your production and alert the SOC/MS.

Practice 20: Avoid allowing  UDP or ICMP for private instances in Security groups. Not a good practice unless specifically needed.

Practice 21: Open only specific ports, Opening range of ports in a security group is not a good practice. In the security group you can add many inbound ingress rules, While opening the ports it is always advised to open for specific ports like 80,443, etc rather than range of ports like 200-300.

 

Add rules for communication between associated instances

Practice 22: Private Subnet instances can be accessed only from the VPC CIDR IP range. Opening instances to the public IP ranges is a possibility , but it does not make any sense. E.g., Opening HTTP to 0.0.0.0/0 in the SG of the private subnet instance does not make any sense. So detect and cleanse such rules.

 

Practice 23: AWS CloudTrail log captures the events related security. AWS lambda events or automated programs should trigger alerts to operations when abnormal activities are detected. For example:
23.1:Alert when X number of SG were added/deleted at “Y” Hours or Day by IAM user / Account
23.2:Alert when X number of SG Rules were added/deleted at “Y” Hours or Day by IAM user / Account

Practice 24: In case you are an enterprise make sure all security groups related activities of your production are part of your change management process. Security Group actions can be manual or automated with your change management in an enterprise.
In case you are an agile Startup or SMB and do not have complicated Change management process, then automate most of the security group related tasks and events as illustrated above on various best practices. This will bring immense efficiency into your operations

Practice 25: Use outbound/egress security groups wherever applicable within your VPC. Restrict FTP connection to any server on the Internet from your VPC. This way you can avoid data dumps and important files getting transferred out from your VPC. Defend harder and make it tougher !

Practice 26: For some tiers of your application, use ELB in front your instance as a security proxy with restrictive security groups – restrictive ports and IP ranges. This doubles your defense but increases the latency.

Practice 27: Some of the tools we use in conjunction to automate and meet above best practices are ServiceNow, Amazon CFT, AWS API’S, Rundeck, Puppet, Chef, Python , .Net and Java automated programs.

Note : In case your organizations Dev/Ops/Devops teams needs help on security group best practices automation on points listed above, feel free to contact me harish11g.aws@gmail.com

 

Source

 

About the Author

Harish Ganesan is the Chief Technology Officer (CTO) of 8K Miles and is responsible for the overall technology direction of the 8K Miles products and services. He has around two decades of experience in architecting and developing Cloud Computing, E-commerce and Mobile application systems. He has also built large internet banking solutions that catered to the needs of millions of users, where security and authentication were critical factors. He is also a prolific blogger and frequent speaker at popular cloud conferences.

 

Apache Solr to Amazon CloudSearch Migration Tool

In this post, we are introducing a new tool called S2C – Apache Solr to Amazon CloudSearch Migration Tool. S2C is a Linux console based utility that helps developers / engineers to migrate search index from Apache Solr to Amazon CloudSearch.

Very often customers initially build search for their website or application on top of Solr, but later run into challenges like elastic scaling and managing the Solr servers. This is a typical scenario we have observed in our years of search implementation experience. For such use cases, Amazon CloudSearch is a good choice. Amazon CloudSearch is a fully-managed service in the cloud that makes it easy to set up, manage, and scale a search solution for your website. To know more, please read the Amazon CloudSearch documentation.

We are seeing growing trend every year, organizations of various sizes are migrating their workloads to Amazon CloudSearch and leveraging the benefits of fully managed service. For example, Measured Search, an analytics and e-Commerce platform vendor, found it easier to migrate to Amazon CloudSearch rather than scale Solr themselves (see article for details).

Since Amazon CloudSearch is built on top of Solr, it exposes all the key features of Solr while providing the benefits of a fully managed service in the cloud such as auto-scaling, self-healing clusters, high availability, data durability, security and monitoring.

In this post, we provide step-by-step instructions on how to use the Apache Solr to Amazon CloudSearch Migration (S2C) tool to migrate from Apache Solr to Amazon CloudSearch.

Before we get into detail, you can download the S2C tool in the below link.
Download Link: https://s3-us-west-2.amazonaws.com/s2c-tool/s2c-cli.zip

Pre-Requisites

Before starting the migration, the following pre-requisites have to be met. The pre-requisites include installations and configuration on the migration server. The migration server could be the same Solr server or independent server that sits between your Solr server and Amazon CloudSearch instance.

Note: We recommend running the migration from the Solr server instead of independent server as it can save time and bandwidth. It is much better if the Solr server is hosted on EC2 as the latency between EC2 and CloudSearch is relatively less.

The following installations and configuration should be done on the migration server (i.e. your Solr server or any new independent server that connects between your Solr machine and Amazon CloudSearch).

  1. The application is developed using Java. Download and Install Java 8 .Validate the JDK path and ensure the environment variables like JAVA_HOME, classpath, path is set correctly.
  2. We assume you already have setup Amazon Web services IAM account. Please ensure the IAM user has right permissions to access AWS services like CloudSearch.
    Note: If you do not have an AWS IAM account with above mentioned permissions, you cannot proceed further.
  3. The IAM user should have AWS Access key and Secret key. In the application hosting server, set up the Amazon environment variables for access key and secret key. It is important that the application runs using the AWS environment variables.
    To setup AWS environment variables, please read the below link. It is important that the tool is run using AWS environment variables.http://docs.aws.amazon.com/AWSSdkDocsJava/latest/DeveloperGuide/credentials.htmlhttp://docs.aws.amazon.com/AWSSdkDocsJava/latest/DeveloperGuide/java-dg-roles.html
    Alternatively, you can set the following AWS environment variables by running the commands below from Linux console.
    export AWS_ACCESS_KEY=Access Key
    export AWS_SECRET_KEY=Secret Key
  4. Note: This step is applicable only if migration server is hosted on Amazon EC2.
    If you do not have an AWS Access key and Secret key, you can opt for IAM role attached to an EC2 instance. A new IAM role can be created and attached to EC2 during the instance launch. The IAM role should have access to Amazon CloudSearch.
    For more information, read the below link
    http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html
  5. Download the migration utility ‘S2C’ (You would have completed this step earlier), unzip the tool and copy it in your working directory.Download Link: https://s3-us-west-2.amazonaws.com/s2c-tool/s2c-cli.zip

S2C Utility File
The downloaded ‘S2C’ migration utility should have the following sub directories and files.

Folder / Files Description  
 
bin Binaries of the migration tool
 
lib Libraries required for migration
 
application.conf Configuration file that allows end users to input parameters Require end-user’s input.
 
logback.xml Log file configuration Optional. Does not require end-user  / developer input
 
s2c script file that executes the migration process

Configure only application.conf and logback.xml.  Do not modify any other file.
Application.conf: The application.conf file has the configuration related to the new Amazon CloudSearch domain that will be created. The parameters configured  in the application.conf file are explained in the table below.

s2c {api {SchemaParser = “s2c.impl.solr.DefaultSchemaParser”SchemaConverter = “s2c.impl.cs.DefaultSchemaConverter”DataFetcher = “s2c.impl.solr.DefaultDataFetcher”DataPusher = “s2c.impl.cs.DefaultDataPusher”  } List of API that is executed step by step during the migration.Do not change this.
solr {dir = “files”
server-url = “http://localhost:8983/solr/collection1”
fetch-limit = 100}
dirThe base directory path of Solr.Ensure the directory is present and also its validity.Eg:/opt/solr/example/solr/collection1/conf
server-url– Server host, port and collection path.The endpoint which will be used to fetch the data.If the utility is run from a different server, ensure the IP address and port has firewall access.
fetch-limit– number of solr documents that can be fetched for each batch call. This configuration number should be carefully set by the developer.The fetch limit depends on the following factors:

  1. Record size of a Solr record(1KB or 2KB)
  2. Latency between migration server and Amazon CloudSearch
  3. Current Request Load on the Solr Server

E.g.: If the total Solr documents is 100000 and fetch limit is 100, then it would take 100000 / 10 = 10000 batch calls to complete the fetch.If size of each Solr record is 2KB, then 100000 * 2KB = 200MB data is migrated.

cs {domain = “collection1”
region = “us-east-1″
instance-type = ” search.m3.xlarge”
partition-count = 1
replication-count = 1}
domain – CloudSearch domain name. Ensure that the domain name does not already exist.
Region – AWS region for the new CloudSearch domain
Instance type – Desired instance type for CloudSearch nodes. Choose the instance type based on the volume of data and the expected query volume. 
Partition count – Number of partitions required for CloudSearch
replication-count – Replication count for CloudSearch
wd = “/tmp” Temporary file path to store intermediate data files and migration log files

Running the migration

Before launching the S2C migration tool, verify the following:

    • Solr directory path – Make sure that the Solr directory path is valid and available. The tool cannot read the configuration if the path or directory is invalid.
    • Solr configuration contents – Validate that the Solr configuration contents are correctly set inside the directory. Example: solrconfig.xml, schema.xml, stopwords.txt, etc.
    • Make sure that the working directory is present in the file system and has write permissions for the current user. It can be an existing directory or a new directory. The working directory stores the fetched data from Solr and migration logs.
    • Validate the disk size before starting the migration. If the available free disk space is lesser than the size of the Solr index, the fetch operations will fail.

For example, if the Solr index size is 7 GB, make sure that the disk has at least 10 GB to 20 GB of free space.
Note: The tool reads the data from Solr and stores in a temporary directory (Please see configuration wd = /tmp in the above table).

  • Verify that the AWS environment variables are set correctly. The AWS environment variables are mentioned in the pre-requisites section above.
  • Validate the firewall rules for IP address and ports if the migration tool is run from a different server or instance. Example: Solr default port 8983 should be opened to the EC2 instance executing this tool.

Run the following command from directory ‘{S2C filepath}’
Example: /build/install/s2c-cli

/s2c or JVM_OPTS=”-Xms2048m -Xmx2048m” ./s2c (With heap size)

The above will invoke the shell ‘s2c’ script that starts the search migration process. The migration process is a series of steps that require user inputs as shown in the screen shots below.
Step 1: Parse the Solr schema The first step of migration prompts for a confirmation to parse the Solr schema and Solr configuration file. During this step, the application generates a ‘Run Id’ folder inside the working directory.
  Example: /tmp/s2c/m1416220194655

The Run Id is a unique identifier for each migration. Note down the Run Id as you will need it to resume the migration in case of any failures.

Step 2: Schema conversion from Solr to CloudSearch.The second step prompts confirmation to convert Solr schema to CloudSearch schema. Press any key to proceed further.

The second step will also list all the converted fields which are ready to be migrated from Solr to CloudSearch. If any fields are left out, this step will allow you to correct the original schema. User can abort the migration and identify the ignored fields, rectify the schema and re-run the migration again.The below screen shot shows the fields ready for CloudSearch migration.


Step 3: Data Fetch: The third step prompts for confirmation to fetch the search index data from the Solr server. Press any key to proceed. This step will generate a temporary file which will be stored in the working directory. This temporary file will have all the fetched documents from the Solr index.


There is also option to skip the fetch process if all the Solr data is already stored in the temporary file. If this is the case, the prompt will look like the screenshot below.

Step 4: Data push to CloudSearchThe last and final step prompts for confirmation to push the search data from the temporary file store to Amazon CloudSearch. This step also creates the CloudSearch domain with the configuration specified in application.conf including desired instance type, replication count, and multi-AZ options.

If the domain is already created, the utility will prompt to use the existing domain. If you do not wish to use an existing domain, you can create a new CloudSearch domain using the same prompt.
Note: The console does not prompt for any ‘CloudSearch domain name’ but instead it uses the domain name configured in the application.conf file.

Step 5: Resume (Optional) During the migration steps, if there is any failure during the fetch operation, it can be resumed. This is illustrated in the screen shot below.

Step 6: Verification Log into AWS CloudSearch management console to verify that the domain and index fields.

Amazon CloudSearch allows running test queries to validate the migration and as well the functionality of your application.

Features supported

  • Support for other non-Linux environments is not available for now.
  • Support for Solr Shards is not available for now. The Solr shard needs to be migrated separately.
  • The install commands may vary for different Linux flavors. Example installing software, file editor command, permission set commands can be different for every Linux flavors. It is left to engineering team to choose the right commands during the installation and execution of this migration tool.
  • Only fields configured as ‘stored’ in Solr schema.xml are supported. The non-stored fields are ignored during schema parsing.
  • The document id (unique key) is required to have following attributes:
    1. Document ID should be 128 characters or less in size.
    2. Document ID can contain any letter, any number, and any of the following characters:      _ – = # ; : / ? @ &
    3. The below link will help you to understand in data  preparation before migrating to CloudSearch http://docs.aws.amazon.com/cloudsearch/latest/developerguide/preparing-data.html
  • If the conditions are not met in a document, it will be skipped during migration. Skipped records are shown in the log file.
  • If a field type (mapped to fields) is not stored, the stopwords mapped to that particular field type are ignored.

Example 1:

<field name=”description” type=”text_general” indexed=”true” stored=”true” />   

Note: The above field ‘description’ will be considered for stopwords.Example 2:

<field name=”fileName” type=”string” />     

Note: The above field ‘fileName’ will not be migrated and ignored in the stopwords.

Please do write your feedback and suggestions in the below comments section to improve this tool. The source code of the tool can be downloaded at https://github.com/8KMiles/s2c/. We have written a follow-up post in regard to that.

About the Authors
 Dhamodharan P is a Senior Cloud Architect at 8KMiles.

 

 

 

 Dwarakanath R is a Principal Architect at 8KMiles.