5 Considerations you need to know before investing in Big Data Analytics

A vast number of companies with different industrial background collaborate with data analytics companies to increase operational competence and to make better business decisions. When big data is handled properly they shall lead to immense change in a business. Though data analysis is a powerful tool, most companies are not ready to include data analysis software as their practical source. Purchasing and downloading data analytics isn’t as simple as buying a software. There are many things that must be considered before a company invests in analytics software.

You should know where your company exactly stands in terms of analysis system and consider the following things before investing in big data analytics.

What you want to find out from your data?

You should know for what you will be using your analytics software before investing on them. If you don’t know what business problem you need to solve then collecting data and setting up an analysis system isn’t productive. So check for areas of your company where the current process is not effective. Look out for different questions you need to answer prior to investing in a solution so you can adopt for an appropriate analytics partner.

Do you have enough data to work?

You should have significant and reliable data to perform data analytics. Therefore, you need to see if your company has enough amount of data or workable information to perform analysis. Also you should determine if the company can afford and have ability to collect such information. This process can become expensive considering the labor cost, hours spent on categorizing the information and data storage. So it is also necessary to consider data aggregation and storage cost before moving forward.

Do you have the capital to invest for analytics software?

Depending on companies need the price range for analytics software varies. Few software vendors offer data warehousing, which can be ideal for companies that require data storage and analytics as well as have large budget. Other vendors give visualization systems, both on SaaS and on-premise form. As visualization comes in varied price ranges, your company will be able to find a solution that fits your budget.

Besides the software cost you should estimate the cost of effort and service which is five times the software price. The investment can change depending on the size and depth of the project, but it’s necessary to completely understand the costs involved in data analytics before investing.

Do you have resource to work with your data?

There are many analytics systems that are automatic but you still need user interaction and management. It is necessary to have data engineer for constant data ingestion, organization and provision of data marts for data analysts and data scientists, who in turn will continue to work on new insights and predictions by updating the data processing rules and algorithms/models as per changing business needs. Also having resource for analytical decisions will avoid confusions and that specific person should be able to allot time and materials for scrutinizing and making reports.

Are you capable to take action?

At the final stage, you would have collected data, identified the problem, invested in the software and performed analysis; but to make everything worth you have to be ready to act immediately and efficiently. With the recently discovered data, you have required information to change your organizations practice. Whereas executing a new project could be expensive so it’s essential to be ready with resources necessary for implementing the change.

Data analytics can be a powerful tool to improve a company’s efficiency. So remember to consider these five factors before investing in big data analytics.

Powershell: Automating AWS Security Groups

To provision and manage EC2-Instances in AWS cloud that comply with industry standards and regulations, Individuals administrating that should understand the security mechanisms within AWS framework—both those that are automatic and those that require configuration. Let’s take a look at Security Group which falls under the latter category.

As there is no “Absolute Security Group” which can be plugged in to satisfy the universal need, we should always be open for its modification. Automating so via Powershell will provide predictable/consistent results.

What Is Security Group?

Every VM created through AWS Management Console (or via scripts) can have association with one or multiple Security Groups (in case of VPC it can be up to 5). By default all the inbound and out bound traffic flow at instance level is blocked from elsewhere. We should automate the infrastructure to open only the ports satisfying the customer need. This implies that we should add rules to each Security Group for ingress/ egress as per customer requirement. For more details have a look at AWS Security Group

It is duly important to allow traffic only from valid source IP addresses; this will substantially prune security attack surface, use of 0.0.0.0/0 as IP range makes things vulnerable for sniffing or tampering of infrastructure. Traffic between VMs should always traverse through Security Groups; we can achieve this by allowing initiators Security Group- ID as source.


Automation Script


I have kept this as a single block, if one wishes they can create a function out of it. Few things worth considering:

  • Execution of this script will only materialize given working pair of Secret Key & Access Key
  • This script make use of filtering functionality, whereby it expect end user to provide some Name-Pattern ,selection of Security Group is driven by aforementioned pattern
  • To facilitate the whole operation you have to provide certain parameters i.e.[IpProtocol , FromPort , ToPort , Source]
  • Source parameter can be interpreted in two ways, you can either provide IpRanges in CIDR block format or choose another Security Group as source in the form of UserIdGroupPair

<#

.SYNOPSIS

Simple script to safely assign/revoke Ingress Rules from VPC Security Group .

 

.DESCRIPTION

Script first checks to see what are the rules has beein specified for update,if already assigned will do no harm.

If assginement is successful, same can be verified at AWS console.

 

NOTE:  Script must be updated to include proper pattern, security credentials.

#>

 

# Update the following lines, as needed:

 

Param(

[string]$AccessKeyID=”**********”,

[string]$SecretAccessKeyID=”********”,

[string]$Region=”us-east-1″,

[string]$GrpNamePattern=”*vpc-sg-pup_winC*”,

[string]$GroupId=”sg-xxxxxxxx”,

[string]$CidrIp=”0.0.0.0/0″,

[switch]$SetAws=$true,

[switch]$Revoke,

[switch]$Rdp=$true,

[switch]$MsSql=$true

)

$InfoObject = New-Object PSObject -Property @{

AccessKey = $AccessKeyID

SecretKey = $SecretAccessKeyID

Region=$Region

GrpNamePattern = $GrpNamePattern

GroupId=$GroupId

CidrIp=$CidrIp

}

if($SetAws)

{

Set-AWSCredentials -AccessKey $InfoObject.AccessKey  -SecretKey $InfoObject.SecretKey

Set-DefaultAWSRegion -Region $region

}

$PublicGroup = New-Object Amazon.EC2.Model.UserIdGroupPair

$PublicGroup.GroupId= $InfoObject.GroupId

 

$filter_platform = New-Object Amazon.EC2.Model.Filter -Property @{Name = “group-name”; Values = $InfoObject.GrpNamePattern}

$SG_Details=Get-EC2SecurityGroup -Filter $filter_platform |SELECT GroupId, GroupName

 

$rdpPermission = New-Object Amazon.EC2.Model.IpPermission -Property @{IpProtocol=”tcp”;FromPort=3389;ToPort=3389;UserIdGroupPair=$PublicGroup}

 

$mssqlPermission = New-Object Amazon.EC2.Model.IpPermission -Property @{IpProtocol=”tcp”;FromPort=1433;ToPort=1433;IpRanges=$InfoObject.CidrIp}

$permissionSet = New-Object System.Collections.ArrayList

 

if($Rdp){ [void]$permissionSet.Add($rdpPermission) }

 

if($MsSql){ [void]$permissionSet.Add($mssqlPermission) }

 

if($permissionSet.Count -gt 0)

{

try{

if(!$Revoke){

“Granting to $($SG_Details.GroupName)”

Grant-EC2SecurityGroupIngress -GroupId $SG_Details.GroupId -IpPermissions $permissionSet

}

else{

“Revoking to $($SG_Details.GroupName)”

Revoke-EC2SecurityGroupIngress -GroupId $SG_Details.GroupId -IpPermissions $permissionSet

}

}

catch{

if($Revoke){

Write-Warning “Could not revoke permission to $($SG_Details.GroupName)”

}

else{

Write-Warning “Could not grant permission to $($SG_Details.GroupName)”

}

}

}

 

 

What we are looking at being able to automate Creation/updation of Security Group. Use this script in case you ran into frequent changing of Security Groups.


P.S. This script has been written keeping VPC  in mind, Different parameter usage between VPC and EC2 security groups should be take care of.

 

Credits -“Utkarsh Pandey”

  • April 19, 2016
  • blog

DevOps with Windows – Chocolatey

Conceptually Package manager is well understood space for someone having slightest understanding of how *nix environment get’s managed, but when it comes to windows it was untracked space till recently. This was piece of stack which was ironically missing for so long that once you get hands on you will feel how on earth you were living without it. NuGet and Chocolatey are the two buzz words making lots of noises and deemed as the future for windows server management.

What Is Chocolatey?

Chocolatey builds on top of NuGet packaging format to provide a package management  for Microsoft Windows applications , Chocolatey is  kind of yum or apt-get but for windows. Its CLI based and can be used to decentralize packaging.  It has a central repository located at http://chocolatey.org/.

If you have ever used windows build in provider you probably be aware of the issues it has. It doesn’t really do versioning and seems misfit for upgrading. Any organization looking for long term solution to ensure that latest versions are always installed for them build in package provider may not be the recommended option .Chocolatey takes care of all this with very little effort. In contrast to default provider which has no dependency Chocolatey requires your machine to have Powershell 2.0 & .Net framework 4.0 installed. Installation of packages from Chocolatey is one command line that reaches out to internet and pulls it down. That would be version-able and upgradable; you can specify this version of package and that that’s what gets installed.

Recommended way of Chocolatey installation is by executing PowerShell script.

Chocolatey With AWS

AWS offers windows instances with both their offering; under IAAS you can launch windows instance as EC2, whereas with PAAS you can get that via Elastic beanstalk.

Using Cloud Formation:

Using ‘cfn-init’ AWS Cloud Formation supports the download of files and execution of commands on Windows EC2 instance. Bootstrapping of Windows instance using Cloud Formation is lot simpler than any other ways. We can leverage this offering to install Chocolatey while launching the server using CFT. While doing this using Cloud formation we have to execute PowerShell.exe and provide the install command to that. One thing to be take care of that Chocolatey installer and the packages it installs may modify the machine’s PATH environment variable. This adds complexity since subsequent commands after these installations are executed in the same session, which does not have the updated PATH. To overcome this, we utilize a command file to set the session’s PATH to that of the machine before it executes our command. We will create a command file ‘ewmp.cmd’ to execute a command with the machine’s PATH, and then we will proceed with Chocolatey and any other installation. With below sample we will be installing Chocolatey and then install Firefox with Chocolatey as provider.

“AWS::CloudFormation::Init”: {

“config”: {

“files” : {

“c:/tools/ewmp.cmd” : {

“content”: “@ECHO OFF\nFOR /F \”tokens=3,*\” %%a IN (‘REG QUERY \”HKLM\\System\\CurrentControlSet\\Control\\Session Manager\\Environment\” /v PATH’) DO PATH %%a%%b\n%*”

}

},

“commands” : {

“1-install-chocolatey” : {

“command” : “powershell -NoProfile -ExecutionPolicy unrestricted -Command \”Invoke-Expression ((New-Object Net.WebClient).DownloadString(‘https://chocolatey.org/install.ps1’))\””

},

“2-install-firefox” : {

“command” : “c:\\tools\\ewmp choco install firefox”

}

}

}

}

 

Using AWS Elastic Beanstalk:

AWS Elastic Beanstalk supports the downloading of files and execution of commands on instance creation using container customization. We can leverage this feature to install Chocolatey.

The aforementioned installation can get translated into AWS Elastic Beanstalk config files to enable use of Chocolatey in Elastic Beanstalk. The change while doing using Elastic Beanstalk; we will create YAML .config files inside the .ebextensions folder of our source bundle.

files:

c:/tools/ewmp.cmd:

content: |

@ECHO OFF

FOR /F “tokens=3,*” %%a IN (‘REG QUERY “HKLM\System\CurrentControlSet\Control\Session Manager\Environment” /v PATH’) DO PATH %%a%%b

%*

commands:

1-install-chocolatey:

command: powershell -NoProfile -ExecutionPolicy unrestricted -Command “Invoke-Expression ((New-Object Net.WebClient).DownloadString(‘https://chocolatey.org/install.ps1’))”

2-install-firefox:

command: c:\tools\ewmp choco install firefox

 

Above will work in the same way as cloud formation sample did , it will Create a command file ‘ewmp.cmd’ to execute a command with the machine’s PATH before installing Chocolatey and Firefox.

 

P.S. Chocolatey can best be used as package provider for puppet on windows. Puppet offers great support in promotion and execution of Chocolatey on Windows. 

 

 

Credits – Utkarsh Pandey

 

  • April 19, 2016
  • blog

Top Health IT Issues You Should Be Aware Of

Information Technology (IT) as a major role in refining the facilities of healthcare industry to improve patient care and organize vast quantity of health related data. Over several years, the healthcare across country has seen remarkable growth with the help of IT. So both the public and private healthcare sectors are making use of IT to meet their new requirements and standards. Though IT is playing an important role to improve excellence in patient care, increase efficiency and reduce cost, there are certain Health It issues that you should to be aware of and should fix them:

Database

New database and related tools are needed to manage huge amount of data and improve patient care. So, using non-relational database will help to manage and make proper use of vast amount of healthcare data. This database type is perfect for information that is structured easily but they can’t handle unstructured data (like records, clinical notes, etc.). But relational databases (like electronic health records (EHR)) organize data into tables or rows or force information into predefined groups. However, with non-relational databases it is easy to analyze different data forms and avoid rigid structure.

Mobile Healthcare and Data Security

With change in financial incentives and growth of mobile healthcare technology, the patient care would shift to the consumer. Thus, providing care is easy from anywhere and anytime with mobility. Also to reduce the money spent on health plans, additional tools are allied for wellness and disease management programs. However, cyber security issues are the biggest threat. As breaching the data would cost huge financial loss. It is necessary to take action to prevent breaches as this is a major issue.

With increase in mobility of healthcare, it is must to introduce a mobile/BYOD that would help avoid data breaching and privacy intrusion.

Health Information Exchange (HIE)

The HIE will help sharing of healthcare data between healthcare organizations. Different concerns related to healthcare policy and standard should be analyzed before implementing such exchanges as sensitive data is at risk.

Wireless Network and Telemedicine

Wireless networking is mandatory for the employees of healthcare industry to avail the medical facilities. To transfer the old health IT services to adopt wireless access could be an expensive and challenging option due to structural limitations. Also the wireless issue continues to be an obstacle for telemedicine adoption. The varying state policies on telemedicine use and reimbursement continue to restrict this emerging technique.

Data analysis

It owns a major role in assisting, treating and preventing illness and providing quality care to people. To implement a data analysis system which offers secure data storage and easy access would be an expensive and robust task.

Cloud System

The cloud system is answerable to many questions with respect to data ownership, security and encryption. To fix the cloud related issues, some providers are experimenting with cloud-based EHR systems while others build their own private cloud.

The necessity and requirement of health IT is increasing every day. Though health IT has become a major phenomenon, we should always remember that challenges would continue to intrude as they progress. So be aware and keep yourself updated with the top health IT issues and tackle them.

Related post from 8KMiles

How Cloud Computing Can Address Healthcare Industry Challenges

How pharmaceuticals are securely embracing the cloud

5 Reasons Why Pharmaceutical Company Needs to Migrate to the Cloud

8K Miles Tweet Chat 2: Azure

If you missed our latest Twitter chat on Azure or wish to once again go through the chat, this is the right place! Here’s a recap on what happened during the 12th April Tweet chat, as a compilation of all the questions asked and answers as given by the tweet chat participants. The official tweet chat handle of 8K Miles being @8KMilesChat shared frequently asked questions (FAQs) related to Azure and here’s how they were answered.

1

2

3

4

5

6

twitter chat

7-2 7-3

7-4

 

8-18-28-38-4

 

9

10

 

We received clear answers to every question asked and it was an informative chat on Azure. For more such tweet chats on cloud industry follow our Twitter handle @8KMiles.

The active participants during the tweet chat were cloud experts Utkarsh Pandey and Harish CP. Here’s a small brief on their expertise:

Utkarsh Pandey

Utkarsh, is a Solutions Architect who in his current role as AWS & Azure Certified solution architect holds the responsibility of cloud development services.

HarishCP

HarishCP, is a Cloud Engineer. He works in the Cloud Infrastructure team helping customers in Infrastructure management and migration.

Powershell: Automating AWS Security Groups

Powershell: Automating AWS Security Groups

To provision and manage EC2-Instances in AWS cloud that comply with industry standards and regulations, Individuals administrating that should understand the security mechanisms within AWS framework—both those that are automatic and those that require configuration.Let’s take a look at Security Group which falls under the latter category.

As there is no “Absolute Security Group” which can be plugged in to satisfy the universal need, we should always be open for its modification.Automating so via Powershell will provide predictable/consistent results.

What Is Security Group?

Every VM created through AWS Management Console (or via scripts) can have association with one or multiple Security Groups (in case of VPC it can be up to 5). By default all the inbound and out bound traffic flow at instance level is blocked from elsewhere. We should automate the infrastructure to open only the ports satisfying the customer need. This implies that we should add rules to each Security Group for ingress/ egress as per customer requirement.For more details have a look at AWS Security Group

It is duly important to allow traffic only from valid source IP addresses; this will substantially prune security attack surface, use of 0.0.0.0/0 as IP range makes things vulnerable for sniffing or tampering of infrastructure. Traffic between VMs should always traverses through Security Groups, we can achieve this by allowing initiators Security Group- ID as source.

Automation Script

I have kept this as a single block ,if one wishes they can create a function out of it. few things worth considering :

  • Execution of this script will only materialize given working pair of Secret Key & Access Key
  • This script make use of filtering functionality, whereby it expect end user to provide some Name-Pattern ,selection of Security Group is driven by aforementioned pattern
  • To facilitate the whole operation you have to provide certain parameters i.e.[IpProtocol , FromPort , ToPort , Source]
  • Source parameter can be interpreted in two ways, you can either provide IpRanges in CIDR block format or choose another Security Group as source in the from of UserIdGroupPair

<#

.SYNOPSIS

Simple script to safely assign/revoke Ingress Rules from VPC Security Group .

 

.DESCRIPTION

Script first checks to see what are the rules has beein specified for update,if already assigned will do no harm.

If assginement is successful, same can be verified at AWS console.

 

NOTE:  Script must be updated to include proper pattern, security credentials.

#>

# Update the following lines, as needed:

Param(

[string]$AccessKeyID=”**********”,

[string]$SecretAccessKeyID=”********”,

[string]$Region=”us-east-1″,

[string]$GrpNamePattern=”*vpc-sg-pup_winC*”,

[string]$GroupId=”sg-xxxxxxxx”,

[string]$CidrIp=”0.0.0.0/0″,

[switch]$SetAws=$true,

[switch]$Revoke,

[switch]$Rdp=$true,

[switch]$MsSql=$true

)

$InfoObject = New-Object PSObject -Property @{

AccessKey = $AccessKeyID

SecretKey = $SecretAccessKeyID

Region=$Region

GrpNamePattern = $GrpNamePattern

GroupId=$GroupId

CidrIp=$CidrIp

}

if($SetAws)

{

Set-AWSCredentials -AccessKey $InfoObject.AccessKey  -SecretKey $InfoObject.SecretKey

Set-DefaultAWSRegion -Region $region

}

$PublicGroup = New-Object Amazon.EC2.Model.UserIdGroupPair

$PublicGroup.GroupId= $InfoObject.GroupId

$filter_platform = New-Object Amazon.EC2.Model.Filter -Property @{Name = “group-name”; Values = $InfoObject.GrpNamePattern}

$SG_Details=Get-EC2SecurityGroup -Filter $filter_platform |SELECT GroupId, GroupName

$rdpPermission = New-Object Amazon.EC2.Model.IpPermission -Property @{IpProtocol=”tcp”;FromPort=3389;ToPort=3389;UserIdGroupPair=$PublicGroup}

$mssqlPermission = New-Object Amazon.EC2.Model.IpPermission -Property @{IpProtocol=”tcp”;FromPort=1433;ToPort=1433;IpRanges=$InfoObject.CidrIp}

$permissionSet = New-Object System.Collections.ArrayList

if($Rdp){ [void]$permissionSet.Add($rdpPermission) }

if($MsSql){ [void]$permissionSet.Add($mssqlPermission) }

if($permissionSet.Count -gt 0)

{

try{

if(!$Revoke){

“Granting to $($SG_Details.GroupName)”

Grant-EC2SecurityGroupIngress -GroupId $SG_Details.GroupId -IpPermissions $permissionSet

}

else{

“Revoking to $($SG_Details.GroupName)”

Revoke-EC2SecurityGroupIngress -GroupId $SG_Details.GroupId -IpPermissions $permissionSet

}

}

catch{

if($Revoke){

Write-Warning “Could not revoke permission to $($SG_Details.GroupName)”

}

else{

Write-Warning “Could not grant permission to $($SG_Details.GroupName)”

}

}

}

what we are looking at being able to automate Creation/updation of Security Group.Use this script in case you ran into frequent changing of Security Groups.

 

Credits -Uthkarsh Pandey

  • April 13, 2016
  • blog

Puppet – An Introduction

Puppet – An Introduction

Most common issue while building and maintaining large infrastructure has always been wastage of time. Amount of redundant work performed by each member within team is significant. The idea of automatically configuring and deploying infrastructures has evolved out of a wider need to address this particular problem.

Puppet and Chef are few among the many configuration management packages available. They offer a framework for describing your application/server configuration in a text-based format. Instead of manually installing IIS on each of your web servers, you can instead write a configuration file which says “all web servers must have IIS installed”.

What Is Puppet ?

Puppet is Ruby -based configuration management software, and it can run in either client-server or stand-alone mode. It can be used to manage configuration on UNIX (including OS X), Linux, and Microsoft Windows platforms. It is designed to interact with your hosts in continuous fashion,Unlike other provisioning tools that build your hosts and leave them on their own.

You define a “Desired State” for every node (agents) on puppet master. If agent node doesn’t resemble desired state, in puppet terms “drift” has occurred. Actual decision on how your machine is suppose to look is done by the master, whereas agents only provides data about itself and then responsible for actually applying those decisions. By default each agent will contact master every 30 min, which can be customized. The way this entire process work can be summed with this workflow.

PL_dataflow_notitle

  1. Each nodes sends its current information (current state) in the form of facts.
  2. Puppet master will use these facts and compile a catalog about desired state of that agent, and send it back to agent.
  3. Agent will enforce the configuration as specified in catalog, and send the report back to master to indicate the success/failure.
  4. Puppet Master will generate the detailed report which can be feed to any third party tool for monitoring.

Credits -Uthkarsh pandey

  • April 13, 2016
  • blog

Meet 8K Miles Cloud Experts at Bio-IT World Conference & Expo ‘16

The Annual Bio-IT World Conference & Expo is around the corner! The Cambridge Healthtech’s 2016 Bio-IT World Conference and Expo is happening at Seaport World Trade Centre, Boston, MA and 8K Miles will be attending and presenting in the event. The three day spanning meet from April 5th -7th includes 13 parallel conference tracks and 16 pre-conference workshops.

 The Bio-IT World Conference & Expo continues to be a vibrant event every year that unites 3,000+ life sciences, pharmaceutical, clinical, healthcare, and IT professionals from more than 30 countries. At the conference look forward to compelling talks, including best practice case studies and joint partner presentations, which will feature over 260 of your fellow industry and academic colleagues discussing themes of big data, smart data, cloud computing, trends in IT infrastructure, genomics technologies, high-performance computing, data analytics, open source and precision medicine, from the research realm to the clinical arena.

When it comes to the Cloud Healthcare, Pharmaceutical and Life Sciences have special needs. 8K Miles makes it stress-free for your organization to embrace the Cloud and reap all the benefit the Cloud offers while at the same time meeting your security and compliance needs. Stop by booth #128 at the event to meet our 8K Miles, Director of Sales, Tom Crowley who is a versatile; goal oriented sales, business development and marketing professional with 20+ years of wide variety of experience and accomplishments in the information security industry.

Also, at the event on Wednesday, April 6th from 10:20-10:40am two of our 8K Miles speakers Sudish Mogli, Vice President, Engineering and Saravana Sundar Selvatharasu, AVP, Life Sciences will be presenting on Architecting your GxP Cloud for Transformation and Innovation. They will be sharing solutions and case studies for designing and operating on the cloud for a GxP Environment, by utilizing a set of frameworks that encompasses Operations, Automation, Security and Analytics.

We are just a tweet away, to schedule a one-on-one meeting with us, tweet to @8KMiles! We look forward to meeting you at the event!

How Cloud Computing Can Address Healthcare Industry Challenges

Healthcare & Cloud Computing

The sustainability and welfare of mankind depends on the healthcare industry. Whereas the technologies aren’t utilized enough in the healthcare industry thus restricts the healthcare sector in the operational competence. There are still healthcare sectors which depend on paper records. As well as there are healthcare sectors that has digitized their information. The use of technology will help to coordinate care and ease between patients and physicians, in the midst of the medical community.

Cloud computing is adopted globally to reform and modernize the healthcare sector. The healthcare industry is shifted into a model which helps to collectively support and coordinate the workflows and medical information. Cloud computing helps healthcare industry in storing large data, facilitates sharing of information among physicians and hospitals and increases the data analysis or tracking features. This helps with the treatments, performance of physicians or students, costs and studies.

Overcome Challenges in Healthcare Industry through Cloud Computing

In the healthcare industry, the utmost importance should be given to the following: security, confidentiality, availability of data to users, long-term preservation, data traceability and data reversibility. Some challenges faced by the healthcare industry in IT systems are with respect to exchange, maintenance or making use of huge information. Hence, while moving healthcare information into cloud computing, a careful thought should be given to the type of application i.e., clinical and nonclinical application the organization wants to go with.

So, while moving the application into cloud deployment model, details, such as security, privacy and application requirements should be considered while setting up the healthcare digitally is required. The cloud services can be public, private or hybrid. For a clinical application, the cloud deployment will take place in private or hybrid cloud as they require the highest level of precautions. The nonclinical application will fit under public cloud deployment model. 

Cloud computing is emerging as a vital technology in healthcare industry but still they are underutilized. The persons involved in the healthcare, like medical practitioners, hospitals, research facilities, etc., could consider different cloud service models that could address their business needs. The service models includes Software as a Service (SaaS), Infrastructure as a Service (IaaS) or Platform as a Service (PaaS).

Among the three service models, SaaS, a pay-per-use business model is the most attractive option economically. Especially for the small hospitals or physicians as SaaS doesn’t need full-time IT personnel as well reduces the capital expenses needed for hardware, software or operating system.

PaaS, is a perfect option for large-scale healthcare institutions who have the resources to develop the cloud solutions further. IaaS will be feasible for healthcare industry that could seek more scalable infrastructure. As IaaS is cost-effective as well provides scalability with security, flexibility, data protection and back-ups.

Thus, cloud computing could be a permanent solution or game-changer for a healthcare industry; with respect to its service offerings, operating models, capabilities and end-user services. With cloud computing, the challenges faced in the healthcare industry with respect to managing the medical information, storing data, retrieving data or accessing could be eliminated. Meanwhile the healthcare industry can overtake other industries in use of technology with adoption of cloud services. Thus, accessing or monitoring the healthcare related information across the globe would be easier with implementation of cloud services.

Related post from 8KMiles…
How pharmaceuticals are securely embracing the cloud

Keeping watch on AWS root user activity is normal or anomaly

Avoid malicious cloud trial action in your AWS account cloud watch lamda

27 Best practice tips on amazon web services security groups

8K Miles Tweet Chat : AWS Key Management Service (KMS)

Follows us at Twitter @8KMiles

Did you miss the tweet chat on AWS KMS organised 8K Miles?  Well, if you have missed it or if you wish to relearn of what had happened on 12th February’s tweet chat, this is the place. Here is a compilation of all the questions asked and answers given by the tweet chat participants.

The official tweet chat handle of 8K Miles being @8KMilesChat shared questions which are generally asked related to AWS KMS and here’s how they were answered.

AWS KMS

 

AWS kms

 AWS kms

a4

AWS kms

amazon kms

AWS kms

AWS kms

AWS kms

a10

This is how informative our tweet chat was last time. For more such chats stay tuned to our page updates.

A brief summary about our cloud experts

Ramprasad

Ramprasad, is a Solutions Architect with 8KMiles. In the current role as a certified Solutions Architect in AWS, he works in the Cloud Infrastructure team at 8KMiles helping customers evaluate the cloud platform, suggest and implement the right set of solutions and services that AWS offers.

Senthilkumar

Senthilkumar, is a Senior Cloud Engineer with 8KMiles. He is a certified Solutions Architect in AWS, he works in the Cloud Infrastructure team at 8KMiles helping customers in DevSecOps, Operational Excellence and Implementation.

Related post from 8KMiles….
How pharmaceuticals are securely embracing the cloud

Keeping watch on AWS root user activity is normal or anomaly

Avoid malicious cloud trial action in your AWS account cloud watch lamda

27 Best practice tips on amazon web services security groups

5 Reasons Why Pharmaceutical Company Needs to Migrate to the Cloud

Pharmaceutical companies are one of the major beneficiaries of emerging IT trends and technologies like cloud computing. From innovative ideas for developing new drugs to customer engagement, Pharma companies are increasingly resorting to digital platforms to aid their tasks.

While companies understand the importance of adopting digital practices, the path to transformation poses many hurdles like regulation, financial factors as well as conventional processes. This is where cloud computing can come to their rescue. Cloud computing can help pharmaceutical companies increase revenue, improve quality and save time.

Over past few months, we have seen trends of bigger Pharma companies acquiring smaller ones, for example Sun Pharma’s acquisition of Ranbaxy. The demand for better data management and analytics is only expected to grow. With all the new challenges that these trends create, cloud computing is a technology that pharma companies cannot afford to miss.

Cloud_computing

Courtesy: Wikipedia

Below are the top 5 reasons why Pharmaceutical companies should migrate to cloud

1. Cost Reduction

A large number of cloud users have noticed that using cloud-based software and infrastructure reduces their cost significantly. It requires less development and testing resources. This implies that lesser expenditure for support and maintenance of applications too. Studies show that cloud based software can cost 50% lesser than traditional software application over 10 years. Also, typical cloud based software would take around 4 to 6 months to be ready, whereas in house hosted apps will take around 1 to 3 years to be developed and well tested before deployment.

2. Improved Efficiency

Cloud applications have ability to automate many typical business processes. If manual activities and duplicate data are removed, it can help your product hit the market sooner. Migrating to cloud can also improve collaboration between various departments, suppliers and distributors. With early access to wide range of data, businesses can gather valuable insights about the performance systems and plan their future strategy accordingly. It can also provide you information about capacity, availability and relationship between employees, equipment and materials. Being able to understand the scenario in real time gives you power to operate more efficiently and make better decisions.

3. Decentralization

When a company strives to go global, clinical trials and workforces need to become decentralized which requires IT infrastructure to be decentralized. Building out exclusive data centers can prove to be expensive and distract from a Pharma company’s core business objectives. Cloud computing improves web performance for users in remote locations without having to build out additional data centers.

4. Flexibility

One of the major challenges that Pharma companies face is the need to rapidly structure their physical infrastructure and scale up or down certain requirements to better service their operations. Applications hosted on site aren’t as quick to quickly structure, expand and scale up processes. If these requirements are not met on time, it can put a company in danger of making serious loss as compared to its competitors. Migrating to cloud can help to deploy scalable IT infrastructure that can quickly adjust itself as peer the company requirements, making sure that the resources are always available when required. Thus, it not only improves the speed with which companies can adjust to new market requirements but also checks money being spent on resources that remain under-utilized.

5. Security

There were long-held security concerns about cloud computing, not only in Pharma companies but also in whole manufacturing sector. But now, this fear is being overcome by the realization that cloud hosted data is far more secure than data hosted on site and quite accessible. Cloud computing allows applications to run independently of hardware through a virtual environment running out of secure data centers. Hence, employees can access same apps and documents anywhere, thus breaking down barriers of geography and converting any place into a virtual “office”.

Conclusion

Above list is a short compilation of the benefits that cloud computing offers to pharmaceutical companies. There is lots more that cloud based technology can provide to improve their long term ROI. Cloud computing can help pharma companies develop innovative products like personalized medicine by improved customer engagement techniques to gather data for effective marketing.

If you have got additional ideas on how pharmaceutical company can benefit from migrating to cloud computing, feel free to leave your comments.

CloudWatch + Lambda Case 4: Control launch of Specific “C” type EC2 instances post office hours to save costs

We have a customer who has predictable load volatility between 9 am to 6 pm and uses specific large EC2 instances during office hours for analysis, they use “c4.8xlarge” for that purpose. Their IT wanted to control launch of such large instance class post office hours and during nights to control costs, currently there is no way to restrict or control this action using Amazon IAM. In short we cannot create complex IAM policy with conditions that user A belonging to group A cannot launch instance type C every day between X and Y.

Some stop gap followed is to have a job running which removes the policy from an IAM user when certain time conditions are met. So basically what we would do is, to have a job that calls an API that removes the policy which restricts an IAM user or group from launching instances. This will make the IAM policy management complex and tough to assess/govern drifts between versions.

After the introduction of the CloudWatch events our Cloud operations started controlling it with lambda functions. Whenever an Instance type is launched it will trigger a lambda function, the function will filter whether it is a specific “C” type and check for the current time, if the time falls after office hours, it will terminate the EC2 instance launched immediately.

As a first step, we will be creating a rule in Amazon CloudWatch Events dashboard. We have chosen AWS API Call as an Event to be processed by an AWSCloudTrail Lambda function as a target.

CloudWatch Events Lambda EC2

The next step would be configuring rule details with Rule definition

CloudWatch Events Lambda EC2

Finally, we will review the Rules Summary

CloudWatch Events Lambda EC2

Amazon Lambda Function Code Snippet (Python)
import boto3

def lambda_handler(event, context):
#print (“Received event: ” + json.dumps(event, indent=2))
#print (“************************************************”)

ec2_client = boto3.client(“ec2”)

print “Event Region :”, event[‘region’]

event_time = event[‘detail’][‘eventTime’]
print “Event Time :”, event_time

time = event_time.split(‘T’)
t = time[1]
t = t.split(‘:’)
hour = t[0]

instance_type = event[‘detail’][‘requestParameters’][‘instanceType’]
print “Instance Type:”, instance_type

instance_id = event[‘detail’][‘responseElements’][‘instancesSet’][‘items’][0][‘instanceId’]
print “Instance Id:”,instance_id

if( instance_type.startswith( ‘t’ ) and hour > 18 or hour < 8 ):
print ec2_client.terminate_instances( InstanceIds = [ instance_id ] )

GitHub Gist URL:  https://github.com/cloud-automaton/automaton/blob/master/aws/events/TerminateAWSEC2.py

This post was co authored with Priya and Ramprasad of 8KMiles.

This article was originally published in: http://harish11g.blogspot.in/

CloudWatch + Lambda Case 3 -Controlling cross region EBS/RDS Snapshot copies for regulated industries

If you are part of regulated industry like Pharmaceutical/ Life sciences/BFSI running mission critical applications on AWS, at times as part of the compliance requirements you will have to restrict/control data movement to a particular geographic region in the cloud. This becomes complex to restrict sometimes. Let us explore in detail:

We all know there are varieties of ways to move data from one AWS region to another, but one commonly used method is Snapshot copy across AWS regions. Usually you can restrict snapshot copy permission in IAM Policy, but what if you need the permission enabled for moving data between AWS accounts inside a region, but still want to control EBS/RDS snapshot copy action across regions. It can be only mitigated by automatically deleting the snapshot on destination AWS region in case snapshot copy activity is done.

Our Cloud operations team used to altogether remove this permission in IAM or monitor this activity using polling scripts for customers with multiple accounts who need this permission and still need control. Now after the introduction of CloudWatch Events we have configured a rule that points to an AWS Lambda which gets triggered in near real time when snapshot is copied to destination AWS region. The lambda function will initiate a deletion process immediately. Though it is reactive it is incomparably faster than manual intervention.

In this use case, Amazon CloudWatch Event will identify the EBS Snapshot copies across the regions and delete them.

As a first step, we will be creating a rule in Amazon CloudWatch Events dashboard. We have chosen AWS API Call as an Event to be processed by an AWSCloudTrail Lambda function as a target.

CloudWatch Events Lambda

The next step would be configuring rule details with Rule definition

CloudWatch Events Lambda

Finally, we will review the Rules Summary

CloudWatch Events Lambda

Amazon Lambda Function Code Snippet (Python)

CloudWatch Events Lambda

GitHub Gist URL: https://github.com/cloud-automaton/automaton/blob/master/aws/events/AWSSnapShotCopy.py

https://github.com/cloud-automaton/automaton/blob/master/aws/events/AWSSnapShotCopy.py

This post was co-authored with Muthukumar and Ramprasad of 8KMiles

This article was originally published in: http://harish11g.blogspot.in/

CloudWatch + Lambda Case 2- Keeping watch on AWS ROOT user activity is normal or anomaly ?

As a Best Practice you should never use your AWS root account credentials to access AWS. Instead, create individual (IAM) users for anyone who needs access to your AWS account. This allows you to give each IAM user a unique set of security credentials and grant different permissions to each user. Example: Create an IAM user for yourself as well, give that user administrative privilege, and use that IAM user for all your work and never share your credentials to anyone else.

Usually Root has full access and it is not ideal to restrict the same in AWS IAM. Imagine you suddenly doubt some anomaly/suspicious activities done as Root user (using EC2 API’s etc) in your logs other than normal IAM user provisioning; this could be because Root user is compromised or forced, but ultimately it is a deviation from the best practice.

In the past we used to poll the CloudTrail logs using programs and differentiate between “root” and “Root”, and our cloud operations used to react to these anomaly behaviors. Now we can inform the cloud operations and customer stake holders near real time using CloudWatch events.

In this use case, Amazon CloudWatch Event will identify activities if any performed by an AWS ROOT user and notifications will be sent to SNS thru AWS Lambda.

As a first step, we will be creating a rule in Amazon CloudWatch Events dashboard. We have chosen AWS API Call as an Event to be processed by an AWSCloudTrail Lambda function as a target. The lambda function will detect if the event is triggered by root user and notifies through SNS.

CloudWatch Events Lambda Root Activity Tracking

The next step would be configuring rule details with Rule definition

CloudWatch Events Lambda Root Activity Tracking

Finally, we will review the Rules Summary

CloudWatch Events Lambda Root Activity Tracking

Amazon Lambda Function Code Snippet (Python)

CloudWatch Events Lambda Root Activity Tracking

GitHub Gist URL:

https://github.com/cloud-automaton/automaton/blob/master/aws/events/TrackAWSRootActivity.py

This post was co-authored with Saravanan and Ramprasad of 8KMiles

This article was originally published in: http://harish11g.blogspot.in/

CloudWatch + Lambda Case 1- Avoid malicious CloudTrail action in your AWS Account

As many of you know AWS CloudTrail provides visibility into API activity in your AWS account, Cloud Trail Logging lets you see which actions users have taken and which resources have been used, along with details such as the time and date of actions and the actions that have failed because of inadequate permissions. It enables you to answer important questions such as which user made an API call or which resources were acted upon in an API call. If a user disables CloudTrail logs accidentally or with malicious intent then audit logging events will not captured and hence you fail to have proper governance in place. The situation will get complex, If the user disables- enables back CloudTrail for a brief period of time where some important activities can go unlogged and unaudited. In short once CloudTrail logging is enabled it should not be disabled and this action needs to be defended in depth.

Our Cloud operations team had earlier written a program that periodically scans the Cloud Trail logs entries, if any log activity was missing after an X period of time it alerts the operations.  Overall reaction time on our cloud operations was >15-20 mins to mitigate this CloudTrail disable action.

Now after the introduction of CloudWatch Events we have configured a rule that points to an AWS Lambda function as target. This function gets triggered in near real time when CloudWatch is disabled and automatically enables it back without any manual interaction from Cloud operations. The advanced version of the program triggers workflow which logs entries into ticket system as well. This event model has helped us reduce the mitigation to less than a minute.
We have illustrated below the detailed steps on how to configure this event. Also we given the link for GIT with basic AWS Lambda Python code that can be used by your cloud operations.

In this use case, Amazon CloudWatch Event will identify whether an AWS account has got CloudTrail enabled or not, if not enabled, Amazon CloudWatch Events will take corrective actions by enabling the same.

As a first step, we will be creating a rule in Amazon CloudWatch Events dashboard. We have chosen AWS API Call as an Event to be processed by an AWSCloudTrail Lambda function as a target.

CloudWatch Events Lambda CloudTrail

The next step would be configuring rule details with Rule definition

CloudWatch Events Lambda CloudTrail

Finally, we will review the Rules Summary

CloudWatch Events Lambda CloudTrail

Amazon Lambda Function Code Snippet (Python)
import json
import boto3
print(‘Loading function’)
“”” Function to define Lambda Handler “””
def lambda_handler(event, context):
    try:
        client = boto3.client(‘cloudtrail’)
        if event[‘detail’][‘eventName’] == ‘StopLogging’:
            response = client.start_logging(Name=event[‘detail’][‘requestParameters’][‘name’])
    except Exception, e:
        sys.exit();

 

GitHub Gist URL:

This post was co-authored with Mohan and Ramprasad of 8KMiles

This article was originally published in: http://harish11g.blogspot.in/

Managing User Identity across cloud-based application with SCIM

Simple Cloud Identity Management (SCIM) Protocol is a standard-based provisioning and de-provisioning user identity to the cloud-based SaaS applications. SCIM’s pragmatic approach it is designed quick and easy to move the user identity across the cloud applications. It’s mainly intent is to reduce the cost and complexity of user management operations by providing a common user schema and extension model, as well as binding documents to provide patterns for exchanging this schema using standard protocols.

SCIM is built on a model where a resource is the common denominator and all SCIM objects are derived from it. SCIM has three objects that directly derives from the Resource object. The ServiceProviderConfiguration and Schema are used to discover the service provider configuration. The Core Resource object specifies the endpoint resources User, Group and Organization.
The SCIM protocol exchange the user identities between two applications over HTTP protocol using REST (Representational State Transfer) protocol. SCIM protocol exposes a common user schema and resource object expressed JSON format and XML format. SCIM requests are made via HTTP requests with different HTTP methods and responses are returned in the body of the HTTP response, formatted as JSON or XML depending on the request.

Following are the SCIM Endpoint services for standard-based user identity provisioning and de-provisioning across cloud-based applications.

# SCIM provides two end point to discover supported features and specific attribute details.
• GET /ServiceProviderConfigs – This endpoint specify the service provider specification and compliance, authentication schemes and data models.
• GET /Schemas – This endpoint specify the service provider’s resources and attribute extensions.
# SCIM Provides a REST API with simple set of HTTP/HTTPS (For CRUD) operations.

• POST – https://endpoint.com/{v}/{resource} – Create a new resource or Bulk resource.
• GET – https://endpoint.com/{v}/{resource}/{id} – Retrieves a particular resource.
• GET – https://endpoint.com/{v}/{resource}?filter={attribute}{op}{value}&sortBy={attributeName}&sortOrder={ascending|descending} – Retrieves a resource with filter parameters.
• PUT – https://endpoint.com/{v}/{resource}/{id} – Modifies a resource with a complete, consumer specified resource (Replace the full resource).
• PATCH – https://endpoint.com/{v}/{resource}/{id} – Modifies a resource with a set of consumer specified changes (Update particular resource).
• DELETE – https://endpoint.com/{v}/{resource}/{id} – Deletes a resource. (Delete particular resource)

How SCIM protocol to provision the user identities into the Cloud (SaaS) application.

SCIM

The SCIM protocol does not define a scheme for authentication and authorization therefore Service Provider are free to choose mechanisms appropriate to their use cases. Most of the SaaS applications (Service provider) provide OAuth2.0 security protocol for authentication and authorization, some of the SaaS application provide their own authentication mechanism. However now days most of the IDM vendors (CA, SailPoint, PingIdentity.) support SCIM protocol.

If you need more information about on this SCIM protocol refer this link http://www.simplecloud.info/

Keys to Ensure a Smooth and Successful Go-Live

Keys to Ensure a Smooth and Successful Go-Live:

The day is finally here. The months of planning, build, testing and training have culminated in go-live day. Will this be a successful and relatively smooth day? If you have followed the best-practice tips below, then the day will go without any major issues.

  • Early End-User Involvement
    Often the biggest complaint from end-users at go-live is the feeling that the new system was forced on them without being able to provide input. One way to help alleviate this is to engage the users early with information and question and answer sessions, where the users are able to ask questions and express concerns. These sessions along with usability labs and training provide continued involvement throughout the process and allow the end-users to take some ownership of the success of the project and go-live.
  • Document Decisions
    Throughout the implementation process, many decisions will be made. One thing that can slow the process and cause a rush close to go live is poor documentation of decisions and revisiting decisions. Documenting what decisions have been made and why can help alleviate back-tracking. Yes, sometimes a decision will need to be revisited based on new information, but keeping to a minimum is important to keep the project on track, and allow go-live day to be a success.
  • Testing, Testing, Testing
    This may be the most obvious point, but robust testing is extremely important to a smooth go-live. Identifying and correcting problems before go live will make the day better for everyone.
  • Be Nice
    Go-live is a big day and it will be stressful. The project team and support staff will need to keep a calm demeanor throughout the day. Listening to the users and providing answers with a smile will help the end-users stay calm as well.
  • Issue tracking and resolution
    Problems will arise at go-live and having a maintained issues log allows for the identification of trends in issues. Discussing the issues during pre-shift meetings with all project team members allows for everyone to be on the same page and helps prevent duplicate issues from being tracked. Along with tracking the issues, having a ‘fast-track’ process for fixing any issues in the system allows the end-users to see progress and move forward.

Go-live day will come with some stress and challenges, but if you follow the steps listed, it will be a smooth process and a positive experience for everyone.

Community Connect/New Practice Team Model

Community Connect/New Practice Team Model
One effect of the new and upcoming Affordable Care Act laws, is that healthcare organizations are acquiring smaller private practices and able to grow their organizations. Another is for smaller practices to contract with larger organizations to utilize their Electronic Medical Records (EMRs). EMRs can be prohibitively expensive for smaller practices, as can the penalties for not using them, so contracting with larger organizations allows them to keep autonomy and use of and EMR with a large network.
The benefits to the large organizations in both cases are clear with the increase in patient base and potential revenue. However this growth, often rapid, can present significant and numerous challenges to the administration.
EMR Project Teams
One area where challenges is realized is with the EMR Project Teams. The teams are often ‘bare-bones’ so an aggressive timeline can stretch teams too far and existing projects (upgrade, optimization, maintenance, etc) are often be neglected by necessity.
One method to help alleviate this strain is to bring in a team dedicated to the build and roll out of the newly acquired and community connect clinics. This allows the organization analysts to focus on the existing projects, and the dedicated team to build and roll out the clinics quickly and consistently.
Success Story
Utilizing this ‘team’ approach has been shown to be successful at a recent site which uses the Epic EMR. Creating a team with 1 Project Manager, 1 Ambulatory Analyst, and 1 Cadence Analyst was shown to be an ideal team structure. The addition of a second ‘team’ of 1 Ambulatory and 1 Cadence Analyst when the timeline was very tight proved to be beneficial.
The first clinic to go up was a collaborative approach between the added team and the existing staff. Embedding the team allowed them to learn the build conventions, documentation, change management, and various organization experts for different build aspects. The go-live was also collaborative to ensure the proper processes were being used.
The second clinic was more autonomous for the team, but a point-person was used from the staff for questions and the build was reviewed for accuracy.
The following clinics were autonomous which allowed the staff to work on a major upgrade. The team was able to ask questions as needed, but were able to build and roll-out clinics with limited time required from the staff.
If you would like any more information on this community connect model and how it could work for your organization, then please contact info@serj.com for more information.

Open Sourcing S2C Tool

This is a follow up post to our previous article ‘Migrating Solr to Amazon CloudSearch using the S2C tool’. Last month, we released an open source tool ‘S2C’, a Linux console based utility that helps developers to migrate search index from Apache Solr to Amazon CloudSearch.

In this article, we share the source code of the S2C tool which will allow developers to customize and extend the S2C tool to suit their requirements. The source code can be downloaded in the below link.

https://github.com/8KMiles/s2c/

Further, we discuss step-by-step instructions on how to build the source code of S2C tool.

Pre-Requisites

In this section, we detail the pre-requisites for building S2C tool process.
1. The application is developed using Java. Download and Install Java 8 .Validate the JDK path and ensure the environment variables like JAVA_HOME, classpath, path is set correctly.

2. We will use Gradle to build the S2C tool. Download and Install Gradle.
Please read getting-started.html (inside Gradle base folder) in setting up Gradle. Gradle is an open source build tool which does not require any pre-requisites like Maven or Ant.

3. Download the source code directly from the link https://github.com/8KMiles/s2c/
or alternatively
Download and install Git – http://git-scm.com/book/en/v2/Getting-Started-Installing-Git and then use the following command to clone the source from Git

git clone https://github.com/8KMiles/s2c/.git (requires Git installation)

Note: The source code is available in public and does not require any credentials to access the source.

Build process

We will use Gradle, an open source build tool to build the S2C migration utility.
1. Verify the path, classpath, environment variables of Java, Gradle. Example: JAVA_HOME, GRADLE_HOME

2. Unzip the downloaded S2C source code and run the following command from the main directory. Example: E:/s2ctool/s2c-master or /opt/s2ctool/s2c-master

 ./gradlew -PexportPath=/tmp :s2c-cli:exportTarGz

The above command will create .tgz (zip file) at ‘/tmp’ directory. The directory path can be changed if required
or

gradle exportTarGz

The above command will create .tgz at ‘s2c-master/s2c-cli/tmp’ directory.

or

gradle exportZip

The above command will create .zip at ‘s2c-master/s2c-cli/tmp’ directory.

Build output file:s2c-cli-1.0.zip or s2c-cli-1.0.tgz

3.The build output is the final product that is deployed to do migration from Solr to Amazon CloudSearch. The deployment steps are discussed in detail in the original blog ‘Migrating Solr to Amazon CloudSearch using the S2C tool’.

Please do write your feedback and suggestions in the below comments section to improve this tool. We also intend to write a follow-up post sharing the original source code of this tool.

About the Authors

 Dhamodharan P is a Senior Cloud Architect at 8KMiles.

 

 

 

 Dwarakanath R is a Principal Architect at 8KMiles.

 

 

 

27 Best Practice Tips on Amazon Web Services Security Groups

AWS Security Groups are one of the most used and abused configurations inside an AWS environment if you are using them on cloud quite long. Since AWS security groups are simple to configure, users many times ignore the importance of it and do not follow best practices relating to it. In reality, operating on AWS security groups every day is much more intensive and complex than configuring them once. Actually, nobody talks about it! So in this article, I am going to share our experience in dealing with AWS Security groups since 2008 as a set of best practice pointers relating to configuration and day to day operations perspective.
In the world of security, proactive and reactive speed determines the winner. So a lot of these best practices should be automated in reality. In case your organizations’ Dev/Ops/Devops teams needs help with security group best practices automation, feel free to contact me.

AWS released so many features in the last few years relating to Security, that we should not visualize Security groups in isolation, It just does not make sense anymore. The Security Group should always be seen in the overall security context, with this I start the pointers.

Practice 1:  Enable AWS VPC Flow Logs for your VPC or Subnet or ENI level. AWS VPC flow logs can be configured to capture both accept and reject entries flowing through the ENI and Security groups of the EC2, ELB + some more services. This VPC Flow log entries can be scanned to detect attack patterns,alert abnormal activities and information flow inside the VPC and provide valuable insights to the SOC/MS team operations.

Practice 2: Use AWS Identity and Access Management (IAM) to control who in your organization has permission to create and manage security groups and network ACLs (NACL). Isolate the responsibilities and roles for better defense. For example, you can give only your network administrators or security admin the permission to manage the security groups and restrict other roles.

Practice 3: Enable AWS Cloud Trail logs for your account. The AWS Cloud Trail will log all the security group events and it is needed for management and operations of security groups. Event streams can be created from AWS Cloud Trail logs and it can be processed using AWS Lambda. For example : whenever a Security Group is deleted , this event will be captured with details on the AWS Cloud Trail logs. Events can be triggered in AWS Lamdba which can process this SG change and alert the MS/SOC on the dashboard or email as per your workflow. This is a very powerful way of reacting to events within span of <7 minutes. Alternatively, you can process the AWS Cloud Trail logs stored in your S3 every X frequency as a batch and achieve the above. But the Operation teams reaction time can vary depending on generation and polling frequency of the AWS Cloud Trail logs. This activity is a must for your operations team.
Practice 4: Enable AWS App Config for your AWS account. App records all events related to your security group changes and can even send emails.

Practice 5: Have proper naming conventions for the Amazon Web Services security group. The naming convention should follow a enterprise standards. For example it can follow the notation: “AWS Region+ Environment Code+ OS Type+Tier+Application Code”
Security Group Name – EU-P-LWA001
AWS Region ( 2 char ) = EU, VA, CA etc
Environment Code (1 Char)  = P-Production , Q-QA, T-testing, D-Development etc
OS Type (1 Char)= L -Linux, W-Windows etc
Tier (1 Char)= W-Web, A-App, C-Cache, D-DB etc
Application Code ( 4 Chars) = A001
We have been using Amazon Web Services from 2008 and found over the years managing the security groups in multiple environments is itself a huge task. Proper naming conventions from beginning is a simple practice, but will make your AWS journey manageable.

Practice 6: For security in depth, make sure your Amazon Web Services security groups naming convention is not self explanatory also make sure your naming standards stays internal. Example : AWS security group named UbuntuWebCRMProd is self explanatory for hackers that it is a Production CRM web tier running on ubuntu OS. Have an automated program detecting AWS security groups with Regex Pattern scanning of AWS SG assets periodically for information revealing names and alert the SOC/Managed service teams.

Practice 7: Periodically detect, alert or delete AWS Security groups not following the organization naming standards strictly. Also have an automated program doing this as part of your SOC/Managed service operations.  Once you have this stricter control implemented then things will fall in line automatically.

Practice 8: Have automation in place to detect all EC2,ELB and other AWS assets associated with Security groups. This automation will help us to periodically detect Amazon Web Services Security groups lying idle with no associations, alert the MS team and cleanse them. Unwanted security groups accumulated over time will create unwanted confusion.

Practice 9: In your AWS account, when you create a VPC, AWS automatically creates a default security group for the VPC. If you don’t specify a different security group when you launch an instance, the instance is automatically associated with the appropriate default security group. It will
allow inbound traffic only from other instances associated with the “default” security group and allow all outbound traffic from the instance. The default security group specifies itself as a source security group in its inbound rules. This is what allows instances associated with the default security group to communicate with other instances associated with the default security group. This is not a good security practice. If you don’t want all your instances to use the default security group, you can create your own security groups and specify them when you launch your instances. This is applicable to EC2 , RDS , ElastiCache and some more services in AWS. So detect “default” security groups periodically and alert to the SOC/MS.

Practice 10: Alerts by email and cloud management dash board should be triggered whenever critical security groups or rules are added/modified/deleted in production.  This is important for reactive action of your managed services/security operations team and audit purpose.

Practice 11 : When you associate multiple security groups with an Amazon EC2 instance, the rules from each security group are effectively aggregated to create one set of rules. AWS uses this set of rules to determine whether to allow access or not. If there is more than one SG rule for a specific port, AWS applies the most permissive rule. For example, if you have a rule that allows access to TCP port 22 (SSH) from IP address 203.0.113.10 and another rule that allows access to TCP port 22 for everyone, then everyone will have access to TCP port 22 because permissive takes precedence.
Practice X.1 : Have automated programs detecting EC2 associated with multiple SG/rules and alert the SOC/MS periodically. Condense the same manually to 1-3 rules max as part of your operations.

Practice X.1 : Have automated programs detecting conflicting SG/rules like restrictive+permissive rules together and alert the SOC/MS periodically.

Practice 12 : Do not create least restrictive security groups like 0.0.0.0/0 which is open to every one.
Since web servers can receive HTTP and HTTPS traffic open, only their SG can be permissive like
0.0.0.0/0,TCP, 80, Allow inbound HTTP access from anywhere
0.0.0.0/0,TCP, 443, Allow inbound HTTPS access from anywhere
All least restrictive SG created in your account should be alerted to SOC/MS teams immediately.

Practice 13: Have a security policy not to launch servers with default ports like 3306, 1630, 1433, 11211, 6379 etc. If the policy has to be accepted, then security groups also have to be created on the new hidden listening ports instead of the default ports. This provides a small layer of defense, since one cannot infer the information from the security group port on the EC2 service it is protecting. Automated detection and alerts should be created for SOC/MS, if security groups are created with default ports.

Practice 14: Applications which require stricter compliance requirements like HIPAA, PCI etc to be met need end to end transport encryption to be implemented on server back end in AWS. The communication from ELB to Web->App->DB->Others tiers need to be encrypted using SSL or HTTPS. This means only secured ports like 443, 465, 22 are permitted in corresponding EC2 security groups. Automated detection and alerts should be created for SOC/MS if security groups are created on secure ports for regulated applications.

Practice 15: Detection , alert and actions can be taken by parsing the AWS Cloud Trail logs based on usual patterns observed in your production environment
Example:
15.1 :If a port was opened and closed in <30 or X mins in production can be a candidate for suspicious activity if it is not normal pattern for your production
15.2 :If a permissive Security Group was created and closed in <30 or X mins can be a candidate for suspicious activity if it is not the normal pattern for your production
Detect anomalies on how long a change effected and reverted in security groups in production.

Practice 16: In case ports have to be opened in Amazon Web Services security groups or a permissive AWS security group needs to be applied, Automate this entire process as part of your operations such that a security group is open for X agreed minutes and will be automatically closed aligning with your change management. Reducing manual intervention avoids operational errors and adds security.

Practice 17: Make sure SSH/RDP connection is open in AWS Security Group only for jump box/bastion hosts for your VPC/subnets. Have stricter controls/policies avoid opening SSH/RDP to other instances of production environment. Periodically check , alert and close for this loop hole as part of your operations.

Practice 18: It is a bad practice to have SSH open to the entire Internet for emergency or remote support. By allowing the entire Internet access to your SSH port there is nothing stopping an attacker from exploiting your EC2 instance. The best practice is to allow very specific IP address in your security groups, this restriction improves the protection. This could be your office or on premise or DC through which you connect your jump box.

Practice 19: Too much or Too less: How many security groups for a usual multi tiered web app is preferred is a frequently asked question ?
Option 1 : One security group cutting across multiple tiers is easy to configure, but it is not a recommended for secure production applications.
Option 2: One Security group for every instance is too much protection and tough to manage operationally on longer term
Option 3: Individual Security group for different tiers of the application, For example : Have separate security groups for ELB, Web , App, DB and Cache tiers of your application stack.
Periodically check whether Option 1 type rule is being created in your production and alert the SOC/MS.

Practice 20: Avoid allowing  UDP or ICMP for private instances in Security groups. Not a good practice unless specifically needed.

Practice 21: Open only specific ports, Opening range of ports in a security group is not a good practice. In the security group you can add many inbound ingress rules, While opening the ports it is always advised to open for specific ports like 80,443, etc rather than range of ports like 200-300.

 

Add rules for communication between associated instances

Practice 22: Private Subnet instances can be accessed only from the VPC CIDR IP range. Opening instances to the public IP ranges is a possibility , but it does not make any sense. E.g., Opening HTTP to 0.0.0.0/0 in the SG of the private subnet instance does not make any sense. So detect and cleanse such rules.

 

Practice 23: AWS CloudTrail log captures the events related security. AWS lambda events or automated programs should trigger alerts to operations when abnormal activities are detected. For example:
23.1:Alert when X number of SG were added/deleted at “Y” Hours or Day by IAM user / Account
23.2:Alert when X number of SG Rules were added/deleted at “Y” Hours or Day by IAM user / Account

Practice 24: In case you are an enterprise make sure all security groups related activities of your production are part of your change management process. Security Group actions can be manual or automated with your change management in an enterprise.
In case you are an agile Startup or SMB and do not have complicated Change management process, then automate most of the security group related tasks and events as illustrated above on various best practices. This will bring immense efficiency into your operations

Practice 25: Use outbound/egress security groups wherever applicable within your VPC. Restrict FTP connection to any server on the Internet from your VPC. This way you can avoid data dumps and important files getting transferred out from your VPC. Defend harder and make it tougher !

Practice 26: For some tiers of your application, use ELB in front your instance as a security proxy with restrictive security groups – restrictive ports and IP ranges. This doubles your defense but increases the latency.

Practice 27: Some of the tools we use in conjunction to automate and meet above best practices are ServiceNow, Amazon CFT, AWS API’S, Rundeck, Puppet, Chef, Python , .Net and Java automated programs.

Note : In case your organizations Dev/Ops/Devops teams needs help on security group best practices automation on points listed above, feel free to contact me harish11g.aws@gmail.com

 

Source

 

About the Author

Harish Ganesan is the Chief Technology Officer (CTO) of 8K Miles and is responsible for the overall technology direction of the 8K Miles products and services. He has around two decades of experience in architecting and developing Cloud Computing, E-commerce and Mobile application systems. He has also built large internet banking solutions that catered to the needs of millions of users, where security and authentication were critical factors. He is also a prolific blogger and frequent speaker at popular cloud conferences.

 

Apache Solr to Amazon CloudSearch Migration Tool

In this post, we are introducing a new tool called S2C – Apache Solr to Amazon CloudSearch Migration Tool. S2C is a Linux console based utility that helps developers / engineers to migrate search index from Apache Solr to Amazon CloudSearch.

Very often customers initially build search for their website or application on top of Solr, but later run into challenges like elastic scaling and managing the Solr servers. This is a typical scenario we have observed in our years of search implementation experience. For such use cases, Amazon CloudSearch is a good choice. Amazon CloudSearch is a fully-managed service in the cloud that makes it easy to set up, manage, and scale a search solution for your website. To know more, please read the Amazon CloudSearch documentation.

We are seeing growing trend every year, organizations of various sizes are migrating their workloads to Amazon CloudSearch and leveraging the benefits of fully managed service. For example, Measured Search, an analytics and e-Commerce platform vendor, found it easier to migrate to Amazon CloudSearch rather than scale Solr themselves (see article for details).

Since Amazon CloudSearch is built on top of Solr, it exposes all the key features of Solr while providing the benefits of a fully managed service in the cloud such as auto-scaling, self-healing clusters, high availability, data durability, security and monitoring.

In this post, we provide step-by-step instructions on how to use the Apache Solr to Amazon CloudSearch Migration (S2C) tool to migrate from Apache Solr to Amazon CloudSearch.

Before we get into detail, you can download the S2C tool in the below link.
Download Link: https://s3-us-west-2.amazonaws.com/s2c-tool/s2c-cli.zip

Pre-Requisites

Before starting the migration, the following pre-requisites have to be met. The pre-requisites include installations and configuration on the migration server. The migration server could be the same Solr server or independent server that sits between your Solr server and Amazon CloudSearch instance.

Note: We recommend running the migration from the Solr server instead of independent server as it can save time and bandwidth. It is much better if the Solr server is hosted on EC2 as the latency between EC2 and CloudSearch is relatively less.

The following installations and configuration should be done on the migration server (i.e. your Solr server or any new independent server that connects between your Solr machine and Amazon CloudSearch).

  1. The application is developed using Java. Download and Install Java 8 .Validate the JDK path and ensure the environment variables like JAVA_HOME, classpath, path is set correctly.
  2. We assume you already have setup Amazon Web services IAM account. Please ensure the IAM user has right permissions to access AWS services like CloudSearch.
    Note: If you do not have an AWS IAM account with above mentioned permissions, you cannot proceed further.
  3. The IAM user should have AWS Access key and Secret key. In the application hosting server, set up the Amazon environment variables for access key and secret key. It is important that the application runs using the AWS environment variables.
    To setup AWS environment variables, please read the below link. It is important that the tool is run using AWS environment variables.http://docs.aws.amazon.com/AWSSdkDocsJava/latest/DeveloperGuide/credentials.htmlhttp://docs.aws.amazon.com/AWSSdkDocsJava/latest/DeveloperGuide/java-dg-roles.html
    Alternatively, you can set the following AWS environment variables by running the commands below from Linux console.
    export AWS_ACCESS_KEY=Access Key
    export AWS_SECRET_KEY=Secret Key
  4. Note: This step is applicable only if migration server is hosted on Amazon EC2.
    If you do not have an AWS Access key and Secret key, you can opt for IAM role attached to an EC2 instance. A new IAM role can be created and attached to EC2 during the instance launch. The IAM role should have access to Amazon CloudSearch.
    For more information, read the below link
    http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html
  5. Download the migration utility ‘S2C’ (You would have completed this step earlier), unzip the tool and copy it in your working directory.Download Link: https://s3-us-west-2.amazonaws.com/s2c-tool/s2c-cli.zip

S2C Utility File
The downloaded ‘S2C’ migration utility should have the following sub directories and files.

Folder / Files Description  
 
bin Binaries of the migration tool
 
lib Libraries required for migration
 
application.conf Configuration file that allows end users to input parameters Require end-user’s input.
 
logback.xml Log file configuration Optional. Does not require end-user  / developer input
 
s2c script file that executes the migration process

Configure only application.conf and logback.xml.  Do not modify any other file.
Application.conf: The application.conf file has the configuration related to the new Amazon CloudSearch domain that will be created. The parameters configured  in the application.conf file are explained in the table below.

s2c {api {SchemaParser = “s2c.impl.solr.DefaultSchemaParser”SchemaConverter = “s2c.impl.cs.DefaultSchemaConverter”DataFetcher = “s2c.impl.solr.DefaultDataFetcher”DataPusher = “s2c.impl.cs.DefaultDataPusher”  } List of API that is executed step by step during the migration.Do not change this.
solr {dir = “files”
server-url = “http://localhost:8983/solr/collection1”
fetch-limit = 100}
dirThe base directory path of Solr.Ensure the directory is present and also its validity.Eg:/opt/solr/example/solr/collection1/conf
server-url– Server host, port and collection path.The endpoint which will be used to fetch the data.If the utility is run from a different server, ensure the IP address and port has firewall access.
fetch-limit– number of solr documents that can be fetched for each batch call. This configuration number should be carefully set by the developer.The fetch limit depends on the following factors:

  1. Record size of a Solr record(1KB or 2KB)
  2. Latency between migration server and Amazon CloudSearch
  3. Current Request Load on the Solr Server

E.g.: If the total Solr documents is 100000 and fetch limit is 100, then it would take 100000 / 10 = 10000 batch calls to complete the fetch.If size of each Solr record is 2KB, then 100000 * 2KB = 200MB data is migrated.

cs {domain = “collection1”
region = “us-east-1″
instance-type = ” search.m3.xlarge”
partition-count = 1
replication-count = 1}
domain – CloudSearch domain name. Ensure that the domain name does not already exist.
Region – AWS region for the new CloudSearch domain
Instance type – Desired instance type for CloudSearch nodes. Choose the instance type based on the volume of data and the expected query volume. 
Partition count – Number of partitions required for CloudSearch
replication-count – Replication count for CloudSearch
wd = “/tmp” Temporary file path to store intermediate data files and migration log files

Running the migration

Before launching the S2C migration tool, verify the following:

    • Solr directory path – Make sure that the Solr directory path is valid and available. The tool cannot read the configuration if the path or directory is invalid.
    • Solr configuration contents – Validate that the Solr configuration contents are correctly set inside the directory. Example: solrconfig.xml, schema.xml, stopwords.txt, etc.
    • Make sure that the working directory is present in the file system and has write permissions for the current user. It can be an existing directory or a new directory. The working directory stores the fetched data from Solr and migration logs.
    • Validate the disk size before starting the migration. If the available free disk space is lesser than the size of the Solr index, the fetch operations will fail.

For example, if the Solr index size is 7 GB, make sure that the disk has at least 10 GB to 20 GB of free space.
Note: The tool reads the data from Solr and stores in a temporary directory (Please see configuration wd = /tmp in the above table).

  • Verify that the AWS environment variables are set correctly. The AWS environment variables are mentioned in the pre-requisites section above.
  • Validate the firewall rules for IP address and ports if the migration tool is run from a different server or instance. Example: Solr default port 8983 should be opened to the EC2 instance executing this tool.

Run the following command from directory ‘{S2C filepath}’
Example: /build/install/s2c-cli

/s2c or JVM_OPTS=”-Xms2048m -Xmx2048m” ./s2c (With heap size)

The above will invoke the shell ‘s2c’ script that starts the search migration process. The migration process is a series of steps that require user inputs as shown in the screen shots below.
Step 1: Parse the Solr schema The first step of migration prompts for a confirmation to parse the Solr schema and Solr configuration file. During this step, the application generates a ‘Run Id’ folder inside the working directory.
  Example: /tmp/s2c/m1416220194655

The Run Id is a unique identifier for each migration. Note down the Run Id as you will need it to resume the migration in case of any failures.

Step 2: Schema conversion from Solr to CloudSearch.The second step prompts confirmation to convert Solr schema to CloudSearch schema. Press any key to proceed further.

The second step will also list all the converted fields which are ready to be migrated from Solr to CloudSearch. If any fields are left out, this step will allow you to correct the original schema. User can abort the migration and identify the ignored fields, rectify the schema and re-run the migration again.The below screen shot shows the fields ready for CloudSearch migration.


Step 3: Data Fetch: The third step prompts for confirmation to fetch the search index data from the Solr server. Press any key to proceed. This step will generate a temporary file which will be stored in the working directory. This temporary file will have all the fetched documents from the Solr index.


There is also option to skip the fetch process if all the Solr data is already stored in the temporary file. If this is the case, the prompt will look like the screenshot below.

Step 4: Data push to CloudSearchThe last and final step prompts for confirmation to push the search data from the temporary file store to Amazon CloudSearch. This step also creates the CloudSearch domain with the configuration specified in application.conf including desired instance type, replication count, and multi-AZ options.

If the domain is already created, the utility will prompt to use the existing domain. If you do not wish to use an existing domain, you can create a new CloudSearch domain using the same prompt.
Note: The console does not prompt for any ‘CloudSearch domain name’ but instead it uses the domain name configured in the application.conf file.

Step 5: Resume (Optional) During the migration steps, if there is any failure during the fetch operation, it can be resumed. This is illustrated in the screen shot below.

Step 6: Verification Log into AWS CloudSearch management console to verify that the domain and index fields.

Amazon CloudSearch allows running test queries to validate the migration and as well the functionality of your application.

Features supported

  • Support for other non-Linux environments is not available for now.
  • Support for Solr Shards is not available for now. The Solr shard needs to be migrated separately.
  • The install commands may vary for different Linux flavors. Example installing software, file editor command, permission set commands can be different for every Linux flavors. It is left to engineering team to choose the right commands during the installation and execution of this migration tool.
  • Only fields configured as ‘stored’ in Solr schema.xml are supported. The non-stored fields are ignored during schema parsing.
  • The document id (unique key) is required to have following attributes:
    1. Document ID should be 128 characters or less in size.
    2. Document ID can contain any letter, any number, and any of the following characters:      _ – = # ; : / ? @ &
    3. The below link will help you to understand in data  preparation before migrating to CloudSearch http://docs.aws.amazon.com/cloudsearch/latest/developerguide/preparing-data.html
  • If the conditions are not met in a document, it will be skipped during migration. Skipped records are shown in the log file.
  • If a field type (mapped to fields) is not stored, the stopwords mapped to that particular field type are ignored.

Example 1:

<field name=”description” type=”text_general” indexed=”true” stored=”true” />   

Note: The above field ‘description’ will be considered for stopwords.Example 2:

<field name=”fileName” type=”string” />     

Note: The above field ‘fileName’ will not be migrated and ignored in the stopwords.

Please do write your feedback and suggestions in the below comments section to improve this tool. The source code of the tool can be downloaded at https://github.com/8KMiles/s2c/. We have written a follow-up post in regard to that.

About the Authors
 Dhamodharan P is a Senior Cloud Architect at 8KMiles.

 

 

 

 Dwarakanath R is a Principal Architect at 8KMiles.

 

 

EzIAM – Moving your Identities to the Cloud – An Analysis

Before an enterprise implements an on-premise IDM (Identity Management) solution, there are a lots of factors to consider. These considerations go way up, if the enterprise were to implement a new cloud IDM solution (i.e decide to move their identities partially or fully to a cloud like AWS, Azure or Google and manage these identities using a cloud IDM solution like the EzIAMTM solution). I will touch upon these items.

There could be 3 types of movers to the cloud.

  • New enterprise (or a start-up) that is planning to start their operations with a cloud IDM itself straightaway. These enterprises may not have an on-premise presence at all (Neo IDM Movers).
  • Some other enterprises might be planning to move only some of their existing IDM parts to the cloud and keep the rest of them on-premise (they are generally called the Hybrid IDM Movers).
  • While a few others could try to move their entire on-premise IDM operations to the cloud (Total IDM Movers). Although there will be some common considerations for these 3 categories of movers, before they decide to move to Cloud IDM, they individually will have some unique issues to deal with.

New Movers to a cloud IDM Infrastructure – companies starting their operations in the Cloud & hence want to have all their identities in the new cloud IDM infrastructure from day 1 of their operations:

These are the companies that start their identity management in the cloud itself straightway. The number of questions that these enterprises would want to be answered would be far less compared to the other 2 category of enterprises. Prime considerations for these type of organizations would be:

1. Will the cloud IDM solution be safe to implement (i.e safe to have my corporate users & identities exist in there) ?
2. Will the cloud IDM solution be able to address the day-to-day IDM operations/workflows that each user is going to go through?
3. Will the cloud IDM solution be able to scale for the number of users ?
4. What are the connectivity options (from a provisioning standpoint) that the cloud-idm system provides ? (i.e connecting to their applications/db’s/directories that are existing on the cloud, assuming they are a complete cloud organization)?
5. How robust these connections are (i.e in terms of number of concurrent users, data transport safety) ?
6. What are the Single Sign-On connectivity options that the solution provides ?
7. What are the advanced authentication mechanisms that the solution provides ?
8. What are the compliance and regulatory mechanisms in place ?
9. What are the data backup and recovery technologies in place ?
10. What are the log and audit mechanisms in place ?

If the organization can get convincing answers to the above questions, I think it is prudent for them to move their identities to the cloud. EzIAMTM, as a cloud IDM solution (from 8KMiles Inc.) has the best possible answers to the above questions in the market today. It is definetly an identity-safe, data-safe and a transport-safe solution, meaning identities stored within EzIAMTM directories and databases stay there in a secure manner and when transported either within the cloud or outside, always go through a TLS tunnel. Each component of EzIAMTM (there are 7 components/servers) is load-balanced and are tuned for high scale performance.

There are more than 30 out-of-the-box provisioning connectors available to connect to various directories, databases and software applications. The Single Sign-On connectivity options are innumerable with support for SAML2.0, OpenID 2.0 and OAuth 2.0. Varied advanced authentication mechanisms are supported that ranges from X509 cert/smart card based tokens & OTP/mobile-based authentications. Being in the AWS cloud the backup process and recovery process is as efficient as any back process can be. Daily backups of snapshots and data are taken, with ability to recover within minutes.

Hybrid Movers to a cloud IDM infrastructure – companies moving their on-premise identities & applications to the cloud but not fully yet :

Most of the companies would fall into this category. These kind of movers, move only a few parts of their IDM infrastructure to the cloud. They would initially move their applications to the cloud to start with. Then they would probably move their user stores/directories and along with that their identities to the cloud. They would still have some applications on-premise, which they would need to connect from the cloud IDM solution. They would also want to perform the daily identity workflow process from the cloud IDM solution. This way they can streamline their operations especially if they have offices in multiple locations, with users in multiple Organizational Units (OUs), accessing multiple on-premise and cloud applications.

Hybrid movers would have the maximum expectations from their cloud IDM solution, as the solution needs to address both their on-premise and cloud assets. Generally if these movers can get answers to the following tough questions, they will be much satisfied, before they move their IDM assets to the cloud.

1. Will the cloud IDM solution enable me to have a single primary Corporate Directory in the cloud? How will it enable the move of my current on-premise primary directory/user database to the cloud?
2. Will the solution allow me to provision users from our existing on-premise endpoints to the cloud?
3. Will the solution help me keep my on-premise endpoints (that contain user identities) in tact and move these endpoints in stages to the cloud.
4. I have applications, on-premise whose access is controlled by on-premise Access Control software. How can I continue to have these applications on-premise and enable access control to them via the cloud IDM solution?
5. How will the solution provide access control to the applications that I am going to move to the cloud?
6. Will the cloud IDM solution help me chalk out a new administrator/group/role/user base structure?
7. Will the solution help me control my entire IDM life-cycle management (from the day a user joins the org to the time any user leaves the org) through the cloud IDM ?
8. How exhaustive will the cloud IDM solution allow my access permission levels to be?
9. How often would the cloud IDM solution allow me to do a bulk-load of users from an on-premise directory or db?
10. What will the performance of the system when I perform other IDM operations with the system, during this bulk-load of users?
11. Will the solution allow us to have a separate HR application which we would want to be connected and synched up with the cloud IDM Corporate Directory?
12. What are the security benefits in connectivity, transport, access control, IDM life cycle operations, provisioning, admin-access etc. that the solution offers?
13. What are the connectivity options (i.e connecting to other enterprise applications across that enterprise’s firewall’s?)
14. What SaaS applications that the solution would allow the users to connect to in the future? How would the solution control those connections through a standard universal access administration for my company?

Total Movers to a cloud IDM infrastructure – companies that move 100% of their identity infrastructure to the cloud from an on-premise datacenter :

The primary motivation behind the “Total Movers” of IDM to the cloud would be the following:

1. How can I move my entire IDM infrastructure without loosing data, application access control, identity workflows, Endpoint Identity Data, Connectors ?
2. How long would it take for my move ?
3. Would I be able to setup a QA environment and test the system thoroughly before moving to production in the cloud?
4. How can I transition from my on-premise IDM software to a different cloud IDM software like EzIAMTM?
5. What is the learning curve for my users to use this system?
6. How can I customize the cloud IDM user interface, so it depicts my organizations profile & IDM goals/strategies ?
7. How much can I save in trained IDM skilled personnel and on-premise infrastructure costs when I move my IDM to the cloud in its entirety?

For all the 3 kinds of cloud movers described above, EzIAMTM would be a perfect solution. Pretty much all the questions posted above for all the types of movers, can be answered by the deployment of EzIAMTM. The solution is very versatile, customizable and has great connectivity options to all types of endpoints that an enterprise can have. The learning curve to get used to the screens is very minimal, as the screens are intuitive. Mobile access is enabled. The feature of integrating EzIAMTM with a cloud Governance Service solution is an added incentive for the movers, as this option would be extremely helpful to govern their identity environment efficiently.

Database Scalability simplified with Microsoft Azure Elastic Scale (Preview) for SQL Azure

Database Multi Tenancy in enterprises is a less known fact, but it’s a must and very familiar topic among SaaS Application developers. It’s also very hard to develop and manage Multi-Tenant application & database infrastructure.

SaaS Developers usually build custom Sharding architectures because of the deep limitation and inflexibility of SQL Server Federation services. Custom Sharding and database scalability architectures are more manual and consists lots of moving parts. One such manual architecture usually suggested Application Architects are creating fixed set of ideal schema/DB and manage a Master Connection string table to manage the mappings between the Sharding ID/Customer ID with the right shard (Refer Illustration below).

As such, there’s nothing wrong in the architecture above, but there are lots of challenges imposed by it such as

1. Infrastructure Challenges

  • a)Maintaining and Managing Shard Meta DB infra-structure
  • b)Splitting 1 noisy customer data from one shard to another shard
  • c)Scaling up a particular shard on need basis
  • d)Querying data from multi shard Databases
  • e)Merging shards to cut down costs

2. OLAP Challenges (Database Analysis & Warehousing)

  • a)Developers will not be able to issue single query to fetch data from 2 different shards
  • b)Conducting Data Analysis is hard

Introducing Azure Elastic Scale

Azure introduced “SQL Azure Elastic Scale SDK” to overcome these challenges. To get started with Azure Elastic scale, you can download the Sample App which will be a good starting point to understand the SDK inside out.  Azure Elastic Scale has 3 Key APIs that makes the Sharding simple, they are

  1. Shard Map Manager (SMM)
  2. Data Dependent Routing (DDR)
  3. Multi Shard Query(MSQ)
  4. Split & Merge

Shard Map Manager

This is the Key part of the SQL Azure Elastic Sharding function. Shard Map Manager is essentially a Master Database to hold the shard mapping details along with shard range key (Customer ID/Dept. ID/Product ID) (Refer the below screenshot).  When you add/remove shard, it creates/removes an entry into the SMM database.

 

Data Dependent Routing

After the shard has been defined in the Sharding Manager DB, Data Dependent Routing API takes care of routing the customer request to the appropriate shard. Data dependent routing takes the shard ID to identify the right shard of the specific customer. Shards can also span Azure regions which helps us to keep the shard very next to the customer and reduce the network latency as well as compliancy requirement.

DDR also cache the shard details received from SMM db to avoid unwanted round trip on each request. However, this cache will be invalidated as when you change the shard details.

 

Multi Shard Query

Multi Shard Query is the Key API which allows the querying of data from multiple Shards and helps us in joining the result set. Under the hood, it actually queries the data individually from the different shards and applies join on the received result set to get a unified data in return. Multi Shard Query is only ideal when you have ideal schema and suitable for custom schema in the shards.  Click here to view the complete list of APIs that’s part of the Microsoft.Azure.SqlDatabase.ElasticScale namespace.

Example

public static void ExecuteMultiShardQuery(RangeShardMap shardMap, string credentialsConnectionString)
        {
            // Get the shards to connect to
            IEnumerable shards = shardMap.GetShards();

            // Create the multi-shard connection
            using (MultiShardConnection conn = new MultiShardConnection(shards, credentialsConnectionString))
            {
                // Create a simple command
                using (MultiShardCommand cmd = conn.CreateCommand())
                {
                    // Because this query is grouped by CustomerID, which is sharded,
                    // we will not get duplicate rows.
                    cmd.CommandText = @"
                        SELECT 
                            c.CustomerId, 
                            c.Name AS CustomerName, 
                            COUNT(o.OrderID) AS OrderCount
                        FROM 
                            dbo.Customers AS c INNER JOIN 
                            dbo.Orders AS o
                            ON c.CustomerID = o.CustomerID
                        GROUP BY 
                            c.CustomerId, 
                            c.Name
                        ORDER BY 
                            OrderCount";

                    // Append a column with the shard name where the row came from
                    cmd.ExecutionOptions = MultiShardExecutionOptions.IncludeShardNameColumn;

                    // Allow for partial results in case some shards do not respond in time
                    cmd.ExecutionPolicy = MultiShardExecutionPolicy.PartialResults;

                    // Allow the entire command to take up to 30 seconds
                    cmd.CommandTimeout = 30;

                    // Execute the command. 
                    // We do not need to specify retry logic because MultiShardDataReader will internally retry until the CommandTimeout expires.
                    using (MultiShardDataReader reader = cmd.ExecuteReader())
                    {
                        // Get the column names
                        TableFormatter formatter = new TableFormatter(GetColumnNames(reader).ToArray());

                        int rows = 0;
                        while (reader.Read())
                        {
                            // Read the values using standard DbDataReader methods
                            object[] values = new object[reader.FieldCount];
                            reader.GetValues(values);
                            // Extract just database name from the $ShardLocation pseudocolumn to make the output formater cleaner.
                            // Note that the $ShardLocation pseudocolumn is always the last column
                            int shardLocationOrdinal = values.Length - 1;
                            values[shardLocationOrdinal] = ExtractDatabaseName(values[shardLocationOrdinal].ToString());
                            // Add values to output formatter
                            formatter.AddRow(values);
                            rows++;
                        }

                        Console.WriteLine(formatter.ToString());
                        Console.WriteLine("({0} rows returned)", rows);
                    }
                }
            }
        }

Download the sample application from here.

Elastic Scale Split & Merge

As the name suggests, it helps the developers to Split or Merge the DB shards based on the needs. There are 2 Key scenarios when you would need Split & Merge functionality. They are

  1. Moving data from heavily growing hotspot database to a new Shard
  2. Merge 2 databases if the size of the provisioned DB is less to reduce the database cost.

Split & Merge is a combination .Net APIs + Web API + Powershell package. Refer the below links for an introduction and Step by Step Implementation Guide.

 

Introduction : http://azure.microsoft.com/en-us/documentation/articles/sql-database-elastic-scale-overview-split-and-merge/

Step by Step Guide: http://azure.microsoft.com/en-us/documentation/articles/sql-database-elastic-scale-configure-deploy-split-and-merge/

About the Author

 Ilyas is a Cloud Solution Architect at 8K Miles specializing Microsoft Azure and AWS Clouds. He is also passionated about Big Data, Analytics and Machine Learning Technologies.

LinkedIn || Twitter 

8K Miles Blood Donation Drive: a CSR Initiative

As part of the Corporate Social Responsibility, 8KMiles Software Services in association with Jeevan Blood Bank and Research center (Public Charitable Trust) conducted a Blood Donation Camp at our Chennai Premises on Mar, 31 2015. Many of our employees eagerly volunteered and donated blood in the interest of saving people lives.

 

“I have been a Blood Donator ever since my college days, I truly enjoy the satisfaction that I get when I donate blood every time.” Vinoth, Human Resource, 8KMiles Software Services.

 

Our Blood Donors from 8K Miles Office Chennai, India

 

 

Want to become a game changer with a sense of social responsibilities? Head to our careers page.

EzIAM – On-premise and Cloud Connectivity Options

For an enterprise, a key decision factor in selecting a good Cloud Identity Management service, is the ability of the service to connect to on-premise & cloud endpoints. Enterprises normally have User data stored in their endpoints. User data would range from the groups (segregation of users into a classification needed for the endpoint) the user belongs to, to the roles(a classification that allows the user to perform a particular function in that endpoint when the user is part of that classification) of the user and access privileges(permission levels to access resources) for that particular endpoint.

The same user in an enterprise, can have different types of access, can perform different roles and can be part of different groups in different endpoints. We normally witness the fact that, once the number of endpoints grows in an enterprise this User data & the related objects like groups and roles become unmanageable and untraceable. Enterprise want to have this problem fixed.

Any enterprise would love to know what types of access each user has in each of the endpoints at a given point in time. It would be key for them to have this information in one central place. Having this data in a single location, would help enterprise managers to look out for improper access, redundant roles in each of the endpoints on a periodic basis.

EzIAMTM (a Cloud Identity Management Service from 8KMiles Inc.) offers something called a provisioning directory, where relevant data (User, groups, roles) from the endpoints can be stored and accessed by the enterprise administrator. (Please refer to my previous blog – EzIAM FAQ – to know about the origins and capabilities of EzIAMTM). This data can then be imported to an Identity Access Governance Service, that would then analyze the roles, groups & access permissions during periodic certification campaigns conducted by the business managers. That can be a topic for a future blog. Now, let us dwelve into the various facets of endpoint connectivity options available within EzIAMTM.

EndPoints:

Endpoints are Directories, databases, LDAPs, applications, OS user stores etc. Almost any endpoint in an enterprise would have a data store where the users of that particular endpoint would be stored. Sometimes the endpoint themselves could be applications, in which case there would be an application database where the user information would be stored. Almost any system that contains user information could act as an endpoint for EzIAMTM. The endpoints could reside either on-premise or in a cloud.

Typically, an endpoint is a specific installation of a platform or application, such as Active Directory or Microsoft Exchange, which communicates with Identity Management to synchronize information (primarily attributes of a user stored in the endpoint). An endpoint can be:

■ An operating system (such as Windows)
■ A security product that protects an operating system (such as CA Top Secret and CA ACF2)
■ An authentication server that creates, supplies, and manages user credentials (such as CA Arcot)
■ A business application (such as SAP, Oracle Applications, and PeopleSoft)
■ A cloud application (such as Salesforce and Google Apps)

Connectors:

A connector is the software that enables communication between EzIAMTM and an endpoint system. A connector server (an EzIAMTM Server Component) uses a connector to manage an endpoint. One can generate a dynamic connector using Connector Xpress (an EzIAM Tool), or one can develop a custom static connector in Java. For each endpoint that you want to manage, you must have a connector. Connectors are responsible for representing each of the managed objects in the endpoint in a consistent manner. Connectors translate add, modify, delete, rename, and search LDAP operations on those objects into corresponding actions against the endpoint system. A connector acts as a gateway to a native endpoint type system technology. For example, to manage computers running Active Directory Services (ADS) install the ADS connector on a connector server.

Three Types of Connectors:

EzIAMTM has a rich set of On-premise connectivity options. There are 3 primary ways of connecting to endpoints;

C++ Connectors (managed by C++ Connector Server (CCS))
Java Connectors (managed by CA IAM Connector Server (CA IAM CS)).
Provisioning Server Plugins

The endpoints (in the diagram, courtesy: CA) the Connectors connect to range primarily from PeopleSoft, SalesForce (IAM CS) to AD, DB2 (C++ Connector), RACF(Prov. Server Plugin). These are just examples of connectors. A list of out-of-the-box connectors is given in “Connecting to endpoints” sub-section below.

One cannot use both CA IAM CS and CCS to manage the same endpoint type.

What Connectors Can Do:

EzIAMTM has a number of out-of-the-box Connectors that help to connect to popular endpoints. Each connector lets Identity Management within EzIAMTM, perform the following operations on managed objects on the endpoint:
■ Add
■ Modify—Changes the value of attributes, including modifying associations between them (for example, changing which accounts belong to a group).
■ Delete
■ Rename
■ Search—Queries the values of the attributes that are stored for an endpoint system or the managed objects that it contains.
For most endpoint types, all of these operations can be performed on accounts. These operations can also be performed on other managed objects if the endpoint permits it.

Connecting to Endpoints:

Popular out-of-the-box Connectors in EzIAMTM:
CA Access Control Connector
CA ACF2 v2 Connector
CA Arcot Connector
CA DLP Connector
CA SSO Connector for Advanced Policy Server
CA Top Secret Connector
IBM DB2 UDB for z/OS Connector
Google Apps Connector
IBM DB2 UDB Connector
IBM RACF v2 Connector
Kerberos Connector
Lotus Domino Connector
Microsoft Active Directory Services Connector
Microsoft Exchange Connector
Microsoft Office 365 Connector
Microsoft SQL Server Connector
Microsoft Windows Connector
Oracle Applications Connector
Oracle Connector
IBM i5/OS (OS/400) Connector
PeopleSoft Connector
RSA ACE (SecurID) Connector
RSA Authentication Manager SecurID 7 Connector
Salesforce.com Connector
SAP R/3 Connector
SAP UME Connector
Siebel Connector
UNIX ETC and NIS Connector

Ways to Create a New Connector:

One can connect to an endpoint that is not supported out-of-the-box in EzIAMTM, also. To do this, an enterprise needs to create its own connector in one of these ways:

■ Use Connector Xpress to create the connector.
■ Use the CA IAM CS SDK to create the connector.
■ Ask 8KMiles to create a connector.

Set Up Identity Management Provisioning with Active Directory:

One can use Active Directory Server (ADS) to synchronize attribute data to supported endpoints. This could be done by configuring CA IAM CS to propagate local changes in Active Directory to a cloud-based identity store using a connector. For example, assume that you have a GoogleApps installation in the cloud. You could create an ADS group named “GoogleApps” and then configure the CA IAM CS to monitor that group. CA IAM CS synchronizes any changes to the GoogleApps environment in the cloud. If you add a user to the ADS GoogleApps group, CA IAM CS uses the GoogleApps connector to trigger a “Create User” action in the GoogleApps environment proper.

To set up directory synchronization:
1. Install CA IAM CS in your environment.
2. Acquire the endpoints that you want to synchronize with. You must acquire endpoints in order to create templates in step 4.
3. Create one or more directory monitors. Monitors capture changes that you make in your local Active Directory, and report them for the synchronization.
4. Create one or more synchronization templates. Templates control settings for the directory synchronization.

Custom Connectors:

Custom Connectors are connectors that can be programmed (mostly from pre-available template structures) that enables an enterprise to connect to custom endpoints (i.e endpoints that are not supported out-of-the-box in EzIAMTM).

Custom Connector Implementation Guidelines:

It would help the developers to consider the following guidelines when designing and implementing a connector:

■ Drive as much of the connector implementation logic as possible using metadata.
■ Write code that takes advantage of the service provided by the CA IAM CS framework, like pluggable validators and converters, and connection pooling support classes.
■ Write custom connector code to address any additional specific coding requirements.

In summary, connection to endpoints is a critical aspect of modern Cloud Identity Management systems. The crucial Connector properties to look for from your Cloud Identity Management system would be,

  • the efficiency of the connectors that would dictate the speed of data transfer between the endpoint and the Corporate user store
  • the synchronization of attributes between the endpoint and the store (strong synchronization vs weak synchronization)
  • the customization aspects of the connector (connector pool size, reverse synchronization from the endpoint to the Corporate Store etc.)
  • the Validators and Convertors of datatypes (from endpoint to Directory) that the connectors offer
  • the range of endpoints that the connectors could connect to ranging from AD, LDAP, DBs, Web Services (SOAP and REST-based) to custom endpoints with custom schema & metadata

EzIAMTM is an ideal candidate in this regard as it has a rich set of on-premise and cloud connectivity options. It has all the ideal connector properties that an enterprise would need to connect to their favourite endpoints.