4 Ways Medical Devices Industry could benefit from Cassandra

Healthcare is one of the most complicated and largest industry in the US, is heading towards tapping the potential of Big Data to make healthcare healthier. With the advancements in technology regarding wearable devices, e-health & m-health plus IoT, the volume of time-stamped data gets increased drastically. This Big Data, if used wisely, could enhance the quality of patient care, patient engagement, predicting diseases, reduce healthcare costs and time and prevent security breaches and fraud. The potential of Big Data seems very significant, but it is no easy feat to realize them without a proper database in place.

This storyline of the healthcare industry is directly or indirectly hooks up the medical device manufacturers to rethink their existing business models and create a value proposition for customers through devices with innovative analytics solutions. This transformation of medical device industry is further fueled by increasing emphasis on quality of care and treatment, digitalization of care delivery models and stern regulatory compliance focusing on patient safety and cost-containment, empowered/informed customers and emerging technologies. It is tough to meet this goal with a legacy database management system. An advanced database technology like Cassandra could serve as the holy grail of Big Data because of its ability to:

  • Scale boundlessly

The projected growth rate of volume of data being generated in the healthcare industry will soon be in zettabyte or yottabyte scale. The traditional RDBMS is not designed to scale up to meet the occupy the exploding amount of data, whereas Cassandra offers the flexibility to scale up to any size at the lowest cost, just by adding nodes or clusters to the environment whenever required.

  • Always-available & promise zero-downtime

The price of data outages and downtime in healthcare is measured not just in terms of money and time, but in lives. A life-saving application or device cannot depend on a legacy DB that has a single point of failure, and that undergoes chronic performance issues. Cassandra’s ability to deploy in multi-data center keep the system running, even when an entire data center is lost.

  • Seamlessly integrate with cross-platform or cross systems

In the healthcare ecosystem, there are numerous sources of data, such as laboratory reports, clinical reporting systems, customer support, financial, administrative, and pharmacy data and electronic health records (EHR), and these data could be structured or unstructured. Cassandra easily integrates with various data sources and helps to search any data easily and quickly within the same database platform.

  • Access Near Real-time Data

The rapid expansion of accessible and consumable patient data is no longer fit to undergo the legacy method of collecting and processing information. It must instead gain insights from data while it’s in motion, providing provider and patient with real-time insights in a user-friendly interface. Cassandra can deliver near real-time performance through its ability to store and access data in columns, execute fast inserts, use distributed counters, and by taking advantage of solid-state drives.

Cassandra could be the spine for any medical device manufacturing enterprise to develop competitive and intuitive analytics solution for the healthcare industry by efficiently managing the big expanding data. Check out what’s in our box for you – Cassandra as a Service!

Author Credits: Archana Ramesh, Asst Manager- Marketing, 8KMiles. You can reach her here.

Is Lifesciences industry leveraging all RWE data to drive better patient outcomes?

Any Organization that is involved at any part of the drug production or its value chain process, will undisputedly require access and means to analyse Real World data(RWD). For starters, Real World Data is the base to understand both the benefits and risks after a drug has received regulatory approval. The FDA generally defines RWD as data acquired from disparity sources/systems apart from traditional clinical trials and Real World Evidence (combination and analysis of RWD elements like economic outcome, drug’s value proposition etc.).

Real World Evidence(RWE) gives Life Science companies, medical providers and insurance companies a larger and more accurate data set to base strategies and conclusions on effectiveness and ROI of medications/ treatments in the real world. For example, Healthcare insurance provider, third-party payers, or health-plan sponsors can’t reach the right segment of patients without analysing types of products that are the most effective or in demand amongst them. Whereas, the healthcare providers without right RWE insights can neither comprehend the process of a patient’s treatment within their current healthcare system nor ensure optimal treatment guidelines and best practices (which may ultimately lead to penalties). When it comes to pharma manufacturers there is a need for real-time access to RWE data to conduct observational research, speed up drug development and understand types of patients to target for clinical trials and marketing campaigns. In spite of the above needs Lifesciences industry do not reap the maximum benefits from the Real World Data and to support this, conservative market estimates that major pharma players splurge an average of $20 million annually on RWE and yet struggle to fully understand patients’ health and treatment.

This apparent gap between Life science industry and RWE Data could be bridged using Cloud based Analytics. Though the above stakeholders have different uses out of RWE data, their requirements could be satisfied with a Cloud based analytics solution.

Why Cloud based analytics acts as a bridge?
Analytics is the key to gain the insights from the data available and in this case RWE data is bound to increase at an exponential rate so resorting to cloud is the best option. Below are some important aspects why a Cloud solution does justice to your investment:

Security-Healthcare information being sensitive in many regions, strict regulatory compliances like GxP and HIPAA has been made mandatory to be adhered by the organizations dealing RWE data. Cloud solutions make sure these protocols are followed and data is maintained with high level security, keeping vulnerabilities and threats away.

Speed– Time is one of the critical essence in this sector so speed in terms of access and accuracy in results are very important. Cloud based analytics are swift in processing huge data which enables accurate insights for taking efficient business decisions.

Scalability– Cloud can rapidly scale any amount of data irrespective of their velocity coming from multiple sources which many on-premise solutions fail. This quality is very crucial to meet analytical demands of the end user and only a cloud solution can satisfy this.

Thus, RWE data with help of right Cloud based analytics solutions can open ways to identify under-served markets, enhance personalized treatments/ therapies and reduce payer costs and time taken for clinical trials. Therefore to process this humongous data coming from various sources, aggregate them at one place and access them anytime, anywhere irrespective of geographies without any hassles like speed and security issues, it is best for organizations to embrace a Cloud based Analytics solution.

Author Credits: Kripaa Krishnamurthy, Sr. Associate- Digital Marketing, 8KMiles. You can reach her here

Kerberizing Cassandra

When it comes to access control, enterprises seek for uncompromised mechanism in place to protect the data and services over the network. Cryptographies based on both symmetrical and asymmetrical key algorithms are widely used for secured access and authorization. The well-known Kerberos protocol uses symmetric encryptions and can be used for a client to prove itself to a server across an insecure network. Further privacy and data integrity is ensured after both client and server prove their identity. Having Kerberos as a centralized authenticator, it is appropriate for enterprises to implement single sign-on (SSO) without difficulty across all their applications.

In this blog, we will discuss how to enable Kerberos authentication for a distributed NoSQL database system. We will implement for DataStax Cassandra, one of the leading database platform for big data on a cluster of nodes. DataStax Cassandra already provides authentication based on internally controlled role name/passwords, authorization based on object permission management, Authentication and authorization based on JMX username/passwords and SSL encryption. We intend to explain the implementation of the Kerberos integration and the blog is divided as four sections.

  • Installing Kerberos server
  • Configuring Kerberos server
  • Connection to Cassandra
  • Connection to Cassandra cqlsh

Installing Kerberos Server

Kerberos server is the key to the network security and is advised to have an alternative server to recover just in case of any failure. To set up Kerberos server, installation is done as below.

$ sudo apt-get install krb5-admin-server
$ sudo apt-get install krb5-kdc
$ sudo krb5_newrealm

While installing set kerberos realm with kdc and kadmin server. We go with Kerberos.com as kerberos realm and kserver.com as kdc and kadmin server.
Configuring Kerberos Server:
After installing Kerberos, we need to edit default Kerberos files, to change the realms name. In order to change, open krb5.conf in an editor,

$ sudo vi /etc/krb5.conf

Check the configuration for kerberos realm. In [domain_realm] part and add realm name.

.kerberos.com = KERBEROS.COM
kerberos.com = KERBEROS.COM

Now restart the Kerberos server so that changes will be reflected and be effective.

$ sudo service krb5-admin-server restart
$ sudo service krb5-kdc restart

At this stage, the Kerberos server is ready. After this we need to set, Kerberos client, so that the Cassandra nodes be secured.
Cassandra Connection
In this session, we will be connecting each Cassandra node to server by creating principal and editing few files in clients as well as in server.
kdc server:
Here we will create policy to attach with principals. To create basic policy, admin..

$ sudo kadmin.local
Kadmin: add_policy -minlength 8 -minclasses 3 admin
Kadmin: quit

Policy with name admin has been created and is ready to be attached. Now, add a server principal name (SPN)

$ sudo kadmin.local
kadmin: addprinc -policy admin root/admin
kadmin: quit

In order to give full permission to admin policy, open kadm5.acl in vi editor,

$ sudo vi /etc/krb5kdc/kadm5.acl

Uncomment */admin * line and save so that admin policy will get full permission. In /etc/hosts add your internal ip with realm name for example:

x.x.x.x KERBEROS.com

Restart the krb5-admin-server

$ sudo service krb5-admin-server restart

We need to add service principal and HTTP principal for each client in KDC using addprinc command. To do so, login to add principal

$ sudo kadmin -p root/admin
Kadmin: addprinc -policy admin –randkey cassandra/fqdn
Kadmin: addprinc -policy admin –randkey HTTP/fqdn
(*fqdn– Fully Qualified Domain Name / name of each node.)

To find client fqdn,

$ hostname --fqdn

To ensure both principal added successfully, check the server principal name in the KDC by listprincs command.

Kadmin: listprincs

kerberos client:
Install krb5-user in each client

$ sudo apt-get install krb5-user.

Set the realm in the same way as we did for Kerberos server with same server, i.e. KERBEROS.COM as kerberos realm and kserver.com as kdc and kadmin server. Most likely both krb5.conf should be same.

KDC server:
A keytab is a file containing pairs of Kerberos principals and encrypted keys from the Kerberos password. Using this we get authenticated to various remote systems using Kerberos without entering a password. If kerberos password has been changed, keytab should be generated newly.
To produce keytab for principal name.

$ sudo kadmin.local
Ktadd –k dse.keytab cassandra/fqdn
Ktadd –k dse.keytab HTTP/fqdn
dse.keytab - name of the keytab.

This command will create keytab in current folder. We can also target a folder to generate keytab. Now copy the keytab to each node using scp command.
Kerberos client:
Open /etc/hosts and add realm name and ip as in server.

x.x.x.x kerberos.com

Move the keytab to specific location and change user and permission of the keytab,

$ sudo chown cassandra:cassandra dse.keytab
$ sudo chmod 600 dse.keytab

Open cassandra.yaml, /etc/dse/cassandra/cassandra.yaml in vi editor, Change the authenticator as below.

authenticator: com.datastax.bdp.cassandra.auth.KerberosAuthenticator

Open dse.yaml, /etc/dse/dse.yaml in vi editor and modify as below

keytab: /etc/dse/dse.keytab
service_principal: cassandra/fqdn@KERBEROS.COM
http_principal: HTTP/fqdn@KERBEROS.COM
qop: auth
Fqdn as in server

By this Kerberos authentication has been set for Cassandra.
Connection to Cassandra cqlsh:
Cqlsh is the client for executing Cassandra Query Language based on python. In this session, we are going to authenticate cqlsh with kerberos by configuring cassandra.yaml.
In Server:
We need a user principal to authenticate to server from client. To create the user principal, jane do the following.

$ sudo kadmin.local
Kadmin: addprinc jane

It looks like jane@KERBEROS.COM
In Client:
Now temporarily disable kerberos authenticator and dse authorizer in cassandra.yaml. And add the following.

authenticator: PasswordAuthenticator
authorizer: CassandraAuthorizer

Now, restart the dse service and start the cqlsh,

$ sudo service dse restart

Login to cqlsh to add jane as superuser.

$ cqlsh hostname -u cassandnra -p cassandra (default user name and password)

Create a superuser, jane, to authenticate the kerberos servere by,

cqlsh> create user 'jane@KERBEROS.COM' SUPERUSER;

Now renable kerberos authenticator and change the authorizer in cassandra.yaml

authenticator: com.datastax.bdp.cassandra.auth.KerberosAuthenticator
authorizer: AllowAllAuthorizer

To run cqlsh kerberos authentication, add the python dependencies in the clients.

$ sudo apt-get install python-pip
$ sudo pip install pure-sasl
$ sudo apt-get install python-kerberos

Create cqlshrc file in .cassandra directory.

$ vi /home/user/.cassandra/cqlshrc and add these configuration,
hostname = host-ip
port = 9042
hostname = host-ip
service = Cassandra

Now introduce yourself to kerberos server by,

$ kinit jane

And enter the password and get authenticated.
The kinit command obtains or renews a Kerberos ticket-granting ticket. Use below command and verify the ticket.

$ klist

Start cqlsh.

$ cqlsh

By this the cqlsh kerberos authentication has been successful, every cql command has now been encrypted and secured.


Author Credits: Siddharth Kumar S, Senior Associate – Big data, 8KMiles Software Services and you can reach him here

Solutions in Azure : Azure CDN for Maximum Bandwidth & Reduced Latency – Part II

From the First Part of this Article we can derive a conclusion that the CDN’s job is to enhance the regular hosting by reducing bandwidth consumption, minimizing latency and providing the scalability needed to handle abnormal traffic loads. It cuts down on round-trip time (RTT), effectively giving similar response to end user irrespective of their geographical presence.

In MicroMarketMonitor’s recent report, it has clearly mentioned that only the North American content delivery network market is expected to grow from $1.95 billion in 2013 to $7.83 billion in 2019. One significant factor driving this growth is end user interaction with online content. So moving forward it is going to be a major factor while architecting any application.-

Azure CDN Highlights

  1. Improve rendering speed & Handle high traffic loads: Azure CDN servers manage the content by making use of its large network of POPs. This dramatically increases the speed and availability, resulting in significant user experience improvements.
  2. Designed for today’s web: Azure CDN is specifically designed for the dynamic, media-centric web of today and cater to the requirement of its users who expect everything to be fast, high quality, and always-on.
  3. Streaming Aware Era: Azure CDN can be helpful under all possible three ways while serving videos over HTTP- Progressive Download and Play, HTTP Pseudo-streaming & Live Streaming.
  4. Dynamic Content Acceleration: If you understand the nitty-gritties of Azure CDN it also uses series of techniques to serve uncatchable content faster. For example, it can route all communication from a client in India to a server in the US through an edge in India and an edge in the US. They then maintain a constant connection between those two edges and apply WAN optimization techniques to accelerate it.
  5. Block spammers, scrapers and other bad bots: Azure Content Delivery Network is built on a highly scalable, reverse-proxy architecture with sophisticated DDoS identification and mitigation technologies to protect your website from DDoS attacks.
  6. When the expectations are at peak, Azure CDN delivers: Thanks to its distributed global scale, Azure Content Delivery Network handles sudden traffic spikes and heavy loads, like the start of a major product launch or global sporting event.

Working With Azure Storage

Once the CDN is enabled on an Azure storage account, any blobs that are in public containers and are available for anonymous access will be cached via the CDN. Only blobs that are publicly available can be cached with the Azure CDN. To make a blob publicly available for anonymous access, you must denote its container as public. Once you do so, all blobs within that container will be available for anonymous read access and you have the option of making container data public as well, or restricting access only to the blobs within it.

For best performance, use CDN edge caching for delivering blobs less than 10 GB in size. When you enable CDN access for a storage account, the Management Portal provides you with a CDN domain name in the following format: http://.vo.msecnd.net/. This domain name can be used to access blobs in a public container. For example, given a public container named music in a storage account named myaccount, users can access the blobs in that container using either of the following two URLs:

Working With Azure Websites

You can enable CDN from your websites to cache your web contents, such as images, scripts, and stylesheets. See Integrate an Azure Website with Azure CDN. When you enable CDN access for a website, the Management Portal provides you with a CDN domain name in the following format: http://.vo.msecnd.net/. This domain name can be used to retrieve objects from a website. For example, given a public container named cdn and an image file called music.png, users can access the object using either of the following two URLs:

  • Azure Website URL: http://mySiteName.azurewebsites.net/cdn/music.png
  • Azure CDN URL: http://.vo.msecnd.net/cdn/music.png

Working With Azure Cloud Services

You can cache objects to the CDN that are provided by an Azure cloud service. Caching for cloud services has the following constraints:

  • The CDN should be used to cache static content only.
  • Your cloud service must be deployed to in a production deployment.
  • Your cloud service must provide the object on port 80 using HTTP.
  • The cloud service must place the content to be cached in, or delivered from, the /cdn folder on the cloud service

When you enable CDN access for a cloud service, the Management Portal provides you with a CDN domain name in the following format: http://.vo.msecnd.net/. This domain name can be used to retrieve objects from a cloud service. For example, given a cloud service named myHostedService and an ASP.NET web page called music.aspx that delivers content, users can access the object using either of the following two URLs:

  • Azure cloud service URL: http://myHostedService.cloudapp.net/cdn/music.aspx
  • Azure CDN URL: http://.vo.msecnd.net/music.aspx

Accessing Cached Content over HTTPS

Azure allows you to retrieve content from the CDN using HTTPS calls. This allows you to incorporate content cached in the CDN into secure web pages without receiving warnings about mixed security content types.

To serve your CDN assets over HTTPS there are couple of constraints worth mentioning:

  • You must use the certificate provided by the CDN. Third party certificates are not supported.
  • You must use the CDN domain to access content. HTTPS support is not available for custom domain names (CNAMEs) since the CDN does not support custom certificates at this time.

Even when HTTPS is enabled, content from the CDN can be retrieved using both HTTP and HTTPS.

Note: If you’ve created a CDN for an Azure Cloud Service (e.g. http://[XYZ].cloudapp.net/cdn/) it’s important that you create a self-signed certificate for your Azure domain ([XYZ].cloudapp.net). If you’re using Azure Virtual Machines can be done through IIS.

Custom Domain to Content Delivery Network (CDN) endpoint

In case you want to access the cached content with custom domain, azure lets you map your domain to particular CDN End point. With that in place you can use your own domain name in URLs to retrieve the cached content.

For detailed information on implementation please check- Map CDN to Custom Domain

CDNs are an essential part of current generation’s Internet, and their importance will increase eventually. Even now, companies are trying hard trying to figure out ways to move more functionality to edge servers (POP Locations) in order to provide users with the fastest possible experience. Azure CDN plays a vital role as it suffice current generation CDN requirement. While implementing Azure CDN (Or any CDN for that matter) the important thing is to formulate a strategy regarding the maximum lifespan of an object beforehand.

Author Credits: This article was written by Utkarsh Pandey, Azure Solution Architect at 8K Miles Software Services and originally published here .

There is a Microsoft Azure event happening on 16th September 2017 and it is great opportunity for all Azure enthusiasts to meet, greet and share ideas on latest innovation in this field. Click here for more details.



Diagnosis of Information Security issues & Best Practices to implement Role Based Access Control in Healthcare Premises

Role-based access control (RBAC) is a method of regulating access to computer or network resources based on the roles of individual users within an enterprise. Usually, users have the access privileges to the systems based on the roles that they perform in those systems. RBAC policies in general ensure that users who come under these policies have the right access to the right resource at the right point of time. In recent times Healthcare industry has been giving significant importance to RBAC, for example, if a RBAC system was used in a hospital, each and every person who is allowed access to the hospital’s network has a predefined role (doctor, nurse, lab technician, administrator, etc.). If a user is defined as having the role of nurse, then that user can access only resources that the role of nurse has been allowed access to. Each user is assigned one or more roles, and each role is assigned with one or more privileges in that role. In a hospital EHR Implementation process, clear non-separation of roles and chaotic access privileges to various systems would cause mayhem in the system, resulting in an implementation failure.

Security & RBAC Readiness Issues – Spotting of Symptoms

The first step before initiating an EHR Implementation process is to thoroughly assess/discover all RBAC related issues. 8KMiles looks for the following indications, as part of a discovery process, to assess whether RBAC issues exist in a Hospital/Health Care Organization and if they do, where they might exist.

1. Hospital or Health Care System has problem in defining roles for a particular user
2. Hospital or Health Care System has problem in providing access to a single user amongst a group of users within same job-title/department.
3. Department/System/ Application has to constantly rotate staff (sometimes even on a daily basis), hence keeping track of roles/access is getting very difficult.
4. There are many mini-roles which can form into a major role. There are many such major roles existing in the system.
5. Many Roles and access privileges though defined in the system have not been used for a while
6. There are no systems to address SODs (Segregation of Duties) that exists among roles/privileges
7. There is no Access Governance Solution in place to assess Role/Access privileges assigned to users
8. Audit reports were not in place to follow compliance process due to lack of Access Governance solution.
9. Users had problems in Multi-level approval
10. Roles not fitting into daily scheme of activities of a department are prevalent in the system
11. Patient Privacy Data related issues are a concern (both from a data-entry and a data-breach perspective)
12. Data Security or RBAC Security is a concern especially during bulk-data upload of patient data or during data interchange between in-house or external systems.
13. System has both Groups and Roles defined, but Groups are not mapped to roles in a way they should be.

Security & RBAC Readiness Issue – Mitigation Processes

After a detailed analysis of the issue is done, an 8KMiles RBAC Process Manager ,who will take the ownership of EHR Implementation, will define and implement the following processes/procedures, pertaining to RBAC and Security at the Hospital/Health Care Facility:

1. Study the results of the Discovery Process and understand existing Security and RBAC policies in place for each system/application in each department at every location of the Hospital/Health Care system
2. Prepare a RBAC matrix (Access Requirements of each Department’s Titles)
3. Prepare current workflows where these roles and their access privileges come into play
4. Note down any inconsistencies and inefficiencies pertaining to these roles that are obvious. For example, if a role and its access privileges are used heavily while some of them that are not being used at all
5. Note down any violation of Segregation of Duties (SODs) as per the access privileges entitled for the current roles. For example a clerk in the records department should not have access to a copier or print functions in their system so they can print/copy EHRs and distribute
6. Note down the critical and sensitive roles and access privileges that one needs to be careful about. These could be roles and privileges wherein employee has a direct read/write access to patient’s privacy data
7. Plan to address any gaps relating to these critical and sensitive roles and access privileges before/during EHR Implementation
8. Prepare plans to address the inconsistencies and SODs before the transition to the new EHR Implementation. If the EHR Preparation time is very short, at least have plans to address these during the EHR Implementation. Note: SOD Violations are caused by Pairs of Roles with Access Privileges that if an individual were to possess, would have a potential to directly compromise the integrity of both the systems, where these roles function
9. Prepare future workflows of Roles after the EHR Implementation. Note down the inconsistencies that were there before and how it has been solved during or because of EHR Implementation
10. Assess whether the critical and sensitive roles and access privileges were addressed effectively during EHR Implementation
11. Assess whether micro-roles or mini-roles have been effectively rolled under major Roles efficiently (i.e. Roles within Roles)
12. Perform a RBAC Data-Owner Certification Process every month after EHR Implementation wherein each of the Data Owners of each application/system attests for the need of these roles to be in the system. As a pre-cursor to this process identify all systems/applications and their Data Owners
13. Perform a RBAC Access Certification Process, every month after EHR Implementation, wherein you ask the Managers, Supervisors to attest where employees who work under them do need the roles they are performing on these systems. As a pre-cursor to this process identify all the managers and supervisors (if not already done) of each employee
14. Address orphaned/redundant roles and access privileges that come out of the above certification process
15. Find the relationship between the Groups and Roles present and see if the mappings of Groups with Roles are not out of synch
16. Assess any HIPAA/SOX related RBAC compliance issues that occur prior/during/post EHR Implementation and address them
17. Apprise of the Hospital System leadership/Stake Holders the major findings and changes made to be compliant

Security & RBAC Readiness – Best Practices

Following are some of the best practices that 8KMiles RBAC Managers/Personnel follow in order to address RBAC related issues:

1. Establish formal Business Relationship with the prospect
2. Understand the Business needs and requirements

a. Compliance with HIPAA, SOX, SAS70, HL7, requirements
b. Workflow Management (Establish Flow of Identities, Establish Roles – Access Control in relation to each other)
c. Interoperability (Explore Identity Federation with External Parties, Use REST APIs)
d. Security (Plan for Identity/Data Security at Rest and in Motion)
e. Medical Records Synch (HL7, HTTPS/Encryption, Multi-Factor Auth)
f. Integration Issues – For Example, integration of the hospital EMR with subsystems like Identity and Access Management and Access Governance systems

3. Understand Role Hierarchy, relationships between groups, roles and access permissions
4. Study the requirement systematically and come up with solutions based on agile methodology for the above pain-points

By following all of the above processes and RBAC best practices, organizations can secure Healthcare data and also identify redundant roles, inefficient access privileges, employees who were not granted right roles i.e., Mismatched roles, disparities between groups and roles, and access permissions of employees who were no longer part of the system.

Author Credits: Raj Srinivas, VP Technology at 8KMiles, You can connect with him here for more information

Solutions in Azure : Azure CDN for Maximum Bandwidth & Reduced Latency – Part I

Under the Current Ecosystem of Microsoft Cloud; Azure CDN has been widely recognized as CDaaS (Content Delivery as a Service), with its growing network of POP locations it can be used for offloading content to a globally distributed network of servers. As its prime functionality it caches static content at strategically placed locations to help distributing Content with low latency and higher data transfer that ensures faster throughput to your end users.
Azure CDN particularly offers a global solution to the developers for delivering high-bandwidth content by caching the content at physical nodes across the world. Now requests for these contents has to travel shorter distance reducing the number of hops in between. With CDN in place you can be ensured that Static files such as (images, JS, CSS, videos etc.) and website assets are sent from servers closest to your website visitors. For content heavy websites like e- commerce this latency savings could be of significant performance factor.

In essence, Azure CDN puts your content in many places at once, providing superior coverage to your users. For example, when someone in London accesses your US-hosted website, it is done through an Azure UK PoP. This is much quicker than having the visitor’s requests, and your responses, travel the full width of the Atlantic and back.

There are two players (Verizon & Akamai) to provide us those edge locations for Azure CDN. Both providers have distinct ways of building their CDN infrastructures. Verizon on one hand has been quite happy disclosing their location on the contrary Azure CDN from Akamai POP locations are not individually disclosed. To get the updated list of locations keep checking Azure CDN POP Locations.

How Azure CDN Works?

Today, over half of all internet traffic is already being served by CDNs. Those numbers are rapidly trending upward with every passing year, and azure has been significant contributor there.

As with most of the azure services, Azure CDNs are not magic and actually work in a pretty simple and straightforward manner. Let’s just go through the actual case-
1) A user (XYZ) requests a file (also called an asset) using a URL with a special domain name, such as .azureedge.net. DNS routes the request to the best performing Point-of-Presence (POP) location. Usually this is the POP that is geographically closest to the user.
2) If the edge servers in the POP do not have the file in their cache, the edge server requests the file from the origin. The origin can be an Azure Web App, Azure Cloud Service, Azure Storage account, or any publicly accessible web server.
3) The origin returns the file to the edge server, including optional HTTP headers describing the file’s Time-to-Live (TTL).
4) The edge server caches the file and returns the file to the original requestor (Alice). The file will remain cached on the edge server until the TTL expires. If the origin didn’t specify a TTL, the default TTL is 7 days.
5) Additional users (ex. ABC) may then request the same file using that same URL, and may also be directed to that same POP.
6) If the TTL for the file hasn’t expired, the edge server returns the file from the cache. This results in a faster, more responsive user experience.

Reasons for using a CDN

1) To understand the reasons behind why Azure CDN is so widely used, we first have to recognize the issue they’re designed to solve (LATENCY). It’s the annoying delay that occurs from the moment you request to load a web page to the moment its content actually appears onscreen, especially applications where many “internet trips” are required to load content. There are quite a few factors which contribute to this, many being specific to a given web page. In all cases however, the delay duration is impacted by the physical distance between you and that website’s hosting server. Azure CDN’s mission is to virtually shorten that physical distance, the goal being to improve site rendering speed and performance.
2) Another obvious reason for using the Azure CDN is throughput. If you look at a typical webpage, about 20% of it is HTML which was dynamically rendered based on the user’s request. The other 80% goes to static files like images, CSS, JavaScript and so forth. Your server has to read those static files from disk and write them on the response stream, both actions which take away some of the resources available on your virtual machine. By moving static content to the Azure CDN, your virtual machine will have more capacity available for generating dynamic content.

When a request for an object is first made to the CDN, the object is retrieved directly from the Blob service or from the cloud service. When a request is made using the CDN syntax, the request is redirected to the CDN endpoint closest to the location from which the request was made to provide access to the object. If the object is not found at that endpoint, then it is retrieved from the service and cached at the endpoint, where a time-to-live (TTL) setting is maintained for the cached object.

Author Credits: This article was written by Utkarsh Pandey, Azure Solution Architect at 8KMiles Software Services and originally published here

Cortana Intelligence for Patient Length of Stay Prediction

Predictive Length of Stay
Length of Stay (LOS) is defined as total number of days a patient stayed in hospital from his/her initial admit date to discharge date. LOS varies from patient to patient as it depends on disease conditions and facilities provided to him/her in hospital.

Importance of Predictive Length of Stay
PLOS is a sophisticated model that can significantly improve the quality of treatment, in addition, to decreasing the workload pressure of doctor. It enhances accurate planning with existing facilities to understand patient disease conditions and focus on discharging the patient quickly avoiding re-admissions in the hospital.

Machine Learning Techniques for Predictive Length of Stay
Here we talk about two popular machine learning techniques that can be used for LOS prediction.

Random Forest
Random Forest is one of the machine learning tree based predictive algorithm which builds several decision trees and combines their output to improve model accuracy. Combining decision trees output is known as Ensemble and it  helps weak learners to become strong learner.

For example, when we are uncertain to take a particular decision, then we approach few persons for suggestions and then by combining all suggestions we take the final decision. Similarly, the Random Forest mechanism becomes a strong learner from a weak learners (individual decision trees).

Random Forest is useful to solve regression and classification problems. For regression problems, the dependent variable is continuous. Whereas in classification problems, the dependent variable is categorical.

Advantage of this model is, it runs efficiently on large data set or databases that consists of data sets with thousands of features.

Gradient Boosting
Gradient Boosting is another machine learning algorithm for developing prediction models to solve regression and classification kind of problems. It builds the model in iterative fashion like other boosting models do and the main objective here is to minimize the loss of the model by adding weak learners using a gradient descent procedure.

Gradient descent is used find best weights to minimize error or loss of model. In gradient boosting, weak learners or decision trees are used to make prediction.

Advantage of GBT is that the trees are built one at a time, where each new tree helps to correct errors made by previously trained tree. With each tree added, the model becomes even more effective.

Microsoft Cortana Intelligence Solution for Predictive Length of Stay
As part of Cortana Intelligence Solution, Microsoft has in-built solution for complete LOS platform comprising data storage, data pipeline/processing, ML algorithms and visualization.

Microsoft’s support for integrated SQL Server Service and R programming is a big advantage for any data science problems.

Hospital patients’ data is stored in SQL Server and the PLOS machine learning models are executed through R IDE. The models take input from SQL Server and the predicted results can be stored in SQL Server database. Provision to visualize the statistics and predicted LOS result for patients can be made through the visualization tool PowerBI.


PLOS Model Working Procedure

To predict length of stay of newly joined patient in hospital, we are using two machine learning algorithms. Those are Regression Random Forest and Gradient Boosting Trees. Both models follow the below procedure.
1. Data Pre-processing and cleaning
2. Feature Engineering
3. Data Set Splitting, Training, Testing, Evaluation
4. Deploy and Visualize results

Data Pre-processing
Hospital patient data is loaded into SQL Server tables. If there are any missing values in tables, these will get replaced . Missing values are replaced with either -1 or mean or mode.

Feature Engineering
In feature engineering, standardize values of features of data set is used for training predictive models.

Splitting, Training, Testing and Evaluating
Data set gets split into training data set and testing data set with specified percentage (e.g. train data set: 60%, test data set: 40%). These two data sets gets stored in SQL Server database tables separately and two models, regression Random Forest and Gradient Boosting Trees with training data set are built.

Finally we predict length of stay on test data set and then evaluate performance metrics of Regression Random Forest model and Gradient Boosting model.

Deploy and Visualize Results
Deploy PowerBI in client machine and then load predicted results into PoerBI Dash Board. We can visualize patient predictive length of stay using PowerBI Dash Board.

Advantage of Predictive Length of Stay
This solution enables predictive length of stay for hospitals and the Predicted information is especially useful to two personnel.

For hospitals which require solution to predict length of stay, this is a good choice because of its robust integration of SQL and R code.

Chief Medical Information Officer (CMIO)
This solution is useful for CMIO to determine if resources are being allocated appropriately in a hospital network  and to see which disease conditions are most prevalent in patients that will be staying in-care facilities long term


Care Line Manager
A Care Line Manager takes care of all patients in hospital directly and his main job is to observe each and every patient status on their health condition and required resources. He will plan the patient discharge and allocation of resources. Length of stay prediction helps the care line manager to manage their patient’s care better.


Microsoft Cortana based solution is impressive in terms of providing the necessary components to predict the duration of stay in hospital for patients.  It provides flexible features for integration with the hospital healthcare applications and data.  The framework for pre-processing and modeling can be modified to suit the needs and the R programming capability will attract enthusiasm for the data scientists.  The basic dashboards based on PowerBI are user friendly and can be customized for specific needs of hospitals.  The whole solution will help to plan resources very effectively like allocation of doctors, beds, required medicine, etc and avoid unnecessary extended patient stay in hospital bed.

Author Credits: Kattula T, Senior Associate, Data Science, Analytics SBU at 8K Miles Software Services Chennai.

Image Source: Microsoft PLOS

Azure Resource Lock: Safeguard Your Critical Resources

Prevention is better than Cure – There were quite a few instances when I thought I should have applied this logic and this has even more significance if you are playing around public cloud more so while dealing with mission critical resources there. There are numerous occasions when you want to protect your resources from some unwarranted human actions or to put it bluntly we are seeking a solution to prevent other users in organization from accidentally deleting or modifying critical resources.

Azure has given us couple of ways to apply that level of control, firstly with role-based access control (RBAC), With the Reader and various Contributor roles RBAC is a great way to help protect resources in Azure. You can effectively limit the actions that a user can take against a resource. However, even with one of the Contributor roles, it is still possible to delete specific resources. This makes it very easy to accidentally delete an item.

Azure Lock provides you the options using which you can effetely control any such adventure. Unlike RBACK, you use management locks to apply a restriction across all users and roles. To learn about setting permissions for users and roles, see Azure Role-based Access Control. Using Resource lock you can lock a particular subscription, a particular resource group or even a specific resource. With this in place authorize users can still be able to read or modify the resources but they CAN NOT breach that lock and delete the same.

To make this happen you have to apply the Resource Lock Level to aforementioned scopes. You can set the lock level toCanNotDelete or ReadOnly(As of now these two are the only options supported). CanNotDelete means authorized users can still read and modify a resource, but they can’t delete it. ReadOnly means authorized users can only read from a resource, but they can’t modify or delete it.

When you apply a lock at a parent scope, all child resources inherit the same lock.

One point worth mentioning here is that you will also need to be in either an Owner or User Access Administrator role for the desired scope, because to play with Resource Lock it’s prerequisite to have access to Microsoft.Authorization/* orMicrosoft.Authorization/locks/* actions (only these two have appropriate permissions).

Create Resource Lock Using ARM Template

With Azure Resource Manager template we can lock the resources at the time of its creation. An ARM template is a JSON-formatted template file which provide a declarative way to define the deployment of Azure resources. Here is the example of how to create a lock on particular Storage Account-



If you see the example clearly the name of storage account coming via parameter while the most important section to be noticed is how the lock (utLock) has been created by concatenating the resource name with /Microsoft.Authorization/ and the name of the lock.

Create Resource Lock using PowerShell

Placing a resource lock on an entire group could be helpful in situations where you want to ensure no resources in that group are deleted. With below example I have tried to create a resource lock on a particular resource Group” UT-RG”


To remove the resource Lock make use of Remove-AzureResourceLock cmdlet, make sure you are providing proper ResourceId.


Off late Azure has brought this support to ARM Portal as well, to achieve the similar things via portal click the Settings blade for the resource, resource group, or subscription that you wish to lock, select Locks. Once prompted Give the lock a name and lock level and you are immune to those talked about unwanted situations. It gives you options to lock an entire subscription to ReadOnly if malicious activity was detected.


Author Credits: This article was written by Utkarsh Pandey, Azure Solution Architect at 8KMiles Software Services and originally published here.


Securing Cassandra

Data security is a major concern and is given top priority in every organization. Securing sensitive data and keeping it out of hands from those who should not have access is challenging even in traditional database environments, let alone a cloud hosted database.  Data should be secured on the fly and on rest. In this blog, we will talk about securing data in Cassandra database on cloud environment specifically on AWS. We will divide the blog into two.

  1. Secure Cassandra on AWS
  2. Cassandra data access security

Secure Cassandra on AWS

Cassandra is best used when hosted across multiple datacenters.  Hosting it on cloud across multiple datacenters will reduce lot of cost and peace of mind knowing that you can survive regional outages. However, securing cloud infra is most fundamental activity that need to be carried when hosted on cloud.

Securing Ports

Securing ports and unknown host access is the foremost think when hosted on cloud. Cassandra needs the following ports to be opened on your firewall for a multi-node cluster else it will act as standalone cluster.

Public ports

Port Number Description
22 SSH port


Create a Security Group with default rule as SSH traffic allowed on port 22 (both inbound and outbound).

  1. Click ‘ADD RULE’ (both inbound and outbound)
  2. Choose ‘SSH’ from the ‘Type’ dropdown
  3. Enter only allowed IPs from the ‘Source’ (inbound) / ‘Destination’ (outbound).

Private – Cassandra inter node ports

Ports used by the Cassandra cluster for inter-node communication must be restricted to communicate within the node, restricting the traffic flow from and to the external resources.

Port Number Description
7000 Inter node communication without SSL encryption enabled
7001 Inter node communication with SSL encryption enabled
7199 Cassandra JMX monitoring port
5599 Private port for DSEFS inter-node communication port.


To configure inter-node communication ports in a Security Group:

  1. Click ‘ADD RULE’.
  2. Choose ‘Custom TCP Rule’ from the ‘Type’ dropdown.
  3. Enter the port number in the ‘Port Range’ column.
  4. Choose ‘Custom’ from the ‘Source’ (inbound) / ‘Destination’ (outbound) dropdown and enter the same Security Group ID as the value. This allows communication only within the cluster over the configured port, when this Security Group would be attached to all the nodes in the Cassandra cluster.

Private – Cassandra inter node ports

The following port needs to be secured and opened only for the clients which will be connecting with our cluster.

Port Number Description
9042 Client port without SSL encryption enabled
9160 Client port SSL encryption enabled
9142 Should be open when both encrypted and unencrypted connections are required
9160 DSE client port (Thrift) port


To configure public ports in a Security Group:

  1. Click ‘ADD RULE’.
  2. Choose ‘Custom TCP Rule’ from the ‘Type’ dropdown.
  3. Enter the port number in the ‘Port Range’ column.
  4. Choose ‘Anywhere’ from the ‘Source’ (inbound) / ‘Destination’ (outbound).

To restrict the public ports to certain known IP or IP Range:

d.Choose ‘Custom’ from the ‘Source’ (inbound) / ‘Destination’ (outbound) dropdown and provide the IP value or CIDR block corresponding to the IP Range.

Now that we have configured the firewall, our VMS are secured for unknown access.  It is recommended to create Cassandra clusters in a private subnet within your VPC which does not have Internet access.

Create a NAT instance in a public subnet or configure NAT Gateway that can route the traffic from the Cassandra cluster in the private subnet for software updates.

Cassandra Data Access Security

Securing data involves the following security accesses,

  1. Node to node communication
  2. Client to node communication
  3. Encryption at rest
  4. Authentication and authorization

Node to Node and Client to Node Communication Encryption

Cassandra is a master-less database. Master-less design offers no single point of failure for any database process or function. Every node is same on Cassandra. Read and write is served by every node for any query on the database. So, there is lot of data transfer between each node on the cluster. When the database is hosted on public cloud network, this communication needs to be secured. Likewise, the data transferred between the database and client on the public network is always at risk. To secure the data on flight during these scenarios, usually encryption of data by sending over a SSL is preferred widely.

Most developers are not exposed to encryption in their day to day work. And setting up an encryption layer is always a tedious process. Cassandra helps this by providing a built-in feature. All we need to do is enable the server_encryption_options: and client_encryption_options: configurations on your cassandra.yaml file and provide the required certificates and keys. Cassandra takes care of the encryption of data during node to node and client to server communications.

Additionally, Cassandra follows Client Certificate Authentication. Imagine, without authentication that we are talking to another Cassandra node, the cluster is only expecting a SSL key, we can write programs to attach to a cluster and execute any commands, listen to writes on arbitrary token ranges, even create a admin account into the system_auth table.

To avoid this, Cassandra follows Client Certificate Authentication. Using this approach Cassandra takes the extra step of verifying the client against a local trust store. If it does not recognize the client’s certificate, it will not accept the connection. This additional verification can be enabled by setting require_client_auth:true in cassandra.yaml configuration file.

In the rest of the blog we will see step by step process of enabling and configuring the cluster for SSL connection. If you have a certificate already, you can skip Generating certificates using OpenSSL.

Generating Certificates using OpenSSL

Most of the UNIX system should have OpenSSL tool installed on it. If not available, install OpenSSL before proceeding further.


  1. Create a configuration file gen_ca_cert.conf with the below configurations.


2.Run the following OpenSSL command to create the CA:

3.You can verify the contents of the certificate you just created with the following command:

You can generate certificate for each node if required, but doing that is not recommended. Because it is very tough to maintain separate key for each node. Imagine, when a new node is added to the cluster, the certificate for that node needs to be added to all other nodes which is tedious process. So, we recommend using the same certificate for all the nodes. Following steps will help you to use the same certificate for all the nodes.

Building Keystore

I will be explaining the keystore building for a 3-node cluster. Same can be followed for a n node cluster.


To verify that the keystore is generated with correct key pair information and accessible, execute the below command


With our key stores created and populated, we now need to export a certificate from each node’s key store as a “Signing Request” for our CA:


With the certificate signing requests ready to go, it’s now time to sign each with our CA’s public key via OpenSSL:


Add CA to the keystore into each node’s keystore via -import sub command of keytool.


Building Trust Store

Since Cassandra uses Client Certificate Authentication, we need to add a trust store to each node. This is how each node will verify incoming connections from the rest of the cluster.

We need to create trust store by importing CA root certificate’s public key:


Since all our instance-specific keys have now been signed by the CA, we can share this trust store instance across the cluster.

Configuring the Cluster

After creating all the required files, you can keep the keystore and truststore files in /usr/local/lib/cassandra/conf/ or any directory of your choice. But make sure that the cassandra demon has access to the directory. By making he below configuration in cassandra.yaml file the inbound and outbound requests will be encrypted.

Enable Node to Node Encryption


Enable Client to Node Encryption


Repeat the above process on all the nodes on the cluster and your cluster data is secured on flight and from unknowns.

Author Credits: This article was written by Bharathiraja S, Senior Data Engineer at 8KMiles Software Services.

Cassandra Backup and Restore Methods

Cassandra Backup and Restore Methods

Cassandra is a distributed database management system. In Cassandra, data is replicated among multiple nodes across multiple data centers. Cassandra can survive without any interruption in service when one or more nodes are down. It keeps its data in SSTable files. SSTables are stored in the keyspace directory within the data directory path specified by the ‘data_file_directories’ parameter in the cassandra.yaml file.  By default, its SSTable directory path is /var/lib/cassandra/data/<keypace_name>. However, Cassandra backups are still necessary to recover from following scenario

  1. Any errors made in data by client applications
  2. Accidental deletions
  3. Catastrophic failure that will require you to rebuild your entire cluster
  4. Data can become corrupt
  5. Useful to roll back the cluster to a known good state
  6. Disk failure

Cassandra Backup Methods

Cassandra provides two types of backup. One is snapshot based backup and the other is incremental backup.

Snapshot Based Backup

Cassandra provides nodetool utility which is a command line interface for managing a cluster. The nodetool utility gives a useful command for creating snapshots of the data. The nodetool snapshot command flushes memtables to the disk and creates a snapshot by creating a hard link to SSTables. SSTables are immutable. The nodetool snapshot command takes snapshot per node basis. To take an entire cluster snapshot, the nodetool snapshot command should be run using a parallel ssh utility, such as pssh.  Alternatively, snapshot of each node can be taken one by one.

It is possible to take a snapshot of all keyspaces in a cluster, or certain selected keyspaces, or a single table in a keyspace. Note that you must have enough free disk space on the node for taking the snapshot of your data files.

The schema does not get backed up in this method.  This must be done manual separately.


a.All keyspaces snapshot

If you want to take snapshot of all keyspaces on the node then run the below command.

$ nodetool snapshot

The following message appears:

Requested creating snapshot(s) for [all keyspaces] with snapshot name [1496225100] Snapshot directory: 1496225100

The snapshot directory is /var/lib/data/keyspace_name/table_nameUUID/ snapshots/1496225100

b.Single keyspace snapshot

Assuming you created the keyspace university. To took a snapshot of the keyspace and you want a name of the snapshot the run the below command

$ nodetool snapshot -t 2017.05.31 university

The following output appears:

Requested creating snapshot(s) for [university] with snapshot name [2015.07.17]

Snapshot directory: 2017.05.31

c.Single table snapshot

If you want to take a snapshot of only the student table in the university keyspace then run the below command

$ nodetool snapshot --table student university

The following message appears:

Requested creating snapshot(s) for [university] with snapshot name [1496228400]

Snapshot directory: 1496228400

After completing the snapshot, you can move the snapshot files to another location like AWS S3 or Google Cloud or MS Azure etc. You must backup the schema because Cassandra can only restore data from a snapshot when the table schema exists.


  1. Snapshotbased backup is simple and much easier to manage.
  2. Cassandra nodetool utility provides nodetool clearsnapshot command which removesthe snapshot files.


  1. For large datasets, it may be hard to take a daily backup of the entire keyspace.
  2. It is expensive to transfer large snapshot data to a safe location like AWS S3

Incremental Backup

Cassandra also provides incremental backups. By default incremental backup is disabled. This can be enabled by changing the value of “incremental_backups” to “true” in the cassandra.yaml file.

Once enabled, Cassandra creates a hard link to each memtable flushed to SSTable to a backup’s directory under the keyspace data directory. In Cassandra, incremental backups contain only new SSTable files; they are dependent on the last snapshot created.

In the case of incremental backup, less disk space is required because it only contains links to new SSTable files generated since the last full snapshot.


  1. The incremental backup reduces disk space requirements.
  2. Reducesthe transfer cost.


  1. Cassandra does not automatically clear incremental backup files. If you want to remove the hard-link files then write your own script for that. There is no built-in tool to clear them.
  2. Creates lots of small size file in backup. File management and recovery not a trivial task.
  3. It is not possible to select a subset of column families for incremental backup.

Cassandra Restore Methods

Backups are meaningful when they are restorable under situations when keyspace gets deleted or new cluster gets launched from the backup data or a node get replaced. Restoring backed up data is possible from snapshots and if you are using incremental backups then you need all incremental backup files created after the snapshot. There are mainly two ways to restore data from backup. One is using nodetool refresh and another one using sstableloader.

Restore using nodetool refresh:

Nodetool refresh command loads newly placed SSTables onto the system without a restart. This method is used when new node replace a node which is not recoverable. Restore data from a snapshot is possible if the table schema exists. Assuming you have created a new node then follow the below steps

  1. Create the schema if not created already.
  2. Truncate the table,if necessary.
  3. Locate the snapshot folder(/var/lib/keyspace_name/table_name UUID/snapshots/snapshot_name) and copy the snapshot SSTable directory to the /var/lib/keyspace/table_name-UUID directory.
  4. Run nodetool refresh.

Restore using sstableloader:

The sstableloader loads a set of SSTable files in a Cassandra cluster. The sstableloader provides the following options.

  1. Loading external data
  2. Loading existing SSTables
  3. Restore snapshots

The sstableloader does not simply copy the SSTables to every node, but also transfers the relevant part of the data to each node and also maintain the replication factor. Here sstableloader used for restore snapshots. Follow the below steps for restore using sstableloader

  1. Create the schema if not exists.
  2. Truncate the table if necessary.
  3. Bring your back up data to a node from AWS S3 or Google Cloud or MS AzureExample: Download your backup data in /home/data
  4. Run the below command
    sstableloader -d ip /home/data


Author Credits: This article was written by Sebabrata Ghosh, Data Engineer at 8KMiles Software Services  and can reach him here.


8KMiles strikes the right balance between Cloud Security and Performance

Healthcare industry is one of those sectors that face major challenges when it comes to embracing Cloud transformation. Regulatory specific security and huge amounts of sensitive data are the major reasons and there is a constant need for Technology and Information heads in the Healthcare organization to maintain the right equilibrium between security and privacy yet not compromising on the IT infra budgets and performance. On this context, A Capability matured level 5 Healthcare prospect approached 8KMiles with a specific set of requirement. The prospect company were using CPSI application and had an enormous amount of rectifications that were required to be either made or migrated. The reason behind the prospect opted 8KMiles the most preferred  choice of development partner because,  8KMiles is one of the  State of the Art Solution providers and has an Agile team of experts who practice Scrum and are ready to take up ad-hoc requirement with 24/7 development support system.

8KMiles worked extensively to collaborate with prospect company to :

1) Establish formal Business Relationship with the prospect.

2) Understand the Business needs and requirements:

a. User Interface : The interface involves multiple billing screens complicating navigation and unduly delays task completion

b. E-mail Messaging limitation : It is not possible to send messages to more than one person.

c. Compliance with HL7 requirements – Integration with other HL7 compliant systems is minimal, unable to interface with the radiologist even with a HL7 interface through a pretty basic MS SQL server database.

d. Workflow Management – The workflow management does not capture many areas of healthcare thus missing out on benefits

e.  Interoperability – CPSI does not allow FHIR (Fast Healthcare Interoperability Resources Specification) compatible APIs for open access to patient data

f.  Security – Multi-layered approach to security is not provided for limiting employee education

g.  Medical Records Synch – Updated information of patients treated at different facilities is not available on the fly

h.  Lack of Standardized terminology, system architecture and indexing – System is inflexible and incapacitated to capture the diverse requirements of the different healthcare disciplines

i.   Integration Issues – Integration of the hospital EMR with the Physician office EMR in a seamless fashion is not happening

j.   There were too many Switches in Role Hierarchy which were not recorded properly

3) 8KMiles studied the requirement systematically and came up with solutions based on Agile methodology for the above pain-points.

a. User Interface – The interface involves a SSO which allows the User to provide a onetime Credential making them to Open N-Number of Application/Resources with a single click .

b. E-mail Messaging limitation :

i. 8K Miles Access Governance & Identity Management Solution allows to send multiple mails based on approvals, rejection, attestation, re-certification.

ii. Multi-Level approval and messaging is possible .

c. Compliance with HL7 requirements:

i. 8K Miles Access Governance & Identity Management Solution provides the Customer to integrate any database such as MS SQL Server. Oracle, IDM dB 2.

ii. We provide our Customer/Employees to integrate with various portals through an SSO – Single Sign On (e.g.) A radiologist can login to multiple portals with single credential .

d. Workflow Management – The Workflow Management and Policy Compliance helps/Facilitates in capturing areas of restrictions in Health Care such as providing right access of resource to right user at right time .

e. Interoperability – 8KMiles Access Governance & Identity Management Solution & SSO helps in providing fast access to FHIR related Applications .

f. Security – 8K Miles Access Governance & Identity Management Solution Provides Multilevel Approval & Parallel Approval

g. Medical Records Synch – 8K Miles Access Governance & Identity Management Solution will be integrated and Synced with different databases, updated information of patients treated at different facilities will be available on the fly at any point of time .

h. Lack of Standardized terminology, system architecture and indexing – Highly customizable, flexible to handle any requirement based on Health Care Needs related to Identity & Access Governance.

i. Integration Issues – Integration of the hospital EMR with the Physician office EMR in a seamless fashion is provided using SSO.

j. Switch – 8K Miles Access Governance & Identity Management Solution helps in providing distribution of switches & roles to multiple users on a daily basis.

If you are experiencing similar problems in your healthcare business, please write to sales@8kmiles.com.

Cost Optimization Tips for Azure Cloud-Part III

In continuation to my previous blog am going to jot down more on how to optimize cost while moving into Azure public cloud


With Microsoft Introducing next generation of Azure deployment via Azure Resource Manager (ARM) we can avail significant performance improvement just by upgrading the VM’s to latest versions (From Azure V1 to Azure V2). In all case the price would either be same or near to same.
For example- if you are upgrading a DV1-series VM to DV2- Series it gives you 35-40% faster processing for the same price point .


It is not enough to shut down VMs from within the instance to avoid being billed because Azure continues to reserve the compute resources for the VM including a reserved public IP. Unless you need VMs to be up and running all the time, shut down and deallocate them to save on cost. This can be achieved from Azure Management portal or Windows Powershell.


If you delete a VM, the VHDs are not deleted. That means you can safely delete the VM without losing data. However, you will still be charged for storage. To delete the VHD, delete the file from Blob storage.

  •  When an end-user’s PC makes a DNS query, it doesn’t contact the Traffic Manager Name servers directly. Instead, these queries are sent via “recursive” DNS servers run by enterprises and ISPs. These servers cache the DNS responses, so that other users’ queries can be processed more quickly. Since these cached responses don’t reach the Traffic Manager Name servers, they don’t incur a charge.

The caching duration is determined by the “TTL” parameter in the original DNS response. This parameter is configurable in Traffic Manager—the         default is 300 seconds, and the minimum  is 30 seconds.

By using a larger TTL, you can increase the amount of caching done by recursive DNS servers and thereby reduce your DNS query charges. However, increased caching will also impact how quickly changes in endpoint status are picked up by end users, i.e. your end-user failover times in the event of an endpoint failure will become longer. For this   reason, we don’t recommend using very large TTL values.

Likewise, a shorter TTL gives more rapid failover times, but since caching is reduced the query counts against the Traffic Manage name servers will be higher.

By allowing you to configure the TTL value, Traffic Manager enables you to make the best choice of TTL based on your application’s business needs.

  • If you provide write access to a blob, a user may choose to upload a 200GB blob. If you’ve given them read access as well, they may choose do download it 10 times, incurring 2TB in egress costs for you. Again, provide limited permissions, to help mitigate the potential of malicious users. Use short-lived Shared Access Signature (SAS) to reduce this threat (but be mindful of clock skew on the end time).
  • Azure App Service charges are applied to apps in stopped state. Please delete apps that are not in use or update tier to Free to avoid charges.
  • In Azure Search, The stop button is meant to stop traffic to your service instance. As a result, your service is still running and will continue to be charged the hourly rate.
  • Use Blob storage to store Images, Videos and Text files instead of storing in SQL Database. The cost of the Blob storage is much less than SQL database. A 100GB SQL Database costs $175 per month, but the Blob storage costs only $7 per month. To reduce the cost and increase the performance, put the large items in the blob storage and store the Blob Record key in SQL database.
  • Cycle out old records and tables in your database. This saves money, and knowing what you can or cannot delete is important if you hit your database Max Size and you need to quickly delete records to make space for new data.
  • If you intend to use substantial amount of Azure resources for your application, you can choose to use volume purchase plan. These plans allow you to save 20 to 30 % of your Data Centre cost for your larger applications.
  • Use a strategy for removing old backups such that you maintain history but reduce storage needs. If you maintain backups for last hour, day, week, month and year, you have good backup coverage while not incurring more than 25% of your database costs for backup. If you have 1GB database, your cost would be $9.99 per month for the database and only $0.10 per month for the backup space.
  • Azure Document DB with the stored procedure is that they enable applications to perform complex batches and sequence of operations directly inside the database engine, closer to the data. So, the network traffic latency cost for batching and sequencing operations can be completely avoided. Another advantage to using stored procedure is that they get implicitly pre-complied to the byte code format upon registration, avoiding script compilation costs at the time of each invocation.
  • The default of a cloud service size is ‘small’. You can change it to extra small in your cloud service – properties – settings. This will reduce your costs from $90 to $30 a month at the time of writing. The difference between ‘extra small’ and ‘small’ is that the virtual machine memory is 780 MB instead of 1780 MB.
  • Windows Azure Diagnostic may burst your bill on Storage Transaction. If you do not control it properly.

We’ll need to define what kind of log (IIS Logs, Crash Dumps, FREB Logs, Arbitrary log files, Performance Counters, Event Logs, etc.) to be collected and send to Windows Azure Storage either on-schedule-basis or on-demand.

However, if you are not carefully define what you are really need for the diagnostic info, you might end up paying the unexpected bill.

Assuming the following figures:

  • You a few application that require high processing power of 100 instances
  • You apply 5 performance counter logs (Processor% Processor Time, Memory Available Bytes, Physical Disk% Disk Time, Network Interface Connection: Bytes Total/sec, Processor Interrupts/sec)
  • Performing a schedule transfer for every 5 seconds
  • The instance will run 24 hours per day, 30 days per month

How much it costs for Storage Transaction per month?

5 counters X 12 times X 60 min X 24 hours X 30 days X 100 instances = 259,200,000 transactions

$ 0.01 per 10,000 transactions X 129,600,000 transactions =$ 259.2 per month

To bring it down, if you really need to monitor all 5 performance counters on every 5 seconds? What if you reduce them to 3 counters and monitor it every 20 seconds?

3 counters X 3 times X 60 min X 24 hours X 30 days X 100 instances = 3,8880,000 transactions

$ 0.01 per 10,000 transactions X 129,600,000 transactions =$ 38.8 per month

You can see how much you save for this numbers. Windows Azure Diagnostic is really needed but use it improperly may cause you paying unnecessary money

  • An application will organize the blobs in different container per each user. It also allows the users to check size of each container. For that, a function is created to loop through entire files inside the container and return the size in decimal. Now, this functionality is exposed at UI screen. An admin can typically call this function a few times a day.

Assuming the following figures for illustration:

  • I have 1,000 users.
  • I have 10,000 of files in average for each container.
  • Admin call this function 5 times a day in average.
  • How much it costs for Storage Transaction per month?

Remember: a single Get Blob request is considered 1 transaction!

1,000 users X 10,000 files X 5 times query X 30 days = 1,500,000,000 transaction

$ 0.01 per 10,000 transactions X 1,500,000,000 transactions = $ 1,500 per month

Well, that’s not cheap at all so to bring it down.

Do not expose this functionality as real time query to admin. Considering to automatically run this function once in a day, save the size in somewhere. Just let admin to view the daily result (day by day).With limiting the admin to just only view once a day, what will be the monthly cost looks like:

1,000 users X 10,000 files X 1 times query X 30 days = 300,000,000 transaction

$ 0.01 per 10,000 transactions X 300,000,000 transactions = $ 300 per month

Author Credits: This article was written by Utkarsh Pandey, Azure Solution Architect at 8KMiles Software Services and originally published here

Cost Optimization Tips for Azure Cloud-Part II

Cloud computing comes with myriad benefits with its various as-a-service models and hence most businesses consider it wise to move their IT infrastructure to cloud. However, many IT admins worry that hidden costs will lower their department’s total cost of ownership.

We believe that it is more about estimating your requirements correctly and managing resources in the right way.

Microsoft Azure Pricing

Microsoft Azure allows you to quickly deploy infrastructures and services to meet all of your business needs. You can run Windows and Linux based applications in 22 Azure data-center regions, delivered with enterprise grade SLAs. Azure services come with:

  • No upfront costs
  • No termination fees
  • Pay only for what you use
  •  Per minute billing

You can calculate your expected monthly bill using Pricing Calculator and track your actual account usage and bill at any time using the billing portal.

1. Azure allows you to set a monthly spending limit on your account. So, if you forget to turn off your VMs, your Azure account will get disabled before you run over your predefined monthly spending limit. You can also set email billing alerts if your spend goes above a preconfigured amount.

2. It is not enough to shut down VMs from within the instance to avoid being billed because Azure continues to reserve the compute resources for the VM including a reserved public IP. Unless you need VMs to be up and running all the time, shut down and deallocate them to save on cost. This can be achieved from Azure Management portal or Windows Powershell.

3. Delete the unused VPN gateway and application gateway as they will be charged whether they run inside virtual network or connect to other virtual networks in Azure. Your account will be charged based on the time gateway is provisioned and available.

4. At least one VM is required to be running all the time, with one reserved IP included in 5 reserved public IP in use, in order to avoid reserved IP address charges. If you down all your VMs in service, then Microsoft is likely to reassign that IP to some other customer’s cloud service, which can hamper your business.

5. Minimize the number of compute hours by using auto scaling. Auto scaling can minimize the cost by reducing the total compute hours so that the number of nodes on Azure scales up or down based on demand.

6. When an end-user’s PC makes a DNS query, recursive DNS servers run by enterprises and ISPs cache the DNS responses. These cached responses don’t incur charge as they don’t reach the Traffic Manager Name servers. The caching duration is determined by the “TTL” parameter in the original DNS response. With larger TTL value, you can reduce DNS query charges but it would result in longer end-user failover times. On the other hand, shorter TTL value will reduce caching resulting in more query counts against Traffic Manager Name server. Hence, configure TTL in Traffic Manager based on your business needs.

7. Blob storage offers a cost effective solution to store graphics data. Blob storage of type Table and Queue of 2 GB costs $0.14/month and type block blob costs just $0.05/month


A SQL Database of similar capacity will cost $4.98/month. Hence, use blob storage to store images, videos and text files instead of storing in SQL Database.


To reduce the cost and increase the performance, put the large items in the blob storage and store the blob record key in SQL database.

Above tips will definitely help you cut cost on Azure and leverage the power of cloud computing to the best!


Cost Optimization Tips for Azure Cloud-Part I

In general there are quite a few driving forces behind rapid adoption of cloud platforms off late, but doing it within the industry cost budget is the actual challenge. Though the key benefit from public cloud providers like Azure is its pay-as-you-go pricing model which makes customers immune of any capital investment but there are chances that the expenses in cloud start to add up and can soon get out of control if we are not practicing effective cost management. It needs attention and care to “Take Control over Your Cloud Costs” and decide about a better cost management strategy.

Under these Articles I will try to outline few of the Azure’s cost saving and optimization considerations .Its gonna be 3 part article first of this can be subtitled as “7 consideration for highly effective azure architecture “ because it covers the stuff from an architect’s point of view—

1. Design for Elasticity

Elasticity has been one of the fundamental properties of Azure that drives many of its economic benefits. By designing you architecture for elasticity you will avoid Over Provisioning of resources, that way you should always restrict yourself to use only what is needed. There are umbrella of service in azure which helps customers getting rid of under-utilization of resources. (Always make use of services like VM scale set & Auto scaling).

2. Leverage Azure Application Services (Notification, Queue, Service Bus etc.)
Application services in azure doesn’t only help you in performance optimization but they can greatly affect the cost of overall infrastructure. Judicially decide on which all are the service needed for your workload and provision them in optimum way. Make use of the existing service don’t try to reinvent the wheel.
When you install software’s to suffice the requirements there is a benefit of Customize features but the trade-off is immense you have to have an instance for this which intern restrict the availability of these software’s by tying in to a particular VM. Whereas if you choose different services from Azure you enjoy the inbuilt Availability, Scalability and High Performance with option of Pay as you go.

3. Always Use Resource Group
Keep the related resource in close proximity that way you can save money on communication among the services in addition to that application will get boost on performance as latency would no longer be a factor. In the latter articles I will specifically talk about other benefits this particular service can offer.

4. Off Load From Your Architecture
Try to offload as much as possible by distributing things to their more suited services it doesn’t only reduce the maintenance headache but help in optimizing the cost too.Move the session related data out of server, Optimize the infrastructure for performance and cost by caching and edge caching static content.

Combine Multiple JS & CSS files into one and then perform the Compression for minification. Once bundled into compressed form move them to azure blob.When you’re content (Static content) is popular frontend it with Azure Content delivery network. Use Blob + Azure CDN as it will reduce the cost as well as latency (depends on cache-hit ratio).For anything related to media streaming make use of Azure CDN as it frees you from running Adobe FMS.

5. Caching And Compression For CDN Content
After analyzing multiple Customer subscriptions, we can derive a pattern of modest to huge CDN spends. As a common practice, customers would have forgotten to enable caching for CDN resources either at origin servers like Azure Blob. You should enable compression for content like CSS, JavaScript, Text Files, JSON, HTML etc. to ensure cost savings on bandwidth. Also, frequently deploy production changes and often forget to enable caching & compression for static resources, dynamic content like text/HTML/JSON etc. We recommend you to have post-deploy job as a part of your release automation to ensure client side caching, server-side compression etc. are enabled for your application and resources.

6. Continuous Optimization In Your Architecture
If you are using Azure for the past few years, there is high possibility of using outdated services, Though once designed you should not do too much tinkering with architecture but it’s good to have a look and see if there are things which can be replaced with new generation service. They might be best fit for the workload and can offer same results in less expenses. Always match resources with the workload.
With that it doesn’t only give you instant benefits but offers you recurring savings in your next month’s bill.

7. Optimize The Provisioning Based On Consumption Trend

You need to be aware of what you are using. There is no need of wasting your money on expensive instances or services if you don’t need them. Automatically turn off what you don’t need, there are services like Azure Automation which can help you achieving that.Make use of azure service like auto-scaling, VM scale set and azure automation for uninterrupted services even when traffic tends to increase beyond expectations.Special mention for Azure DevTest- a service specially designed for Development and testing scenarios. With this service azure helps end users to model their infrastructure where they will be charged only for office hours (usually 8*5) these settings are customizable which makes it even more flexible.While dealing with Azure storage, make use of Appropriate Storage Classes with required redundancy options. Service like File Storage, Page-Blob, Block-Blob etc. have their specific purpose so be clear while designing your architecture.

Author Credits: This article was written by Utkarsh Pandey, Azure Solution Architect at 8KMiles Software Services and originally published here

Enhanced Security In Cloud Computing – A Traditional Approach In Modern Technology

Cloud computing is now a part of our day to day activities and we can’t deny the fact that all applications in smartphones are integrated with cloud. The data which is uploaded, stored or downloaded via the cloud needs to be secured during its static and dynamic status.

Watermarking is a technique of authenticity and helps to secure data which enhances cloud computing security. Have you ever thought how we use watermarking in our everyday activities and how it is available in our wallets or purses? Yes! Am talking about the currency notes which has the watermark on it.

Can this be digitized? Yes, it has already been digitized which we often see in our TV channels which are digital watermarked with logo: be it BBC or our local channels.

Consider introducing the traditional approach of watermarking techniques in cloud computing which has enabled to prevent breaches and alleviate security threats that has risen due to technology growth.

We all know that cloud business model supports on-demand, pay-for-use, and economies-of-scale IT services over the Internet. The virtualized data centers combine to form the internet cloud. To enhance the multiple data residence on the same cloud, the cloud needs to be designed to be secure and private because security breaches will lead to data being compromised. Cloud platforms are dynamically built through virtualization with provisioned hardware, software, networks, and data sets. The idea is to migrate desktop computing to a service-oriented platform using virtual server clusters at data centers.

We need to identify best practice process for cost effective security enhancements in cloud computing and watermarking has been analyzed to fit into this category . Increasing the public cloud usage with security enhanced clouds like using digital watermarking techniques helps in betterment of revenue for the cloud service providers and client.

Digital watermarking is a method that can be applied to protect documents, images, video, software, and relational databases.These techniques protect shared data objects and massively distributed software modules.

This combined with data coloring can prevent data objects from being damaged, stolen, altered, or deleted. Protecting data center must first secure cloud resources and uphold user privacy and data integrity.

(Image Source: Google)

The new approach could be more cost-effective than using the traditional encryption and firewalls to secure the clouds. This can be implemented to protect data-center access at a coarse-grained level and secure data access at a fine-grained file level. This can be interlinked with security as a service (SECaaS) and data protection as a service (DPaaS) and be widely used for personal, business, finance, and digital government data. It safeguards user authentication and tighten the data access-control in public clouds.

Public Watermarked clouds are an effective solution for security threats
It ensures confidentiality, integrity, and availability in a multi-tenant environment. Computing clouds with enhanced privacy controls demands ubiquity, efficiency, security, and trustworthiness.

Effective trust management, guaranteed security, user privacy, data integrity, mobility support, and copyright protection are crucial to the universal acceptance of cloud as a ubiquitous service. Effective less cost usage of public clouds leads to satisfied customers.

This blog would have thus enabled to identify the different security threats in cloud computing and identify best practice process for cost effective security enhancements in cloud computing which will in turn benefit the organization.

Author Credits: This article was written by Ramya Deepika, Cloud Architect at 8KMiles Software Services and originally published here

Benchmarking Sentiment Analysis Systems

Gone are the days when consumers depended on word-of-mouth from their near and dear ones for any product purchase. The Gen-Y generation now majorly go for the online reviews to not only get the virtual look & feel of the product but also understand the specs & cons of the product.  The online reviews could be from various sources like forum discussions, blogs, microblogs, twitter & social networks and are humongous in nature which has led to inception and rapid growth of Sentiment analysis.

Sentiment analysis helps to understand the opinion of people towards a product or an issue.  Sentiment analysis has grown to be one of the most active research areas in Natural Language Processing (NLP). It is also widely studied in data mining, web mining and text mining.  In this blog, we will discuss the techniques to evaluate and benchmark sentiment analysis feature in NLP products.

8KMiles’ recent engagement with a leading cloud provider involved applying sentiment analysis on different review datasets with some of the top products available in the market and assess their effectiveness in correctly identifying them as positive or negative or neutral.  Our own team had tracked opinions about enormous number of movie reviews from IMDb, product reviews from Amazon & Yelp and predicted sentiment polarity with very accurate results. Tweets are different from reviews because of their purpose: while reviews represent summarized thoughts of authors, tweets are more casual and limited to 140 characters of text.  Because of this nature, the accuracy results for tweets significantly vary from other datasets.  A systematic approach to benchmark the accuracy of the sentiment polarity helps to reveal the strength and weakness of various products under different scenarios.  Here, we share some of the top performing products and key information on how accuracy is evaluated for various NLP APIs and comparison report is prepared.

There is a wide range of products available in the market; a few important products with their language supports is shown below.

Google NL API Microsoft Linguistic
Analysis API
IBM AlchemyAPI Stanford CoreNLP Rosette Text Analytics Lexalytics
Sentiment Analysis Language Support English, Spanish, Japanese English, Spanish, French, Portuguese English, French, Italian, German, Portuguese, Russian and Spanish English English, Spanish, Japanese English, Spanish, French, Japanese, Portuguese, Korean, etc.


Not all products return sentiment polarity directly. Some directly return polarity like Positive, Negative, Neutral whereas some other return scores and these score ranges in turn have to be converted to get the polarity if we want to compare products. Following sections explain the results returned by some of the APIs.

Google’s NL API Sentiment Analyzer returns numerical score and magnitude values which represent the overall attitude of the text. After analyzing the results for various ranges, range from -0.1 to 0.1 found to be appropriate for neutral sentiment. Any score greater than 0.1 was considered as positive and score less than -0.1 was considered as negative.

Microsoft Linguistic Analysis API returns a numeric score between 0 & 1. Scores closer to 1 indicate positive sentiment, while scores closer to 0 indicate negative sentiment. A range of scores between 0.45 and 0.60 might be considered as neutral sentiment. Scores less than 0.45 may be used as negative sentiment and scores greater than 0.60 may be taken as positive sentiment.

IBM Alchemy API returns a score as well as sentiment polarity (positive, negative or neutral). So, the sentiment label can be used directly to calculate the accuracy.

Similarly, Stanford CoreNLP API returns 5 labels viz. very positive, positive, very negative, negative and neutral. For comparison with other products, very positive and positive may be combined and treated as a single group called positive and similarly very negative and negative may be combined and treated as a single group called negative.

After above conversion, we need a clean way to show the actual and predicted results for the sentiment polarities. This is explained with examples in the following section.

Confusion Matrix

A confusion matrix contains information about actual and predicted classifications done by a classification system. Let’s consider the below confusion matrix to get a better understanding.

Above example is based on a dataset having 1,500 reviews with split-up of 780 positives, 492 negatives and 228 neutrals in actual.  Product A has predicted 871 positives, 377 negatives and 252 neutrals whereas Product B predicts 753 positives, 404 negatives and 343 neutrals.

From the above table, we can easily understand that Product A rightly identifies 225 negative reviews as negative reviews.  But it wrongly classifies 157 negative reviews as positive and 110 negative reviews as neutral.

Note that all the correct predictions are located along the diagonal of the table (614, 225 and 55).  This helps to quickly identify the errors, as they are shown by values outside the diagonal.

Precision, Recall & F-Measure
Precision measures the exactness of a classifier.  A higher precision means less false positives, while a lower precision means more false positives. Recall measures the completeness, or sensitivity, of a classifier. Higher recall means less false negatives, while lower recall means more false negatives.

  • Precision = True Positive / (True Positive + False Positive)
  • Recall = True Positive / (True Positive + False Negative)

F1 Score is measure of a test’s accuracy. It considers both Precision and Recall of the test to compute score, F-Score is the Harmonic mean of precision and recall. This will tell you how your system is performing.

  • F1-Measure= [2 * (Precision * Recall) / (Precision + Recall)]

Here is the Precision, Recall and F1-Score for Product A and Product B.
Product A achieves 70% precision in finding the positive sentiment. This is calculated as 614 divided by 871 (refer confusion matrix table). This means, out of the 871 reviews that Product A identified as positive, 70% is correct (Precision) and 30% of reviews that Product A identified as positive is incorrect.

Product A achieves 79% recall in finding the positive sentiment. This is calculated as 614 divided by 780 (refer confusion matrix table). This means, out of the 780 reviews that Product A should have identified as positive, it has identified 79% correct (Recall) and 21% ((79 + 87)/780) is incorrect.

It is desired to have both high precision and high recall to get a final high accuracy. F1 score considers both precision and recall and gives a single number to compare across products. Based on F1 score comparison, the following is arrived for the given dataset.

  • Product B is slightly better than Product A in finding positive sentiment.
  • Product B is better than Product A in finding negative sentiment.
  • Product B is slightly better than Product A in finding neutral sentiment.

Final Accuracy can be calculated using Num. of Correct Prediction / Total Num. of Records.

To conclude, we understand for the given dataset, Product B performs better than product A. It is important that we must consider multiple datasets and take the average accuracy to find out product final standing.

Author Credits: This article was written by Kalyan Nandi, Lead, Data Science at Big Data Analytics SBU, 8KMiles Software Services.

Azure Virtual Machine – Architecture

Microsoft Azure is built on Microsoft’s definition of commodity infrastructure. The most intriguing part of Azure is its cloud operating system that is at its heart. During the initial days of azure when it started it stated using fork of windows as its underlying platform Back then they named it as red dog operating system & red dog hypervisor. If you go into the history of Azure the project which became azure was originally named as project red dog. David Cutler was the brain behind designing and developing the various Red Dog core components and it was he who gave this name.in his own words- the premises of Red Dog (RD) is being able to share a single compute node across several properties. This enables better utilization of compute resources and the flexibility to move capacity as properties are added, deleted, and need more or less compute power. This is turn drives down capital and operational expenses.

It was actually a custom version of windows and the driving reason for this customization was because hyper v during those didn’t had the features which was needed for Azure (particularly support for booting from VHD). if you try to understand the main components of its architecture we can count four pillars-

  • Fabric Controller
  • Storage
  • Integrated Development Tools and Emulated Execution Environment
  • OS and Hypervisor

Those were initial (early 2006) days of azure as it matured running a fork of an OS is not ideal (in terms of cost and complexity), so Azure team talked to the Windows team, and efforts were made to use Windows itself. As time passed windows eventually caught up and now Azure runs on Windows.

Azure Fabric Controller
Among there one component which contributed immensely in its success is fabric controller. The fabric controller owns all the resources in the entire cloud and runs on a subset of nodes in a durable cluster. It manages the placement, provisioning, updating, patching, capacity, load balancing, and scale out of nodes in the cloud all without any operational intervention.

Fabric Controller which still is backbone of azure compute is the kernel of the Microsoft Azure cloud operating system. Azure Fabric Controller regulates the creation, provisioning, de-provisioning and supervising of all the virtual machines and their back-end physical server. In other words It provisions, stores, delivers, monitors and commands the virtual machines (VMs) and physical servers that make up Azure. One added benefit is that It also detects and responds to both software and hardware failure automatically.

Patch Management
When we try to understand the underlying mechanism/workflow which Microsoft follows for patch management the common misconception is that it keeps updating all the nodes just like we do in our environment. But things in cloud is little different, AS Azure hosts are image-based (hosts boot from VHD) and it follows the image based deployment. So instead of just having patches delivered, azure roll out new VHD of the host operating system. Means they are not actually going and patching everyone but instead azure update at one place and because its orchestrated update it can use this image to update the whole environment.

This offers a major advantage in host maintenance as the volume itself can be replaced, enabling quick rollback. Host updates role out every few weeks (4-6 weeks), with an approach where updates are well-tested before they are rolled out broadly to the data centers. It’s the responsibility of Microsoft to ensure that each roll out is tested before updating the data center servers. To do so they start this implementation with few fabric controller stamps which could be called as pilot cluster and then once through they will gradually push the updated to production (Data Center) hosts. The underlying technology behind this is called Update Domain (UDs). When you create VM’s and put them in an availability set they get bucketed into update domain (by default you get 5 but there are provisions to increase them to 20). So, all the VMs part of availability set will get distributed equally among these UDs. With this the patching will take place in batches and Microsoft will ensure that at a time only single update domain should go for patching. You can call this as staged rollout. To understand this in more detail let’s see how Fabric controller manages the partitioning-

Under Azure’s Fabric Controller it has two types of partitions: Update Domains(UDs) and Fault Domains(FDs). These two are responsible for not only high availability for also for resiliency of infrastructure with this in place in empowers the Azure with ability to recover from failures and continue to function. It’s not about avoiding failures, but responding to failures in a way that avoids downtime or data loss.

Update Domain: An Update Domain is used to upgrade a service’s role instances in groups. Azure deploys service instances into multiple update domains. For an in-place update, the FC brings down all the instances in one update domain, updates them, and then restarts them before moving to the next update domain. This approach prevents the entire service from being unavailable during the update process.

Fault Domain: Fault Domain defines potential points of hardware or network failure. For any role with more than one instance, the FC ensures that the instances are distributed across multiple fault domains, in order to prevent isolated hardware failures from disrupting service. All exposure to server and cluster failure in Azure is governed by fault domains.

Azure Compute Stamp
As in Azure, things gets divided into stamps where each stamp will have one fabric controller and this fabric controller is the one responsible for managing the VMs inside that stamp. In Azure, there are only two type of stamps, it could either be compute stamp or storage stamp. This Fabric controller is also not single; it has its distributed branches. Based on the available information, azure will have 5 replicas of the fabric controller where it uses synchronous mechanism to replicate the state. In this setup, there will be one primary and to which control pane will talk to. Now it’s the responsibility of this primary to act on the instruction (example- provision a VM) and also let other replicas know about it. And when at least 3 of them acknowledge the fact that this operation is going to happen then the operation take place (this is called quorum based approach).

VM Availability
Talking about Azure Virtual Machines there are three major components (Compute, Storage, Networking) which constitute Azure VM.While discussing Azure Virtual Machine (VM) resiliency with customers, they typically assume it is comparable to their on-prem VM architecture and as such, features from on-prem is expected in Azure. Well it is not the case, thus I wanted to put this together to provide more clarity on the VM construct in Azure to better understand how VM availability in Azure is typically more resilient then most on-prem configuration.
“Talking about Azure Virtual Machines there are three major components (Compute, Storage, Networking) which constitute Azure VM. So, when we talk about Virtual machine in Azure we must take two dependencies into consideration. Windows Azure Compute (to run the VM’s), and Windows Azure Storage (to persist the state of those VM’s). What this means is that you don’t have a single SLA, instead you actually have two SLA’s. And as such, they need to be aggregated since a failure in either, could render your service temporarily unavailable.”
Under this article lets have our discussions on Compute(VM) and Storage components.

Azure Storage:You can check my other article where I have talked about this in great details, on how an Azure Storage Stamp is a cluster of servers hosted in Azure Datacenter. These Stamps follows layer architecture with built-in redundancy to provide High Availability. Under this multiple (most of the times 3) replicas of each file, referred as Extent, are maintained on multiple different servers partitioned between Update Domains and Fault Domains. Each write operations are performed Synchronously (till we are talking about intra Stamp replication) and control is returned only after the 3 copies completed the write, thus making the write operation strongly consistent.

Virtual Machine:


Microsoft Azure has provided a means to detect health of virtual machines running on the platform and to perform auto-recovery of those virtual machines should they ever fail. This process of auto-recovery is referred to as “Service Healing”, as it is a means of “healing” your service instances. In this case, Virtual Machines and the Hypervisor physical hosts are monitored and managed by the Fabric Controller. The Fabric Controller has the ability to detect failures.

It can perform the detection in two mode-Reactive and Proactive. If the FC detects failures in reactive mode (Heartbeats missing) or proactive mode (known situations leading to a failure) from a VM or a hypervisor host, it will initiate a recovery by either redeploying the VM on a healthy host (same host or another host) and mark the failed resource as unhealthy and remove it from the rotation for further diagnosis. This process is also known as Self-Healing or Auto Recovery.
With Above diagram we can see different layers of the system where faults can occur and the health checks that Azure performs to detect them

*auto-recovery mechanism is enabled and available on virtual machines across all the different VM sizes and offerings, across all Azure regions and datacenters.

Author Credits: This article was written by Utkarsh Pandey, Azure Solution Architect at 8KMiles Software Services and originally published here

For more interesting information follow us in LinkedIn by clicking here

Tale of how a Fortune 50 Giant got the right Identity Access Partner

As organizations constantly seek to expand their market reach and attract new business opportunities, identity management ( specifically SSO, User provisioning and Management) has evolved as an enabling technology to mitigate risks and improve operational efficiency. As the shift to cloud-based services increases, identity management capabilities can be delivered as hosted services to help drive more operational efficiency and to improve business agility. A Fortune 50 Giant designed and developed a Cloud based Identity Management Solution and proposed an opportunity to existing and prospective SaaS Vendors to have a tie up with its product to test and make the product live as a full-fledged Single Sign-on Solution. 8KMiles being a Cloud based Identity Services company accepted to fulfill the Client requirement.
This company opted for 8KMiles, the most preferred choice, because 8KMiles is a State of the Art Solution provider that practiced Scrum Methodology. 8KMiles never hesitated to take up any ad-hoc requirements because of their industry specific team of experts who are ready to offer 24/7 development support system. 8KMiles thus pitched in to help the client by finding their Pain Points & AS-IS Scenarios. Later 8KMiles worked extensively to collaborate with their company and its respective SAAS Vendors to:
1. Establish formal Business Relationship with SaaS Vendors
2. Pre-qualify SaaS Vendor
3. Configure SaaS Application for Partner company on Identity Cloud Service SAML SSO integration, Test and Certify
4. Prepare IDP Metadata
5. Establish a stringent QA process
6. Complete Documentation
a. Confirmance and interoperability test report
b. SAML SSO Technical documentation
c. A video explaining the steps involved in the integration
d. Provide metadata configuration and mapping attributes details
7. Build Monitoring Tool
8. Adopt Quality Assurance with 2 level Testings (Manual & Automation)
9. Configure, integrate, troubleshoot, monitor and produce reports using 8KMiles MISPTM tool.

Thus, 8KMiles enabled this Fortune 50 Biggie to attain the following business benefits:
• Refinement of user self-service functionalities
• Activation users & groups and linking SaaS applications to the user accounts in the cloud
• Enablement of SSO to these SaaS Apps & enable user access via SAML2.0
• Usage of OAuth 2.0 to authorize changes to configuration.
• Adoption & Testing of different methods of SSO for the same SaaS App
• Documentation of the process in a simplistic manner
• Automation to test & report on all aspects of the integration without human involvement
For more information or details about our Cloud Identity access solution, please write to sales@8kmiles.com

Author Credit:  Ramprasshanth Vishwanathan, Senior Business Analyst- IAM SBU

LifeSciences Technology Trends to expect in 2017

There is a constant change in Life Sciences industry dynamics especially in terms of handling the ever growing data, using modern cloud technology, implementing agile business models and alignment with the compliance standards. Here are some of the Lifesciences Tech trends that are predicted for this 2017.

1) Cloud to manage Ever-growing Data

The growing volume of data is one of the major concerns amongst the Life science players. There is a constant need to manage and optimize this vast data into actionable information in real time and this where cloud technology will give the agility required to achieve this. Life sciences will continue to shift to cloud to address the inefficiencies and streamline and scale their operations.

2) Analytics to gain importance

Data is the key driver for any Pharma or Lifesciences organization and will determine the way drugs are developed and brought to market. The data are generally distributed and fragmented as clinical trial systems, databases, research data, physician notes, hospital records, etc and analytics will aid to a great extent to analyze, explore and curate these data to realize real business benefits out of this data ecosystem. Year 2017 will see a rise in trends like Risk analytics, Product failure analytics, drug discovery analytics, supply disruptions predictive analytics and Visualizations.

3) Lifesciences and HCPs will now go Digital for interactions

There was a time when online engagements were just a dream due to limitations in technology and regulations. Embracing a digital channel will open up faster mode of communication amongst Lifescience players, HCPs and consumers. These engagements are not only easy and compliant but are integrated with applications to meet industry requirements. This will also aid life sciences players reach more HCPs and also meet customer’s growing expectations for online interactions

4) Regulatory Information Management will be the prime focus

When dealing with overseas market it is often very critical to keep track of all the regulatory information at various levels. Many a times information on product registrations, submission of content plans, health authority correspondence, source documents to published dossiers etc, are disconnected and are not recorded at one centralized place. So programs that aid in alignment and streamlining of all regulatory activities will gain momentum this year.

To conclude, Daniel Piekarz, Head of Healthcare and Life Sciences Practice, DataArt stated that, “New start-ups will explode into the healthcare industry with disruptive augmented reality products without the previous limitations of virtual reality. As this technology advances the everyday healthcare experience, it will exist on the line between the real world and virtual in what is being called mixed reality.” Thus 2017 will see a paradigm shift in the way technology will revolutionize Life Sciences players’ go-to market leading to early adopters of the above gaining the competitive edge and reaping business benefits as compared to laggards!

Identity Federation – 10 Best Practices from a User’s perspective

Federation is a concept which deals with connection of two parties/providers (Identity Provider (IDP) and Service Provider (SP)). One vetting the credentials of the user and the other providing a service to the user depending upon the successful vetting of the credential by the first provider.  While setting up these federations, certain best practices can be followed by the two parties that would make the federation experience holistic for a user. This blog post explores and highlights these practices.

Let us start with the SP side, as this is where the user lands after a federation. The following are some of the best practices to be followed on the SP side.

  1. If the user has reached the SP for the first ever time, it will be good to make sure (with the consent of the user and with due thought to the user’s privacy), if some identifying information/data (like the immutable id, email id, etc.) of the user can be stored in the SP. This allows the user’s subsequent visits to be tied to it. This may be needed in order to ensure that the user gets a better service experience at the SP each time.  If the intention of the federation is not to expose/tailor user/usage specific sites, then this need not be followed.
  2. The SP should be able to link the user to multiple applications protected by the SP, with the identifying information from the federated transaction, preferably immediately after federation time, in order to establish continuity of services that the particular user was offered last time they logged in to the SP applications and/or tailor the application’s preferences to the federated user’s profile.
  3. Wherever possible it will be better to use local provisioning or remote provisioning of the user at the SP. Critical aspects like security, privacy organization’s policy in handling external users and their attributes dictate which type of provisioning would be best.    This provisioning process again would help speed up the user experience at the SP application and also will assist in giving a better service to the same returning user.
  4. Sending the right assertion parameters to the downstream application.

This is critical, as some of the vital information such as role information, auxiliary user attributes, preferences that the application requires need to be passed on appropriately to the application.  The application might be making important decisions based on these parameters in order to address the user’s needs correctly.

  1. Redirect to appropriate URLs at the Service Provider in both the cases of “User Success” or “User Failure” to get to those URLs. Failure could be because of the following reasons:

a) User not having the right role, privilege or permission to access the site or part of the site, as the assertion did not have them

b) User got authenticated correctly at the IDP, but IDP failed to send the right assertion to the SP

c) Failure of user disambiguation process at the SP

d) User unable to be linked to the right accounts at the SP

In each case, if the Failure URL gives an appropriate error message to the user, the user would know exactly why he could not access the resource. Ticketing software would probably help the user generate a ticket for the same and get a solution for the failed transaction from the SP.

Let us now focus on the IDP side, as this is where the user usually authenticates in order to reach an SP application in a federation.   The following are some of the best practices to be followed on the IDP side:

  1. Most important thing is for the IDP to display an error that is meaningful to the user, if and when his/her authentication fails at the IDP. This would make it easier for the user to know if it was any credential issue, network issue, domain issue or some other issue that made the authentication process fail.
  2. The IDP should mention to the user (either in their website or application) what the supported types of credentials allowed for authentication are. This could vary from userid/password to X509 Certificates, smartcards, OTPs or other hardware/software tokens.   The user interface should appropriately lead the user to the right type of authentication, using the right type of credentials, depending on the type of service he/she wishes to get from the IDP.
  3. The IDP would be able to issue assertions to the SP, that contains details like Level of assurance at which the user credential was accepted, other user attributes like role, user preferences etc., if applicable. This is other than the primary subject information that the IDP is contracted to send to the SP, during the initial metadata exchange.  These extra attributes would help the SP and its applications to tailor their user preferences.
  1. In the case of IDP supporting a particular profile and service, the IDP should support all the possible standard options/features linked with these profiles/services. Otherwise the users should be let known, what is supported.  This is to ensure that the sure users would not be misled, if they assume that all related option/features are supported.  For example:

a) If the IDP is supporting IDP-initiated Browser Post Profile, then it would be better if it supports IDP initiated Single Logout, Common Name ID Formats linked with the Browser Post Profile, Signing and Encryption of Assertions, Protocol Response in POST Formats etc.

b) if the IDP is supporting SP-initiated Browser Post Profile, then it would be better if it supports IDP or SP initiated Single Logout, Common NameID Formats, Signing and Encryption of Assertions, Protocol Response in POST Formats, Relay State, Accept AuthenRequest in GET and POST Formats, support “allow create” of new IDs if an ID is not already present for a federation transaction etc.

c) if the IDP is supporting multiple Protocols and features such as delegated authentication, redirection to other IDPs, etc., it should clearly mention the protocols and the corresponding profiles, features supported in each of the IDP supported website/application.

d) If exclusively a particular feature is not followed or supported by the IDP, it should be clearly mentioned by the IDP to its users.

All the above should be provided in laymen terms, so that the user can understand what features are supported and what are not.

  1. IDP should clearly mention the conditions associated with privacy clause/rules/protection with respect to user credentials/identities and their secure transport. This is to keep the user informed about how their credentials will be used. It also highlights the protection measures followed to make the federated transaction secure.


Author Bio:
Raj Srinivas, the author of this blog is an IAM and Security Architect with 8K Miles.   His passion includes analyzing problems that enterprises have in the IAM & Cloud Security domain, from various verticals that include Banking, Insurance, HealthCare, Government, Finance & Mortgage and provide in-depth solutions that will have far-reaching effects for the enterprise.

SaaS Data Security More Critical Now Than Ever Before in Healthcare

If, as healthcare payer and provider, you are using Software-as-a-Service (SaaS) solutions to provide better service to your patients and customers, data security might be as critical to you as your business. Healthcare industry has shifted to cloud based solutions to maintain electronic Protected Health Information (ePHI), and hence considering the sensitivity of information, it has become more important now than ever before.

In order to keep pace with growing demand, healthcare industry has faced the heat to provide faster, better, and more accessible care by adopting new technologies while complying with industry mandates like the Health Insurance Portability and Accountability (HIPAA) Act and Health Information Technology for Economic and Clinical Health (HITECH) Act.

Why Healthcare needs Data Security in SaaS applications?

It is because of the astonishing number of data breaches and attacks on healthcare data that has forced involved organizations to look for higher and stronger methods of data security at various levels, be it at physical level or application level.

According to a recent study by Symantec Corporation, approximately 39 percent of breaches in 2015 occurred in the health services sector. The same report found that ransomware and tax fraud rose as increasingly sophisticated attack tactics were being used by organized criminals with extensive resources. These criminals utilize professional businesses and adopt best business practices to exploit the loopholes prevailing in the security of ePHI. They first recognize the vulnerabilities and then exploit the weakness of unsecured system. The stolen health records are then sold in black market for ten times more value than that of stolen credit card.

In a statement given by Kevin Haley, director, Symantec Security Response, he said, “Advanced criminal attack groups now echo the skill sets of nation-state attackers. They have extensive resources and a highly-skilled technical staff that operate with such efficiency that they maintain normal business hours and even take the weekends and holidays off.”

Loopholes in Healthcare Data Security

Public cloud services are cost-efficient because the infrastructure often involves shared multitenant environments, whereby consumers share components and resources with other consumers often unknown to them. However, this model has many associated risks. It gives one consumer a chance to access the data of another and there is even a possibility that data could be co-mingled.

Cloud services allow data to be stored in many locations as part of Business Continuity Plan (BCP). It can be beneficial in case of an emergency such as a power outage, fire, system failure or natural disaster. If data is made redundant or backed up in several locations, it can provide reassurance that critical business operations will not be interrupted.

However, consumers that do not know where their data resides lose control of ePHI at another level. Knowing where their data is located is essential for knowing which laws, rules and regulations must be complied with. Certain geographical locations might expose ePHI to international laws that change who has access to data in contradiction to HIPAA and HITECH laws.

Many employees use their smartphones that do not have the capability to send and receive encrypted email. So, while answering emails at home from their phone, employees may be putting sensitive data at risk.

Bring Your Own Device (BYOD) policies also put data at risk if devices are lost or stolen. Logging on to insecure internet connections can also put business and patient information at risk. Storing sensitive data on unsecured local devices like laptops, tablets or hard drives can also expose unencrypted information at the source.


It is obvious from such startling statistics that large number of data breaches and cyber-attacks can occur only if the applications and storage of data are not secure. Also, all the employees involved should be given unique username and password and must be trained on how to keep login credentials secure apart from training sessions on Privacy and Security Rules.

Transferring data to the cloud comes with various issues that complicate HIPAA compliance for covered entities, Business Associates (BAs), and cloud providers such as control, access, availability, shared multitenant environments, incident readiness and response, and data protection. Although storage of ePHI in the cloud has many benefits, consumers and cloud providers must be aware of how each of these issues affects HIPAA and HITECH compliance.

The need of the hour is that all the involved parties must come together and take the responsibility of data security from their end till next level.

It is better to invest in securing SaaS applications and medical data instead of paying huge fines which could be in millions of dollars!

Related Posts :-

Steps to HIPAA Compliance for Cloud-Based Systems

Why Healthcare Organizations Need to Turn to Cloud

Steps to HIPAA Compliance for Cloud-Based Systems

The rapid growth of cloud computing has also led a rapid growth in concerns pertaining to security and privacy in cloud-based infrastructure. Hence, such fears create a huge requirement to understand and implement cloud computing for healthcare organizations, while being compliant with the Health Insurance Portability and Accountability Act (HIPAA).

The benefits offered by cloud-based technology are too good to let go. The agility and flexibility that can be gained by utilizing public, private, and hybrid clouds are quite compelling.  We need cloud based environment that can provide secure and HIPAA compliant solutions.

But, how do you achieve HIPAA compliance with cloud?


Image Source: Mednautix

Follow below steps to better understand how to ensure HIPAA compliance and reduce your risk of a breach.

1.      Create a Privacy Policy

Create a comprehensive privacy policy and make sure your employees are aware of it.

2.      Conduct trainings

Having a privacy policy in place wouldn’t be enough. You would require to make sure that they are implemented as well. For that employees must be given all required trainings during the on-boarding process. You should also require this training for all third-party vendors. Develop online refresher courses in HIPAA security protocols and make it mandatory for all employees and vendors to go through such courses at regular intervals.

3.      Quality Assurance Procedure

Make sure all the quality assurance standards are met and are HIPAA compliant. Conduct surprise drills to find out loopholes, if any.

4.      Regular audits

Perform regular risk assessment programs to check the probability of HIPAA protocol breach and evaluate potential damage in terms of legal, financial and reputational effects on your business. Document the results of your internal audits and changes that need to be made to your policies and procedures. Based on your internal audit results, review audit procedure and update with necessary changes.

5.      Breach Notification SOP

Create a standard operating procedure (SOP) document mentioning details about what steps should be taken in order to avoid a protocol breach. Mention steps to be followed in case a patient data breach occurs.

Most often you would have a cloud service provider who will take care of your wide range of requirements ranging from finding resources, developing apps & hosting them to maintenance of cloud based infrastructure. While the primary responsibility of HIPAA compliance falls on healthcare company, compliance requirements can extend to the cloud service provider as “business associates”.

Are your cloud service providers HIPAA business associates?

Figuring out if your cloud service provider can be considered as HIPAA business associate can be tough. The decision may vary depending on the type of cloud usage. Considering that the cloud provider agency is an active participant, it must also adhere to security requirements, such as including encryption, integrity controls, transmission protections, monitoring, management, employee screening and physical security.

Investing in HIPAA compliance procedures can save you from many hassles. Follow these steps and minimize your risk of being found noncompliant.

Ransomware on the Rise: What You Can Do To Protect Your Organisation From The Attack

Ransomware is malicious software used by the cyber criminals to hold your computer files or data and demand for a payment from you to release the data back. This is the popular method used by malware authors to extract money from organisations or individuals. Different ransomware varieties are used to get on to a person’s computer, but the most common technique is to install a software or use social engineering tactics, like displaying fake messages from law enforcement department, to attack on a victims computer. The criminals do not restore the computer access until the ransom is paid.

Ransomware is very scary as the files once damaged are almost beyond repair. But you can overcome this attack if you have prepared your system. Here are a few measures that will help you to protect your organisation from the attack.

Data Backup

To defeat ransomware, it is important to regularly backup your data. Once you get attacked, you will lose all your documents; but if you could clean your machine, restore your system and other lost documents from backup then you need not worry. So backup the files to an external hard drive or backup service, then you should can turn off your computer and start over with a new setup after attack.

Use Reputable Security Precaution

Using both antivirus software and a firewall will prevent you. It is critical to keep the software up-to-date and maintain a strong firewall, otherwise the hacker might easily exploit through security holes. Also purchase antivirus software from a reputable company because there are many fake software.

Ransomware Awareness Training

It is important to be aware of the cyber security issues and get properly trained to identify the phishing attempts. Creating awareness to staffs will help them to take action and deal with the ransomware. As the methods used by hackers constantly change it is necessary to keep your users up-to-date. Also, it is tough for untrained users to question the origin of a well-crafted phishing email. So, providing security training to staffs is the best way to prevent malware infection through social engineering.

Disconnect from Internet

If you are suspicious about a file or receive a ransomware note then immediately stop communicating with server. By disconnecting from the internet you might lessen the damage, as it takes some time to encrypt all your files. This isn’t foolproof but disconnecting from internet is better than nothing. As you can always re-install software if you have backed up your data.

Check File Extensions

Always see the full file extension, it helps to easily spot suspicious files. If possible try to filter the files in your mail by extension, like you can deny mails sent with ‘.EXE’ files. In case you exchange .EXE files in your organisation then it is better to use ZIP files with password-protection.

Exercise Caution, Warn Authorities, Never Pay

Avoid any links inside emails and suspicious websites. It is better to use another computer to research details if your PC falls under attack. Also, inform the local FBI or cybercrime about the attack. Finally, never pay them as it would be a mistake because they may continue to further demand from you and will not release your information as well. So, taking precautions to protect your data and being alert are the best ways to prevent ransomware attack.

In reality, dealing with ransomware requires an effective backup plan so you could protect your organisation from the attack.

Why Healthcare Organizations Need to Turn to Cloud

It is important for every healthcare organization to develop an effective IT roadmap in order to provide best services to customers and patients. Most healthcare payers and providers are moving to cloud based IT infrastructure in order to utilize the benefits that were once considered unimaginable.

But, before moving ahead, let’s check out some industry statistics and research studies.

Healthcare Organizations and Cloud Computing Statistics

Healthcare Organizations and Cloud Computing Statistics

Source: Dell GTAI

According to Dell’s Global Technology Adoption 2015, adoption of cloud technology increased from 25% in 2014 to 41% in 2015 alone.

Spending on cloud computing or in simpler terms – hosted medical services – in global healthcare was $4.2bn in 2004, but this will grow by 20% every year until 2020, reaching $12.6bn.

North America is the biggest consumer of cloud computing services and by 2020 its spending on cloud based solutions will reach $5.7bn.

What kind of data can be moved to Cloud?

Critical healthcare applications can be hosted on cloud platform in order to increase their accessibility and availability. Apart from them, below mentioned hardware, software and data can also be moved to cloud.

  • Email
  • Electronic Protected Health Information (ePHI)
  • Picture archiving and communication systems
  • Pharmacy information systems
  • Radiology information systems
  • Laboratory information systems.
  • Disaster recovery systems
  • Databases & Back up data

Why Healthcare Organizations should move to Cloud?

1.      Low Cost

Healthcare organizations can reduce IT costs to a significant extent by moving to the cloud. Cloud based software require lesser resources for development and testing. This implies fewer resources for maintenance and more robust solutions at a lesser cost. It is believed that over a period of 10 years, cloud based applications cost 50% lesser than traditional in-house hosted applications.

2.      More Accessibility

It is important that healthcare data is available to doctors as quickly as possible so that they can diagnose and analyze the situation of patient soon and take the right steps to improve the condition. Cloud computing improves web performance for users in remote locations as well without having to build out additional data centers.

3.      Higher Flexibility

Cloud based platform allows organizations to scale up or down based on their needs. With conventional on-premise hosted solutions, it can be tough to align their physical infrastructure quickly to varying demands. Migrating to cloud can help to deploy scalable IT infrastructure that can adjust itself as per the requirements, making sure that the resources are always available when required.

4.      Improved Efficiency

Moving to cloud also helps to avoid money being spent on infrastructure to be under-utilized. With early access to wide range of data, businesses can gather valuable insights about the performance of systems and plan their future strategy accordingly. Pharmaceutical companies, hospitals and doctors can focus on their core objective – giving the best possible treatment and service to patient – while the cloud service providers take care of their IT needs.

5.      More Reliability

Cloud based software remains available 24*7 from anywhere to any authorized personnel having an internet connection. Apart from that, it is easier to recover from loss due to natural disasters because of its distributed architecture.


The cloud’s resiliency and high availability make it a cost-effective alternative to on-site hosted solutions. However, security has been a major barrier to cloud adoption in many verticals. It’s especially critical in healthcare industry which is regulated by HIPAA and HITECH Acts and plays a major role in such organizations’ decisions to move their data into a public cloud app.

7 Tips to Save Costs in Azure Cloud

Cloud computing comes with myriad benefits with its various as-a-service models and hence most businesses consider it wise to move their IT infrastructure to cloud. However, many IT admins worry that hidden costs will lower their department’s total cost of ownership.

We believe that it is more about estimating your requirements correctly and managing resources in the right way.

Microsoft Azure Pricing

Microsoft Azure allows you to quickly deploy infrastructures and services to meet all of your business needs. You can run Windows and Linux based applications in 22 Azure data center regions, delivered with enterprise grade SLAs. Azure services come with:

  • No upfront costs
  • No termination fees
  • Pay only for what you use
  • Per minute billing

You can calculate your expected monthly bill using Pricing Calculator and track your actual account usage and bill at any time using the billing portal.

How to save cost on Azure Cloud?

  1. Azure allows you to set a monthly spending limit on your account. So, if you forget to turn off your VMs, your Azure account will get disabled before you run over your predefined monthly spending limit. You can also set email billing alerts if your spend goes above a preconfigured amount.
  2. It is not enough to shut down VMs from within the instance to avoid being billed because Azure continues to reserve the compute resources for the VM including a reserved public IP. Unless you need VMs to be up and running all the time, shut down and deallocate them to save on cost. This can be achieved from Azure Management portal or Windows Powershell.
  3. Delete the unused VPN gateway and application gateway as they will be charged whether they run inside virtual network or connect to other virtual networks in Azure. Your account will be charged based on the time gateway is provisioned and available.
  4. At least one VM is required to be running all the time, with one reserved IP included in 5 reserved public IP in use, in order to avoid reserved IP address charges. If you down all your VMs in service, then Microsoft is likely to reassign that IP to some other customer’s cloud service, which can hamper your business.
  5. Minimize the number of compute hours by using auto scaling. Auto scaling can minimize the cost by reducing the total compute hours so that the number of nodes onAzure scales up or down based on demand.
  6. When an end-user’s PC makes a DNS query, recursive DNS servers run by enterprises and ISPs cache the DNS responses. These cached responses don’t incur charge as they don’t reach the Traffic Manager Name servers. The caching duration is determined by the “TTL” parameter in the original DNS response. With larger TTL value, you can reduce DNS query charges but it would result in longer end-user failover times. On the other hand, shorter TTL value will reduce caching resulting in more query counts against Traffic Manager Name server. Hence, configure TTL in Traffic Manager based on your business needs.
  7. Blob storage offers a cost effective solution to store graphics data. Blob storage of type Table and Queue of 2 GB costs $0.14/month and type block blob costs just $0.05/month.

SQL Database

A SQL Database of similar capacity will cost $4.98/month. Hence, use blob storage to store images, videos and text files instead of storing in SQL Database.

SQL Database

To reduce the cost and increase the performance, put the large items in the blob storage and store the blob record key in SQL database.

Above tips will definitely help you cut cost on Azure and leverage the power of cloud computing to the best!