From the contents of emails to intellectual property, business plans, proprietary training documentation, and much more, most enterprises manage vast amounts of unstructured data containing valuable and sensitive information. The sheer volume of unstructured data created and managed by most enterprises can be enough to drive up storage costs substantially. In addition to managing unstructured data, understanding what unstructured data is sensitive and protecting it is a crucial concern for the modern enterprise.
But protecting this data can be challenging due to the nature of unstructured data and the challenges that often exist in identifying where it resides within the enterprise network, protecting it from unauthorized access, and preventing it from exiting the secure company environment.
To gain some insight into the most effective tactics and strategies that today's security leaders turn to when it comes to protecting unstructured data, we asked a panel of security pros and other business leaders to answer this question:
"What's the most important tip for companies looking to protect unstructured data?"
Find out what our experts suggest for better protecting your company's sensitive unstructured data by reading their responses below.
Meet Our Panel of Security Professionals:
Thomas Fischer is global security advocate at Digital Guardian, based out of our EMEA headquarters in London. In addition to his role at Digital Guardian, Thomas is director of the BSides London conference.
“Unstructured data” typically refers to…
Data that is in human readable format. Unlike structured data, which is typically programmatically correct and machine readable (e.g. a database), human created data like email, text, spreadsheets, video, pictures, etc. is referred to as unstructured and probably comprises up to 80% of data generated today.
A critical characteristic of unstructured data is that it will contain a company’s sensitive data or IP and probably represents in some form or another the sum total of all the knowledge that has been created or collected in your organisation. If this data was leaked it could be detrimental to the company’s image or could even result in heavy fines if you are subject to legislation like GDPR.
Unstructured data needs to be handled securely and this should be addressed in your data protection policies. “Too big to handle” is no longer a valid excuse to not manage this data.
Unstructured data protection starts with asking and answering some key questions like:
- Who’s been viewing or collecting these files? (e.g. employees, customers, patients, students, etc.)
- Who’s been modifying these files?
- Who currently has access to sensitive unstructured data, what access controls are in place, and how do you control super user rights?
In order to best answer these questions you’ll need to first understand and identify what kind of sensitive information exists in unstructured data and where it is residing. A key element in this would be to scan your data stores and even endpoints for key company terminologies (e.g. product IDs, product descriptions, keywords in documents) or perhaps personal data like names, DoB, addresses, biometric data, etc.
This understanding will help you target the controls to put into place based on the nature of the assets and their criticality. Once the unstructured data is understood, you can start to put controls into place to monitor and control who modifies the files, copies them, or even changes their access permissions (e.g. an HR document suddenly becomes available to all staff).
Jake Frazier is a senior managing director at FTI Consulting, based in Houston. He heads the information governance and compliance practice in the technology segment. Frazier assists legal, records, information technology and information security departments to identify, develop, evaluate and implement in-house electronic discovery and information governance processes, programs and solutions.
"One of the most meaningful and proactive ways to address protecting unstructured (and structured) data is through..."
Information governance. While information governance is a broad-reaching concept, a key initiative under it is the differentiation of data types across the organization and implementation of stronger security protocols around the corporation’s most sensitive data, or crown jewels. As we’ve learned from the long list of publicized data breaches, there is an increasing need for companies to get smarter about locating, organizing and securing their truly sensitive data.
A company's crown jewels can be separated into several categories: data that must be preserved for legal or regulatory obligation (i.e. legal holds), valuable data assets (IP or customer lists) and data that needs to be protected to protect customers and employees (customer PII, employee information). When considering steps for securing critical information, organizations should look for solutions that protect against threats like hackers, but also safeguard data from those inside the organization. Key stakeholders from across legal, IT, records management and compliance and the C-suite should all be involved in the process, which can include the following important steps:
- Establishing a sophisticated, central repository for the crown jewels, including granular security including authentication, access tiers and controlled permissions
- Supporting sufficient storage and backup for the crown jewels database
- Enabling tracking for which employees are placing information in that repository and accessing data stored there
- Encryption of sensitive documents
- Implementing Secure Socket Layer (SSL) protocol, which manages authentication and encrypted communication between users in a network
- Using security information and event management (SIEM) tools to analyze security activity in real-time
- Password protecting devices and keeping passwords protected and separate from encrypted documents
- Employing remote access to wipe and locate lost or stolen devices
- Training employees on policies, procedures and safeguards to ensure widespread adoption and enforcement of programs
Some of the same techniques that help organizations identify their crown jewels can also help find documents that no longer have any value and should be deleted. Valuable information should be stored under lock and key, while the junk should be tossed out.
Greg Kelley is the Chief Technology Officer at Vestige Digital Investigations.
"The most important tip for protecting unstructured data is..."
Knowing what data you have! Where are the sources of your unstructured data, how is it protected, who has access to it and how is it being monitored? The answers to these questions are quite often not complete without an unbiased outside source reviewing a company's IT infrastructure. For larger companies, their internal audit group can often handle this task. But for most companies, they really need to consider outside advisors be they outside counsel or consultants. The job of outside advisors is to tell the company something they don't know, and they are often good at that task. Too many times I have clients turn to me post breach with responses such as we thought it was all encrypted, we assumed there was limited access or we thought our logs went back 6 months but they only go back six days.
Prashant is responsible for the business operations and product direction of RSA's Governance & Lifecycle products within the RSA SecurID Suite. Prashant has over two decades of experience with leading cross functional teams in Product Strategy, Design, Development, Quality Assurance, Training, Technical documentation & Customer Support.
"While it is natural to first address whether access policy and control objectives are being met with respect to unstructured data..."
The real challenge should be to ascertain ownership of unstructured data. Usually, we understand who owns the application and who, therefore, should make key access decisions. It's not as clear, however, when it comes to things like shared folders, the data within them and the associated metadata. Who owns the unstructured data ultimately? Its creator? Its highest contributor? Or, is it the person who accesses it the most? Having a seamless process for identifying business data owners and validating that ownership is absolutely necessary for effective governance. Ensuring unstructured data is protected the same way as applications and user identities helps to mitigate risks down to the next level.
Dr. Pierson is the Chief Security Officer & General Counsel for Viewpost. As a recognized cybersecurity and privacy expert, he serves on the DHS’s Data Privacy and Integrity Advisory Committee and Cybersecurity Subcommittee, is a Distinguished Fellow of the Ponemon Institute, and previously was the Chief Privacy Officer for the Royal Bank of Scotland (RBS).
"The most critical security function for companies is..."
Controlling access to unstructured data. This data may contain sensitive information or personally identifiable information (PII), and thus ensuring the right persons have access to the data for the stated business purpose is key. Securing access to this data starts with controlling where the data resides, the access points to the data, and the rights and permissions of the persons accessing the data. While the endpoints of companies no longer exist due to mobile devices, access on cell phones, and home access, securing unstructured data can still be achieved by sound identity and access management (IAM).
Bryan Seely is a world famous cyber security expert, ethical hacker, author and former U.S. Marine.
"The most important tip for protecting unstructured data is..."
To think about threats not just from outside an organization, but from inside as well. Good employees can turn into threats or become compromised by malware or other things, so instead of treating all threats as outside of the wall, organizations must use rights management, file sharing fundamentals, and other techniques to compartmentalize information so that data loss, theft or other bad things are minimized at all costs. There are a ton of solutions for this type of thing.
John Allen is Vice President for Information Security and Governance with PrimePay, LLC. He is an experienced professional with over 20 years of experience specializing in information security and enterprise infrastructure and operations for clients in the financial services and insurance industries.
"Email encryption is relevant to protecting unstructured data, but the most important aspect is..."
Data classification. Having a program in place to identify sensitive data and prevent it from being included in unstructured data stores is preferable to the alternative of treating all data as classified. When everything is considered classified, ease of access suffers and the importance of protecting specific data can be lost by system users.
Mihai is an avid tech reader and occasional writer. He is an IT consultant for Unigma, a cloud management company and for ComputerSupport, an IT support company providing managed IT services, cloud services and onsite support across the United States. His area of interest includes Amazon Web Services, Google Cloud, Microsoft Azure infrastructures, general tech and risk management.
"Protecting unstructured data is not an easy task, it must be approached with caution..."
First of all, I would implement policy-based access controls in order to prevent unauthorized access, and I would use encryption to protect data since creation, ensuring that it remains secure whenever it is moved from one database to another, archived for backup purposes, etc. Any company, big or small, should have a security plan that includes unstructured data protection. I know that businesses want to rapidly access/modify unstructured data and this can be risky, but still, with a good security plan several sensitive operations can be done in a safe environment.
Swapnil Bhagwat is Senior Manager - design & digital media, implementing digital, design, web and strategies for the group companies. He is an MBA graduate with work experience in the US, UK and Europe. Swapnil has worked for more than a decade across a range of businesses for the global markets.
"It’s a well-known fact that sensitive information may reside in unstructured data in the form of office documents, emails, and other contents..."
This unstructured data may contain employee and customer information, operational data, and intellectual property. These types of data that a company handles should be classified into different data types. Once the basic list of data types is chalked out, representatives from all departments like IT, HR, legal, finance, and others should identify the data that they have forgotten about. The next step should be to map the data and formulate a handling policy that outlines how each of the classified data should be managed. Companies should ensure that it conforms to the policy laid down.
Oliver Howe is the IT Director at Rocketseed. He is an IT professional with over 20 years experience, primarily in online business communications.
"First of all, you need to identify where all of your unstructured data resides and then..."
Categorize each segment. Only then can policies be put in place to adequately cover every possible outcome.
Types of Unstructured Data:
- Personally identifiable - customer data, employee data
- Financial - banking details, transactions, invoices
- Documentation - created by current and ex-employees, contractors, affiliates
- Web pages
- Forum postings
- Skype chats
There is a trade off between protection and preventing users from being able to perform their jobs properly. Everything can be locked up and encrypted and held behind a corporate firewall, but not everyone will have the capability to easily access and return such items. This could lead to more data loss as people revert to personal portable storage devices for ease of use.
Jonathan Gossels is president of SystemExperts, an IT security and compliance consulting firm.
"Most companies are very good at protecting data that they know about and consider sensitive..."
They restrict access to the HR systems where compensation data is available. They put access controls and monitoring procedures on systems that store critical intellectual property like formulas or key financial analytics.
Typically, they have formal policies and associated technology deployments and procedures to protect sensitive data.
When someone downloads that data from a secure environment into an Excel spreadsheet or a thumb drive, all the controls are gone.
Technology can’t solve this – this is a human problem. It can only reasonably be addressed through appropriate use policies and extensive and ongoing user awareness training. Employees need to understand: DON’T TAKE SENSITIVE DATA OUT OF ITS CONTROLLED ENVIRONMENT!
Chris Carter is the CEO of Approyo, a leading global SAP technology data service provider. Mr. Carter has been in the Big Data industry and the SAP industry for almost 25 years. He has been nationally recognized by The American SAP Users Group, SAP, Hadoop World and more.
"Protecting unstructured data comes down to..."
The storage of the data and placing it in secure facilities or secured data reception to allow the data to flow back and forth between where it is stored and where it is used. In my world, we fully believe that securing the data host AND the link between to the receptor where the data will be used/run.
Without this security at each point, it opens anyone to a breach at a single fail point if not secured properly at each step of the chain.
Dirk Garner is a Principal Consultant with Garner Consulting and has over 25 years of first-hand experience in technology strategy, architecture, and engineering. Dirk Garner guides companies to become data-centric through understanding and embracing modern data integration concepts and accelerating time to value with advanced data integration technologies.
"It is far easier to secure unstructured data in 2017 than it was even a few years ago..."
There are products that integrate Kerberos security with LDAP and AD and make administration easy and straightforward. The real challenge is extending your data governance policies to cover unstructured data so as to meet everyone's requirements. If you don't currently have data access policies, you will need to establish them and that will require collaboration with several departments across your organization. All of the tried and proven data management strategies used to govern relational data can be helpful as guidelines here, but unstructured data brings unique challenges into the mix. Since the objective of storing all of the unstructured data is to find valuable insight, you must allow some degree of freedom without crossing any ethical or legal lines.
Ray Walsh is a cyber security analyst at BestVPN.com. He is interested in politics and, in particular, the subject of International Relations. He is an advocate for freedom of speech, equality and personal privacy. On a more personal level, he likes to stay active, loves snowboarding, enjoys seafood, and likes to listen to trap music.
"Unstructured files are all the files that hang around on individual machines..."
Those could be text messages, emails, photos, videos or audio recordings. There is a lot of crucial information spread across an entire firm’s machines that is likely to include workings and trade secrets of the business. That is why it is becoming more common for firms to consider solutions like granular protection with file, application, or full disk virtual machine-level encryption. Add to that good policy based controls and you have the ability to encrypt sensitive information as it is being created. As is always the case with cybersecurity, the implementation of encryption coupled with fluidity of access to data is the key to the solution.
Natacha is the PR Director at CareSkore.
"The most important tip for companies or individuals looking to protect their unstructured data is..."
Obtaining granular protection with file, application, or full disk virtual machine-level encryption; plus a centralized key and policy administration for unstructured data at rest. You can leverage analytics, big data, speech recognition and artificial intelligence to work with unstructured data.
Brian McNamara is the owner of Information Business System, LLC, working as an Information Technology specialist producing solid results and cost savings to business throughout the NJ area. His strengths are using companies' current technology more efficiently and in ways they have not considered to increase business productivity. With his 20+ years in the business world, he has the skills to break procedures into more manageable pieces which makes it easier to surmise where the inefficiencies and bottlenecks are.
"First, from a high-level perspective..."
The first step is finding unstructured data. Most people forget about this type of data. The next step is categorizing it as static or variable data – will the data change? Finally, there has to be someone (or group) in charge of organizing the data. Yes, the data has to be organized in some sort of fashion. Until this is done, it's usually residing in an old database, app, or huge folder. This is no different than a filing, paper records conversion project. There must be some type of order in order to locate the data in a future search. Actually, once you go through this exercise, you find that a lot of the data is no longer needed or wanted.....
This data, organized or not, needs the SAME strategies of backup (including offsite) as all company data.
Bob Graham is the CEO of BlackRidge Technology. He is a seasoned executive with broad business and entrepreneurial experience in the technology and services industry, from Fortune 500 firms to start-ups. He has an extensive background in venture capital, engineering, marketing, sales, and operations, along with living and working in non-English speaking environments, including Asia, Europe, and the Americas. Bob also worked as a technology analyst on the subjects of storage, networking, servers, and services.
"The most important thing companies can do to protect unstructured data is to..."
Develop a network segmentation strategy, based on users' permissions and identities.
Kathy is a Marketing Manager at Tie National, LLC.
"47% of data breaches were caused by a malicious or criminal attack..."
(Source: Ponemon Institute, 2015 Cost of Data Breach Study). If your business no longer works with a certain vendor or an employee has parted ways, one of your first steps is to change all shared passwords and revoke access on all file sharing whether they were set up as an administrator or as a guest. Prevention is one of the best methods of protection. Stay on top of access consent, and you greatly reduce the risk posed to your unstructured data.
Aaron Fuller is the principal and owner of Superior Data Strategies LLC. Located in Lansing, Michigan, Superior Data Strategies focuses on data warehousing, dimensional data marts, operational data stores, service-oriented architecture and data integration in a variety of industries.
"The most important thing that companies can do to protect their unstructured data is..."
To avoid treating your data lake as if it is a data dumping ground. To do this:
- Know what data you are bringing into your data lake before you acquire it.
- Segregate different sources and/or subject areas of unstructured data into data ponds that each have their own levels of security based on needs.
- Create processes that regularly search through your data ponds for sensitive/private data, such as social security numbers, medical records and credit card numbers, to be sure you don't accidentally violate security laws and policies by grabbing unstructured data.
- Finally, clean out unstructured data from the lake when you know it isn't needed anymore. The more cluttered the data lake becomes, the higher the chance that something bad could happen!