Enterprise Data is data that institutions produce and process. The conscious/intentional or unconscious removal of these data from the institution by different methods is called “Data Leakage.” NSA leak, Sony leak, HSBC leak, etc., can be given examples.
Enterprise Data is divided into two parts. “Structured Data” and “Unstructured Data.” Data stored in databases and tables of predetermined types is called Structured Data, and other data is called Unstructured Data.
Access to “Structured Data” is mainly done through an enterprise interface (entering, modifying, and reading records from the database using any enterprise application interface). “Unstructured Data” is data created by users, such as word documents, excel spreadsheets, text documents, pdf documents, and emails.
Contrary to popular belief, most institutional data (around 70%) is “Unstructured Data” growing exponentially. And again, contrary to popular belief, the majority of this “Unstructured Data” contains confidential and valuable information. Top management and business units draw continuous reports from corporate applications. These are often very crowded and difficult to understand/meaningful words. Some users transfer these reports to an excel table to make sense of them, and they produce more meaningful reports by performing the necessary operations there (pivot tables, graphs, etc.). Or they add the screenshots of these reports to the report files (PowerPoint, Word, or Pdf) they created.
“Unstructured Data” is not fixed. Due to the nature of the work, it is always in motion. It is sent to someone by mail, copied to a storage area on the cloud, taken out of the company on mobile devices, memory sticks, and notebooks. As a result, data, which is relatively safe incorporate storage areas, is open to all kinds of attacks in mobile and cloud environments.
Every institution takes a series of measures to prevent their private information from leaking out and thinks these measures will be enough to protect itself. While taking these measures, it determines the number of preliminary assumptions. One of the most critical assumptions is that the attack will come from outside, and corporate data will be stolen if this attack is successful. It makes a severe firewall investment to eliminate this possibility. Another assumption is that all critical data is stored in databases. They also take serious measures to eliminate this possibility.
The real big problem starts at this point. Because most data leaks are from inside, by malicious software or malicious/unconscious users, of course, firewall and database security are very important. They cannot be neglected, but they are not enough. In addition, other measures must be taken.
- The first of these is “Data Classification.”
Not all data is at the same level of privacy. Some data, such as marketing materials, can be easily shared with someone outside the organization. Some data should be shared only with the institution’s employees, and some data should be shared only with specific groups or individuals. The prerequisite for this is the correct classification of the data. We can classify data in two ways;
The first is to automatically scan all storage areas with a system defined by specific policies and find and automatically classify files that comply with these policies. The error tolerance of this method is high, and a competent person must review these results and check their accuracy.
The second method is to classify data by the data owner while it is being created. This method has a higher probability of success. If coercive and predetermined policies are applied to this method, this probability will increase even more.
The main problem of this method is that users give up classification and start to classify all data at the lowest or highest level. To eliminate this problem, it is necessary to increase corporate awareness and choose the most user–friendly software possible when choosing data classification software.
- Another measure is establishing the “DLP (Data Loss / Leak Prevention)” system.
To work correctly, as we said before, DLP (Data Loss / Leak Prevention) must determine where the data is and classify it. Then, it is essential to decide on the data flow, how it is used, and its place in our overall workflow, and correct the wrong workflows. Unfortunately, DLP (Data Loss / Leak Prevention) alone is not a magic wand and cannot prevent data leakage on its own. For this, it needs some systems that can help.
Another task that needs to be done to achieve this is establishing the “Governance System.” Institutional data will be constantly created, accessed, and circulated among users. It is not possible to prevent roaming and access. This circulation needs to be controlled and supervised. This is called “Data Access Governance (DAG).” Data Access Governance (DAG) briefly offers solutions for the following issues;
- Who owns the data?
- Where did the user get this authorization?
- What level of access authority do you have?
- Which user is authorized to access this data?
- What data is in which enterprise resource (File Server, NAS, AD, Exchange, Sharepoint)?
Enterprise Data and Process Innovation
In today’s changing economic climate, organizations are looking for ways to expand their business, increase or maintain their market share. They manage the interaction between critical processes and corporate components on time and with the right moves to adapt to the ever-changing market conditions with the new generation’s demands. At this point, process innovation emerges as a fundamental approach. Particularly noteworthy institutions have made process innovation one of the most important agenda items in annual evaluations and investor meetings. On the other hand, they have difficulty integrating innovation into their existing architectural structures. Organizations that have focused on the standardization of processes for years and built architectural designs for this experience are more complex than expected to bring a more flexible and creative perspective. In line with this challenge, investments have started to move towards service-based architecture, cloud computing, mobile facilities, social media applications, and similar technologies that support process innovation.
In summary, institutions now aim to manage their innovation processes, which aim to create innovations by feeding on creative ideas, with standardization studies aimed at establishing solid structures, and to manage their technology investments in this direction, instead of being a paradox.
At first glance, the innovation perspective conflicts with enterprise data requirements. However, when we look at the definition and purpose of innovation, it is evident that a backbone structure is needed for innovation to come to life. Furthermore, it should not be forgotten that innovation, especially the idea’s stage, is a creative process. However, while going through the steps of designing, developing, and implementing the mentioned creative thinking, an infrastructure shaped with an architectural perspective is needed. The essentials are standards, structural approaches, and management processes from an enterprise data perspective. However, future expectations for these structures require adaptation to the conditions of constant change. Thus, the need for flexibility and thinking without borders, that is, the need for innovation, emerges.
The approach to be implemented here is to save both perspectives from the perception of paradox and manage them in a coordinated way to support each other. Thus, process innovation can come to life. When investments are made in technologies that serve this direction, a synergy can be created, and institutions can achieve structures that adapt rapidly and economically to change in competitive environments. The change will always happen, and institutions will either adapt or disappear as long as they can. However, it is not enough to keep up with the difference; it is essential to do this most economically.